[Bug 1842320] [NEW] Can't boot: "error: out of memory." immediately after the grub menu

Launchpad Bug Tracker 1842320 at bugs.launchpad.net
Wed Oct 5 16:19:39 UTC 2022


You have been subscribed to a public bug by Dan Bungert (dbungert):

[Impact]

 * In some cases, if the users’ initramfs grow bigger, then it’ll likely
not be able to be loaded by grub2.

 * Some real cases from OEM projects:

In many built-in 4k monitor laptops with nvidia drivers, the u-d-c puts
the nvidia*.ko to initramfs which grows the initramfs to ~120M. Also the
gfxpayload=auto will remain to use 4K resolution since it’s what EFI
POST passed.

In this case, the grub isn't able to load initramfs because the
grub_memalign() won't be able to get suitable memory for the larger
file:

```
#0 grub_memalign (align=1, size=592214020) at ../../../grub-core/kern/mm.c:376
#1 0x000000007dd7b074 in grub_malloc (size=592214020) at ../../../grub-core/kern/mm.c:408
#2 0x000000007dd7a2c8 in grub_verifiers_open (io=0x7bc02d80, type=131076)
    at ../../../grub-core/kern/verifiers.c:150
#3 0x000000007dd801d4 in grub_file_open (name=0x7bc02f00 "/boot/initrd.img-5.17.0-1011-oem",
    type=131076) at ../../../grub-core/kern/file.c:121
#4 0x000000007bcd5a30 in ?? ()
#5 0x000000007fe21247 in ?? ()
#6 0x000000007bc030c8 in ?? ()
#7 0x000000017fe21238 in ?? ()
#8 0x000000007bcd5320 in ?? ()
#9 0x000000007fe21250 in ?? ()
#10 0x0000000000000000 in ?? ()
```

Based on grub_mm_dump, we can see the memory fragment (some parts seem
likely be used because of 4K resolution?) and doesn’t have available
contiguous memory for larger file as:

```
grub_real_malloc(...)
...
if (cur->size >= n + extra)
```

Based on UEFI Specification Section 7.2[1] and UEFI driver writers’
guide 4.2.3[2], we can ask 32bits+ on AllocatePages().

As most X86_64 platforms should support 64 bits addressing, we should
extend GRUB_EFI_MAX_USABLE_ADDRESS to 64 bits to get more available
memory.

 * When users grown the initramfs, then probably will get initramfs not
found which really annoyed and impact the user experience (system not
able to boot).

[Test Plan]

 * detailed instructions how to reproduce the bug:

1. Any method to grow the initramfs, such as install nvidia-driver.

2. If developers would like to reproduce, then could dd if=/dev/random
of=... bs=1M count=500, something like:

```
$ cat /usr/share/initramfs-tools/hooks/zzz-touch-a-file
#!/bin/sh

PREREQ=""

prereqs()
{
        echo "$PREREQ"
}

case $1 in
# get pre-requisites
prereqs)
        prereqs
        exit 0
        ;;
esac

. /usr/share/initramfs-tools/hook-functions
dd if=/dev/random of=${DESTDIR}/test-500M bs=1M count=500
```

And then update-initramfs

 * After applying my patches, the issue is gone.

 * I did also test my test grubx64.efi in:

1. X86_64 qemu with
1.1. 60M initramfs + 5.15.0-37-generic kernel
1.2. 565M initramfs + 5.17.0-1011-oem kernel

2. Amd64 HP mobile workstation with
2.1. 65M initramfs + 5.15.0-39-generic kernel
2.2. 771M initramfs + 5.17.0-1011-oem kernel

All working well.

[Where problems could occur]

* The changes almost in i386/efi, thus the impact will be in the i386 / x86_64 EFI system.
The other change is to modify the “grub-core/kern/efi/mm.c” but I use the original addressing for “arm/arm64/ia64/riscv32/riscv64”.
Thus it should not impact them.

* There is a “#if defined(__x86_64__)” which intent to limit the >
32bits code in i386 system and also

```
 #if defined (__code_model_large__)
-#define GRUB_EFI_MAX_USABLE_ADDRESS 0xffffffff
+#define GRUB_EFI_MAX_USABLE_ADDRESS __UINTPTR_MAX__
+#define GRUB_EFI_MAX_ALLOCATION_ADDRESS 0x7fffffff
 #else
 #define GRUB_EFI_MAX_USABLE_ADDRESS 0x7fffffff
+#define GRUB_EFI_MAX_ALLOCATION_ADDRESS 0x3fffffff
 #endif
```

If everything works as expected, then i386 should working good.

If not lucky, based on “UEFI writers’ guide”[2], the i386 will get > 4GB
memory region and never be able to access.

[Other Info]

 * Upstream grub2 bug #61058
https://savannah.gnu.org/bugs/index.php?61058

 * Test PPA: https://launchpad.net/~os369510/+archive/ubuntu/lp1842320

 * Test grubx64.efi:
https://people.canonical.com/~jeremysu/lp1842320/grubx64.efi.lp1842320

 * Test source code: https://github.com/os369510/grub2/tree/lp1842320

 * If you built the package, then test grubx64.efi is under
“obj/monolithic/grub-efi-amd64/grubx64.efi”, in my case:
`/var/cache/pbuilder/build/276481/build/grub2-2.06/obj/monolithic/grub-
efi-amd64/grubx64.efi`

 * My build command: `sudo PBSHELL=1 pbuilder build --hookdir ~/hook-dir
ubuntu-grub/grub2_2.06-2ubuntu7+jeremydev2.dsc 2>&1 | tee build.log`

 * My qemu command: `qemu-system-x86_64 -bios
edk2/Build/OvmfX64/DEBUG_GCC5/FV/OVMF.fd -hda Templates/grub.qcow2 -m 6G
-vga cirrus -smp 8 -machine type=q35,accel=kvm -cpu host -enable-kvm
-boot menu=on` (I built an edk2 binary with debugging log)

 * You can use my grubx64.efi with debug symbols from
https://people.canonical.com/~jeremysu/lp1842320/grubx64.efi.lp1842320-dev-
with-debug-symbols and source code is from
https://github.com/os369510/grub2/tree/jeremy-dev .

After built the package from source code, then you can use gdb to attach
the qemu session as:

```
ubuntu at ubuntu-HP-ZBook-Fury-16-G9-Mobile-Workstation-PC [ /var/cache/pbuilder/build/35354/tmp/buildd/grub2-2.06/obj/grub-efi-amd64/grub-core ]
$ gdb -x gdb_grub # with “add-symbol-file kernel.img ${address}
```

The address above can read from qemu serial port and found the last
“Loading driver at 0x000xxxxxxxxxx EntryPoint=0x000xxxxxxxabc”

In above case, fill “0x000xxxxxxxabc” to ${address}.

[1] https://uefi.org/sites/default/files/resources/UEFI_Spec_2_9_2021_03_18.pdf
[2] https://edk2-docs.gitbook.io/edk-ii-uefi-driver-writer-s-guide/4_general_driver_design_guidelines/readme.2/423_use_uefi_memory_allocation_services

---

Upgraded from 19.04 to current 19.10 using "do-release-upgrade -d". Can
still boot using the previous 5.0.0-25-generic kernel, but the
5.2.0-15-generic fails to start.

On selecting Ubuntu from Grub, the message "error: out of memory." is
immediately shown. Pressing a key attempts to start boot-up but fails to
mount root fs.

Machine is HP Spectre X360 with 8GB RAM. Under kernel 5.0.0, free shows
the following (run from Gnome terminal):

              total        used        free      shared  buff/cache   available
Mem:        7906564     1761196     3833240     1020216     2312128     4849224
Swap:       1003516           0     1003516

Kernel packages installed:

linux-generic                              5.2.0.15.16 amd64
linux-headers-5.2.0-15                     5.2.0-15.16 all
linux-headers-5.2.0-15-generic             5.2.0-15.16 amd64
linux-headers-generic                      5.2.0.15.16 amd64
linux-image-5.0.0-25-generic               5.0.0-25.26 amd64
linux-image-5.2.0-15-generic               5.2.0-15.16+signed1 amd64
linux-image-generic                        5.2.0.15.16 amd64
linux-modules-5.0.0-25-generic             5.0.0-25.26 amd64
linux-modules-5.2.0-15-generic             5.2.0-15.16 amd64
linux-modules-extra-5.0.0-25-generic       5.0.0-25.26 amd64
linux-modules-extra-5.2.0-15-generic       5.2.0-15.16 amd64

Photo of kernel panic attached.

NVMe drive partition layout (GPT):

Device           Start        End   Sectors   Size Type
/dev/nvme0n1p1    2048    1050623   1048576   512M EFI System
/dev/nvme0n1p2 1050624    2549759   1499136   732M Linux filesystem
/dev/nvme0n1p3 2549760 1000214527 997664768 475.7G Linux filesystem

$ sudo pvs
  PV                          VG        Fmt  Attr PSize    PFree
  /dev/mapper/nvme0n1p3_crypt ubuntu-vg lvm2 a--  <475.71g    0

$ sudo lvs
  LV     VG        Attr       LSize   Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  root   ubuntu-vg -wi-ao---- 474.75g
  swap_1 ubuntu-vg -wi-ao---- 980.00m

Partition 3 is LUKS encrypted. Root LV is ext4.
---
ProblemType: Bug
ApportVersion: 2.20.11-0ubuntu7
Architecture: amd64
AudioDevicesInUse:
 USER        PID ACCESS COMMAND
 /dev/snd/controlC0:  gmckeown   1647 F.... pulseaudio
CurrentDesktop: ubuntu:GNOME
DistroRelease: Ubuntu 19.10
InstallationDate: Installed on 2019-08-15 (18 days ago)
InstallationMedia: Ubuntu 19.04 "Disco Dingo" - Release amd64 (20190416)
Lsusb:
 Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
 Bus 001 Device 003: ID 8087:0a2b Intel Corp.
 Bus 001 Device 002: ID 04f2:b593 Chicony Electronics Co., Ltd HP Wide Vision FHD Camera
 Bus 001 Device 004: ID 046d:c52b Logitech, Inc. Unifying Receiver
 Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
MachineType: HP HP Spectre x360 Convertible 13-ae0xx
Package: linux (not installed)
ProcFB: 0 inteldrmfb
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-5.0.0-25-generic root=/dev/mapper/ubuntu--vg-root ro quiet splash
ProcVersionSignature: Ubuntu 5.0.0-25.26-generic 5.0.18
RelatedPackageVersions:
 linux-restricted-modules-5.0.0-25-generic N/A
 linux-backports-modules-5.0.0-25-generic  N/A
 linux-firmware                            1.181
Tags:  eoan
Uname: Linux 5.0.0-25-generic x86_64
UpgradeStatus: Upgraded to eoan on 2019-09-02 (0 days ago)
UserGroups: adm cdrom dip lpadmin plugdev sambashare sudo
_MarkForUpload: True
dmi.bios.date: 05/17/2019
dmi.bios.vendor: AMI
dmi.bios.version: F.25
dmi.board.asset.tag: Base Board Asset Tag
dmi.board.name: 83B9
dmi.board.vendor: HP
dmi.board.version: 56.43
dmi.chassis.type: 31
dmi.chassis.vendor: HP
dmi.chassis.version: Chassis Version
dmi.modalias: dmi:bvnAMI:bvrF.25:bd05/17/2019:svnHP:pnHPSpectrex360Convertible13-ae0xx:pvr:rvnHP:rn83B9:rvr56.43:cvnHP:ct31:cvrChassisVersion:
dmi.product.family: 103C_5335KV HP Spectre
dmi.product.name: HP Spectre x360 Convertible 13-ae0xx
dmi.product.sku: 2QH38EA#ABU
dmi.sys.vendor: HP

** Affects: grub
     Importance: Unknown
         Status: Unknown

** Affects: oem-priority
     Importance: Critical
     Assignee: jeremyszu (os369510)
         Status: Triaged

** Affects: grub2-signed (Ubuntu)
     Importance: Critical
         Status: Triaged

** Affects: initramfs-tools (Ubuntu)
     Importance: Critical
         Status: Won't Fix

** Affects: linux (Ubuntu)
     Importance: Critical
         Status: Confirmed


** Tags: apport-collected foundations-todo jammy jellyfish-edge-staging jiayi kinetic oem-priority originate-from-1972964 patch
-- 
Can't boot: "error: out of memory." immediately after the grub menu
https://bugs.launchpad.net/bugs/1842320
You received this bug notification because you are a member of Ubuntu Foundations Bugs, which is subscribed to the bug report.



More information about the foundations-bugs mailing list