[Bug 1929860] Re: initrdfail can result in resuming with different initrd images and hanging resume
Brian Murray
1929860 at bugs.launchpad.net
Wed Aug 18 19:18:53 UTC 2021
I also tested this update on an Intel NUC and I was able to reboot w/o
any issues.
bdmurray at atom:~$ cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=20.04
DISTRIB_CODENAME=focal
DISTRIB_DESCRIPTION="Ubuntu 20.04.2 LTS"
bdmurray at atom:~$ uname -a
Linux atom 5.4.0-81-generic #91-Ubuntu SMP Thu Jul 15 19:09:17 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
bdmurray at atom:~$ dpkg -l | grep ii.*grub
ii grub-common 2.04-1ubuntu26.13 amd64 GRand Unified Bootloader (common files)
ii grub-efi-amd64 2.04-1ubuntu44.2 amd64 GRand Unified Bootloader, version 2 (EFI-AMD64 version)
ii grub-efi-amd64-bin 2.04-1ubuntu44.2 amd64 GRand Unified Bootloader, version 2 (EFI-AMD64 modules)
ii grub-efi-amd64-signed 1.167.2+2.04-1ubuntu44.2 amd64 GRand Unified Bootloader, version 2 (EFI-AMD64 version, signed)
ii grub2-common 2.04-1ubuntu26.13 amd64 GRand Unified Bootloader (common files for version 2)
ii ubuntu-recovery-grub-hotkey 1.1columbia1 all Ubuntu Recovery Grub Hotkey Configuration
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to grub2 in Ubuntu.
https://bugs.launchpad.net/bugs/1929860
Title:
initrdfail can result in resuming with different initrd images and
hanging resume
Status in grub2 package in Ubuntu:
Fix Released
Status in grub2 source package in Focal:
Fix Committed
Bug description:
[Impact]
Ubuntu Focal (and new releases) on AWS will normally boot without an initrd image (just the microcode.cpio). There is a fallback mechanism to reboot with the full initrd image when the boot fails to complete. The grub environment variable "initrdfail" is used to track when a boot failed and switch between the optimized initrd-less boot path and the full initrd path.
On a normal successful boot, the "initrdfail" variable is cleared by
grub-initrd-fallback.service. However, this doesn't happen when
resuming from hibernation. As a result, the initrd fallback will get
triggered on the second hibernation / resume cycle despite the
original boot using only the microcode.cpio. This switch in initrd
images leads to the second resume hanging.
We've been able to successfully avoid this issue by adding the
following to the ec2-hibinit-agent resume handler:
/usr/bin/grub-editenv - unset initrdfail
/usr/bin/grub-editenv - unset recordfail
(Note: clearing recordfail may not be necessary, will need to try
again without it.)
This bug was filed against grub2 as it appears to own initrdfail.
[Test plan]
TBD w/ CPC
[Regression potential]
Services get changed to oneshot, and wantedby=multi-user sleep; maybe we miss other places it should run, or record the wrong thing on resume?
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/grub2/+bug/1929860/+subscriptions
More information about the foundations-bugs
mailing list