[Bug 1929860] [NEW] initrdfail can result in resuming with different initrd images and hanging resume

Francis Ginther 1929860 at bugs.launchpad.net
Thu May 27 18:49:37 UTC 2021


Public bug reported:

Ubuntu Focal (and new releases) on AWS will normally boot without an
initrd image (just the microcode.cpio). There is a fallback mechanism to
reboot with the full initrd image when the boot fails to complete. The
grub environment variable "initrdfail" is used to track when a boot
failed and switch between the optimized initrd-less boot path and the
full initrd path.

On a normal successful boot, the "initrdfail" variable is cleared by
grub-initrd-fallback.service. However, this doesn't happen when resuming
from hibernation. As a result, the initrd fallback will get triggered on
the second hibernation / resume cycle despite the original boot using
only the microcode.cpio. This switch in initrd images leads to the
second resume hanging.

We've been able to successfully avoid this issue by adding the following
to the ec2-hibinit-agent resume handler:

/usr/bin/grub-editenv - unset initrdfail
/usr/bin/grub-editenv - unset recordfail

(Note: clearing recordfail may not be necessary, will need to try again
without it.)

This bug was filed against grub2 as it appears to own initrdfail.

** Affects: grub2 (Ubuntu)
     Importance: Undecided
         Status: New

** Package changed: ec2-hibinit-agent (Ubuntu) => grub2 (Ubuntu)

-- 
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to ec2-hibinit-agent in Ubuntu.
https://bugs.launchpad.net/bugs/1929860

Title:
  initrdfail can result in resuming with different initrd images and
  hanging resume

Status in grub2 package in Ubuntu:
  New

Bug description:
  Ubuntu Focal (and new releases) on AWS will normally boot without an
  initrd image (just the microcode.cpio). There is a fallback mechanism
  to reboot with the full initrd image when the boot fails to complete.
  The grub environment variable "initrdfail" is used to track when a
  boot failed and switch between the optimized initrd-less boot path and
  the full initrd path.

  On a normal successful boot, the "initrdfail" variable is cleared by
  grub-initrd-fallback.service. However, this doesn't happen when
  resuming from hibernation. As a result, the initrd fallback will get
  triggered on the second hibernation / resume cycle despite the
  original boot using only the microcode.cpio. This switch in initrd
  images leads to the second resume hanging.

  We've been able to successfully avoid this issue by adding the
  following to the ec2-hibinit-agent resume handler:

  /usr/bin/grub-editenv - unset initrdfail
  /usr/bin/grub-editenv - unset recordfail

  (Note: clearing recordfail may not be necessary, will need to try
  again without it.)

  This bug was filed against grub2 as it appears to own initrdfail.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/grub2/+bug/1929860/+subscriptions



More information about the foundations-bugs mailing list