[Bug 1840686] Re: Xenial images won't reboot if disk size is > 2TB when using GPT
Matthew Ruffell
1840686 at bugs.launchpad.net
Wed Nov 6 23:06:17 UTC 2019
As per Ćukasz's request, I have performed verification testing across a
wide range of affected and unaffected systems, with the
grub2-2.02~beta2-36ubuntu3.23 package in -proposed.
All tests were performed on GCP.
Test case one:
Summary: efi enabled image with 3072gb disk.
Disk: 3072gb
Bios/efi: efi
Image: daily-ubuntu-1604-xenial-v20190731, ubuntu-os-cloud-devel
Affected: Yes
Behaviour with grub2-2.02~beta2-36ubuntu3.22 from -updates:
Fails to reboot - as expected because of this bug.
Log: https://paste.ubuntu.com/p/KKH7r3vdrC/
Behaviour with grub2-2.02~beta2-36ubuntu3.23 from -proposed:
If you follow the instructions in the test section, the instance reboots successfully. Note, grub-install must be explicitly called. If you do not manually run grub-install, rebooting will fail.
Log: https://paste.ubuntu.com/p/7KxmDCjFSG/
Log (apt upgrade, no manual grub-install): https://paste.ubuntu.com/p/j3VK5PV7GR/
Test case two:
Summary: efi enabled image with 10gb disk.
Disk: 10gb
Bios/efi: efi
Image: daily-ubuntu-1604-xenial-v20190731, ubuntu-os-cloud-devel
Affected: No
Behaviour with grub2-2.02~beta2-36ubuntu3.22 from -updates:
Reboots successfully.
Behaviour with grub2-2.02~beta2-36ubuntu3.23 from -proposed:
Reboots successfully.
Logs: https://paste.ubuntu.com/p/c65jjYjj6S/
Note, grub-install was not invoked manually, and represents a typical user apt upgrade with no interaction.
Test case three:
Summary: bios enabled image with 10gb disk.
Disk: 10gb
Bios/efi: bios
Image: ubuntu-1604-xenial-v20191024
Affected: No
Behaviour with grub2-2.02~beta2-36ubuntu3.22 from -updates:
Reboots successfully.
Behaviour with grub2-2.02~beta2-36ubuntu3.23 from -proposed:
Reboots successfully.
Logs: https://paste.ubuntu.com/p/N3QFX63WCS/
Note, grub-install was not invoked manually, and represents a typical user apt upgrade with no interaction.
Test case four:
Summary: bios enabled image with 3072gb disk.
Disk: 3072gb
Bios/efi: bios
Image: ubuntu-1604-xenial-v20191024
Affected: No
Behaviour with grub2-2.02~beta2-36ubuntu3.22 from -updates:
Reboots successfully.
Behaviour with grub2-2.02~beta2-36ubuntu3.23 from -proposed:
Reboots successfully.
Logs: https://paste.ubuntu.com/p/KZw4kcD6pS/
Note, grub-install was not invoked manually, and represents a typical user apt upgrade with no interaction.
Log (apt upgrade, WITH manual grub-install): https://paste.ubuntu.com/p/RFBR5BbbtH/
Conclusion:
grub2-2.02~beta2-36ubuntu3.23 from -proposed fixes this bug. It does not
introduce any regressions for non-affected use cases.
The only thing to note, is that for efi based images with disk > 2tb, if
the user does not manually run grub-install after installing the package
in -proposed then rebooting will fail. This is no better than the
current situation of failing to reboot regardless, and because of this,
there is unlikely to be any users out there who are running an image
with disk > 2tb and have never rebooted ever, so it is unlikely this
will be a problem.
This will however, fix all images and new instances using the fixed
version of grub moving forward.
Taking this into consideration, I am happy to mark this as verified.
Pat, feel free to also test. I will also see if the customer is
interested in testing as well.
** Tags removed: verification-needed verification-needed-xenial
** Tags added: verification-done-xenial
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to grub2 in Ubuntu.
https://bugs.launchpad.net/bugs/1840686
Title:
Xenial images won't reboot if disk size is > 2TB when using GPT
Status in cloud-init:
Won't Fix
Status in grub2 package in Ubuntu:
Fix Released
Status in grub2-signed package in Ubuntu:
Fix Released
Status in grub2 source package in Xenial:
Fix Committed
Status in grub2-signed source package in Xenial:
Fix Committed
Bug description:
[Impact]
On Xenial images which use GPT instead of MBR to enable efi based
booting, there is an issue where after booting an instance that has a
disk size of 2049 GB or higher, we hang on the next subsequent boot
(Logs indicate it hanging on "Booting Hard Disk 0").
This is a problem in grub2 where the system would become unbootable
after ext* online resize if no resize_inode was created at ext* format
time.
[Test Case]
To reproduce:
1) Create an image with a disk size of 3072 GB using a serial that has
GPT:
gcloud compute instances create test-3072-xenial --image daily-
ubuntu-1604-xenial-v20190731 --image-project ubuntu-os-cloud-devel
--boot-disk-size 3072
2) Reboot the instance
The instance will hang on reboot and you cannot connect. If you go to
GCP console and select Logs > Serial port 1 (console), you will see
the boot process has stopped at "Booting Hard Disk 0".
I have built a test package, which is available here:
https://launchpad.net/~mruffell/+archive/ubuntu/lp1840686-test
If you do step 1) but do not reboot, and instead add the PPA, install
the new grub like so:
1) gcloud compute instances create test-3072-xenial --image daily-ubuntu-1604-xenial-v20190731 --image-project ubuntu-os-cloud-devel --boot-disk-size 3072
2) sudo add-apt-repository ppa:mruffell/lp1840686-test
3) sudo apt-get update
4) sudo apt remove grub-common grub-efi-amd64 grub-efi-amd64-bin grub-efi-amd64-signed grub-pc-bin grub2-common
5) sudo apt install grub-common grub-efi-amd64 grub-efi-amd64-bin grub-pc-bin grub2-common
6) sudo grub-install /dev/sda
7) sudo reboot
The instance will boot successfully and you will be able to connect.
Note, we must use "daily-ubuntu-1604-xenial-v20190731" as the image,
as it is enabled for GPT and efi. GCP was reverted back to MBR and
bios booting because of this bug, so the latest images will not
reproduce the problem.
[Regression Potential]
Grub is a core package and every care must be taken in order to not
introduce any regressions.
The commit is present in B, D, E and F, and is considered well tested
and widely adopted by the community.
The commit comes with its own testcase, to test the ext4_metabg fix.
The changes are localised to ext* based filesystems, although since
they are the most popular family of filesystems used by the community,
this does not reduce risk of breakage by much.
If a regression were to happen, a regression would have a large
impact, and in the worst case, can lead to unbootable systems and data
loss for users who are not technical enough to reinstall grub from a
working package inside the broken system chroot.
[Other Info]
In comment #4, Sultan identifies the fix as:
commit e20aa39ea4298011ba716087713cff26c6c52006
Author: Vladimir Serbinenko <phcoder at gmail.com>
Date: Mon Feb 16 20:53:26 2015 +0100
Subject: ext2: Support META_BG.
This commit is from upstream grub2, and can be found here:
https://git.savannah.gnu.org/cgit/grub.git/commit/?id=e20aa39ea4298011ba716087713cff26c6c52006
Looking at when this was merged:
$ git describe --contains e20aa39ea4298011ba716087713cff26c6c52006
2.02-beta3~429
This commit is present in B, D, E and F, leaving X as the only version
needing an SRU.
The commit cleanly cherry picks to X, because the delta from
2.02~beta2-36ubuntu3.22 to 2.02-beta3~429 is small.
To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-init/+bug/1840686/+subscriptions
More information about the foundations-bugs
mailing list