[Bug 1960758] Re: UEFI libvirt servers can't boot on Ubuntu 20.04 hypervisors with Ussuri/Victoria

Mauricio Faria de Oliveira 1960758 at bugs.launchpad.net
Sat Sep 2 20:27:36 UTC 2023


Verification done for focal-proposed and 
(via upgrade to) focal-victoria/proposed,
later upgraded to focal-wallaby/proposed
(to confirm no changes are needed there).

focal-proposed:
--------------

First, (juju) deployed openstack on focal, and verified with steps in
comment #15.

With focal-updates, an uefi server does NOT boot:

	$ juju ssh nova-compute/0 'dpkg -s nova-compute | grep ^Version:' 2>/dev/null
	Version: 2:21.2.4-0ubuntu2.5

	$ openstack image set jammy --property hw_firmware_type=uefi
	$ openstack server create --image jammy --flavor m1.small --network private test

	$ juju ssh nova-compute/0 sudo virsh dumpxml instance-00000002 2>&1 | sed -n '/<os>/,/<\/os>/p'
	...
	    <loader readonly='yes' type='pflash'>/usr/share/OVMF/OVMF_CODE.secboot.fd</loader>
	...

	$ openstack console log show test
	$

With focal-proposed, the server does boot successfully:

$ juju ssh nova-compute/0 'sudo add-apt-repository --yes "deb
http://archive.ubuntu.com/ubuntu focal-proposed main"'

I didn't find an option to "upgrade" to focal-proposed
(just to later openstack releases/cloud archive), thus
I manually run the `apt install` command as the charm.

The config change isn't an issue with the nova-compute
charm due to dpkg options `--force-conf{new,def}`:

        @ charm-nova-compute/hooks/nova_compute_utils.py

	 769     dpkg_opts = [
	 770         '--option', 'Dpkg::Options::=--force-confnew',
	 771         '--option', 'Dpkg::Options::=--force-confdef',
	 772     ]

@ dpkg(1):

	confnew: If a conffile has been modified and the version in the package did change, 
	always install the new version without prompting, unless the --force-confdef is also specified, 
	in which case the default action is preferred.


And the charm does rewrite the configs anyway (`configs.write_all()`),
so I installed and triggered the config change/rewrite w/ our option.

	$ juju ssh nova-compute/0 'sudo apt install --yes --option=Dpkg::Options::=--force-confnew --option=Dpkg::Options::=--force-confdef nova-compute'
	...
	Configuration file '/etc/nova/nova.conf'
	 ==> Modified (by you or by a script) since installation.
	 ==> Package distributor has shipped an updated version.
	 ==> Keeping old config file as default.
	...

	$ juju ssh nova-compute/0 'dpkg -s nova-compute | grep ^Version:' 2>/dev/null
	Version: 2:21.2.4-0ubuntu2.6

        $ juju config nova-compute config-
flags='ubuntu_libvirt_uefi_loader_path=True'

	$ juju ssh nova-compute/0 sudo grep ubuntu_libvirt_uefi_loader_path /etc/nova/nova.conf
	ubuntu_libvirt_uefi_loader_path = True

And the server boot correctly:
 
	$ openstack server stop test
	$ openstack server start test

	$ openstack server show test | grep -e instance_name -e task_state -e vm_state -e status
	| OS-EXT-SRV-ATTR:instance_name       | instance-00000002                                        |
	| OS-EXT-STS:task_state               | None                                                     |
	| OS-EXT-STS:vm_state                 | active                                                   |
	| status                              | ACTIVE                                                   |

	$ juju ssh nova-compute/0 sudo virsh dumpxml instance-00000002 2>&1 | sed -n '/<os>/,/<\/os>/p'
	...
	    <loader readonly='yes' type='pflash'>/usr/share/OVMF/OVMF_CODE.fd</loader>
	...

	$ openstack console log show test | grep -o 'test login:'
	test login:

focal-victoria/proposed:
-----------------------

Then, upgraded to (cloud-archive) focal-victoria, as in comment #17.
Except for the last step, which used -proposed, of course:

        $ juju config nova-compute openstack-origin=cloud:focal-
victoria/proposed

	$ juju ssh nova-compute/0 'dpkg -s nova-compute | grep ^Version:' 2>/dev/null
	Version: 2:22.4.0-0ubuntu1~cloud5

	$ cmadison nova | grep victoria
	 nova | 2:22.4.0-0ubuntu1~cloud4                        | victoria          | focal-updates   | source
	 nova | 2:22.4.0-0ubuntu1~cloud5                        | victoria-proposed | focal-proposed  | source

The fix continues to work in victoria-proposed:

	$ openstack server stop test
	$ openstack server start test

	$ juju ssh nova-compute/0 sudo virsh dumpxml instance-00000002 2>&1 | sed -n '/<os>/,/<\/os>/p'
	...
	    <loader readonly='yes' type='pflash'>/usr/share/OVMF/OVMF_CODE.fd</loader>
	...

focal-wallaby:
-------------

Finally, upgraded to focal-wallaby (which does not need the fix at all),
as in comment #17, using 'cloud:focal-wallaby'. Versions (#17 plus one):

	$ for app in rabbitmq-server keystone cinder glance neutron-api neutron-gateway placement nova-cloud-controller nova-compute; do juju status | awk '/'$app' / { print $1, $2; quit }'; done
	rabbitmq-server 3.8.2
	keystone 19.0.1
	cinder 18.2.1
	glance 22.1.1
	neutron-api 18.6.0
	neutron-gateway 18.6.0
	placement 5.0.1
	nova-cloud-controller 23.2.2
	nova-compute 23.2.2

Verified the patched code is not there anymore:

	$ juju ssh nova-compute/0 'grep -r ubuntu_libvirt_uefi_loader_path /usr/lib/python3/dist-packages/nova/' 
	$

And the VM still boots successfully after stop/start:

	$ openstack console log show test | grep -o 'test login:'
	test login:

The new guest XML is refactored/different from Ussuri/Victoria,
as expected, due to qemu firmware metadata files in Wallaby:

	$ juju ssh nova-compute/0 sudo virsh dumpxml instance-00000002 2>&1 | sed -n '/<os>/,/<\/os>/p'
	  <os>
	    <type arch='x86_64' machine='pc-i440fx-4.2'>hvm</type>
	    <loader readonly='yes' secure='no' type='pflash'>/usr/share/OVMF/OVMF_CODE_4M.fd</loader>
	    <nvram template='/usr/share/OVMF/OVMF_VARS_4M.fd'>/var/lib/libvirt/qemu/nvram/instance-00000002_VARS.fd</nvram>
	    <boot dev='hd'/>
	    <smbios mode='sysinfo'/>
	  </os>


** Tags removed: verification-needed verification-needed-focal verification-victoria-needed
** Tags added: verification-done verification-done-focal verification-victoria-done

-- 
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to Ubuntu Cloud Archive.
https://bugs.launchpad.net/bugs/1960758

Title:
  UEFI libvirt servers can't boot on Ubuntu 20.04 hypervisors with
  Ussuri/Victoria

Status in Ubuntu Cloud Archive:
  Invalid
Status in Ubuntu Cloud Archive ussuri series:
  Triaged
Status in Ubuntu Cloud Archive victoria series:
  Fix Committed
Status in OpenStack Compute (nova):
  Invalid
Status in OpenStack Compute (nova) ussuri series:
  Invalid
Status in OpenStack Compute (nova) victoria series:
  Invalid
Status in nova package in Ubuntu:
  Invalid
Status in nova source package in Focal:
  Fix Committed

Bug description:
  Impact:
  ===

  Currently, setting `hw_firwmare_type=uefi` may create
  _unbootable_ servers on 20.04 hypervisors with Ussuri
  and Victoria (Wallaby and later are OK).

  We should not use the Secure Boot firmware on the 'pc'
  machine type, as 'q35' is _required_ by OVMF firmware
  if SMM feature is built (usually the case, to actually
  secure the SB feature).
  [See comment #6 for research and #7 for test evidence.]

  We should not use the Secure Boot firmware on the 'q35'
  machine type _either_, as it might not work regardless,
  since other libvirt XML options such as SMM and S3/S4
  disable may be needed for Secure Boot to work, but are
  _not_ configured by Openstack Ussuri (no SB support).

  
  Approach:
  ===

  Considering how long Focal/Ussuri have been out there
  (and maybe worked with UEFI enabled for some cases?)
  add a config option to _opt-in_ to actually supported
  UEFI loaders for nova/libvirt.

  This seems to benefit downstream/Ubuntu more (although
  other distros might be affected) add the config option
  "ubuntu_libvirt_uefi_loader_path" (disabled by default)
  in the DEFAULT libvirt config section (so it can be set
  in nova-compute charm's 'config-flags' option).

  
  Test Plan:
  ===

  $ openstack image set --property hw_firmware_type=uefi $IMAGE
  $ openstack server create --image $IMAGE --flavor $FLAVOR --network $NETWORK uefi-server

  (with patched packages:)
  Set `ubuntu_libvirt_uefi_loader_path = true` in `[DEFAULT]` in /etc/nova/nova.conf
  (eg `juju config nova-compute config-flags='ubuntu_libvirt_uefi_loader_path=true'`)
  $ openstack server stop uefi-server
  $ openstack server start uefi-server

  - Expected Result:

  The server's libvirt XML uses UEFI _without_ Secure Boot.

          <loader readonly='yes'
  type='pflash'>/usr/share/OVMF/OVMF_CODE.fd</loader>

  The guest boots, and console log confirms UEFI mode:

          $ openstack console log show srv | grep -i -e efi -e bios
          ...
          Creating boot entry "Boot0003" with label "ubuntu" for file "\EFI\ubuntu\shimx64.efi"
          ...
          [    0.000000] efi: EFI v2.70 by EDK II
          [    0.000000] efi:  SMBIOS=0x7fbcd000  ACPI=0x7fbfa000  ACPI
          2.0=0x7fbfa014  MEMATTR=0x7eb30018
          [    0.000000] SMBIOS 2.8 present.
          [    0.000000] DMI: OpenStack Foundation OpenStack Nova, BIOS 0.0.0 02/06/2015
          ...

  - Actual Result:

  The server's libvirt XML uses UEFI _with_ Secure Boot.

          <loader readonly='yes'
  type='pflash'>/usr/share/OVMF/OVMF_CODE.secboot.fd</loader>

  The guest doesn't boot; empty console log; qemu-kvm looping at 100%
  CPU.

          $ openstack console log show srv | grep -i -e efi -e bios
          $ openstack console log show srv | wc -l
          0

          $ juju run --app nova-compute 'top -b -d1 -n5 | grep qemu'
            67205 libvirt+  ... 100.0   1.4   1:18.35 qemu-sy+
            67205 libvirt+  ... 100.0   1.4   1:19.36 qemu-sy+
            67205 libvirt+  ...  99.0   1.4   1:20.36 qemu-sy+
            67205 libvirt+  ... 101.0   1.4   1:21.37 qemu-sy+
            67205 libvirt+  ... 100.0   1.4   1:22.38 qemu-sy+

  
  Where problems could occur:
  ===

  The changes are opt-in with `ubuntu_libvirt_uefi_loader_path=true`,
  so users are not affected by default.

  Theoretically, regressions would more likely manifest and be contained
  in nova's libvirt driver, when `hw_firwmare_type=uefi` (not by default).

  The expected symptoms of regressions are boot failures (server starts
  from openstack perspective, but doesn't boot to the operating system).

  
  Other Info:
  ===

  - Hypervisor running Ubuntu 20.04 LTS (Focal)
  - Nova packages from Ussuri (Ubuntu Archive) or Victoria (Cloud Archive).

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1960758/+subscriptions




More information about the Ubuntu-openstack-bugs mailing list