[Bug 1978489] Re: libvirt / cgroups v2: cannot boot instance with more than 16 CPUs
Stefan Lupsa
1978489 at bugs.launchpad.net
Mon Feb 12 08:29:23 UTC 2024
> - comment #12: it shows a live migration problem which I'm not sure is
due to this bug, or something else.
The patch fixes the problem for jammy distro.
The same patch should also be available on cloud archive cloud:focal-
yoga for someone to be able to migrate VMs to a jammy node during an
upgrade process. Any instance with >= 10 vcpu will not be able to live-
migrate to an upgraded node (running jammy distro) even if it already
has the patch and would throw the same error. It's the same problem and
the patch addresses this with functionality to update the cpu shares on
the instance handled during the migration. This means by the migrations
source, which in a juju environment would be an ubuntu focal running the
yoga cloud-archive.
--
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to nova in Ubuntu.
https://bugs.launchpad.net/bugs/1978489
Title:
libvirt / cgroups v2: cannot boot instance with more than 16 CPUs
Status in OpenStack Compute (nova):
In Progress
Status in nova package in Ubuntu:
Confirmed
Status in nova source package in Jammy:
Fix Committed
Bug description:
Description
===========
Using the libvirt driver and a host OS that uses cgroups v2 (RHEL 9,
Ubuntu Jammy), an instance with more than 16 CPUs cannot be booted.
Steps to reproduce
==================
1. Boot an instance with 10 (or more) CPUs on RHEL 9 or Ubuntu Jammy
using Nova with the libvirt driver.
Expected result
===============
Instance boots.
Actual result
=============
Instance fails to boot with a 'Value specified in CPUWeight is out of
range' error.
Environment
===========
Originially report as a libvirt but in RHEL 9 [1]
Additional information
======================
This is happening because Nova defaults to 1024 * (# of CPUs) for the
value of domain/cputune/shares in the libvirt XML. This is then passed
directly by libvirt to the cgroups API, but cgroups v2 has a maximum
value of 10000. 10000 / 1024 ~= 9.76
[1] https://bugzilla.redhat.com/show_bug.cgi?id=2035518
====================================
Ubuntu SRU Details:
[Impact]
See above.
[Test Case]
See above.
[Regression Potential]
We've had this change in other jammy-based versions of the nova package for a while now, including zed, antelope, bobcat.
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1978489/+subscriptions
More information about the Ubuntu-openstack-bugs
mailing list