[Bug 1602057] Re: [SRU] (libvirt) KeyError updating resources for some node, guest.uuid is not in BDM list
Brian Murray
brian at ubuntu.com
Thu May 4 20:48:43 UTC 2017
Hello shiliang, or anyone else affected,
Accepted nova into xenial-proposed. The package will build now and be
available at https://launchpad.net/ubuntu/+source/nova/2:13.1.3-0ubuntu2
in a few hours, and then in the -proposed repository.
Please help us by testing this new package. See
https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how
to enable and use -proposed.Your feedback will aid us getting this
update out to other Ubuntu users.
If this package fixes the bug for you, please add a comment to this bug,
mentioning the version of the package you tested, and change the tag
from verification-needed to verification-done. If it does not fix the
bug for you, please add a comment stating that, and change the tag to
verification-failed. In either case, details of your testing will help
us make a better decision.
Further information regarding the verification process can be found at
https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in
advance!
** Changed in: nova (Ubuntu Xenial)
Status: Triaged => Fix Committed
** Tags added: verification-needed
--
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to Ubuntu Cloud Archive.
https://bugs.launchpad.net/bugs/1602057
Title:
[SRU] (libvirt) KeyError updating resources for some node, guest.uuid
is not in BDM list
Status in Ubuntu Cloud Archive:
Fix Released
Status in Ubuntu Cloud Archive mitaka series:
Triaged
Status in Ubuntu Cloud Archive newton series:
Fix Released
Status in OpenStack Compute (nova):
Fix Released
Status in OpenStack Compute (nova) mitaka series:
Won't Fix
Status in OpenStack Compute (nova) newton series:
Fix Committed
Status in nova package in Ubuntu:
Fix Released
Status in nova source package in Xenial:
Fix Committed
Bug description:
[Impact]
There currently exists a race condition whereby the compute
resource_tracker periodic task polls extant instances and checks their
BDMs which can occur prior to any mappings having yet been created
e.g. root disk mapping for new instances. This patch ensures that
instances without any BDMs are skipped.
[Test Case]
* deploy Openstack Mitaka with debug logging enabled (not essential but helps)
* create an instance
* delete its BDMs - pastebin.ubuntu.com/24287419/
* watch /var/log/nova/nova-compute.log on hypervisor hosting
instance and wait for next resource_tracker tick
* ensure that exception mentioned in LP does not occur (happens
after "Auditing locally available compute resources for node")
[Regression Potential]
The resource tracker information is used by the scheduler when
deciding which compute hosts are able to have an instances scheduled
to them. In this case the resource tracker would be skipping instances
that would contribute to disk overcommit ratios. As such it is
possible that that scheduler will have momentarily skewed information
about resource consumption on that compute host until the next
resource_tracker tick. Since the likelihood of this race condition
occurring is hopefully slim and provided that users have a reasonable
frequency for the resource_tracker, the likelihood of this becoming a
long term problem is low since the issue will always be corrected by a
subsequent tick (although if the compute host in question were
saturated that would not be fixed until an instances was deleted or
migrated).
[Other]
Note that this patch did not make it into upstream stable/mitaka branch due to the stable cutoff so the proposal is to carry in the archive (indefinitely).
--------
2016-07-12 09:54:36.021 10056 ERROR nova.compute.manager [req-d5d5d486-b488-4429-bbb5-24c9f19ff2c0 - - - - -] Error updating resources for node controller.
2016-07-12 09:54:36.021 10056 ERROR nova.compute.manager Traceback (most recent call last):
2016-07-12 09:54:36.021 10056 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 6726, in update_available_resource
2016-07-12 09:54:36.021 10056 ERROR nova.compute.manager rt.update_available_resource(context)
2016-07-12 09:54:36.021 10056 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 500, in update_available_resource
2016-07-12 09:54:36.021 10056 ERROR nova.compute.manager resources = self.driver.get_available_resource(self.nodename)
2016-07-12 09:54:36.021 10056 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5728, in get_available_resource
2016-07-12 09:54:36.021 10056 ERROR nova.compute.manager disk_over_committed = self._get_disk_over_committed_size_total()
2016-07-12 09:54:36.021 10056 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 7397, in _get_disk_over_committed_size_total
2016-07-12 09:54:36.021 10056 ERROR nova.compute.manager local_instances[guest.uuid], bdms[guest.uuid])
2016-07-12 09:54:36.021 10056 ERROR nova.compute.manager KeyError: '0a5c5743-9555-4dfd-b26e-198449ebeee5'
2016-07-12 09:54:36.021 10056 ERROR nova.compute.manager
To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1602057/+subscriptions
More information about the Ubuntu-openstack-bugs
mailing list