[Bug 1092108] [NEW] resume_state_on_host_boot fails on instances in error state
James Troup
james.troup at canonical.com
Wed Dec 19 13:24:01 UTC 2012
Public bug reported:
After an unexpected host reboot, all the guests went away. I added
'--start_guests_on_host_boot=true' to /etc/nova/nova.conf and started
up nova-compute. It started some instances but then died on:
2012-12-19 11:11:47 CRITICAL nova [-] Domain not found: no domain with matching name 'instance-000000bb'
2012-12-19 11:11:47 TRACE nova Traceback (most recent call last):
2012-12-19 11:11:47 TRACE nova File "/usr/bin/nova-compute", line 49, in <module>
2012-12-19 11:11:47 TRACE nova service.wait()
2012-12-19 11:11:47 TRACE nova File "/usr/lib/python2.7/dist-packages/nova/service.py", line 413, in wait
2012-12-19 11:11:47 TRACE nova _launcher.wait()
2012-12-19 11:11:47 TRACE nova File "/usr/lib/python2.7/dist-packages/nova/service.py", line 131, in wait
2012-12-19 11:11:47 TRACE nova service.wait()
2012-12-19 11:11:47 TRACE nova File "/usr/lib/python2.7/dist-packages/eventlet/greenthread.py", line 166, in wait
2012-12-19 11:11:47 TRACE nova return self._exit_event.wait()
2012-12-19 11:11:47 TRACE nova File "/usr/lib/python2.7/dist-packages/eventlet/event.py", line 116, in wait
2012-12-19 11:11:47 TRACE nova return hubs.get_hub().switch()
2012-12-19 11:11:47 TRACE nova File "/usr/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 177, in switch
2012-12-19 11:11:47 TRACE nova return self.greenlet.switch()
2012-12-19 11:11:47 TRACE nova File "/usr/lib/python2.7/dist-packages/eventlet/greenthread.py", line 192, in main
2012-12-19 11:11:47 TRACE nova result = function(*args, **kwargs)
2012-12-19 11:11:47 TRACE nova File "/usr/lib/python2.7/dist-packages/nova/service.py", line 101, in run_server
2012-12-19 11:11:47 TRACE nova server.start()
2012-12-19 11:11:47 TRACE nova File "/usr/lib/python2.7/dist-packages/nova/service.py", line 162, in start
2012-12-19 11:11:47 TRACE nova self.manager.init_host()
2012-12-19 11:11:47 TRACE nova File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 269, in init_host
2012-12-19 11:11:47 TRACE nova block_device_info)
2012-12-19 11:11:47 TRACE nova File "/usr/lib/python2.7/dist-packages/nova/exception.py", line 114, in wrapped
2012-12-19 11:11:47 TRACE nova return f(*args, **kw)
2012-12-19 11:11:47 TRACE nova File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/connection.py", line 852, in resume_state_on_host_boot
2012-12-19 11:11:47 TRACE nova block_device_info=block_device_info)
2012-12-19 11:11:47 TRACE nova File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/connection.py", line 790, in _hard_reboot
2012-12-19 11:11:47 TRACE nova virt_dom = self._conn.lookupByName(instance['name'])
2012-12-19 11:11:47 TRACE nova File "/usr/lib/python2.7/dist-packages/libvirt.py", line 2370, in lookupByName
2012-12-19 11:11:47 TRACE nova if ret is None:raise libvirtError('virDomainLookupByName() failed', conn=self)
2012-12-19 11:11:47 TRACE nova libvirtError: Domain not found: no domain with matching name 'instance-000000bb'
2012-12-19 11:11:47 TRACE nova
This instance is in an error state:
RESERVATION r-n1d0t747 c519923c921a404c96ebc8210a4ec67a juju-canonistack2, juju-canonistack2-10
INSTANCE i-000000bb ami-000000bf server-187 server-187 error None (c519923c921a404c96ebc8210a4ec67a, alce) 0 m1.small 2012-07-02T02:12:56.000Z nova monitoring-disabled instance-store
And no longer exists on alce. I couldn't find any reasonable way to
kill the instance entirely (ec2-terminate-instances as an admin user
had no affect) or trivially remove it from the database. I ended up
modifying the nova libvirt driver to skip instances it can't find with
the attached patch.
(FAOD, I'm attaching the patch mostly to illustrate the problem and
our workaround, not necessarily for use as is in the packages or
upstream.)
This is all with current Ubuntu 12.04 packages (including
precise-proposed).
** Affects: nova (Ubuntu)
Importance: Undecided
Status: New
** Tags: canonistack
** Patch added: "Skip instances which can't be found in hard_reboot"
https://bugs.launchpad.net/bugs/1092108/+attachment/3463885/+files/diff.txt
** Tags added: canonistack
--
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to nova in Ubuntu.
https://bugs.launchpad.net/bugs/1092108
Title:
resume_state_on_host_boot fails on instances in error state
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/nova/+bug/1092108/+subscriptions
More information about the Ubuntu-server-bugs
mailing list