[Bug 1993149] Re: VMs stay stuck in scheduling when rabbitmq leader unit is down
Brian Murray
1993149 at bugs.launchpad.net
Wed Oct 26 12:24:44 UTC 2022
Hello Aqsa, or anyone else affected,
Accepted python-oslo.messaging into jammy-proposed. The package will
build now and be available at
https://launchpad.net/ubuntu/+source/python-
oslo.messaging/12.13.0-0ubuntu1.1 in a few hours, and then in the
-proposed repository.
Please help us by testing this new package. See
https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how
to enable and use -proposed. Your feedback will aid us getting this
update out to other Ubuntu users.
If this package fixes the bug for you, please add a comment to this bug,
mentioning the version of the package you tested, what testing has been
performed on the package and change the tag from verification-needed-
jammy to verification-done-jammy. If it does not fix the bug for you,
please add a comment stating that, and change the tag to verification-
failed-jammy. In either case, without details of your testing we will
not be able to proceed.
Further information regarding the verification process can be found at
https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in
advance for helping!
N.B. The updated package will be released to -updates after the bug(s)
fixed by this package have been verified and the package has been in
-proposed for a minimum of 7 days.
** Changed in: python-oslo.messaging (Ubuntu Jammy)
Status: Triaged => Fix Committed
** Tags added: verification-needed-jammy
--
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to Ubuntu Cloud Archive.
https://bugs.launchpad.net/bugs/1993149
Title:
VMs stay stuck in scheduling when rabbitmq leader unit is down
Status in OpenStack RabbitMQ Server Charm:
Triaged
Status in Ubuntu Cloud Archive:
Triaged
Status in Ubuntu Cloud Archive yoga series:
Triaged
Status in Ubuntu Cloud Archive zed series:
Triaged
Status in oslo.messaging:
New
Status in python-oslo.messaging package in Ubuntu:
Fix Committed
Status in python-oslo.messaging source package in Jammy:
Fix Committed
Status in python-oslo.messaging source package in Kinetic:
Fix Committed
Bug description:
When testing rabbitmq-server HA in our OpenStack Yoga cloud
environment (Rabbitmq Server release 3.9/stable) we faced the
following issues:
- When the leader unit is down we are unable to launch any VMs and the
launched ones stay stuck in the 'BUILD' state.
- While checking the logs we see that several OpenStack services has
issues in communicating with the rabbitmq-server
- After restarting all the services using rabbitmq (like Nova, Cinder,
Neutron etc) the issue gets resolved and the VMs can be launched
successfully
The corresponding logs are available at:
https://pastebin.ubuntu.com/p/Bk3yktR8tp/
We also observed the same for rabbitmq-server unit which is first in
the list of 'nova.conf' file, and after restarting the concerned
rabbitmq unit we see that scheduling of VMs work fine again.
As this can be seen from this part of the log as well:
"Reconnected to AMQP server on 192.168.34.251:5672 via [amqp] client with port 41922."
====== Ubuntu SRU Details =======
[Impact]
Active/active HA for rabbitmq is broken when a node goes down.
[Test Case]
Deploy openstack with 3 units of rabbitmq in active/active HA.
[Regression Potential]
Due to the criticality of this issue, I've decided to revert the upstream change that is causing the problem as a stop-gap until a proper fix is in place. That fix came in via https://bugs.launchpad.net/oslo.messaging/+bug/1935864. As a result we may see performance degradation in polling as described in that bug.
To manage notifications about this bug go to:
https://bugs.launchpad.net/charm-rabbitmq-server/+bug/1993149/+subscriptions
More information about the Ubuntu-openstack-bugs
mailing list