[Bug 1953431] Re: [SRU] fix unsafe access in unregister_conn()
gerald.yang
1953431 at bugs.launchpad.net
Thu Dec 9 08:14:22 UTC 2021
** Description changed:
[Impact]
without holding sufficient locks, accept_conn and shutdown_connections in AsyncMessenger could cause OSD processes crash
[Test plan]
+ run ceph_test_async_networkstack repeatedly, this will trigger many async messenger events and make sure
+ 1. OSD process not crash
+ 2. no deadlock
[Where problems could occur]
hold sufficient locks and decrement l_msgr_active_connections can avoid OSD processes to crash due to race condition
The only problem I can imagine is deadlock if one process is holding lock A and wait for lock B, the there is another process holding lock B and wait for lock A
-
[Other info]
from upstream:
octopus backport tracker
https://tracker.ceph.com/issues/50482
octopus backport PR
https://github.com/ceph/ceph/pull/43325
** Patch added: "0001-msgr-async-fix-unsafe-access-in-unregister_conn.patch"
https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1953431/+attachment/5546429/+files/0001-msgr-async-fix-unsafe-access-in-unregister_conn.patch
** Tags added: sts sts-sru-needed verification-needed-focal
** Also affects: ceph (Ubuntu Focal)
Importance: Undecided
Status: New
** Changed in: ceph (Ubuntu Focal)
Status: New => In Progress
** Changed in: ceph (Ubuntu Focal)
Assignee: (unassigned) => gerald.yang (gerald-yang-tw)
--
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to ceph in Ubuntu.
https://bugs.launchpad.net/bugs/1953431
Title:
[SRU] fix unsafe access in unregister_conn()
Status in ceph package in Ubuntu:
In Progress
Status in ceph source package in Focal:
In Progress
Bug description:
[Impact]
without holding sufficient locks, accept_conn and shutdown_connections in AsyncMessenger could cause OSD processes crash
[Test plan]
run ceph_test_async_networkstack repeatedly, this will trigger many async messenger events and make sure
1. OSD process not crash
2. no deadlock
[Where problems could occur]
hold sufficient locks and decrement l_msgr_active_connections can avoid OSD processes to crash due to race condition
The only problem I can imagine is deadlock if one process is holding lock A and wait for lock B, the there is another process holding lock B and wait for lock A
[Other info]
from upstream:
octopus backport tracker
https://tracker.ceph.com/issues/50482
octopus backport PR
https://github.com/ceph/ceph/pull/43325
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1953431/+subscriptions
More information about the Ubuntu-openstack-bugs
mailing list