[Bug 1828617] Re: Hosts randomly 'losing' disks, breaking ceph-osd service enumeration
David A. Desrosiers
1828617 at bugs.launchpad.net
Thu Jun 13 16:55:32 UTC 2019
Just adding that I've worked around this issue with the following added
to the lvm2-monitor overrides
(/etc/systemd/system/lvm2-monitor.service.d/custom.conf):
[Service]
ExecStartPre=/bin/sleep 60
This results in 100% success for every single boot, with no missed disks
nor missed LVM volumes applied to those block devices.
We've also disabled nvme multipathing on every Ceph storage node with
the following in /etc/d/g kernel boot args:
nvme_core.multipath=0
Note: This LP was cloned from an internal customer case where their Ceph
storage nodes were directly impacted by this issue, and this is the
current workaround deployed, until/unless we can find a consistent RC
for this issue in an upstream package.
--
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to ceph in Ubuntu.
https://bugs.launchpad.net/bugs/1828617
Title:
Hosts randomly 'losing' disks, breaking ceph-osd service enumeration
Status in ceph package in Ubuntu:
In Progress
Bug description:
Ubuntu 18.04.2 Ceph deployment.
Ceph OSD devices utilizing LVM volumes pointing to udev-based physical devices.
LVM module is supposed to create PVs from devices using the links in /dev/disk/by-dname/ folder that are created by udev.
However on reboot it happens (not always, rather like race condition) that Ceph services cannot start, and pvdisplay doesn't show any volumes created. The folder /dev/disk/by-dname/ however has all necessary device created by the end of boot process.
The behaviour can be fixed manually by running "#/sbin/lvm pvscan
--cache --activate ay /dev/nvme0n1" command for re-activating the LVM
components and then the services can be started.
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1828617/+subscriptions
More information about the Ubuntu-openstack-bugs
mailing list