[Bug 1818680] Re: booting should succeed even if vault is unavailable
James Page
james.page at ubuntu.com
Wed Jun 12 10:00:22 UTC 2019
I've uploaded for bionic-backports and raised the associated bug task;
with regards to xenial, this will be delivered via the Ubuntu Cloud
Archive.
** Also affects: bionic-backports
Importance: Undecided
Status: New
** Changed in: bionic-backports
Status: New => In Progress
** Changed in: bionic-backports
Assignee: (unassigned) => James Page (james-page)
** Also affects: cloud-archive
Importance: Undecided
Status: New
** Also affects: cloud-archive/queens
Importance: Undecided
Status: New
** Changed in: cloud-archive
Status: New => Invalid
** Changed in: cloud-archive/queens
Status: New => Triaged
** Changed in: cloud-archive/queens
Importance: Undecided => High
** Changed in: cloud-archive/queens
Assignee: (unassigned) => James Page (james-page)
--
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to Ubuntu Cloud Archive.
https://bugs.launchpad.net/bugs/1818680
Title:
booting should succeed even if vault is unavailable
Status in Bionic Backports:
In Progress
Status in OpenStack ceph-osd charm:
Invalid
Status in Ubuntu Cloud Archive:
Invalid
Status in Ubuntu Cloud Archive queens series:
Triaged
Status in vaultlocker:
Fix Committed
Status in vaultlocker package in Ubuntu:
Fix Released
Status in vaultlocker source package in Cosmic:
Fix Released
Status in vaultlocker source package in Disco:
Fix Released
Bug description:
[Impact]
decrypt of vaultlocker encrypted block devices blocks the network-online.target; this means that if vault is hosted on the same hardware which is using vaultlocker for encryption at rest, the server will fail to boot fully in the event that all servers are rebooted at the same time.
[Test Case]
Deploy ceph+vaultlocker+vault
Power cycle all servers
Servers never get to multiuser.target as vaultlocker-decrypt services block network-online.target so LXD containers never get started.
[Regression Potential]
The proposed fix drops the Before=network-online.target stanza from the vaultlocker-decrypt systemd unit so minimal impact.
[Original bug report]
If ceph is using vault secrets to encrypt its volumes and vault is not available, booting is not possible without manual intervention, as the ceph-volume and vaultlocker-decrypt services will hang forever.
In case of a full cloud outage, bootstrapping the mysql and vault nodes will require quite a bit of manual intervention, as all required nodes will have to be booted in single user mode to bypass the volume decryption services.
Decryption of the ceph volumes should instead timeout, and allow the
rest of the machine to complete the boot sequence.
To manage notifications about this bug go to:
https://bugs.launchpad.net/bionic-backports/+bug/1818680/+subscriptions
More information about the Ubuntu-openstack-bugs
mailing list