[Bug 1834875] Re: cloud-init growpart race with udev
Dimitri John Ledkov
launchpad at surgut.co.uk
Fri Nov 1 13:28:58 UTC 2019
Some observations:
* growpart uses sfdisk without --no-tell-kernel option, meaning that it does notify kernel about partition changes
* growpart later calls partx, which may be redundant / cause no changes or events
* as a side note, partprobe, blockdev --rereadpt can also be used to reread partition tables, I'm not sure the difference between them
* growpart does not take exclusive lock of the device, meaning sgdisk is known to be racy with udev events
Imho the sequency of commands should be:
* take flock on the device, to neutralise udev
* modify device with sfdisk
* reread partitions tables (i would say with blockdev --rereadpt, rather than partx/partprobe)
* release the flock
* udevadm trigger --action=add --wait device (or trigger && settle)
This way it ensures that no udev events are processed for the device
whilst we are operating and rereading the device partitions, and then we
release the lock, at which point everything has to be quiet and steady,
trigger, settle, done.
See:
sfdisk uses BLKRRPART (reread partition table) ioctl to make sure that the device is not used
by system or another tools (see also --no-reread). It's possible that this feature or another
sfdisk activity races with udevd. The recommended way how to avoid possible collisions is to
use exclusive flock for the whole-disk device to serialize device access. The exclusive lock
will cause udevd to skip the event handling on the device. For example:
flock /dev/sdc sfdisk /dev/sdc
Note, this semantic is not currently supported by udevd for MD
and DM devices.
at http://manpages.ubuntu.com/manpages/eoan/en/man8/sfdisk.8.html
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/1834875
Title:
cloud-init growpart race with udev
Status in cloud-init:
Incomplete
Status in cloud-utils:
New
Status in linux-azure package in Ubuntu:
New
Status in systemd package in Ubuntu:
New
Bug description:
On Azure, it happens regularly (20-30%), that cloud-init's growpart
module fails to extend the partition to full size.
Such as in this example:
========================================
2019-06-28 12:24:18,666 - util.py[DEBUG]: Running command ['growpart', '--dry-run', '/dev/sda', '1'] with allowed return codes [0] (shell=False, capture=True)
2019-06-28 12:24:19,157 - util.py[DEBUG]: Running command ['growpart', '/dev/sda', '1'] with allowed return codes [0] (shell=False, capture=True)
2019-06-28 12:24:19,726 - util.py[DEBUG]: resize_devices took 1.075 seconds
2019-06-28 12:24:19,726 - handlers.py[DEBUG]: finish: init-network/config-growpart: FAIL: running config-growpart with frequency always
2019-06-28 12:24:19,727 - util.py[WARNING]: Running module growpart (<module 'cloudinit.config.cc_growpart' from '/usr/lib/python3/dist-packages/cloudinit/config/cc_growpart.py'>) failed
2019-06-28 12:24:19,727 - util.py[DEBUG]: Running module growpart (<module 'cloudinit.config.cc_growpart' from '/usr/lib/python3/dist-packages/cloudinit/config/cc_growpart.py'>) failed
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/cloudinit/stages.py", line 812, in _run_modules
freq=freq)
File "/usr/lib/python3/dist-packages/cloudinit/cloud.py", line 54, in run
return self._runners.run(name, functor, args, freq, clear_on_fail)
File "/usr/lib/python3/dist-packages/cloudinit/helpers.py", line 187, in run
results = functor(*args)
File "/usr/lib/python3/dist-packages/cloudinit/config/cc_growpart.py", line 351, in handle
func=resize_devices, args=(resizer, devices))
File "/usr/lib/python3/dist-packages/cloudinit/util.py", line 2521, in log_time
ret = func(*args, **kwargs)
File "/usr/lib/python3/dist-packages/cloudinit/config/cc_growpart.py", line 298, in resize_devices
(old, new) = resizer.resize(disk, ptnum, blockdev)
File "/usr/lib/python3/dist-packages/cloudinit/config/cc_growpart.py", line 159, in resize
return (before, get_size(partdev))
File "/usr/lib/python3/dist-packages/cloudinit/config/cc_growpart.py", line 198, in get_size
fd = os.open(filename, os.O_RDONLY)
FileNotFoundError: [Errno 2] No such file or directory: '/dev/disk/by-partuuid/a5f2b49f-abd6-427f-bbc4-ba5559235cf3'
========================================
@rcj suggested this is a race with udev. This seems to only happen on
Cosmic and later.
To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-init/+bug/1834875/+subscriptions
More information about the foundations-bugs
mailing list