[Bug 1828558] Re: installing ubuntu on a former md raid volume makes system unusable

Michael Hudson-Doyle mwhudsonlp at fastmail.fm
Mon Jul 29 01:12:07 UTC 2019


Having looked into this a bit I'm a bit surprised -- assuming the drive
was previously part of a raid array with metadata version 0.90 -- that
your system ever booted at all. Partman creates a disk label with
ped_disk_new_fresh, which arranges for ped_disk_clobber to be called on
the disk. This zeroes the first and last 10 kiB of the disk, which will
wipe out all the mdraid superblock for all other metadata versions. So
if you use a device in a MD raid array with 0.90 metadata then install
to it with ubiquity you get a disk that has both a partition table and
raid metadata.

udev gets its information about block devices from libblkid and this, in
general, seems to check for raid metadata before it checks for a
partition table:

mwhudson at ringil:~/images$ blkid --probe raid1.img
raid1.img: VERSION="0.90.0" UUID="c9d611d5-1d1e-839b-14d5-894fb9296617" TYPE="linux_raid_member" USAGE="raid"
mwhudson at ringil:~/images$ blkid --probe --usage filesystem raid1.img
raid1.img: PTUUID="c5e0e910" PTTYPE="dos"
mwhudson at ringil:~/images$ blkid --probe --usage raid raid1.img
raid1.img: VERSION="0.90.0" UUID="c9d611d5-1d1e-839b-14d5-894fb9296617" TYPE="linux_raid_member" USAGE="raid"

Watching udev monitor while I attach a block device like this does show
device nodes for the partitions appearing very briefly, so it's possible
that your first reboots somehow won the race and managed to mount the
partition before the device node went away again -- but in my testing
the node for the partition is only present for a few milliseconds. I
guess it might be there for longer during the busy environment of early
boot. (When I tried to recreate your setup in a VM, the installed system
didn't boot even once)

ANYWAY, the fix for this is clearly for the install to clear the md
metadata somehow. I think one could make an argument that parted should
do this, but there might be some subtle reasons I don't know about that
would make this a bad idea. The more unsubtle approach would be to jam a
call to "mdadm --zero-superblock" in somewhere.

As for your followup comment:

> Zeroing the md superblock in ubiquity - if that's what you are thinking about, will fix this
> issue in my particular scenario, but what if disk was partitioned in some other way?

I actually think inserting a --zero-superblock in is more or less safe,
because all the other superblocks I know about will be wiped by the
wiping parted already does (there are probably some obscure ones that
will not). We could insert wipefs -a instead to get all the superblocks
libblkid (and hence udev!) knows about. Or of course we could zero the
entire device but that's likely to be unacceptably slow.

> I am
> afraid it is a partial workaround, the proper albeit more compex way of handling this issue
> is to make sure that properly formatted partition table always takes precedence over leftover
> superblocks during boot.

I don't think there really is a way of choosing which of the superblocks
on a device you want to respect. I suppose in theory one could be added,
but this gets way further into kernel uevent/udev land than I am
confident even speculating about.

-- 
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to ubiquity in Ubuntu.
https://bugs.launchpad.net/bugs/1828558

Title:
  installing ubuntu on a former md raid volume makes system unusable

Status in ubiquity package in Ubuntu:
  New

Bug description:
  18.04 is installed using GUI installer in 'Guided - use entire volume'
  mode on a disk which was previously used as md raid 6 volume.
  Installer repartitions the disk and installs the system, system
  reboots any number of times without issues. Then packages are upgraded
  to the current states and some new packages are installed including
  mdadm which *might* be the culprit, after that system won't boot any
  more failing into ramfs prompt with 'gave up waiting for root
  filesystem device' message, at this point blkid shows boot disk as a
  device with TYPE='linux_raid_member', not as two partitions for EFI
  and root (/dev/sda, not /dev/sda1 and /dev/sda2). I was able fix this
  issue by zeroing the whole disk (dd if=/dev/zero of=/dev/sda bs=4096)
  and reinstalling. Probably md superblock is not destroyed when disk is
  partitioned by the installer, not overwritten by installed files and
  somehow takes precedence over partition table (gpt) during boot.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ubiquity/+bug/1828558/+subscriptions



More information about the foundations-bugs mailing list