[Bug 2054672] Re: Deploying a server with bcache on top of HDD and mdadm can frequently fail

Fri Feb 23 22:04:45 UTC 2024

** Merge proposal linked:
   https://code.launchpad.net/~alexsander-souza/curtin/+git/curtin/+merge/461160

** Changed in: maas
       Status: New => Triaged

** Changed in: maas
   Importance: Undecided => Critical

** Changed in: maas
     Assignee: (unassigned) => Alexsander de Souza (alexsander-souza)

** Changed in: maas
    Milestone: None => 3.5.0

** Also affects: maas/3.3
   Importance: Undecided
       Status: New

** Also affects: maas/3.4
   Importance: Undecided
       Status: New

-- 
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to curtin.
https://bugs.launchpad.net/bugs/2054672

Title:
  Deploying a server with bcache on top of HDD and mdadm can frequently
  fail

Status in curtin:
  New
Status in MAAS:
  Triaged
Status in MAAS 3.3 series:
  New
Status in MAAS 3.4 series:
  New

Bug description:
  Environment :
  * MAAS 3.3 and 3.4
  * Ubuntu 22.04
  * deployment / commissioning OS : 20.04 and 22.04
  * Servers to deploy with slow drives such as HDD

  When deploying a server using Bcache as its device for rootfs,
  especially on top of software RAID (mdadm) and with slow drives such
  as hard drives, the installation of Ubuntu, on the storage
  configuration step, can fail quite frequently.

  #
  # Reproducer :
  #
  It is possible to recreate the environment with slow drives with Libvirt with the following setup :
  1) Create around 6 or more VMs with (see the script "create-slow-vms.sh" for the exact commands) :
   * 3 vCPUs
   * 4 GB of RAM
   * 3 disks :
     * 1 x 10 GB fast, as bcache
     * 2 x 30 GB with limited IOPS (150 iops, 30MB/s top speed)

  2) With the following disk topology (see reproducer-storage-config.png) :
   * /dev/vda --> 2 partitions
     - 1GB for md0
     - 29GB for md1
   * /dev/vdb --> 2 partitions
     - 1GB for md0
     - 29GB for md1
   * /dev/md0 --> ext4 for /boot
   * /dev/vdc (fast drive) --> bcache0 cache set
   * /dev/md1 --> bcache0 backend storage
   * /dev/bcache0 --> ext4 for /

  3) Deploy Ubuntu 22.04 to all VMs
  --> some of the VMs will fail with the same error with Curtin

  4) (Optional) Also not erasing the drives when releasing and
  redeploying right away the server seem to increase hugely the
  likelyness of failing the deployment.

  #
  # logs
  #
  I'm attaching to the bug report some more logs :
  * quick-summary-logs.txt --> some logs from baremetal servers on customer's hardware.
  * reproducer-installation-output.txt --> full installation output from a failing in my reproducer test.

  # theory
  And at a first glance, it seems to be a race condition, because when reusing the same server and retrying to deploy again Ubuntu, it may works right.
  This may be triggered because the hard drives are already sollicited with mdadm currently syncing the disks together and may become even slower when some changes, such as creating a bcache backend device, is requested and then curtin failing with the race condition.

  On a large deployment such as Openstack, this make the installation
  process cumbersome as one or multiple servers may randomly fail to
  deploy.

  Looking at the logs of the installation output from MAAS, curtin seems to fail to confirm the backend storage
  # main differences
  ## working
  2024-02-06T10:09:43+00:00 server-node3 cloud-init[2701]: check just created bcache /dev/md1 if it is registered, try=2
  2024-02-06T10:09:43+00:00 server-node3 cloud-init[2701]: Running command ['udevadm', 'settle'] with allowed return codes [0] (capture=False)
  2024-02-06T10:09:43+00:00 server-node3 cloud-init[2701]: TIMED udevadm_settle(): 0.018
  2024-02-06T10:09:43+00:00 server-node3 cloud-init[2701]: Found bcache dev /dev/md1 at expected path /sys/class/block/md1/bcache
  2024-02-06T10:09:43+00:00 server-node3 cloud-init[2701]: validating bcache backing device '/dev/md1' from sys_path '/sys/class/block/md1/bcache'
  2024-02-06T10:09:43+00:00 server-node3 cloud-init[2701]: bcache device /sys/class/block/md1/bcache using bcache kname: bcache6
  2024-02-06T10:09:44+00:00 server-node3 cloud-init[2701]: bcache device /sys/class/block/md1/bcache has slaves: ['md1']

  ## non-working
  2024-02-06T10:09:52+00:00 server-node1 cloud-init[2698]: check just created bcache /dev/md1 if it is registered, try=2
  2024-02-06T10:09:52+00:00 server-node1 cloud-init[2698]: Running command ['udevadm', 'settle'] with allowed return codes [0] (capture=False)
  2024-02-06T10:09:52+00:00 server-node1 cloud-init[2698]: TIMED udevadm_settle(): 0.019
  2024-02-06T10:09:52+00:00 server-node1 cloud-init[2698]: Found bcache dev /dev/md1 at expected path /sys/class/block/md1/bcache
  2024-02-06T10:09:52+00:00 server-node1 cloud-init[2698]: validating bcache backing device '/dev/md1' from sys_path '/sys/class/block/md1/bcache'
  2024-02-06T10:09:52+00:00 server-node1 cloud-init[2698]: bcache dev /dev/md1 at path /sys/class/block/md1/bcache successfully registered on attempt 2/60
  2024-02-06T10:09:52+00:00 server-node1 cloud-init[2698]: devname '/dev/md1' had holders: []

To manage notifications about this bug go to:
https://bugs.launchpad.net/curtin/+bug/2054672/+subscriptions