[Bug 2025563] Re: System can not shutdown if system has multiple VROC RAID arrays

Łukasz Zemczak 2025563 at bugs.launchpad.net
Thu Aug 24 18:10:47 UTC 2023


Hello Cyrus, or anyone else affected,

Accepted systemd into jammy-proposed. The package will build now and be
available at
https://launchpad.net/ubuntu/+source/systemd/249.11-0ubuntu3.10 in a few
hours, and then in the -proposed repository.

Please help us by testing this new package.  See
https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how
to enable and use -proposed.  Your feedback will aid us getting this
update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug,
mentioning the version of the package you tested, what testing has been
performed on the package and change the tag from verification-needed-
jammy to verification-done-jammy. If it does not fix the bug for you,
please add a comment stating that, and change the tag to verification-
failed-jammy. In either case, without details of your testing we will
not be able to proceed.

Further information regarding the verification process can be found at
https://wiki.ubuntu.com/QATeam/PerformingSRUVerification .  Thank you in
advance for helping!

N.B. The updated package will be released to -updates after the bug(s)
fixed by this package have been verified and the package has been in
-proposed for a minimum of 7 days.

** Changed in: systemd (Ubuntu Jammy)
       Status: In Progress => Fix Committed

** Tags added: verification-needed verification-needed-jammy

-- 
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/2025563

Title:
  System can not shutdown if system has multiple VROC RAID arrays

Status in OEM Priority Project:
  In Progress
Status in systemd package in Ubuntu:
  Fix Released
Status in systemd source package in Jammy:
  Fix Committed
Status in systemd source package in Kinetic:
  Fix Released

Bug description:
  [ Impact ]

  The system can not shutdown if the system has multiple VROC RAID arrays.
  Intel has fixed it in systemd v251 [1].
  Need to cherry-pick the commit to ubuntu-jammy systemd 249.11-0ubuntu3.9.

  [1] The commit fixes the issue:
  commit 3a3b022d2cc112803ea7b9beea98bbcad110368a
  Author: Mariusz Tkaczyk <mariusz.tkaczyk at linux.intel.com>
  Date:   Tue Mar 29 12:49:54 2022 +0200

      shutdown: get only active md arrays.

      Current md_list_get() implementation filters all block devices, started from
      "md*". This is ambiguous because list could contain:
      - partitions created upon md device (mdXpY)
      - external metadata container- specific type of md array.

      For partitions there is no issue, because they aren't handle STOP_ARRAY
      ioctl sent later. It generates misleading errors only.

      Second case is more problematic because containers are not locked in kernel.
      They are stopped even if container member array is active. For that reason
      reboot or shutdown flow could be blocked because metadata manager cannot be
      restarted after switch root on shutdown.

      Add filters to remove partitions and containers from md_list. Partitions
      can be excluded by DEVTYPE. Containers are determined by MD_LEVEL
      property, we are excluding all with "container" value.

      Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk at linux.intel.com>

  In the journal, we can see systemd-shutdown looping repeatedly as it
  tries and fails to detach all md devices:

  ...
  [  513.416293] systemd-shutdown[1]: Stopping MD /dev/md124p2 (259:5).
  [  513.422953] systemd-shutdown[1]: Could not stop MD /dev/md124p2: Device or resource busy
  [  513.431227] systemd-shutdown[1]: Stopping MD /dev/md124p1 (259:4).
  [  513.437952] systemd-shutdown[1]: Could not stop MD /dev/md124p1: Device or resource busy
  [  513.449298] systemd-shutdown[1]: Stopping MD /dev/md124 (9:124).
  [  513.456278] systemd-shutdown[1]: Could not stop MD /dev/md124: Device or resource busy
  [  513.465323] systemd-shutdown[1]: Not all MD devices stopped, 4 left.
  [  513.472564] systemd-shutdown[1]: Couldn't finalize remaining  MD devices, trying again.
  [  513.485302] systemd-shutdown[1]: Failed to open watchdog device /dev/watchdog: No such file or directory
  [  513.496195] systemd-shutdown[1]: Stopping MD devices.
  [  513.502176] systemd-shutdown[1]: sd-device-enumerator: Scan all dirs
  [  513.513382] systemd-shutdown[1]: sd-device-enumerator: Scanning /sys/bus
  [  513.521436] systemd-shutdown[1]: sd-device-enumerator: Scanning /sys/class
  [  513.534810] systemd-shutdown[1]: Stopping MD /dev/md126 (9:126).
  [  513.545384] systemd-shutdown[1]: Failed to sync MD block device /dev/md126, ignoring: Input/output error
  [  513.557265] md: md126 stopped.
  [  513.561451] systemd-shutdown[1]: Stopping MD /dev/md124p2 (259:5).
  [  513.576673] systemd-shutdown[1]: Could not stop MD /dev/md124p2: Device or resource busy
  [  513.589274] systemd-shutdown[1]: Stopping MD /dev/md124p1 (259:4).
  [  513.597976] systemd-shutdown[1]: Could not stop MD /dev/md124p1: Device or resource busy
  [  513.607263] systemd-shutdown[1]: Stopping MD /dev/md124 (9:124).
  [  513.615067] systemd-shutdown[1]: Could not stop MD /dev/md124: Device or resource busy
  [  513.625157] systemd-shutdown[1]: Not all MD devices stopped, 4 left.
  [  513.632209] systemd-shutdown[1]: Couldn't finalize remaining  MD devices, trying again.
  [  513.641474] systemd-shutdown[1]: Failed to open watchdog device /dev/watchdog: No such file or directory
  [  513.653660] systemd-shutdown[1]: Stopping MD devices.
  [  513.661257] systemd-shutdown[1]: sd-device-enumerator: Scan all dirs
  [  513.668833] systemd-shutdown[1]: sd-device-enumerator: Scanning /sys/bus
  [  513.677347] systemd-shutdown[1]: sd-device-enumerator: Scanning /sys/class
  [  513.687047] systemd-shutdown[1]: Stopping MD /dev/md126 (9:126).
  [  513.697206] systemd-shutdown[1]: Failed to sync MD block device /dev/md126, ignoring: Input/output error
  [  513.707193] md: md126 stopped.
  ...

  [ Test Plan ]

  1. Build two VROC RAID. One RAID 0 for System volume, another RAID 10 for Data volume.
  2. Install system on System volume.
  3. Update systemd.
  4. Reboot the system.
  5. Verify if the system can reboot.

  [ Where problems could occur ]

  The patch confirmed fixed the reboot issue on the system with two VROC
  RAIDs but more than two VROC RAIDs and the combinations of RAID levels
  are not all tested. The patch itself adds logic to skip partitions and
  containers from the list of md devices to try and stop. Therefore any
  regressions would also be related to stopping md devices in systemd-
  shutdown.

  [ Scope ]

  Jammy

To manage notifications about this bug go to:
https://bugs.launchpad.net/oem-priority/+bug/2025563/+subscriptions




More information about the foundations-bugs mailing list