[Bug 1166086] [NEW] raid assembly can break into divergent parts with conflicting changes

ceg 1166086 at bugs.launchpad.net
Mon Apr 8 09:05:55 UTC 2013


Public bug reported:

Consider a laptop with a raid setup that consists of one internal disk
and a second external disk that resides in the laptops docking station.

Undocking the laptop during operation, causes the raid to switch into
degraded operation and the laptop is turned off afterwards.

Back in the office, the laptop gets mounted in the dock and is switched
on. The hardware setup is such that the disk in the docking station is
found first.

Mdadm --incremental sets up the raid device with the external disk (the
internal disk is refused by --incremental or may have been ejected).
Because the raid does not come up completely, the raid device is then
started in degraded mode with only the external disk.

We now have two divergent parts of the same raid device. The machine
runns with the old disk state as of the undocking of the laptop. All
changes made during the undocked state have been saved to the internal
disks, but they are not used now. New changes will only be written to
the external disk.

Suggested Fix:
Store the state of the event counter at the time of the degradation for each missing device in the superblocks on the remaining member devices.
--incremental should continue to (re)add a device automatically (only) if the event count shows the state of the new member device that is to be (re)added is equal or older than at the time the device failed.

When the incrementally assembled raid device is already running in the
(auto) read only state from an device that had failed earlier, attempt
to switch to the newer state of the added device, if that device
correctly describes the failed state of the older device (older device
event counter is equal or older than at the time it failed). Otherwise,
abort and print an error message that the devices contain conflicting
changes.

** Affects: mdadm (Ubuntu)
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to mdadm in Ubuntu.
https://bugs.launchpad.net/bugs/1166086

Title:
  raid assembly can break into divergent parts with conflicting changes

Status in “mdadm” package in Ubuntu:
  New

Bug description:
  Consider a laptop with a raid setup that consists of one internal disk
  and a second external disk that resides in the laptops docking
  station.

  Undocking the laptop during operation, causes the raid to switch into
  degraded operation and the laptop is turned off afterwards.

  Back in the office, the laptop gets mounted in the dock and is
  switched on. The hardware setup is such that the disk in the docking
  station is found first.

  Mdadm --incremental sets up the raid device with the external disk
  (the internal disk is refused by --incremental or may have been
  ejected). Because the raid does not come up completely, the raid
  device is then started in degraded mode with only the external disk.

  We now have two divergent parts of the same raid device. The machine
  runns with the old disk state as of the undocking of the laptop. All
  changes made during the undocked state have been saved to the internal
  disks, but they are not used now. New changes will only be written to
  the external disk.

  Suggested Fix:
  Store the state of the event counter at the time of the degradation for each missing device in the superblocks on the remaining member devices.
  --incremental should continue to (re)add a device automatically (only) if the event count shows the state of the new member device that is to be (re)added is equal or older than at the time the device failed.

  When the incrementally assembled raid device is already running in the
  (auto) read only state from an device that had failed earlier, attempt
  to switch to the newer state of the added device, if that device
  correctly describes the failed state of the older device (older device
  event counter is equal or older than at the time it failed).
  Otherwise, abort and print an error message that the devices contain
  conflicting changes.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/mdadm/+bug/1166086/+subscriptions




More information about the foundations-bugs mailing list