[Bug 557429] Re: array with conflicting changes is assembled with data corruption/silent loss

Sat Jul 6 11:32:03 UTC 2013

** No longer affects: mdadm (Ubuntu Lucid)

-- 
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to mdadm in Ubuntu.
https://bugs.launchpad.net/bugs/557429

Title:
  array with conflicting changes is assembled with data
  corruption/silent loss

Status in mdadm - Tool for managing linux software RAID arrays.:
  New
Status in Release Notes for Ubuntu:
  Fix Released
Status in “mdadm” package in Ubuntu:
  Triaged

Bug description:
  Re-attaching parts of an array that have been running degraded separately and contain
  conflicting changes of the same amount, or within the range of a write intent bitmap,
  results in the assembly of a corrupt array.

  ----
  Using the latest beta-2 server ISO and following http://testcases.qa.ubuntu.com/Install/ServerRAID1

  Booting out of sync RAID1 array fails with ext3: It comes up as
  synced, but is corrupted.

       (According to comment #18: ext3 vs ext4 seems to be mere
  happenstance.)

  Steps to reproduce:

  1. in a kvm virtual machine, using 2 virtio qcow2 disks each 1768M in size, 768M ram and 2 VCPUs, in the installer I create the md devices:
  /dev/md0: 1.5G, ext3, /
  /dev/md1: ~350M, swap

  Choose to boot in degraded mode. All other installer options are
  defaults

  2. reboot into Lucid install and check /proc/mdstat: ok, both disks
  show up and are in sync

  3. shutdown VM. remove 2nd disk, power on the VM and check
  /proc/mdstat: ok, boots degraded and mdstat shows the disk

  4. shutdown VM. reconnect 2nd disk and remove 1st disk, power on the
  VM and check /proc/mdstat: ok, boots degraded and mdstat shows the
  disk

  5. shutdown VM. reconnect 1st disk (so now both disks are connected,
  but out of sync), power on the VM

  Expected results:
  At this point it should boot degraded with /proc/mdstat showing it is syncing (recovering). This is how it works with ext4. Note that in the past one would have to 'sudo mdadm -a /dev/md0 /dev/MISSING-DEVICE' before syncing would occur. This no longer seems to be required.

  Actual results:
  Array comes up with both disks in the array and in sync.

  Sometimes there are error messages saying there are disk errors, and
  the boot continues to login, but root is mounted readonly and
  /proc/mdstat shows we are in sync.

  Sometimes fsck notices this and complains a *lot*:
  /dev/md0 contains a filesystem with errors
  Duplicate or bad block in use
  Multiply-claimed block(s) in inode...
  ...
  /dev/md0: File /var/log/boot.log (inode #68710, mod time Wed Apr  7 11:35:59 2010) has multiply-claimed block(s), shared with 1 file(s):
   /dev/md0:     /var/log/udev (inode #69925, mod time Wed Apr  7 11:35:59 2010)
  /dev/md0:
  /dev/mdo0: UNEXPECTED CONSISTENCY; RUN fsck MANUALLY.

  The boot loops infinitely on this because the mountall reports that
  fsck terminated with status 4, then reports that '/' is a filesystem
  with errors, then tries again (and again, and again).

  See:
  http://iso.qa.ubuntu.com/qatracker/result/3918/286

  I filed this against 'linux'; please adjust as necessary.

  -----

  From linux-raid list:
  mdadm --incremental should only included both disks in the array if
  1/ their event counts are the same, or +/- 1, or
  2/ there is a write-intent bitmap and the older event count is within
     the range recorded in the write-intent bitmap.

  Fixing:

  * When assembling, mdadm could check for conflicting "failed" states in the
    superblocks of members to detect conflicting changes. On conflicts, i.e. if an
    additional member claims an already running member has failed:
     + that member should not be added to the array
     + report (console and --monitor event) that an alternative
       version with conflicting changes has been detected "mdadm: not
       re-adding /dev/≤member> to /dev/≤array> because constitutes an
       alternative version containing conflicting changes"
     + require and support --force with --add for manual re-syncing of
       alternative versions (because unlike with re-syncing outdated
       devices/versions, in this case changes will get lost).

  Enhancement 1)
    To facilitate easy inspection of alternative versions (i.e. for safe and
    easy diffing, merging, etc.) --incremental could assemble array
    components that contain alternative versions into temporary
    auxiliary devices.
    (would require temporarily mangling the fs UUID to ensure there are no
    duplicates in the system)

  Enhancement 2)
    Those that want to be able to disable hot-plugging of
    segments with conflicting changes/alternative versions (after an
    incidence with multiple versions connected at the same time occured)
    will need some additional enhancements:
     + A way to mark some raid members (segments) as containing
       known alternative versions, and to mark them as such when an
       incident occurs in which they come up after another
       segment of the array is already running degraded.
       (possibly a superblock marking itself as failed)
     + An option like
       "AUTO -SINGLE_SEGMENTS_WITH_KNOWN_ALTERNATIVE_VERSIONS"
       to disable hotplug support for alternative versions once they came
       up after some other version and got marked as containig an alternative version.

To manage notifications about this bug go to:
https://bugs.launchpad.net/mdadm/+bug/557429/+subscriptions