[Bug 925280] Re: Software RAID fails to rebuild after testing degraded cold boot

iMac 925280 at bugs.launchpad.net
Wed Apr 18 18:27:07 UTC 2012


IMHO, this is the *new* expected behavior.  If both the raid members
left the array in a good state (i.e. you unplugged one while the system
was off) then you need to zero the superblock to get it back into the
array.

I suspect your test case would work with a disk that only had the
structures, and not the clean data inside; Perhaps doing a live pull on
the cable (simulate a controller failure) for your test, in an
environment where you don't care about the data.

In that case, upon restart, I would expect the "dirty" and "old" md disk
to be automatically rebuilt.

In one of my use cases, where I use mdadm slightly differently across
two computers, it solves a problem where the older disk is sometimes
mounted when both md members are clean;  In this case the new data is
overwritten by the old, which can be a real issue caused by the old
behavior.

Factors that influence use cases where old data could overwrite new data
previously are related to individual disk spin up times, and
availability of disks at boot (especially with remote block devices),
which is probably the reason for this 'feature'.   My observations are
in the dupe below.

The use case for you should probably include a real 'spare' rather then
using an old member in a good state (which should probably be not
overwritten by default)

https://bugs.launchpad.net/ubuntu/+source/mdadm/+bug/945786

-- 
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to mdadm in Ubuntu.
https://bugs.launchpad.net/bugs/925280

Title:
  Software RAID fails to rebuild after testing degraded cold boot

Status in “linux” package in Ubuntu:
  Incomplete
Status in “mdadm” package in Ubuntu:
  Confirmed

Bug description:
  Attempting the RAID install test with Precise server AMD64.

  Hardware config is a 1U server with 2 SATA drives wth the following
  partitions:

  sda: 500GB SATA
  sda1: 50GB RAID
  sda2: 20GB RAID
  sda3: 180GB RAID

  sdb: 250GB SATA
  sdb1: 50GB RAID
  sdb2: 20GB RAID
  sdb3: 180GB RAID

  Using the instructions found here:
  http://testcases.qa.ubuntu.com/Install/ServerRAID1

  I created the three partitions for each physical disk.  I then created
  three RAID deviecs, md0 - md2 as follows:

  md0: 50GB RAID1 using sda1 and sdb1 for /
  md1: 20GB RAID1 using sda2 and sdb2 for swap
  md2: 180GB RAID1 using sda3 and sdb3 for /home

  I then completed the install and reboot.  On the initial boot, I
  verified that all three RAID devices were present and active:

  bladernr at ubuntu:~$ cat /proc/mdstat
  Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
  md0 : active raid1 sda1[0] sdb1[1]
             48826296 blocks super 1.2 [2/2] [UU]

  md2: active raid1 sda3[0] sdb3[1]
            175838136 blocks super 1.2 [2/2] [UU]

  md1: active raid1 sda2[0] sdb2[1]
            19529656 blocks super 1.2 [2/2] [UU]

  I then powered the machine down per the test case instructions,
  removed disk 2 (sdb) and powered back up.  On reboot, I verified that
  the array was active and degraded and powered the system back down,
  again per the test instructions.

  I re-inserted drive2 (sdb) and powered the system up again.  After
  logging in, I rechecked /dev/mdstat, expecting to see both drives for
  each md device and a resync in progress.  Instead, I found that the
  second drive was missing from md0 and md2 while md1 (the swap LUN) was
  fine.

  bladernr at ubuntu:~$ cat /proc/mdstat
  Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
  md0 : active raid1 sda1[0]
             48826296 blocks super 1.2 [2/1] [U_]

  md2: active raid1 sda3[0]
            175838136 blocks super 1.2 [2/1][U_]

  md1: active raid1 sda2[0] sdb2[1]
            19529656 blocks super 1.2 [2/2] [UU]

  The instructions indicated that I may have to re-add the drives that
  are missing manually, so I attemted this:

  bladernr at ubuntu:~$ sudo mdadm --add /dev/md0 /dev/sdb1
  mdadm: /dev/sdb1 reports being an active member for /dev/md0, but a --re-add fails.
  mdadm: not performing --add as that would convert /dev/sdb1 in to a spare.
  mdadm: To make this a spare, use "mdadm --zero-superblock /dev/sdb1" first.

  I also tried using --re-add:

  bladernr at ubuntu~$ sudo mdadm --re-add /dev/md0 /dev/sdb1
  mdadm: --re-add for /dev/sdb1 to /dev/md0 is not possible

  So here's some info from mdadm:

  /dev/md0:
                     Version : 1.2
         Creation Time : Wed Feb 1 20:53:34 
                Raid Level : raid1
                Array Size : 48826296 (46.56 GiB 50.00GB)
         Used Dev Size : 48826296 (46.56 GiB 50.00GB)
           Raid Devices : 2
          Total Devices : 1
             Persistence : Superblock is persistent

           Update Time : Wed Feb 1 23:54:04 2012
                         State : clean, degraded
        Active Devices : 1 
    Working Devices : 1
         Failed Devices : 0
         Spare Devices : 0

                        Name : ubuntu:0 (local to host ubuntu) 
                         UUID : 118d60db:4ddc5cf2:040c4cb2:bd896eaf
                      Events : 118

      Number    Major    Minor    RaidDevices  State
              0             8              1              0              active sync    /dev/sda1
              1             0              0              1              removed       

  So according to the test instructions, this test is a failure because
  I can't rebuild the array (nor is it automatically rebuilt).

  ProblemType: Bug
  DistroRelease: Ubuntu 12.04
  Package: linux-image-3.2.0-12-generic 3.2.0-12.21
  ProcVersionSignature: Ubuntu 3.2.0-12.21-generic 3.2.2
  Uname: Linux 3.2.0-12-generic x86_64
  AlsaDevices:
   total 0
   crw-rw---T 1 root audio 116,  1 Feb  1 23:35 seq
   crw-rw---T 1 root audio 116, 33 Feb  1 23:35 timer
  AplayDevices: Error: [Errno 2] No such file or directory
  ApportVersion: 1.91-0ubuntu1
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
  CRDA: Error: [Errno 2] No such file or directory
  Date: Wed Feb  1 23:38:28 2012
  HibernationDevice: RESUME=UUID=e573077c-98b5-42e5-9f37-b8efaa2ba74a
  InstallationMedia: Ubuntu-Server 12.04 LTS "Precise Pangolin" - Alpha amd64 (20120201.1)
  IwConfig:
   lo        no wireless extensions.
   
   eth1      no wireless extensions.
   
   eth0      no wireless extensions.
  MachineType: Supermicro X7DVL
  PciMultimedia:
   
  ProcEnviron:
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.2.0-12-generic root=UUID=a84486b9-e72d-4134-82a8-263f91d7d894 ro
  RelatedPackageVersions:
   linux-restricted-modules-3.2.0-12-generic N/A
   linux-backports-modules-3.2.0-12-generic  N/A
   linux-firmware                            1.68
  RfKill: Error: [Errno 2] No such file or directory
  SourcePackage: linux
  UpgradeStatus: No upgrade log present (probably fresh install)
  dmi.bios.date: 06/23/2008
  dmi.bios.vendor: Phoenix Technologies LTD
  dmi.bios.version: 2.1
  dmi.board.name: X7DVL
  dmi.board.vendor: Supermicro
  dmi.board.version: PCB Version
  dmi.chassis.type: 1
  dmi.chassis.vendor: Supermicro
  dmi.chassis.version: 0123456789
  dmi.modalias: dmi:bvnPhoenixTechnologiesLTD:bvr2.1:bd06/23/2008:svnSupermicro:pnX7DVL:pvr0123456789:rvnSupermicro:rnX7DVL:rvrPCBVersion:cvnSupermicro:ct1:cvr0123456789:
  dmi.product.name: X7DVL
  dmi.product.version: 0123456789
  dmi.sys.vendor: Supermicro

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/925280/+subscriptions




More information about the foundations-bugs mailing list