encrypted system fails to boot when Sil3132 is present.

Xen list at xenhideout.nl
Wed Aug 3 11:11:44 UTC 2016


How can I debug a failure in the initrd boot sequence if and when a root 
filesystem cannot be loaded?

I have a Sil3132 SATA RAID controller card that when installed into the 
system (even with no harddisks on it) will cause my (encrypted) system 
to fail booting.

The first thing that visibly happens is that the volume group on which 
the root partition is situated, cannot be found.

"Volume group "linux" not found"

But before that the crypt device opens *fine* -- this happens both on 
Kubuntu 16.04 and Mint 18, which ...kinda stem from the same roots.

So:

- crypt is getting unlocked and *should* be getting mapped to a 
/dev/mapper/xxx name.

- this *should* automatically be followed by pvscan and activation of 
the volume groups

- this is not happening



Booting from an (older) Kubuntu Live DVD, the system boots just fine. I 
will conjecture that this is also going to be the case with a Kubuntu 
16.04 DVD.

When the device is installed it becomes normally scsi host0 and host1. 
However I can change the order in the bios, I supposed it is going to be 
host4 and host5. There is no issue whatsoever with using any attached 
device while booting from Live DVD:

/dev/sde1: UUID="14DCC30ADCC2E558" TYPE="ntfs" PARTUUID="021460ee-01"
/dev/sde2: UUID="5866C79466C770F4" TYPE="ntfs" PARTUUID="021460ee-02"

sde                8:64   0 698.7G  0 disk
├─sde1             8:65   0   450M  0 part
└─sde2             8:66   0  58.6G  0 part

At the same time opening the crypt container on the (non-attached) 
(motherboard-attached) primary harddisk (that contains Mint in this 
case) (but the same thing happens with a Kubuntu 16.04 Live stick 
(installed system, not Live) works fine too.

In this case that was:

cryptsetup open /dev/sdf6 sdf6_c

lvs

<shows all the volumes of volume group "linux">

(((I actually did not attempt to mount those volumes, those LVM volumes, 
but they were loaded fine and should not give any errors))).


So basically:

- the RAID controller functions normally (at least in non-raid mode)

- booting from a Live DVD allows access to both the controller-attached 
drive as well as the motherboard-attached drive

- there seems to be no reason why mounting any opened volumes from the 
main drive will fail, after manually opening the crypt.

- while booting, the crypt device opens fine by way of the initrd 
password prompt (graphical).

[[ In Linux Mint it uses the graphical UI in this case; which seems to 
work fine; in my Kubuntu stick the crypt device is opened with a key 
file embedded in the initrd that grub unlocks).

That should have no bearing on the result though, it is clear there are 
no visible errors that I can see related to the opening of the crypt 
device ]].

- even though the thing has opened fine, it then cannot find the root 
device (that is situated on LVM).


In order to test I can attach another disk that has a non-encrypted root 
filesystem (also on LVM) directly on a PV (no partitions) -- if the 
booting system will not find the PV in this case it means it will just 
not find any PVs (it is sitting directly on the device (/dev/sda).

I will finish this email after testing this, so postponing.

So back in Kubuntu now the system boots just fine.

Apart from the encryption the amount of differences between the systems 
is minute:

- both Mint 18 and the Kubuntu 16.04 stick are running relatively recent 
kernels with Mint somewhat more up to date; the current Kubuntu system 
is also up to date; but both Mint and the stick (both encrypted with 
LUKS) fail.

- there are no adjustments to the initrd of the encrypted systems other 
than (naturally) the inclusion of the cryptsetup tools, hooks and 
scripts.

And yet after opening the crypt device the root filesystem will not be 
found, whereas it will be found just fine if the controller is not 
there.

How can this be?

Grub makes this device map (which is as it should be:)

(hd0)   /dev/disk/by-id/ata-ST1000LM024_HN-M101MBB_S2ZWJ9DG703561
(hd1)   /dev/disk/by-id/ata-WDC_WD7500BPKX-00HPJT0_WD-WX81AB574LVH

(The current system is booted from ST1000.

The WD7500 is on the extra controller (the Sil3132).

Again: even if you remove that second drive, the system still fails to 
boot.

The controller gets host2 (I was mistaken) and host4:

[    2.694148] scsi host2: sata_sil24

But ata5 and 6 I believe:

[    2.707848] scsi host4: sata_sil24
[    2.707956] ata5: SATA max UDMA/100 host m128 at 0xfdcff000 port 
0xfdcf8000 irq 16
[    2.707960] ata6: SATA max UDMA/100 host m128 at 0xfdcff000 port 
0xfdcfa000 irq 16


[    4.928048] ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 0)
[    4.932888] ata5.00: ATA-8: WDC WD7500BPKX-00HPJT0, 01.01A01, max 
UDMA/133
[    4.932891] ata5.00: 1465149168 sectors, multi 16: LBA48 NCQ (depth 
31/32)
[    4.936887] ata5.00: configured for UDMA/100
[    4.937122] scsi 2:0:0:0: Direct-Access     ATA      WDC WD7500BPKX-0 
1A01 PQ: 0 ANSI: 5
[    4.937469] sd 2:0:0:0: [sdf] 1465149168 512-byte logical blocks: 
(750 GB/699 GiB)
[    4.937472] sd 2:0:0:0: [sdf] 4096-byte physical blocks
[    4.937525] sd 2:0:0:0: [sdf] Write Protect is off
[    4.937528] sd 2:0:0:0: [sdf] Mode Sense: 00 3a 00 00
[    4.937552] sd 2:0:0:0: [sdf] Write cache: enabled, read cache: 
enabled, doesn't support DPO or FUA
[    4.937812] sd 2:0:0:0: Attached scsi generic sg6 type 0
[    4.942838]  sdf: sdf1 sdf2
[    4.943215] sd 2:0:0:0: [sdf] Attached SCSI disk
[    7.016034] ata6: SATA link down (SStatus 0 SControl 0)


There is no dmesg output from pvscan (on this working system). So how do 
I find out what happens in the boot sequence, or does anyone have any 
clues?












More information about the kubuntu-users mailing list