[SRU][N/O][PATCH 0/1] raid1: Fix NULL pointer dereference in process_checks()

Mon Jun 9 05:57:01 UTC 2025

BugLink: https://bugs.launchpad.net/bugs/2112519

[Impact]

A null pointer dereference was found in raid1 during failure mode testing.
A raid1 array was set up, filled with data and a check operation started. While
the check was underway, all underlying iSCSI disks were forcefully disconnected
with --failfast set to the md array, and the following kernel oops occurs:

md/raid1:: dm-0: unrecoverable I/O read error for block 527744
md/raid1:: dm-1: unrecoverable I/O read error for block 527616
md/raid1:: dm-0: unrecoverable I/O read error for block 527744
md/raid1:: dm-1: unrecoverable I/O read error for block 527616
md/raid1:: dm-1: unrecoverable I/O read error for block 527616
md/raid1:: dm-0: unrecoverable I/O read error for block 527744
md/raid1:: dm-1: unrecoverable I/O read error for block 527616
md/raid1:: dm-0: unrecoverable I/O read error for block 527744
BUG: kernel NULL pointer dereference, address: 0000000000000040
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0 
SMP NOPTI
CPU: 3 PID: 19372 Comm: md_1t889zmbfni_ Kdump: loaded Not tainted 6.8.0-1029-aws #31-Ubuntu
Hardware name: Amazon EC2 m6a.xlarge/, BIOS 1.0 10/16/2017
RIP: 0010:process_checks+0x25e/0x5e0 [raid1]
Code: 8e 19 01 00 00 48 8b 85 78 ff ff ff b9 08 00 00 00 48 8d 7d 90 49 8b 1c c4 49 63 c7 4d 8b 74 c4 50 31 c0 f3 48 ab 48 89 5d 88 <4c> 8b 53 40 45 0f b6 4e 18 49 8b 76 40 49 81 7e 38 a0 04 7c c0 75
RSP: 0018:ffffb39e8142bcb8 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: 0000000000000002 RSI: 0000000000000004 RDI: ffffb39e8142bd50
RBP: ffffb39e8142bd80 R08: ffff9a2e001ea000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff9a2e0cd63280
R13: ffff9a2e50d1f800 R14: ffff9a2e50d1f000 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffff9a3128780000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000040 CR3: 00000001035b2004 CR4: 00000000003706f0
Call Trace:
 <TASK>
 ? show_regs+0x6d/0x80
 ? __die+0x24/0x80
 ? page_fault_oops+0x99/0x1b0
 ? do_user_addr_fault+0x2e0/0x660
 ? exc_page_fault+0x83/0x190
 ? asm_exc_page_fault+0x27/0x30
 ? process_checks+0x25e/0x5e0 [raid1]
 ? process_checks+0x125/0x5e0 [raid1]
 ? srso_alias_return_thunk+0x5/0xfbef5
 ? ___ratelimit+0xc7/0x130
 sync_request_write+0x1c8/0x1e0 [raid1]
 raid1d+0x13a/0x3f0 [raid1]
 ? srso_alias_return_thunk+0x5/0xfbef5
 md_thread+0xae/0x190
 ? __pfx_autoremove_wake_function+0x10/0x10
 ? __pfx_md_thread+0x10/0x10
 kthread+0xda/0x100
 ? __pfx_kthread+0x10/0x10
 ret_from_fork+0x47/0x70
 ? __pfx_kthread+0x10/0x10
 ret_from_fork_asm+0x1b/0x30
 </TASK>

What happens is that process_checks() loops through all the available disks to
find a primary source with intact data, all disks are missing, and we shouldn't
move forward without having a valid primary source.

[Fix]

This was fixed in 6.15-rc3 with:

commit b7c178d9e57c8fd4238ff77263b877f6f16182ba
Author: Meir Elisha <meir.elisha at volumez.com>
Date:  Tue Apr 8 17:38:08 2025 +0300
Subject: md/raid1: Add check for missing source disk in process_checks()
Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b7c178d9e57c8fd4238ff77263b877f6f16182ba

This has been applied to focal, jammy and plucky already through upstream 
-stable. Currently noble and oracular are lagging behind and are not up to the
-stable release with the fix.

Bug focal:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2111448
Bug jammy:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2111606
Bug plucky:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2111268

[Testcase]

You don't need to set up a full iscsi environment, you can just make some local
VMs and then forcefully remove the underlying disks using libvirt.

Create a VM, attach two scratch disks:

$ lsblk
NAME    MAJ:MIN RM  SIZE RO TYPE MOUNTPOINTS
vda     253:0    0   10G  0 disk 
├─vda1  253:1    0    9G  0 part /
├─vda14 253:14   0    4M  0 part 
├─vda15 253:15   0  106M  0 part /boot/efi
└─vda16 259:0    0  913M  0 part /boot
vdb     253:16   0  372K  0 disk 
vdc     253:32   0    3G  0 disk 
vdd     253:48   0    3G  0 disk 
vde     253:64   0    3G  0 disk 

Create a raid1 array:

$ sudo mdadm --create --failfast --verbose /dev/md0 --level=1 --raid-devices=3 /dev/vdc /dev/vdd /dev/vde

Make a filesystem:

$ sudo mkfs.xfs /dev/md0

$ sudo mkdir /mnt/disk
$ sudo mount /dev/md0 /mnt/disk

Fill scratch disks with files:

for n in {1..1000}; do     dd if=/dev/urandom of=file$( printf %03d "$n" ).bin bs=1024 count=$(( RANDOM)); done

Start a check:

$ sudo mdadm --action=check /dev/md0

Use virt manager / libvirt to detach the disks, watch dmesg.

Test kernels are available in the following ppa:

https://launchpad.net/~mruffell/+archive/ubuntu/sf411666-test

If you install the test kernel, the null pointer dereference no longer occurs.

[Where problems can occur]

We are changing the logic such that if all the reads fail in process_check(),
and we have no valid primary source, then we disable recovery mode, mark an
error occurring, free the bio and exit out. Previously we would have just
continued onward and run into the null pointer dereference.

This really only affects situations where all backing disks are lost. This isn't
too uncommon though, particularly if all are network storage and network issues
occur, losing access to the disks. Things should remain as they are if at least
one primary source disk exists.

If a regression were to occur, it would affect raid1 arrays only, and only
during check/repair operations. 

A workaround would be to disable check or repair operations on the md array 
until the issue is fixed.

[Other info]

Upstream mailing list discussion:

V1:
https://lore.kernel.org/linux-raid/712ff6db-6b01-be95-a394-266be08a1d6e@huaweicloud.com/T/
V2:
https://lore.kernel.org/linux-raid/20250408143808.1026534-1-meir.elisha@volumez.com/T/

Meir Elisha (1):
  md/raid1: Add check for missing source disk in process_checks()

 drivers/md/raid1.c | 26 ++++++++++++++++----------
 1 file changed, 16 insertions(+), 10 deletions(-)

-- 
2.48.1