[Bug 1940207] [NEW] Mdadm slow RAID6 & RAID10 resync
Sergiu
1940207 at bugs.launchpad.net
Tue Aug 17 03:46:49 UTC 2021
Public bug reported:
I am having a RAID 10 made of 24 x Micron 9300 Pro 15.36TB. I have
initialized the array using the command as follows:
mdadm --create /dev/md0 --raid-devices=24 --chunk=32 --level=raid10
/dev/nvme0n1 /dev/nvme1n1 /dev/nvme2n1 /dev/nvme3n1 /dev/nvme4n1
/dev/nvme5n1 /dev/nvme6n1 /dev/nvme7n1 /dev/nvme8n1 /dev/nvme9n1
/dev/nvme10n1 /dev/nvme11n1 /dev/nvme12n1 /dev/nvme13n1 /dev/nvme14n1
/dev/nvme15n1 /dev/nvme16n1 /dev/nvme17n1 /dev/nvme18n1 /dev/nvme19n1
/dev/nvme20n1 /dev/nvme21n1 /dev/nvme22n1 /dev/nvme23n1
What I have found out is that array resyncs at a total of 1.3GB/s and
does not appear to be any way to speed it up. If built as 12 arrays of
RAID1 in a RAID0 config, each individual group ends up being resynced at
~3.2GB/s for a total throughput of about 38.4GB/s which is almost 30
times faster, however in this configuration random IOPS performance
appears to be way more unstable than in standard RAID10, thus negating
the advantages of resync. If mdadm offers predefined RAID10
configuration, it should offer the same resync behavior as RAID 1+0,
however it is not due to lack of parallelization in plain resync. The
dev.raid.speed_limit_max parameter was raised to 5000000 during this
exercise.
Same resync speed is observed also in RAID6, however there, setting
group_thread_cnt value to 12 increases the resync speed from 1.3GB to
about 10.5GB, which is however still far from theoretical speed of
72GB/s. To be noted that increasing group_thread_cnt above 12 does not
scale up, contrary, throughput starts decreasing.
To be mentioned that server is having way more than enough CPU power (2
x 64 cores) and all SSDs are directly attached NVMe, thus there is no
bus sharing of any kind, which was confirmed by running fio on all
devices concurrently and observing max theoretical speed.
** Affects: mdadm (Ubuntu)
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to mdadm in Ubuntu.
https://bugs.launchpad.net/bugs/1940207
Title:
Mdadm slow RAID6 & RAID10 resync
Status in mdadm package in Ubuntu:
New
Bug description:
I am having a RAID 10 made of 24 x Micron 9300 Pro 15.36TB. I have
initialized the array using the command as follows:
mdadm --create /dev/md0 --raid-devices=24 --chunk=32 --level=raid10
/dev/nvme0n1 /dev/nvme1n1 /dev/nvme2n1 /dev/nvme3n1 /dev/nvme4n1
/dev/nvme5n1 /dev/nvme6n1 /dev/nvme7n1 /dev/nvme8n1 /dev/nvme9n1
/dev/nvme10n1 /dev/nvme11n1 /dev/nvme12n1 /dev/nvme13n1 /dev/nvme14n1
/dev/nvme15n1 /dev/nvme16n1 /dev/nvme17n1 /dev/nvme18n1 /dev/nvme19n1
/dev/nvme20n1 /dev/nvme21n1 /dev/nvme22n1 /dev/nvme23n1
What I have found out is that array resyncs at a total of 1.3GB/s and
does not appear to be any way to speed it up. If built as 12 arrays of
RAID1 in a RAID0 config, each individual group ends up being resynced
at ~3.2GB/s for a total throughput of about 38.4GB/s which is almost
30 times faster, however in this configuration random IOPS performance
appears to be way more unstable than in standard RAID10, thus negating
the advantages of resync. If mdadm offers predefined RAID10
configuration, it should offer the same resync behavior as RAID 1+0,
however it is not due to lack of parallelization in plain resync. The
dev.raid.speed_limit_max parameter was raised to 5000000 during this
exercise.
Same resync speed is observed also in RAID6, however there, setting
group_thread_cnt value to 12 increases the resync speed from 1.3GB to
about 10.5GB, which is however still far from theoretical speed of
72GB/s. To be noted that increasing group_thread_cnt above 12 does not
scale up, contrary, throughput starts decreasing.
To be mentioned that server is having way more than enough CPU power
(2 x 64 cores) and all SSDs are directly attached NVMe, thus there is
no bus sharing of any kind, which was confirmed by running fio on all
devices concurrently and observing max theoretical speed.
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/mdadm/+bug/1940207/+subscriptions
More information about the foundations-bugs
mailing list