Glad to hear things have improved.
It was an update for MDADM to improve the performance in one of the recent kernel versions (which accidentally fixed the freezing bug).
The only thing to change in my system between today and the last reboot a week ago is that the kernel went from 5.11 to 5.13.
5.13 had work done to the mdadm implementation so whatever they did fixed this by accident.
Hi @UhClem I didn't see you had replied here, I guess the email notification was lost in the ether.
That command will happily trigger the stalling.
mashie@IONE:~$ dd if=/dev/zero of=/mnt/storage/60MBzero bs=12M count=5 oflag=direct
5+0 records in
5+0 records out
62914560 bytes (63 MB, 60 MiB)...
I grew successfully from 10 -> 12 and expanded the file system which took quite a while.
At this stage the stall would still happen after reboot, something had changed though as it no longer did the heavy reading/seeking on just 5 specific drives, it was now doing reading of all 12 drives and...
And off we go, the reshaping should be done by Sunday evening at the current speed:
mashie@IONE:~$ sudo mdadm /dev/md0 --add /dev/sdh1
mdadm: added /dev/sdh1
mashie@IONE:~$ sudo mdadm /dev/md0 --add /dev/sdk1
mdadm: added /dev/sdk1
mashie@IONE:~$ sudo mdadm --detail /dev/md0
/dev/md0...
I originally had the array use the on-board SATA controllers and moving to the LSI 3905 was one attempt to rule the motherboard out. I don't have a spare motherboard/cpu to try with.
The most important bits I have on Google Drive already so if the rest is lost it is mainly a massive inconvenience.
E4defrag started to speed up with many files tankfully not fragmented, it just finished:
Success: [ 22324/24054 ]
Failure: [ 1730/24054 ]
Total extents: 256096->242202
Fragmented percentage: 26%->22%
And it made no difference at all as...
I may have to cancel the defrag before it is completed though, at the current rate it will take around a month to complete.
Here are some of the output from e4defrag to show how much the extents are tweaked as part of the defrag. I have no point of reference what number of extents are...
I was looking at top while the stalling happened today and spotted this little process pop up for pretty much the entire duration of the stalling:
303 root 20 0 0 0 0 D 1.0 0.0 0:00.56 kworker/u24:11+flush-9:0
Some googling later and one explanation was...
I have thankfully not had any incidents while expanding the array over the years, which I have done a few times now as I add 1-2 drives at a time. I started off with an array of 3 as a test and then did an expansion to 4 before putting any data at risk just to try the process. After that it was...
Thank you for your response.
1. The setup is default MDADM RAID6, I changed the stripe-cache from 256 to 16384 yesterday but it made no improvement for the stalling. I think this issue may have started after the array was expanded from 8 to 10 drives. The last two drives are not showing any...
All three options got the system stalled however the amount of logs generated varied a lot. The last option caused the system to stall over 4 minutes.
mashie@IONE:~$ sudo dd if=/dev/nvme0n1 of=/mnt/storage/test.txt bs=1M count=20k iflag=direct oflag=direct
20480+0 records in
20480+0 records...
This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.