Strange MDADM RAID 6 behaviour

Mashie · Oct 15, 2021

UhClem said:
!!! EDIT !!! I didn't see your defrag posting till after posting the below. (Had this thread opened, then switched to analog life, and neglected to refresh upon return. ) Will be interesting to see the effect.

I may have to cancel the defrag before it is completed though, at the current rate it will take around a month to complete.

Here are some of the output from e4defrag to show how much the extents are tweaked as part of the defrag. I have no point of reference what number of extents are normal/acceptable for 20-80GB files.

Code:

[11542/24046]/mnt/storage/file01:    100%  extents: 29 -> 26    [ OK ]
[11543/24046]/mnt/storage/file02:    100%  extents: 18 -> 8    [ OK ]
[11544/24046]/mnt/storage/file03:    100%  extents: 36 -> 36    [ OK ]
[11545/24046]/mnt/storage/file04:    100%  extents: 30 -> 30    [ OK ]
[11546/24046]/mnt/storage/file05:    100%  extents: 34 -> 34    [ OK ]
[11547/24046]/mnt/storage/file06:    100%  extents: 12 -> 10    [ OK ]
[11548/24046]/mnt/storage/file07:    100%  extents: 268 -> 31    [ OK ]
[11549/24046]/mnt/storage/file08:    100%  extents: 47 -> 47    [ OK ]
[11550/24046]/mnt/storage/file09:    100%  extents: 367 -> 235    [ OK ]
[11551/24046]/mnt/storage/file10:    100%  extents: 302 -> 36    [ OK ]
[11552/24046]/mnt/storage/file11:    100%  extents: 177 -> 30    [ OK ]
[11553/24046]/mnt/storage/file12:    100%  extents: 9 -> 9    [ OK ]
[11554/24046]/mnt/storage/file13:    100%  extents: 185 -> 37    [ OK ]
[11555/24046]/mnt/storage/file14:    100%  extents: 9 -> 9    [ OK ]
[11556/24046]/mnt/storage/file15:    100%  extents: 14 -> 12    [ OK ]
[11557/24046]/mnt/storage/file16:    100%  extents: 256 -> 30    [ OK ]
[11558/24046]/mnt/storage/file17:    100%  extents: 37 -> 37    [ OK ]
[11559/24046]/mnt/storage/file18:    100%  extents: 28 -> 28    [ OK ]
[11560/24046]/mnt/storage/file19:    100%  extents: 42 -> 37    [ OK ]
[11561/24046]/mnt/storage/file20:    100%  extents: 13 -> 13    [ OK ]
[11562/24046]/mnt/storage/file21:    100%  extents: 96 -> 15    [ OK ]
[11563/24046]/mnt/storage/file22:    100%  extents: 324 -> 197    [ OK ]
[11564/24046]/mnt/storage/file23:    100%  extents: 43 -> 39    [ OK ]
[11565/24046]/mnt/storage/file24:    100%  extents: 17 -> 17    [ OK ]
[11566/24046]/mnt/storage/file25:    100%  extents: 10 -> 10    [ OK ]
[11567/24046]/mnt/storage/file26:    100%  extents: 371 -> 43    [ OK ]
[11568/24046]/mnt/storage/file27:    100%  extents: 265 -> 44    [ OK ]
[11569/24046]/mnt/storage/file28:    100%  extents: 61 -> 9    [ OK ]
[11570/24046]/mnt/storage/file29:    100%  extents: 23 -> 23    [ OK ]
[11571/24046]/mnt/storage/file30:    100%  extents: 13 -> 13    [ OK ]
[11572/24046]/mnt/storage/file31:    100%  extents: 34 -> 34    [ OK ]
[11573/24046]/mnt/storage/file32:    100%  extents: 424 -> 424    [ OK ]
[11574/24046]/mnt/storage/file33:    100%  extents: 39 -> 39    [ OK ]
[11575/24046]/mnt/storage/file34:    100%  extents: 58 -> 58    [ OK ]
[11576/24046]/mnt/storage/file35:    100%  extents: 42 -> 42    [ OK ]
[11577/24046]/mnt/storage/file36:    100%  extents: 162 -> 269    [ OK ]
[11578/24046]/mnt/storage/file37:    100%  extents: 67 -> 67    [ OK ]
[11579/24046]/mnt/storage/file38:    100%  extents: 29 -> 27    [ OK ]
[11580/24046]/mnt/storage/file39:    100%  extents: 109 -> 79    [ OK ]
[11581/24046]/mnt/storage/file40:    100%  extents: 167 -> 167    [ OK ]
[11582/24046]/mnt/storage/file41:    100%  extents: 247 -> 293    [ OK ]
[11583/24046]/mnt/storage/file42:    100%  extents: 78 -> 78    [ OK ]
[11584/24046]/mnt/storage/file43:    100%  extents: 12 -> 12    [ OK ]
[11585/24046]/mnt/storage/file44:    100%  extents: 9 -> 9    [ OK ]
[11586/24046]/mnt/storage/file45:    100%  extents: 195 -> 221    [ OK ]
[11587/24046]/mnt/storage/file46:    100%  extents: 358 -> 358    [ OK ]
[11588/24046]/mnt/storage/file47:    100%  extents: 93 -> 93    [ OK ]
[11589/24046]/mnt/storage/file48:    100%  extents: 1074 -> 232    [ OK ]
[11590/24046]/mnt/storage/file49:    100%  extents: 73 -> 7    [ OK ]
[11591/24046]/mnt/storage/file50:    100%  extents: 207 -> 26    [ OK ]
[11592/24046]/mnt/storage/file51:    100%  extents: 25 -> 25    [ OK ]
[11593/24046]/mnt/storage/file52:    100%  extents: 14 -> 14    [ OK ]
[11594/24046]/mnt/storage/file53:    100%  extents: 240 -> 147    [ OK ]
[11595/24046]/mnt/storage/file54:    100%  extents: 26 -> 26    [ OK ]
[11596/24046]/mnt/storage/file55:    100%  extents: 238 -> 61    [ OK ]
[11597/24046]/mnt/storage/file56:    100%  extents: 290 -> 152    [ OK ]
[11598/24046]/mnt/storage/file57:    100%  extents: 78 -> 14    [ OK ]
[11599/24046]/mnt/storage/file58:    100%  extents: 70 -> 41    [ OK ]
[11600/24046]/mnt/storage/file59:    100%  extents: 286 -> 47    [ OK ]
[11601/24046]/mnt/storage/file60:    100%  extents: 13 -> 11    [ OK ]
[11602/24046]/mnt/storage/file61:    100%  extents: 29 -> 29    [ OK ]
[11603/24046]/mnt/storage/file62:    100%  extents: 31 -> 9    [ OK ]
[11604/24046]/mnt/storage/file63:    100%  extents: 193 -> 36    [ OK ]

UhClem said:
That is actually very good to hear.

I don't want to be pessimistic, but there's an ~even chance that the 8==>10 won't change anything (except capacity). In either case, just so you don't feel pressured into doing the -grow asap, it might be useful to devise a "coping plan" so that this wart has minimal impact on life-and-wife. E.g., you can minimize the frequency of shutdown/boot-up by seeing what procedures for Hibernate are available in Ubuntu.

And, we can force the stall-event to happen at the last stage of boot-up, and avoid the surprise annoyance, as currently. This might also avoid the rattling of those 5 disks, but that is just a "theory" I have. The 120 seconds of timeout is probably unavoidable. Details on this forcing can wait till later--just something to ponder in the meantime.

Next time you reboot (normally--no need to force the event), try to cause the stall with the following command:

Code:

dd if=/dev/zero of=/mnt/storage/40MBz bs=8M count=5 oflag=direct

The system is on 24/7 with a reboot once every 1-4 weeks to apply security updates so the issues is mainly an annoyance as long as it isn't a sign of horrible things to come.

If the system can be forced to stall as part of the boot-up that would be a good last resort if the other options fail. I will give that command a try after next reboot whenever that is.

UhClem · Oct 15, 2021

Mashie said:
I may have to cancel the defrag before it is completed though, at the current rate it will take around a month to complete.

I think you should cancel. From the e2fsck you did, /dev/md0 was only 3.x% non-contiguous, so little to be gained de-frag-wise. As for affecting/improving the stall situation, I believe this is also tangerines-vs-tomatoes.

Here are some of the output from e4defrag to show how much the extents are tweaked as part of the defrag. I have no point of reference what number of extents are normal/acceptable for 20-80GB files.

You can use

Code:

hdparm --fibmap pathname

to see the extent layout for a file.

The system is on 24/7 with a reboot once every 1-4 weeks to apply security updates so the issues is mainly an annoyance as long as it isn't a sign of horrible things to come.

Understood, on the reboot freq. As for the "horrible" part, it doesn't feel like that, but that's just my hunch, based only on the simplistic reproducibility of the glitch and the consistently benign outcome (to date). ["Grains of salt": Long ago, when even top CS faculty had never heard of Unix, I knew the kernel, totally. But, that was then ... I retired 20+ yrs ago, and 20 yrs earlier, I stopped doing the kernel.]

Mashie · Oct 18, 2021

E4defrag started to speed up with many files tankfully not fragmented, it just finished:

Code:

    Success:            [ 22324/24054 ]
    Failure:            [ 1730/24054 ]
    Total extents:            256096->242202
    Fragmented percentage:         26%->22%

And it made no difference at all as you expected, if anything it made it worse as it now will stall for 4min30sec now.:

Code:

Oct 18 13:58:06 IONE kernel: [  242.648073] INFO: task jbd2/md0-8:1163 blocked for more than 120 seconds.
Oct 18 13:58:06 IONE kernel: [  242.648086]       Tainted: P           OE     5.11.0-37-generic #41~20.04.2-Ubuntu
Oct 18 13:58:06 IONE kernel: [  242.648090] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 18 13:58:06 IONE kernel: [  242.648093] task:jbd2/md0-8      state:D stack:    0 pid: 1163 ppid:     2 flags:0x00004000
Oct 18 13:58:06 IONE kernel: [  242.648101] Call Trace:
Oct 18 13:58:06 IONE kernel: [  242.648107]  __schedule+0x44c/0x8a0
Oct 18 13:58:06 IONE kernel: [  242.648116]  schedule+0x4f/0xc0
Oct 18 13:58:06 IONE kernel: [  242.648121]  jbd2_journal_commit_transaction+0x300/0x18f0
Oct 18 13:58:06 IONE kernel: [  242.648129]  ? dequeue_entity+0xd8/0x410
Oct 18 13:58:06 IONE kernel: [  242.648139]  ? wait_woken+0x80/0x80
Oct 18 13:58:06 IONE kernel: [  242.648145]  ? try_to_del_timer_sync+0x54/0x80
Oct 18 13:58:06 IONE kernel: [  242.648154]  kjournald2+0xb6/0x280
Oct 18 13:58:06 IONE kernel: [  242.648161]  ? wait_woken+0x80/0x80
Oct 18 13:58:06 IONE kernel: [  242.648165]  ? commit_timeout+0x20/0x20
Oct 18 13:58:06 IONE kernel: [  242.648171]  kthread+0x12b/0x150
Oct 18 13:58:06 IONE kernel: [  242.648179]  ? set_kthread_struct+0x40/0x40
Oct 18 13:58:06 IONE kernel: [  242.648185]  ret_from_fork+0x22/0x30
Oct 18 13:58:06 IONE kernel: [  242.648218] INFO: task pool-Thunar:4737 blocked for more than 120 seconds.
Oct 18 13:58:06 IONE kernel: [  242.648223]       Tainted: P           OE     5.11.0-37-generic #41~20.04.2-Ubuntu
Oct 18 13:58:06 IONE kernel: [  242.648226] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 18 13:58:06 IONE kernel: [  242.648228] task:pool-Thunar     state:D stack:    0 pid: 4737 ppid:  2576 flags:0x00000000
Oct 18 13:58:06 IONE kernel: [  242.648234] Call Trace:
Oct 18 13:58:06 IONE kernel: [  242.648236]  __schedule+0x44c/0x8a0
Oct 18 13:58:06 IONE kernel: [  242.648240]  ? __mod_memcg_lruvec_state+0x25/0xe0
Oct 18 13:58:06 IONE kernel: [  242.648252]  schedule+0x4f/0xc0
Oct 18 13:58:06 IONE kernel: [  242.648256]  rwsem_down_read_slowpath+0x184/0x3c0
Oct 18 13:58:06 IONE kernel: [  242.648264]  down_read+0x43/0xa0
Oct 18 13:58:06 IONE kernel: [  242.648269]  ext4_da_map_blocks.constprop.0+0x2dc/0x380
Oct 18 13:58:06 IONE kernel: [  242.648276]  ext4_da_get_block_prep+0x55/0xe0
Oct 18 13:58:06 IONE kernel: [  242.648281]  ext4_block_write_begin+0x14a/0x530
Oct 18 13:58:06 IONE kernel: [  242.648285]  ? ext4_da_map_blocks.constprop.0+0x380/0x380
Oct 18 13:58:06 IONE kernel: [  242.648290]  ? __ext4_journal_start_sb+0x106/0x120
Oct 18 13:58:06 IONE kernel: [  242.648297]  ext4_da_write_begin+0x1de/0x460
Oct 18 13:58:06 IONE kernel: [  242.648303]  generic_perform_write+0xc2/0x1c0
Oct 18 13:58:06 IONE kernel: [  242.648314]  ext4_buffered_write_iter+0x98/0x150
Oct 18 13:58:06 IONE kernel: [  242.648321]  ext4_file_write_iter+0x53/0x220
Oct 18 13:58:06 IONE kernel: [  242.648326]  ? common_file_perm+0x72/0x170
Oct 18 13:58:06 IONE kernel: [  242.648335]  do_iter_readv_writev+0x152/0x1b0
Oct 18 13:58:06 IONE kernel: [  242.648343]  do_iter_write+0x88/0x1c0
Oct 18 13:58:06 IONE kernel: [  242.648350]  vfs_iter_write+0x19/0x30
Oct 18 13:58:06 IONE kernel: [  242.648356]  iter_file_splice_write+0x276/0x3c0
Oct 18 13:58:06 IONE kernel: [  242.648365]  do_splice_from+0x21/0x40
Oct 18 13:58:06 IONE kernel: [  242.648371]  do_splice+0x2e8/0x650
Oct 18 13:58:06 IONE kernel: [  242.648377]  __do_splice+0xde/0x160
Oct 18 13:58:06 IONE kernel: [  242.648383]  __x64_sys_splice+0x99/0x110
Oct 18 13:58:06 IONE kernel: [  242.648389]  do_syscall_64+0x38/0x90
Oct 18 13:58:06 IONE kernel: [  242.648394]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
Oct 18 13:58:06 IONE kernel: [  242.648401] RIP: 0033:0x7faa74c4a7f3
Oct 18 13:58:06 IONE kernel: [  242.648406] RSP: 002b:00007faa71adc700 EFLAGS: 00000293 ORIG_RAX: 0000000000000113
Oct 18 13:58:06 IONE kernel: [  242.648411] RAX: ffffffffffffffda RBX: 0000000000100000 RCX: 00007faa74c4a7f3
Oct 18 13:58:06 IONE kernel: [  242.648414] RDX: 0000000000000016 RSI: 0000000000000000 RDI: 0000000000000017
Oct 18 13:58:06 IONE kernel: [  242.648417] RBP: 0000000000000000 R08: 0000000000100000 R09: 0000000000000004
Oct 18 13:58:06 IONE kernel: [  242.648420] R10: 00007faa71adc840 R11: 0000000000000293 R12: 0000000000000016
Oct 18 13:58:06 IONE kernel: [  242.648423] R13: 0000000000000000 R14: 0000000000000017 R15: 00007faa71adc850
Oct 18 14:00:07 IONE kernel: [  363.478661] INFO: task jbd2/md0-8:1163 blocked for more than 241 seconds.
Oct 18 14:00:07 IONE kernel: [  363.478673]       Tainted: P           OE     5.11.0-37-generic #41~20.04.2-Ubuntu
Oct 18 14:00:07 IONE kernel: [  363.478677] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 18 14:00:07 IONE kernel: [  363.478679] task:jbd2/md0-8      state:D stack:    0 pid: 1163 ppid:     2 flags:0x00004000
Oct 18 14:00:07 IONE kernel: [  363.478688] Call Trace:
Oct 18 14:00:07 IONE kernel: [  363.478694]  __schedule+0x44c/0x8a0
Oct 18 14:00:07 IONE kernel: [  363.478703]  schedule+0x4f/0xc0
Oct 18 14:00:07 IONE kernel: [  363.478707]  jbd2_journal_commit_transaction+0x300/0x18f0
Oct 18 14:00:07 IONE kernel: [  363.478715]  ? dequeue_entity+0xd8/0x410
Oct 18 14:00:07 IONE kernel: [  363.478725]  ? wait_woken+0x80/0x80
Oct 18 14:00:07 IONE kernel: [  363.478732]  ? try_to_del_timer_sync+0x54/0x80
Oct 18 14:00:07 IONE kernel: [  363.478741]  kjournald2+0xb6/0x280
Oct 18 14:00:07 IONE kernel: [  363.478748]  ? wait_woken+0x80/0x80
Oct 18 14:00:07 IONE kernel: [  363.478752]  ? commit_timeout+0x20/0x20
Oct 18 14:00:07 IONE kernel: [  363.478758]  kthread+0x12b/0x150
Oct 18 14:00:07 IONE kernel: [  363.478766]  ? set_kthread_struct+0x40/0x40
Oct 18 14:00:07 IONE kernel: [  363.478773]  ret_from_fork+0x22/0x30
Oct 18 14:00:07 IONE kernel: [  363.478804] INFO: task pool-Thunar:4737 blocked for more than 241 seconds.
Oct 18 14:00:07 IONE kernel: [  363.478809]       Tainted: P           OE     5.11.0-37-generic #41~20.04.2-Ubuntu
Oct 18 14:00:07 IONE kernel: [  363.478812] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 18 14:00:07 IONE kernel: [  363.478814] task:pool-Thunar     state:D stack:    0 pid: 4737 ppid:  2576 flags:0x00000000
Oct 18 14:00:07 IONE kernel: [  363.478820] Call Trace:
Oct 18 14:00:07 IONE kernel: [  363.478823]  __schedule+0x44c/0x8a0
Oct 18 14:00:07 IONE kernel: [  363.478827]  ? __mod_memcg_lruvec_state+0x25/0xe0
Oct 18 14:00:07 IONE kernel: [  363.478839]  schedule+0x4f/0xc0
Oct 18 14:00:07 IONE kernel: [  363.478842]  rwsem_down_read_slowpath+0x184/0x3c0
Oct 18 14:00:07 IONE kernel: [  363.478851]  down_read+0x43/0xa0
Oct 18 14:00:07 IONE kernel: [  363.478856]  ext4_da_map_blocks.constprop.0+0x2dc/0x380
Oct 18 14:00:07 IONE kernel: [  363.478863]  ext4_da_get_block_prep+0x55/0xe0
Oct 18 14:00:07 IONE kernel: [  363.478868]  ext4_block_write_begin+0x14a/0x530
Oct 18 14:00:07 IONE kernel: [  363.478872]  ? ext4_da_map_blocks.constprop.0+0x380/0x380
Oct 18 14:00:07 IONE kernel: [  363.478877]  ? __ext4_journal_start_sb+0x106/0x120
Oct 18 14:00:07 IONE kernel: [  363.478884]  ext4_da_write_begin+0x1de/0x460
Oct 18 14:00:07 IONE kernel: [  363.478890]  generic_perform_write+0xc2/0x1c0
Oct 18 14:00:07 IONE kernel: [  363.478901]  ext4_buffered_write_iter+0x98/0x150
Oct 18 14:00:07 IONE kernel: [  363.478908]  ext4_file_write_iter+0x53/0x220
Oct 18 14:00:07 IONE kernel: [  363.478914]  ? common_file_perm+0x72/0x170
Oct 18 14:00:07 IONE kernel: [  363.478923]  do_iter_readv_writev+0x152/0x1b0
Oct 18 14:00:07 IONE kernel: [  363.478932]  do_iter_write+0x88/0x1c0
Oct 18 14:00:07 IONE kernel: [  363.478938]  vfs_iter_write+0x19/0x30
Oct 18 14:00:07 IONE kernel: [  363.478944]  iter_file_splice_write+0x276/0x3c0
Oct 18 14:00:07 IONE kernel: [  363.478954]  do_splice_from+0x21/0x40
Oct 18 14:00:07 IONE kernel: [  363.478960]  do_splice+0x2e8/0x650
Oct 18 14:00:07 IONE kernel: [  363.478966]  __do_splice+0xde/0x160
Oct 18 14:00:07 IONE kernel: [  363.478972]  __x64_sys_splice+0x99/0x110
Oct 18 14:00:07 IONE kernel: [  363.478978]  do_syscall_64+0x38/0x90
Oct 18 14:00:07 IONE kernel: [  363.478983]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
Oct 18 14:00:07 IONE kernel: [  363.478990] RIP: 0033:0x7faa74c4a7f3
Oct 18 14:00:07 IONE kernel: [  363.478995] RSP: 002b:00007faa71adc700 EFLAGS: 00000293 ORIG_RAX: 0000000000000113
Oct 18 14:00:07 IONE kernel: [  363.479000] RAX: ffffffffffffffda RBX: 0000000000100000 RCX: 00007faa74c4a7f3
Oct 18 14:00:07 IONE kernel: [  363.479004] RDX: 0000000000000016 RSI: 0000000000000000 RDI: 0000000000000017
Oct 18 14:00:07 IONE kernel: [  363.479006] RBP: 0000000000000000 R08: 0000000000100000 R09: 0000000000000004
Oct 18 14:00:07 IONE kernel: [  363.479009] R10: 00007faa71adc840 R11: 0000000000000293 R12: 0000000000000016
Oct 18 14:00:07 IONE kernel: [  363.479012] R13: 0000000000000000 R14: 0000000000000017 R15: 00007faa71adc850

Stephan · Oct 18, 2021

Can you transplant the drives with controller to a different mainboard? Even just for testing, without case, spread out on a table. These kernel warnings are concerning and should never happen like this.

Also it is high time to backup whatever is on the raid, in case the ext4 on that raid blows up. Make sure you have MD5 checksums of all files, just in case something runs even more amok and starts spraying junk all over the array. If things are too big, maybe prioritize and copy the most important stuff off to a 14-16 TB USB3 drive like WD Book. Just to play safe and prevent tears.

Mashie · Oct 19, 2021

Stephan said:
Can you transplant the drives with controller to a different mainboard? Even just for testing, without case, spread out on a table. These kernel warnings are concerning and should never happen like this.

Also it is high time to backup whatever is on the raid, in case the ext4 on that raid blows up. Make sure you have MD5 checksums of all files, just in case something runs even more amok and starts spraying junk all over the array. If things are too big, maybe prioritize and copy the most important stuff off to a 14-16 TB USB3 drive like WD Book. Just to play safe and prevent tears.

I originally had the array use the on-board SATA controllers and moving to the LSI 3905 was one attempt to rule the motherboard out. I don't have a spare motherboard/cpu to try with.

The most important bits I have on Google Drive already so if the rest is lost it is mainly a massive inconvenience.

Mashie · Oct 19, 2021

UhClem said:
Next time you reboot (normally--no need to force the event), try to cause the stall with the following command:

Code:

dd if=/dev/zero of=/mnt/storage/40MBz bs=8M count=5 oflag=direct

This worked perfectly fine to trigger the stall with.

UhClem · Oct 19, 2021

Mashie said:
This worked perfectly fine to trigger the stall with.

Good. [the intent was to have a minimal "provoker" not involving foreign actor (thunar) or device (nvme)]

I originally had the array use the on-board SATA controllers and moving to the LSI 9305 was one attempt to rule the motherboard out.

I think that does rule out the mobo--since the stall occurs with only the on-board SATAs (all on the C612 chipset), and (separately) with only the 9305 (on a CPU-PCIe slot).
[A bad memory location as culprit is effectively eliminated, since stall occurs with 2 different kernel versions.]

[ ... waiting on the --grow from 10==>N ... ]

Mashie · Oct 21, 2021

UhClem said:
[ ... waiting on the --grow from 10==>N ... ]

And off we go, the reshaping should be done by Sunday evening at the current speed:

Code:

mashie@IONE:~$ sudo mdadm /dev/md0 --add /dev/sdh1
mdadm: added /dev/sdh1
mashie@IONE:~$ sudo mdadm /dev/md0 --add /dev/sdk1
mdadm: added /dev/sdk1
mashie@IONE:~$ sudo mdadm --detail /dev/md0
/dev/md0:
           Version : 1.2
     Creation Time : Sun Jun 30 22:27:54 2019
        Raid Level : raid6
        Array Size : 78129610752 (74510.20 GiB 80004.72 GB)
     Used Dev Size : 9766201344 (9313.78 GiB 10000.59 GB)
      Raid Devices : 10
     Total Devices : 12
       Persistence : Superblock is persistent

     Intent Bitmap : Internal

       Update Time : Thu Oct 21 23:46:53 2021
             State : clean 
    Active Devices : 10
   Working Devices : 12
    Failed Devices : 0
     Spare Devices : 2

            Layout : left-symmetric
        Chunk Size : 512K

Consistency Policy : bitmap

              Name : IONE:0  (local to host IONE)
              UUID : 1f8e4385:3ef16ed6:20147617:818d417e
            Events : 144463

    Number   Major   Minor   RaidDevice State
       0       8       65        0      active sync   /dev/sde1
       1       8       81        1      active sync   /dev/sdf1
       2       8       97        2      active sync   /dev/sdg1
       3       8       17        3      active sync   /dev/sdb1
       4       8       33        4      active sync   /dev/sdc1
       6       8      177        5      active sync   /dev/sdl1
       5       8       49        6      active sync   /dev/sdd1
       7       8      193        7      active sync   /dev/sdm1
       9       8      129        8      active sync   /dev/sdi1
       8       8      145        9      active sync   /dev/sdj1

      10       8      113        -      spare   /dev/sdh1
      11       8      161        -      spare   /dev/sdk1
mashie@IONE:~$ sudo mdadm --grow --raid-devices=12 --backup-file=/root/md0_grow.bak /dev/md0
mdadm: Need to backup 20480K of critical section..
mashie@IONE:~$

lpallard · Oct 24, 2021

I'm very late to the game but a while back (about 8 years ago) I had to give up mdadm for a storage server because of constant kernel panics. If I remember well this was due to a severe bug in mdadm for some kernel series... First thing you should try is get a cheap IBM M1015 and flash it ti IT mode. If you're using consumer grade hardware, never eliminate the possibility of some firmware or BIOS issues and quality issues. I'll try to get some details of what happened to me and post back if I find anything relevant.

Mashie · Oct 24, 2021

lpallard said:
I'm very late to the game but a while back (about 8 years ago) I had to give up mdadm for a storage server because of constant kernel panics. If I remember well this was due to a severe bug in mdadm for some kernel series... First thing you should try is get a cheap IBM M1015 and flash it ti IT mode. If you're using consumer grade hardware, never eliminate the possibility of some firmware or BIOS issues and quality issues. I'll try to get some details of what happened to me and post back if I find anything relevant.

Thanks, any info about MDADM issues is welcome.

I'm already on workstation hardware (E5-1650 v3 Xeon, ECC RAM and LSI 3905-24i controller).

Mashie · Nov 1, 2021

UhClem said:
[ ... waiting on the --grow from 10==>N ... ]

I grew successfully from 10 -> 12 and expanded the file system which took quite a while.
At this stage the stall would still happen after reboot, something had changed though as it no longer did the heavy reading/seeking on just 5 specific drives, it was now doing reading of all 12 drives and without much of the very noisy seeking. The stall however is now lasting just over 6 minutes.

As no particular drive was standing out at this point I expanded from 12 -> 14 and things neither improved nor degraded further.

This is the output from triggering the stalls now:

Code:

mashie@IONE:~$ dd if=/dev/zero of=/mnt/storage/40MBz bs=8M count=5 oflag=direct
5+0 records in
5+0 records out
41943040 bytes (42 MB, 40 MiB) copied, 553.211 s, 75.8 kB/s
mashie@IONE:~$

And the usual entries in syslog:

Code:

Nov  1 09:07:31 IONE kernel: [  363.766976] INFO: task jbd2/md0-8:1217 blocked for more than 120 seconds.
Nov  1 09:07:31 IONE kernel: [  363.766988]       Tainted: P           OE     5.11.0-38-generic #42~20.04.1-Ubuntu
Nov  1 09:07:31 IONE kernel: [  363.766992] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov  1 09:07:31 IONE kernel: [  363.766995] task:jbd2/md0-8      state:D stack:    0 pid: 1217 ppid:     2 flags:0x00004000
Nov  1 09:07:31 IONE kernel: [  363.767003] Call Trace:
Nov  1 09:07:31 IONE kernel: [  363.767009]  __schedule+0x44c/0x8a0
Nov  1 09:07:31 IONE kernel: [  363.767021]  schedule+0x4f/0xc0
Nov  1 09:07:31 IONE kernel: [  363.767028]  jbd2_journal_commit_transaction+0x300/0x18f0
Nov  1 09:07:31 IONE kernel: [  363.767038]  ? dequeue_entity+0xd8/0x410
Nov  1 09:07:31 IONE kernel: [  363.767047]  ? wait_woken+0x80/0x80
Nov  1 09:07:31 IONE kernel: [  363.767053]  ? try_to_del_timer_sync+0x54/0x80
Nov  1 09:07:31 IONE kernel: [  363.767062]  kjournald2+0xb6/0x280
Nov  1 09:07:31 IONE kernel: [  363.767069]  ? wait_woken+0x80/0x80
Nov  1 09:07:31 IONE kernel: [  363.767073]  ? commit_timeout+0x20/0x20
Nov  1 09:07:31 IONE kernel: [  363.767078]  kthread+0x12b/0x150
Nov  1 09:07:31 IONE kernel: [  363.767086]  ? set_kthread_struct+0x40/0x40
Nov  1 09:07:31 IONE kernel: [  363.767093]  ret_from_fork+0x22/0x30
Nov  1 09:09:32 IONE kernel: [  484.597705] INFO: task jbd2/md0-8:1217 blocked for more than 241 seconds.
Nov  1 09:09:32 IONE kernel: [  484.597712]       Tainted: P           OE     5.11.0-38-generic #42~20.04.1-Ubuntu
Nov  1 09:09:32 IONE kernel: [  484.597713] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov  1 09:09:32 IONE kernel: [  484.597714] task:jbd2/md0-8      state:D stack:    0 pid: 1217 ppid:     2 flags:0x00004000
Nov  1 09:09:32 IONE kernel: [  484.597717] Call Trace:
Nov  1 09:09:32 IONE kernel: [  484.597722]  __schedule+0x44c/0x8a0
Nov  1 09:09:32 IONE kernel: [  484.597727]  schedule+0x4f/0xc0
Nov  1 09:09:32 IONE kernel: [  484.597729]  jbd2_journal_commit_transaction+0x300/0x18f0
Nov  1 09:09:32 IONE kernel: [  484.597734]  ? dequeue_entity+0xd8/0x410
Nov  1 09:09:32 IONE kernel: [  484.597739]  ? wait_woken+0x80/0x80
Nov  1 09:09:32 IONE kernel: [  484.597742]  ? try_to_del_timer_sync+0x54/0x80
Nov  1 09:09:32 IONE kernel: [  484.597746]  kjournald2+0xb6/0x280
Nov  1 09:09:32 IONE kernel: [  484.597750]  ? wait_woken+0x80/0x80
Nov  1 09:09:32 IONE kernel: [  484.597752]  ? commit_timeout+0x20/0x20
Nov  1 09:09:32 IONE kernel: [  484.597754]  kthread+0x12b/0x150
Nov  1 09:09:32 IONE kernel: [  484.597758]  ? set_kthread_struct+0x40/0x40
Nov  1 09:09:32 IONE kernel: [  484.597760]  ret_from_fork+0x22/0x30
Nov  1 09:11:33 IONE kernel: [  605.427075] INFO: task jbd2/md0-8:1217 blocked for more than 362 seconds.
Nov  1 09:11:33 IONE kernel: [  605.427086]       Tainted: P           OE     5.11.0-38-generic #42~20.04.1-Ubuntu
Nov  1 09:11:33 IONE kernel: [  605.427090] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov  1 09:11:33 IONE kernel: [  605.427093] task:jbd2/md0-8      state:D stack:    0 pid: 1217 ppid:     2 flags:0x00004000
Nov  1 09:11:33 IONE kernel: [  605.427101] Call Trace:
Nov  1 09:11:33 IONE kernel: [  605.427107]  __schedule+0x44c/0x8a0
Nov  1 09:11:33 IONE kernel: [  605.427118]  schedule+0x4f/0xc0
Nov  1 09:11:33 IONE kernel: [  605.427122]  jbd2_journal_commit_transaction+0x300/0x18f0
Nov  1 09:11:33 IONE kernel: [  605.427130]  ? dequeue_entity+0xd8/0x410
Nov  1 09:11:33 IONE kernel: [  605.427140]  ? wait_woken+0x80/0x80
Nov  1 09:11:33 IONE kernel: [  605.427147]  ? try_to_del_timer_sync+0x54/0x80
Nov  1 09:11:33 IONE kernel: [  605.427156]  kjournald2+0xb6/0x280
Nov  1 09:11:33 IONE kernel: [  605.427163]  ? wait_woken+0x80/0x80
Nov  1 09:11:33 IONE kernel: [  605.427167]  ? commit_timeout+0x20/0x20
Nov  1 09:11:33 IONE kernel: [  605.427173]  kthread+0x12b/0x150
Nov  1 09:11:33 IONE kernel: [  605.427181]  ? set_kthread_struct+0x40/0x40
Nov  1 09:11:33 IONE kernel: [  605.427188]  ret_from_fork+0x22/0x30

Goose · Nov 15, 2021

How about booting with a livecd and then mounting the drive to see if the issue still occurs.

Just because your MD array is affected doesn't mean it's the cause... I think it's likely that another service is doing something that causes the drives to be busy.

EDIT:
If the issue still occurs with the livecd, then it's likely that one of your disks/disk paths is unhealthy. Try doing a long smart scan or a non-destructive badblocks to see what's happening. I suppose you could also use WDs disk test tool. It will tell you if the drive(s) have reassigned sectors.

UhClem · Nov 15, 2021

@Mashie , pardon my sloth -- kept getting bogged down trying to write a script that would have the best chance of exposing real/useful info.
... meanwhile ... Kudos to @Goose :

Just because your MD array is affected doesn't mean it's the cause... I think it's likely that another service is doing something that causes the drives to be busy.

I've had this suspicion ever since seeing:

Code:

Device             tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read    kB_wrtn    kB_dscd
md0             105.00       420.00         0.00         0.00        420          0          0
nvme0n1           0.00         0.00         0.00         0.00          0          0          0
sda              21.00        84.00         0.00         0.00         84          0          0    (disk 7)
sdb              21.00        84.00         0.00         0.00         84          0          0    (disk 1)
sdc               0.00         0.00         0.00         0.00          0          0          0
sdd              21.00        84.00         0.00         0.00         84          0          0    (disk 3)
sde               0.00         0.00         0.00         0.00          0          0          0
sdf              21.00        84.00         0.00         0.00         84          0          0    (disk 9)
sdg              21.00        84.00         0.00         0.00         84          0          0    (disk 10)
sdh               0.00         0.00         0.00         0.00          0          0          0
sdi               0.00         0.00         0.00         0.00          0          0          0
sdj               0.00         0.00         0.00         0.00          0          0          0
sdk               0.00         0.00         0.00         0.00          0          0          0

4KB reads are the "tell".

I intended my new script to be run without Desktop/GUI cruft, or at single-user. LiveCD might be better; but possibly iintroduces a new "variable" of a different OS version/image, whereas just changing runlevel (on your existing boot) would only eliminate (all the) variables introduced with Desktop.

With your newly-grown array, use:

Code:

dd if=/dev/zero of=/mnt/storage/60MBzero  bs=12M count=5 oflag=direct

to (try to) provoke.

Mashie · Nov 25, 2021

Hi @UhClem I didn't see you had replied here, I guess the email notification was lost in the ether.

That command will happily trigger the stalling.

Code:

mashie@IONE:~$ dd if=/dev/zero of=/mnt/storage/60MBzero  bs=12M count=5 oflag=direct
5+0 records in
5+0 records out
62914560 bytes (63 MB, 60 MiB) copied, 532.051 s, 118 kB/s
mashie@IONE:~$

Whichever option you think is best to get rid of the cruft I'm happy to give it a go. The stalling is getting close to almost 10 minutes now which is starting to become quite annoying.

UhClem · Nov 29, 2021

Mashie said:
Whichever option you think is best to get rid of the cruft I'm happy to give it a go. The stalling is getting close to almost 10 minutes now which is starting to become quite annoying.

After more pondering, my bet is that Thunar is the agent provocateur[**], so, rather than mucking with runlevels, etc., (please humor me) try booting with Thunar disabled/eliminated, and try that dd command. (If I'm wrong, we can resort to mucking ...)

[**] This is/would-be not a fault of Thunar (it's just user-level code); but I believe Thunar might be exposing a (soft) bug in md.

Mashie · Nov 29, 2021

UhClem said:
After more pondering, my bet is that Thunar is the agent provocateur[**], so, rather than mucking with runlevels, etc., (please humor me) try booting with Thunar disabled/eliminated, and try that dd command. (If I'm wrong, we can resort to mucking ...)

[**] This is/would-be not a fault of Thunar (it's just user-level code); but I believe Thunar might be exposing a (soft) bug in md.

What is the easiest way to disable Thunar?

UhClem · Nov 29, 2021

Mashie said:
What is the easiest way to disable Thunar?

I don't know; I am (since 1973) strictly command-line, on Unix.
[Mea culpa ... "Do as I say, not as I do."]
(...Googling...)
... maybe there is something to toggle/comment in a startup file (for Xfce?).
Speaking of which, maybe there is an (easy?) way to suppress startup of Xfce (vs runlevels, systemctl, etc).
Now, it is I who is a stranger in a strange land.

Any help here, STHers??

Goose · Dec 8, 2021

If you don't want to try a liveCD, then try booting to a lower runlevel such as 3. See What Are “Runlevels” on Linux? for info on how to do it.

That way X wont be loaded so you wont have issues with Thunar, but TBH I doubt that's the issue. It may be some form of indexing but again probably not.

Mashie · Jan 18, 2022

Just to give an update on this, today kernel 5.13.0-25 was pushed out to this system and the issue is finally gone.

MrCalvin · Jan 18, 2022

A good example of running "stable" (older kernel) doesn't always mean you get a stable system!
RHEL comes to my mind where an old kernel is chosen to give the highest stability, but it is true? I feel there is a exaggerated trust in old kernels.
(sorry going a little off topic)

Strange MDADM RAID 6 behaviour

Member

just another Bozo on the bus

Member

Well-Known Member

Member

Member

just another Bozo on the bus

Member

Member

Member

Member

New Member

just another Bozo on the bus

Member

just another Bozo on the bus

Member

just another Bozo on the bus

New Member

Member

IT consultant, Denmark