I have Xubuntu 20.04 with a 10x10TB MDADM RAID 6 array.
It is working great except for the first time I do a large >7GB file write to the array after a reboot. It will hang at different places in the transfer each time such as 2.5GB/3.6GB/4.3GB/5.6GB. When it hangs the same 5 disks will be busy with activity for 3-4 minutes before everyting resumes and every consecutive file transfer is working as normal until the next reboot.
Syslog 120s after the transfer hangs:
Any idea what is going on? SMART isn't showing errors on any of the disks.
It is working great except for the first time I do a large >7GB file write to the array after a reboot. It will hang at different places in the transfer each time such as 2.5GB/3.6GB/4.3GB/5.6GB. When it hangs the same 5 disks will be busy with activity for 3-4 minutes before everyting resumes and every consecutive file transfer is working as normal until the next reboot.
Code:
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md0 : active raid6 sdb1[3] sdc1[0] sdd1[1] sde1[2] sdi1[9] sdk1[4] sdh1[5] sdg1[7] sdf1[8] sda1[6]
78129610752 blocks super 1.2 level 6, 512k chunk, algorithm 2 [10/10] [UUUUUUUUUU]
bitmap: 0/73 pages [0KB], 65536KB chunk
unused devices: <none>
Linux 5.4.0-88-generic (IONE) 06/10/21 _x86_64_ (12 CPU)
avg-cpu: %user %nice %system %iowait %steal %idle
1.08 0.00 0.75 5.25 0.00 92.92
Device tps kB_read/s kB_wrtn/s kB_dscd/s kB_read kB_wrtn kB_dscd
md0 105.00 420.00 0.00 0.00 420 0 0
nvme0n1 0.00 0.00 0.00 0.00 0 0 0
sda 21.00 84.00 0.00 0.00 84 0 0 (disk 7)
sdb 21.00 84.00 0.00 0.00 84 0 0 (disk 1)
sdc 0.00 0.00 0.00 0.00 0 0 0
sdd 21.00 84.00 0.00 0.00 84 0 0 (disk 3)
sde 0.00 0.00 0.00 0.00 0 0 0
sdf 21.00 84.00 0.00 0.00 84 0 0 (disk 9)
sdg 21.00 84.00 0.00 0.00 84 0 0 (disk 10)
sdh 0.00 0.00 0.00 0.00 0 0 0
sdi 0.00 0.00 0.00 0.00 0 0 0
sdj 0.00 0.00 0.00 0.00 0 0 0
sdk 0.00 0.00 0.00 0.00 0 0 0
Syslog 120s after the transfer hangs:
Code:
Oct 6 21:32:58 IONE kernel: [ 605.733369] INFO: task jbd2/md0-8:1158 blocked for more than 120 seconds.
Oct 6 21:32:58 IONE kernel: [ 605.733375] Tainted: P OE 5.4.0-88-generic #99-Ubuntu
Oct 6 21:32:58 IONE kernel: [ 605.733377] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 6 21:32:58 IONE kernel: [ 605.733380] jbd2/md0-8 D 0 1158 2 0x80004000
Oct 6 21:32:58 IONE kernel: [ 605.733383] Call Trace:
Oct 6 21:32:58 IONE kernel: [ 605.733395] __schedule+0x2e3/0x740
Oct 6 21:32:58 IONE kernel: [ 605.733402] ? __wake_up_common_lock+0x8a/0xc0
Oct 6 21:32:58 IONE kernel: [ 605.733405] schedule+0x42/0xb0
Oct 6 21:32:58 IONE kernel: [ 605.733411] jbd2_journal_commit_transaction+0x258/0x17f0
Oct 6 21:32:58 IONE kernel: [ 605.733415] ? __switch_to_asm+0x40/0x70
Oct 6 21:32:58 IONE kernel: [ 605.733416] ? __switch_to_asm+0x34/0x70
Oct 6 21:32:58 IONE kernel: [ 605.733418] ? __switch_to_asm+0x40/0x70
Oct 6 21:32:58 IONE kernel: [ 605.733420] ? __switch_to_asm+0x34/0x70
Oct 6 21:32:58 IONE kernel: [ 605.733422] ? __switch_to_asm+0x40/0x70
Oct 6 21:32:58 IONE kernel: [ 605.733424] ? __switch_to_asm+0x34/0x70
Oct 6 21:32:58 IONE kernel: [ 605.733426] ? __switch_to_asm+0x40/0x70
Oct 6 21:32:58 IONE kernel: [ 605.733428] ? __switch_to_asm+0x40/0x70
Oct 6 21:32:58 IONE kernel: [ 605.733430] ? __switch_to_asm+0x34/0x70
Oct 6 21:32:58 IONE kernel: [ 605.733432] ? __switch_to_asm+0x34/0x70
Oct 6 21:32:58 IONE kernel: [ 605.733435] ? wait_woken+0x80/0x80
Oct 6 21:32:58 IONE kernel: [ 605.733444] ? try_to_del_timer_sync+0x54/0x80
Oct 6 21:32:58 IONE kernel: [ 605.733449] kjournald2+0xb6/0x280
Oct 6 21:32:58 IONE kernel: [ 605.733452] ? wait_woken+0x80/0x80
Oct 6 21:32:58 IONE kernel: [ 605.733458] kthread+0x104/0x140
Oct 6 21:32:58 IONE kernel: [ 605.733461] ? commit_timeout+0x20/0x20
Oct 6 21:32:58 IONE kernel: [ 605.733464] ? kthread_park+0x90/0x90
Oct 6 21:32:58 IONE kernel: [ 605.733466] ret_from_fork+0x35/0x40
Oct 6 21:32:58 IONE kernel: [ 605.733497] INFO: task pool-Thunar:4920 blocked for more than 120 seconds.
Oct 6 21:32:58 IONE kernel: [ 605.733499] Tainted: P OE 5.4.0-88-generic #99-Ubuntu
Oct 6 21:32:58 IONE kernel: [ 605.733501] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 6 21:32:58 IONE kernel: [ 605.733503] pool-Thunar D 0 4920 2573 0x00000000
Oct 6 21:32:58 IONE kernel: [ 605.733505] Call Trace:
Oct 6 21:32:58 IONE kernel: [ 605.733509] __schedule+0x2e3/0x740
Oct 6 21:32:58 IONE kernel: [ 605.733513] schedule+0x42/0xb0
Oct 6 21:32:58 IONE kernel: [ 605.733518] rwsem_down_read_slowpath+0x16c/0x4a0
Oct 6 21:32:58 IONE kernel: [ 605.733523] down_read+0x85/0xa0
Oct 6 21:32:58 IONE kernel: [ 605.733527] ext4_da_map_blocks.constprop.0+0x2d3/0x380
Oct 6 21:32:58 IONE kernel: [ 605.733530] ext4_da_get_block_prep+0x55/0xe0
Oct 6 21:32:58 IONE kernel: [ 605.733533] ext4_block_write_begin+0x157/0x520
Oct 6 21:32:58 IONE kernel: [ 605.733536] ? ext4_da_map_blocks.constprop.0+0x380/0x380
Oct 6 21:32:58 IONE kernel: [ 605.733540] ? __ext4_journal_start_sb+0x69/0x120
Oct 6 21:32:58 IONE kernel: [ 605.733544] ext4_da_write_begin+0x1cf/0x460
Oct 6 21:32:58 IONE kernel: [ 605.733551] generic_perform_write+0xc2/0x1c0
Oct 6 21:32:58 IONE kernel: [ 605.733556] __generic_file_write_iter+0x107/0x1d0
Oct 6 21:32:58 IONE kernel: [ 605.733561] ext4_file_write_iter+0xb9/0x360
Oct 6 21:32:58 IONE kernel: [ 605.733565] ? common_file_perm+0x5e/0x110
Oct 6 21:32:58 IONE kernel: [ 605.733572] do_iter_readv_writev+0x14f/0x1d0
Oct 6 21:32:58 IONE kernel: [ 605.733575] do_iter_write+0x84/0x1a0
Oct 6 21:32:58 IONE kernel: [ 605.733577] vfs_iter_write+0x19/0x30
Oct 6 21:32:58 IONE kernel: [ 605.733584] iter_file_splice_write+0x24d/0x390
Oct 6 21:32:58 IONE kernel: [ 605.733588] do_splice+0x23f/0x650
Oct 6 21:32:58 IONE kernel: [ 605.733593] __x64_sys_splice+0x131/0x150
Oct 6 21:32:58 IONE kernel: [ 605.733599] do_syscall_64+0x57/0x190
Oct 6 21:32:58 IONE kernel: [ 605.733602] entry_SYSCALL_64_after_hwframe+0x44/0xa9
Oct 6 21:32:58 IONE kernel: [ 605.733605] RIP: 0033:0x7fc61952a7f3
Oct 6 21:32:58 IONE kernel: [ 605.733613] Code: Bad RIP value.
Oct 6 21:32:58 IONE kernel: [ 605.733615] RSP: 002b:00007fc616bbd700 EFLAGS: 00000293 ORIG_RAX: 0000000000000113
Oct 6 21:32:58 IONE kernel: [ 605.733617] RAX: ffffffffffffffda RBX: 0000000000100000 RCX: 00007fc61952a7f3
Oct 6 21:32:58 IONE kernel: [ 605.733619] RDX: 0000000000000016 RSI: 0000000000000000 RDI: 0000000000000017
Oct 6 21:32:58 IONE kernel: [ 605.733620] RBP: 0000000000000000 R08: 0000000000100000 R09: 0000000000000004
Oct 6 21:32:58 IONE kernel: [ 605.733622] R10: 00007fc616bbd840 R11: 0000000000000293 R12: 0000000000000016
Oct 6 21:32:58 IONE kernel: [ 605.733623] R13: 0000000000000000 R14: 0000000000000017 R15: 00007fc616bbd850