NetApp DS4246 with Dell Compellent controller read speeds slow

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

amertahir

New Member
May 16, 2022
6
0
1
I have a NetApp DS4246 with Dell compellent controller and 24 4tb 7.2k rpm SAS drives. I connected it to a dell r730xd with LSI SAS9207-8E and installed TrueNAS Scale.

After creating a pool with mirrored vdevs (6 mirrored vdevs with 3 drives each), I started testing read and write speeds using dd and file copy over smb. I see a weird issue, where if I have only a single file read from pool and no other operations being performed on the pool (scrub, data copy etc), I only get ~50 MB/s read speeds. However, if I simultaneously start another dd/smb file copy essentially doing multiple reads from the pool, then the read speeds bumps up to 700 MB/s aggregate (i.e. the other file transfer speeds up as well).

The write speeds are unaffected and even with one write a time, I get speeds around 700 MB/s which is good. It's only the reads that are weird. I tried changing HBA to Dell H200e (flashed to IT mode) and also switching to Netapp IOM6 instead of dell compellent and same problem.

If I have 12 or less drives in a pool, then read speeds are fine even for single read operation, i.e. I get around ~400 MB/s (good for 4 vdevs).



Does anyone have any ideas what's going on???
 

acquacow

Well-Known Member
Feb 15, 2017
786
439
63
42
Have you set all the sysctl tunables that are recommended for 10GigE?

Have you done iperf3 tests bi-directional between client and server?

Have you enabled smbmultichannel? aio read and write sizes?
 

amertahir

New Member
May 16, 2022
6
0
1
This is happening while doing 'dd if=somefile of=/dev/null bs=1M' on truenas shell, even without going through network. So, the iperf3 and smbmultichannel aio might not help.
 

amertahir

New Member
May 16, 2022
6
0
1
Thanks for the replies, I understand dd isn't a good benchmark tool, but I'm trying to spot the sudden change in transfer speeds even with cifs/dd/nfs when I have single vs multiple simultaneous file transfers going on.

I tried switching to IOM3 controllers on the Netapp DS4243 and I still get only ~50MB/s with 18 drives configured as 3-disk mirrored vdevs. I'm beginning to suspect there's an underlying hardware issue on my disk shelf backplane or maybe 18-drive zfs pool is too big for my setup (having a single 4-lane 6Gb/s expanded to 24 disks).
 

amertahir

New Member
May 16, 2022
6
0
1
I have a Netapp 12gb/s quadport HBA on the way (will be delivered sometime next week) and I'll try that to rule out any issues related to HBA. I have another LSI quadport 6 Gb/s HBA on the way as well. The issue can either be due to controller modules in the disk shelf (I tried IOM3, IOM6 and Dell Compellent 6 Gb/s) or the cables (I tried QSFP->SF-8088 with IOM3/IOM6 and SF-8088->SF-8088 with Dell Compellent controllers) or the disk shelf backplane (I only have access to one disk shelf so I can't compare). I don't think the drives are at fault since I've gotten faster transfer speeds when I have less drives per pool and shuffled different drives in different pool configurations. If there was a slow/failing drive, I would've seen it in the iostat output.

So, if I can't find the root cause after switching HBAs, controllers and cables, I'm just going to give up on it, have spent too much time and budget on it so far.
 

amertahir

New Member
May 16, 2022
6
0
1
I tried using a NetApp 12Gb/s HBA as well and on TrueNAS Core this time with IOM6 installed on the back of the shelf, I still see the same issue.

So far, I see that the read speeds decrease when the drives have been up and running for a couple of hours (> 9) even if the pool had been idle the whole time. If I power cycle the disk shelf or unplug all drives and then plug them in, the read speeds go back to normal (~700MB/s). However, some hours later they decrease again. I have two pools from the disks on the shelf, one with six 3-drive mirrored vdevs (18 drives) and another three 2-drive mirrored vdevs (6 drives). The smaller pool always gives high read and write speeds regardless of how long the drives have been powered up for.

The issue might be the SAS drives themselves, but the smaller pool has the same model drives (IBM-Seagate SAS SED 4tb 7.2k RPM drives) and has no issue.

I ordered another DS4246 which is on the way, if that shows the same issue, then definitely the issue is in my drives.
 

amertahir

New Member
May 16, 2022
6
0
1
I tried with another NetApp HBA, with another machine and TrueNAS Core this time and a new DS4246. Even in the new enclosure I see the same issue, synchronous read speeds drop to ~50-70 MB/s after a couple of hours.

I've concluded there's something wrong with my drives, these are IBM branded Seagate ST4000NM0043 SED drives, I reset them to factory defaults so no encryption enabled and formatted to 512byte sectors. Should work just fine and individually work just fine, or even in small sized pools (<=6 drives) work just fine.


I'll try with a new set of drives (WD branded SAS drives this time) and hopefully I won't see the issue again, probably IBM custom firmware is doing something wonky when working in zfs.
 

lyoth

New Member
May 12, 2021
14
14
3
I tried with another NetApp HBA, with another machine and TrueNAS Core this time and a new DS4246. Even in the new enclosure I see the same issue, synchronous read speeds drop to ~50-70 MB/s after a couple of hours.

I've concluded there's something wrong with my drives, these are IBM branded Seagate ST4000NM0043 SED drives, I reset them to factory defaults so no encryption enabled and formatted to 512byte sectors. Should work just fine and individually work just fine, or even in small sized pools (<=6 drives) work just fine.


I'll try with a new set of drives (WD branded SAS drives this time) and hopefully I won't see the issue again, probably IBM custom firmware is doing something wonky when working in zfs.
Any update on this? I’m experiencing similar problems as well
 

Stephan

Well-Known Member
Apr 21, 2017
923
700
93
Germany
If you can and pool is not in use yet, try setup the ZFS pool with a ZIL device. Like some small Octane drive or an Intel DC P3700. 50-100 GB is plenty. Check if synchronous writes improve. Check write caches on drives, invert setting. Shouldn't matter but worth a shot. Check single drive write speeds without any pool. dd is a blunt tool, I like fio.