Strange bottleneck on 12Gbps Hitachi HUSMM ssd's

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

whitey

Moderator
Jun 30, 2014
2,766
868
113
41
What in the heck is going on here? Here are the results of my 'insanity' check: (taken right in the middle/steady state/rough average of 40GB VM sVMotion)

Single husmm
Code:
                                           capacity     operations    bandwidth
pool                                    alloc   free   read  write   read  write
--------------------------------------  -----  -----  -----  -----  -----  -----
husmm1640                               9.17G   361G      0  3.31K      0   166M
  gptid/a9e37996-c17c-11e7-a22f-0050569a060b  9.17G   361G      0  3.31K      0   166M
--------------------------------------  -----  -----  -----  -----  -----  -----
upload_2017-11-4_16-38-45.png
upload_2017-11-4_16-38-51.png

Mirror of two husmm's
Code:
                                           capacity     operations    bandwidth
pool                                    alloc   free   read  write   read  write
--------------------------------------  -----  -----  -----  -----  -----  -----
husmm1640-mirror                        4.01G   366G      0  3.38K      0   164M
  mirror                                4.01G   366G      0  3.38K      0   164M
    gptid/b24fc567-c17f-11e7-a22f-0050569a060b      -      -      0  3.25K      0   164M
    gptid/b285c529-c17f-11e7-a22f-0050569a060b      -      -      0  3.25K      0   164M
--------------------------------------  -----  -----  -----  -----  -----  -----
upload_2017-11-4_16-39-22.png
upload_2017-11-4_16-39-29.png

Stripped mirror of 4 husmm's
Code:
                                           capacity     operations    bandwidth
pool                                    alloc   free   read  write   read  write
--------------------------------------  -----  -----  -----  -----  -----  -----
husmm1640-stripped-mirror               21.8G   718G      0  3.69K      0   176M
  mirror                                10.9G   359G      0  1.82K      0  89.3M
    gptid/769fb787-c1a8-11e7-a22f-0050569a060b      -      -      0  1.78K      0  89.4M
    gptid/76d8d32b-c1a8-11e7-a22f-0050569a060b      -      -      0  1.78K      0  89.4M
  mirror                                10.9G   359G      0  1.87K      0  86.6M
    gptid/a21bb9ca-c1a8-11e7-a22f-0050569a060b      -      -      0  1.79K      0  86.7M
    gptid/a251ead4-c1a8-11e7-a22f-0050569a060b      -      -      0  1.79K      0  86.7M
--------------------------------------  -----  -----  -----  -----  -----  -----
upload_2017-11-4_16-39-37.png
upload_2017-11-4_16-39-43.png

Raidz of 4 husmm's
Code:
                                           capacity     operations    bandwidth
pool                                    alloc   free   read  write   read  write
--------------------------------------  -----  -----  -----  -----  -----  -----
husmm1640-rz                            5.62G  1.44T      0  3.37K      0   161M
  raidz1                                5.62G  1.44T      0  3.37K      0   161M
    gptid/02c4749f-c1ac-11e7-a22f-0050569a060b      -      -      0  2.45K      0  57.0M
    gptid/02f5705d-c1ac-11e7-a22f-0050569a060b      -      -      0  2.45K      0  53.9M
    gptid/0326a0a8-c1ac-11e7-a22f-0050569a060b      -      -      0  2.45K      0  57.0M
    gptid/03586024-c1ac-11e7-a22f-0050569a060b      -      -      0  2.45K      0  53.9M
--------------------------------------  -----  -----  -----  -----  -----  -----
upload_2017-11-4_16-39-50.png
upload_2017-11-4_16-39-56.png

Just seems that no matter what I do pool config-wise (single dev, mirror, stripped-mirror, raidz) using husmm's which are a DAMN good drive or so I thought you only get roughly 150-175MB/s w/ these drives until you add a slog then it only takes it up to 300-350MB/s (another 150-175MB/s, coincidence...I think not).

BOOO at consistent 30-50MB/s into these devices in ANY ZFS pool config for me when an identical SLOG devices happily sucks in 175MB/s...shouldn't ALL be able to suck in 175MB/s consistently then is the theory giving me roughly 700MB/s disk I/o throughput and on network 5+Gbps (HELL I'd take 500-600 at this point)

Anyone that can shed light on this I 'may' owe ya a kidney/liver/both :-D
 

Attachments

Last edited:

whitey

Moderator
Jun 30, 2014
2,766
868
113
41
Anyone w/ a kind heart/soul and 4 husmm's on a vSphere platform using AIO FreeNAS care to re-produce? Send ya some beers of your choice! :-D

EDIT: Also anyone care to take a bet on if I set it up w/ 2 or 4 husmm's in stripped (raid0) config if things get any better perf-wise, I'd bet NOT. Again when I add another smaller husmm device as SLOG pool perf doubles. Do I need higher-end/more kick-ass SLOG dev's as my main pool dev's are already good or is the main pool dev's shooting myself in foot or not as good as I thought?
 

T_Minus

Build. Break. Fix. Repeat
Feb 15, 2015
7,625
2,043
113
While not your situation exactly this may interest you from my last year testing with 12GB/s SAS3 (only 200gb in this case).

I'm having issues with forum updating but here are the #s you care about, this was done on Windows both times... one as an esxi data store direct managed in esxi the other was ZFS (Napp-IT) 'base' and 'basic tune' over NFS.

NFS Base on E1000
4K 9.7MB/s READ 16.76MB/S WRITE
NFS Tune on VMXNET3 (vswitch 9k, omniOS mtu 9k, omniOS buffer/packet increase)
4k 43.53MB/s READ 18.2MB/s WRITE <-- not exactly a good write speed
(This is SYNC=OFF too, single SAS3 SSD)

ESXI Direct Datastore (no zfs/nfs/etc)
4K 30.99MB/s READ 76.86MB/s WRITE

Looking at the screenshots I have it appears that WRITE 4K takes a HUGE hit from ZFS and likely NFS, this was on an AIO so I'm not sure how much true network loss there was I don't know how much is lost in esxi? Maybe 80%+ of perf. loss is from ZFS itself?
 

T_Minus

Build. Break. Fix. Repeat
Feb 15, 2015
7,625
2,043
113
4k @ Q32T1 Tune on VMXNEt3 (Same as above)
245MB/s READ & 98MB/s WRITE

4k @ Q32T1 Esxi Direct (same as above)
384MB/s READ & 239MB/s WRITE

4K @ Q32T1 SSD ESXI-->VM Pass Through (bare drive/empty)
340MB/s READ & 309MB/s WRITE
 
  • Like
Reactions: whitey

whitey

Moderator
Jun 30, 2014
2,766
868
113
41
Just for fun I reverted back to my orig config (4 disk husmm raidz plus 1 200gb husmm SLOG), setup iSCSI just to put it side by side NFS (I usually associated 10-15% perf gains to iSCSI w/ the knowledge that it is a PITA to manage comparatively to NFS all in all)

Here is what happened. iSCSI initially looks like it took off delivering 400MB/s or so then went boom/perf back to what I have been seeing w/ all husmm pool w/ matching device SLOG (250-275/MB/s).

Code:
                                           capacity     operations    bandwidth
pool                                    alloc   free   read  write   read  write
--------------------------------------  -----  -----  -----  -----  -----  -----
husmm1640-rz                            68.8G  1.38T      0  4.83K      0   379M
  raidz1                                68.8G  1.38T      0  2.28K      0   118M
    gptid/b534270e-c1b2-11e7-a22f-0050569a060b      -      -      0    761      0  41.3M
    gptid/b56d33f5-c1b2-11e7-a22f-0050569a060b      -      -      0    762      0  41.2M
    gptid/b5a7da7b-c1b2-11e7-a22f-0050569a060b      -      -      0    755      0  41.3M
    gptid/b5dd79b3-c1b2-11e7-a22f-0050569a060b      -      -      0    760      0  41.3M
logs                                        -      -      -      -      -      -
  gptid/b60fd941-c1b2-11e7-a22f-0050569a060b   308M   186G      0  2.55K      0   260M
--------------------------------------  -----  -----  -----  -----  -----  -----
DROP OFF
Code:
                                           capacity     operations    bandwidth
pool                                    alloc   free   read  write   read  write
--------------------------------------  -----  -----  -----  -----  -----  -----
husmm1640-rz                            66.2G  1.38T      0  3.74K      0   253M
  raidz1                                66.2G  1.38T      0  1.70K      0  45.1M
    gptid/b534270e-c1b2-11e7-a22f-0050569a060b      -      -      0  1.33K      0  16.2M
    gptid/b56d33f5-c1b2-11e7-a22f-0050569a060b      -      -      0  1.33K      0  15.7M
    gptid/b5a7da7b-c1b2-11e7-a22f-0050569a060b      -      -      0  1.34K      0  16.2M
    gptid/b5dd79b3-c1b2-11e7-a22f-0050569a060b      -      -      0  1.33K      0  15.7M
logs                                        -      -      -      -      -      -
  gptid/b60fd941-c1b2-11e7-a22f-0050569a060b   154M   186G      0  2.04K      0   208M
--------------------------------------  -----  -----  -----  -----  -----  -----
The devices drop to 15-20MB/s with the SLOG picking up even more slack than in NFS use-cases from initial screenshots so the numbers 'look' the same...don't let that fool you, guess it really doesn't matter, just more interesting than anything, if the pool devices suck/cannot soak in writes in a regular pool let alone w/ a comparable SLOG device throwing de-stages at them, eff it!...WTF, I am losing my mind, should just leave 'good nuff' alone' and come to terms w/ what I have is all the juice I can squeeze outta them' with this particular workload/test case.


upload_2017-11-4_17-40-35.png
upload_2017-11-4_17-40-47.png

Kinda strange that over iSCSI the SLOG hit's even a bit harder v.s. NFS w/ same sVMotion operation:
upload_2017-11-4_17-47-16.png
 
Last edited:

whitey

Moderator
Jun 30, 2014
2,766
868
113
41
Guess I shouldn't b|tch too much, latency and IOPS are lookin' ok.

upload_2017-11-4_18-20-2.png