whitey's FreeNAS ZFS ZIL testing

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

whitey

Moderator
Jun 30, 2014
2,766
868
113
41
@T_Minus my working dataset IS sync=always because I am using NFS for my VMware datastores. I ran 3 VM's each on a seperate host of my 3 node vSphere cluster totaling 180GB of data so I KNOW I am WAY outside of my memory bounds (ARC cache) and no L2ARC in the mix. I DID try FreeNAS iSCSI volumes and forcing sync=always and that did indeed force synchronous writes to the SLOG in that mode (you can see it live by turning sync on and off during sVMotion on that iSCSI zvol) whereas by default iSCSI will go async and you will see NO SLOG usage. What I love about NFS really, SLOG is nearly the perfect use case for VMware/Virtualization on NFS.

What is odd and I can't seem to wrap my damn head arnd is this...had two more HUSMM 200GB SAS3 drives show up today, new SLOG dev's, tried to add to my 4 device HUSMM 400GB devices raidz pool in stripe SLOG mode and immediately saw writes evenly distributed to each SLOG device but the overall throughput to the SLOG whether 1 device or two was equal (about 200MB/sec). Creepy to see 1 dev happily soaking in 200MB, think your gonna add another stripe SLOG in and watch log performance double and each dev drops to 1/2 of what one dev was doing...TOTALLY a wtf moment.

EDIT: I gotta say, while the P3700 is the SLOG king IMHO for what I could get my grubby hands on a 400GB P3700 is not cheap and w/ a $80-100 200GB HUSMM sas3 device providing 200MB SLOG write throughput v.s. 300MB for a P3700 the cost difference is staggering for the bang for buck you get (or don't).

2cents, here's to hoping that Intel Optane DC P4800X test is out of this world, anyone else tested one of those yet as SLOG w/ a similar setup/config/use case?
 
Last edited:

T_Minus

Build. Break. Fix. Repeat
Feb 15, 2015
7,625
2,043
113
@whitey I'm not saying you're testing is non-sync. I'm saying we need to make sure others testing are doing sync=always and results are during steady-state not simply a minute or two of transfer.

We all know even consumer drivers are fast for a few minutes then they can't handle it and drop off.

What I'm saying is 100GB transfer with sync=always is extremely unlikely to put any but the cheapest drive into steady state.

Like other review sites, and companies (Intel specifically comes to mind) benchmark SSD performance the same must be done (especially) for a SLOG... testing must be done after the drive has reached steady state.

Intel SSD DC S3500 Review (480GB): Part 1

S3500 480GB - Look @ #s at 400s, then look at 1000s, and 1400s leveling out.

Just because a SLOG device can handle 1 minute sustained usage doesn't mean it's the best for constant usage is what I'm trying to say in way too many words ;) lol.


Would love to see a SLOG test that records data points, and goes out to 1500 or 2000 seconds of transfer.

Thoughts?

Or how many GBs have you done or TBs?
 

marcoi

Well-Known Member
Apr 6, 2013
1,532
288
83
Gotha Florida
any updates on Intel Optane DC P4800X testing?
looks like the Optane 900P are rolling out shortly. Im wondering if that might be a good slog as well?
 

Patrick

Administrator
Staff member
Dec 21, 2010
12,511
5,792
113
any updates on Intel Optane DC P4800X testing?
looks like the Optane 900P are rolling out shortly. Im wondering if that might be a good slog as well?
Yea I put a bad SSD in there yesterday that is throwing fits.
 

Patrick

Administrator
Staff member
Dec 21, 2010
12,511
5,792
113
any updates?
@BackupProphet was fixing the box yesterday under FreeBSD 11.1.

The machine I believe has a S3610, S3700, P3700, Optane P4800X, Toshiba PX02 SAS3 SSD in it. I also installed 3x 15K RPM SAS3 hard drives for a storage pool. Next step is to change this to CentOS. May add a few more SAS3 SSDs in there in the meantime.
 

whitey

Moderator
Jun 30, 2014
2,766
868
113
41
Yeah we can put a fork in this one for this thread. I think I am satisfied seeing recent results for my use case with what I have on hand. Interested to see results of ZLOG testing thread.
 

marcoi

Well-Known Member
Apr 6, 2013
1,532
288
83
Gotha Florida
@BackupProphet was fixing the box yesterday under FreeBSD 11.1.

The machine I believe has a S3610, S3700, P3700, Optane P4800X, Toshiba PX02 SAS3 SSD in it. I also installed 3x 15K RPM SAS3 hard drives for a storage pool. Next step is to change this to CentOS. May add a few more SAS3 SSDs in there in the meantime.
@Patrick -Do you plan to do a write up article on the testing or just post on the forums? It would be awesome to have an article. If you decide on an article, would it be possible to see:
1. bench marking each slog with data pool. any benchmark tools would work. (w/sync= always on pool)
2. Real world experiments such as vmotion using both iscsi and nfs connections via 10GB or higher.
3. Cost per performance of each drive.
4. Two pools, one spinners the other ssd, then 1 slog device partitioned into two and shared between the two pools, then testing both pools at the same time. This is more of a way to justify high costs (at least in my mind) of a slog drive if it can be partition to smaller sizes IE 50-100GB and each partition be used as a slog for multiple pools. I'm not sure how or if this will work, but it would be cool to test if it possible and then what the results might be.

Thanks,
marco
 

Patrick

Administrator
Staff member
Dec 21, 2010
12,511
5,792
113
Marco we will likely do some of that. Only so many resources to put on this stuff.
 

Benten93

Member
Nov 16, 2015
48
7
8
There you go, taken in the last few seconds of transfer:

Code:
                                           capacity     operations    bandwidth
pool                                    alloc   free   read  write   read  write
--------------------------------------  -----  -----  -----  -----  -----  -----
HDD                                     20.8T  6.44T    0    29.51K  0  1.16G
  raidz2                                20.8T  6.44T    0    12.57K  0  393M
    gptid/ed98e7eb-7cf0-11e7-a2a3-90e2bab861f8      -      -    0    2.46K  0  97.9M
    gptid/ee66676f-7cf0-11e7-a2a3-90e2bab861f8      -      -    0    2.63K  0  99.5M
    gptid/ef1c65a5-7cf0-11e7-a2a3-90e2bab861f8      -      -    0    2.48K  0  98.4M
    gptid/efd7ca38-7cf0-11e7-a2a3-90e2bab861f8      -      -    0    2.38K  0  96.3M
    gptid/f093591e-7cf0-11e7-a2a3-90e2bab861f8      -      -    0    2.46K  0  97.7M
    gptid/f14efa24-7cf0-11e7-a2a3-90e2bab861f8      -      -    0    2.41K  0  97.1M
logs                                        -      -      -      -      -      -
  gptid/3f1e782a-7cf0-11e7-a2a3-90e2bab861f8   4.8G  295G      0  16.94K      0   768M
--------------------------------------  -----  -----  -----  -----  -----  -----
Will rerun the test tomorrow if possible.
Test-Setup:

6x 5TB HDD RAIDZ2 with 300G Overprovisioned Optane P4800X with 2x10Gbit Connection via iSCSI from ESXi Host (local P3700) to this Array.
 

whitey

Moderator
Jun 30, 2014
2,766
868
113
41
Guys,

i am testing the P4800X right now, will post the results here in about 30 minutes, when transfer has finished!
So far it looks very promissing! :)
Please outline high level your testing process, quite a bit of gap between testing procedures/datasets we have seen throughout this thread.

EDIT; NM, beat me to it by mere seconds :-D Let me digest. So was that a sVMotion to that storage (what was src array/config if so) or some other method of testing (zfs send/recv) maybe?

2nd EDIT: DAMN I am losing it, ok I see src was a P3700, same host or across network/hosts? Jumbo frames in the mix if cross host/10G ethernet fabric?
 

Benten93

Member
Nov 16, 2015
48
7
8
Please outline high level your testing process, quite a bit of gap between testing procedures/datasets we have seen throughout this thread.

EDIT; NM, beat me to it by mere seconds :-D Let me digest. So was that a sVMotion to that storage (what was src array/config if so) or some other method of testing (zfs send/recv) maybe?

2nd EDIT: DAMN I am losing it, ok I see src was a P3700, same host or across network/hosts? Jumbo frames in the mix if cross host/10G ethernet fabric?
It was a svmotion of an offline VM with about 830GB Size, from local ESXi P3700 to remote iSCSI Share via dual 10G on my Freenas SAN. No Jumbo Frames!

EDIT: some language fixes :D
 

whitey

Moderator
Jun 30, 2014
2,766
868
113
41
Right smack dab in the middle of the sVMOtion would be better/ideal or a quick sampling on average but that could just be a 'spot-check' every 5-10 mins as the sVMotion progresses, if the numbers align close enough then I say you have one helluva SLOG device. Good testing methodology btw, happy to see some serious stress @ 800+ GB of data moving between src/dest stg/platforms.

EDIT: 6Gbps of sVMotion traffic is no joke if that really is where it runs 'steady-state' through the duration of the operation. I got 3Gbps here w/ my current config using raidz 4 husmm dev's and using the same class ssd for SLOG so good show!
 
  • Like
Reactions: Benten93