10G write speed for ZFS filer

Rand__ · Nov 24, 2018

So today the boxes decided to play nice again...
Win to OmniOS (iperf2)

Code:

C:\>iperf -c 192.168.124.24
------------------------------------------------------------
Client connecting to 192.168.124.24, TCP port 5001
TCP window size: 64.0 KByte (default)
------------------------------------------------------------
[216] local 192.168.124.35 port 49201 connected with 192.168.124.24 port 5001
[ ID] Interval       Transfer     Bandwidth
[216]  0.0-10.0 sec  22.5 GBytes  19.3 Gbits/

Going to higher windowsizes decreased performance ... not really understanding this behavior, but if it's working for a change...

Speeds to a 2x2 Optane Mirror via nfs, not that much faster than single, but not too shabby i guess.

azev · Jan 29, 2020

@Rand__ I almost forgot about this thread, I found it while I was reading other thread. Anyway a few months ago I found some info online about tuning your zfs system when using NVME/SSD, and the result was amazing.

My ZFS setup is dual E5-2680v2 with 256GB of Ram and dual 40Gb Mellanox nic. The drives I am using are 16 x HGST HUSMMR1680 setup in 8 mirrored vdev, and an optane 900 (480GB) carved up to 50Gb partition for ZIL. SYNC=ALWAYS. The ESXi host has dedicated 2x 10Gb nic just for ISCSI, and luns are served via ISCSI multipath.

Anyway, based on my understanding of the publication I found online, these tunables are completely safe to use. I have had power failure while my UPS was on bypass (waiting for replacement batteries) and my data seems intact.
Maybe some of the ZFS guru here can chime in and see whether these tunables are good or bad; however as you can see performance are amazing. It almost seems like the limits are the dedicated dual 10Gb nics.

Rand__ · Jan 30, 2020

Ah nice, thanks - I had a play with some of these already but they didnt have much effect for me, might need to give it another try.
They look safe when used with sync always, but I am no expert on these either

Re your numbers -
1. do you have a similar result set before the tuning?
2. You go up to 64MB blocksize which is rather large, not sure that actually will ever be used by ESXi (as datastore, might be different/application dependent for iscsi disks)

i386 · Jan 30, 2020

Rand__ said:
You go up to 64MB blocksize which is rather large, not sure that actually will ever be used by ESXi

VMFS uses 1MB as blocksize, it's the best compromise between storage efficiency and performance for virtually all vm workloads.
(Esxi <= 5.0 supports up to 8MB blocksizes

)

Rand__ · Jan 30, 2020

1 MB blocksize on the vmfs5/6 partition; this might be true for an iSCSI volume then too (as in remotely attached hard drive)...

Iirc it would not hold true for the actual block size on a nfs based datastore, i think thats down to 64K

azev · Jan 30, 2020

Rand__ said:
Ah nice, thanks - I had a play with some of these already but they didnt have much effect for me, might need to give it another try.
They look safe when used with sync always, but I am no expert on these either

Re your numbers -
1. do you have a similar result set before the tuning?
2. You go up to 64MB blocksize which is rather large, not sure that actually will ever be used by ESXi (as datastore, might be different/application dependent for iscsi disks)

1. I did some testing prior to the tunables and the result were not as good as post tunables. Unfortunately I didnt document the result properly so I do not have anything to show for comparison.
2. I tried to copy the test setup in your original post; although I am somewhat curious why do you set the queue depth to only 1 ?? If I set the test with large queue depth I am getting way higher numbers on the lower block size test.

i386 · Jan 30, 2020

azev said:
why do you set the queue depth to only 1 ??

It's a more accurate representation of his usecase at home (usually qd is not more than 3-4)

azev · Jan 31, 2020

i386 said:
It's a more accurate representation of his usecase at home (usually qd is not more than 3-4)

I suppose it depend on what you run in your private home cloud. I currently have around 40-50VM that power the home running multiple various services. Some of these are pretty disk intensive like the syslog server, netflow (especially when torrenting) etc.

Anyway, ever since I made the mods, things feels much speedier than before especially when I do system patching and reboot once every month or so. Usually things would slow down to the point that you can feel it when you have many vm rebooting concurrently, but with the tunables, things are much better.

Rand__ · Jan 31, 2020

Never noticed background services impacting anything but then my env is smaller... maybe 20 vm tops.
Also currently on vSan (4 nodes) so distributed to boot.

Whats your netflow solution?

azev · Jan 31, 2020

Rand__ said:
Whats your netflow solution?

I tried a few different open source or trial version of various commercial offering. The best free one I like the most is Elastiflow.
They are abit pain to setup especially I am not a linux guru, however once running they seems to work great although they consume so much space (around 1Gb of flow data daily) My setup unfortunately does not have any rollup features or clean the database.
What I do is add to my normal monthly maintenance task to go and manually clean up old records in the elastiflow box.

Rand__ · Jan 31, 2020

Nice! That has not been around last time I looked into this (or I couldn't find it).
I think Scrutinizer was the last I looked at but visualization (ootb) was lacking.

Will try to find some time for a before after comparison of the tunables, probably on 11.3 to test that while we are at it

Rand__ · Feb 2, 2020

So spent half the day with testing the sysctl parameters (not particularly successful; I'll spare you the details. Possible some network issues.)

Spent the other part with Elastiflow - tried unsuccessfully with the bitnami ELK stack OVA (after I dumped it it dawned me that it might just have been a firewall issue); then did a fresh install on Ubuntu and after a few missing parts I got it going.

Unfortunately current Logstash does not play nice with netflow v9 (Logstash 7.4.0 error Exception in inputworker · Issue #427 · robcowart/elastiflow) so I had to downgrade to 7.3 ... which seems to have a reverse DNS issue, but ok, I can live with that ...

Have you looked into Managing the index lifecycle | Elasticsearch Reference [7.3] | Elastic for your maintenance issues?

Edit: Nice - vSan at work (VLAN 7)

Search

10G write speed for ZFS filer

Rand__

Well-Known Member

azev

Well-Known Member

Rand__

Well-Known Member

i386

Well-Known Member

Rand__

Well-Known Member

azev

Well-Known Member

i386

Well-Known Member

azev

Well-Known Member

Rand__

Well-Known Member

azev

Well-Known Member

Rand__

Well-Known Member

Rand__

Well-Known Member