Very slow ioping NFS

timvdh · Jun 5, 2017

Hi,

I currently have 3 different SAN's with omnios and napp-it.
I connect to proxmox hosts over NFS.

The problem is that we have inconsistent performance.

When I run ioping -c 50 at the NFS mount on a proxmox host we get very changing results, varying from 15 iops till 2000 iops.
I don't think this is very good.
When I do the same test to my Synology NAS with NFS we constantly get arround 2700 iops. This is much more stable.

I tried it on 3 different san's. They use other switches.
10gbit ubiquiti, 1gbit cisco, 1 gbit juniper. Results are the same.

Also tried to tweak my nfs config:
servers=512
lockd_listen_backlog=256
lockd_servers=128
lockd_retransmit_timeout=5
grace_period=90
server_versmin=3
server_versmax=3
client_versmin=3
client_versmax=3
server_delegation=on
nfsmapid_domain=
max_connections=1
protocol=ALL
listen_backlog=32
device=
mountd_listen_backlog=64
mountd_max_threads=16

With the same SAN I made an iscsi mount and installed a VM on this.
Inside of this VM I constant get arround 2500 iops. This looks a lot better.

Transfer speed over NFS is not very bad, but also not very stable.
I am currently busy to test with iozone, but I have to learn how this works. How can I create some usefull output with this?

Someone a clue why my NFS is so unstable?

Thanks!

gea · Jun 5, 2017

We need some more details

1.
How much RAM do you use on OmniOS?

Background:
Unlike ntfs or ext4 (what you use propably on your Syno), ZFS is a copy on write filesystem with checksums. While this gives you a superiour data security with snaps and crash resistency, it spreads your data (due checksums more data than on ext4) all over the pool what means that your access pattern is more or less iops limited. Even with a sequential videostream, your access pattern is not sequential.

To compensate this you must use a iops optimized pool ex with SSDs or you need enough RAM for a write cache (up to 4GB) and a larger readcache. With enough RAM nearly all random reads are cached from RAM

2.
What is your pool layout. Which disks?
This refers a little to 1.

3.
Have you disabled sync write (see menu ZFS filesystems > sync) ?
With sync write enabled (usually same as default with NFS), every single write of any datablock must be committed from disk until the next one can occur. This will disable all write caches - the source of performance. Most lower performance NAS systems disables this for better performance for the risk of a dataloss on a crash during a write. With older filesystems you need a hardwareraid with cache and BBU to reduce this problem. On ZFS sync does the same, even better and you can add a dedicated log device if your pool has less iops.

4.
OmniOS tcp and NFS defaults are for 1Gb networks.
With faster networks you should increase tcp buffers, NFS buffers and threads.
You can set these values manually or on napp-it Pro with menu system tuning.
See/ compare values at chapter 21 of http://napp-it.org/doc/downloads/napp-it.pdf

timvdh · Jun 5, 2017

Hi Gea,

Have running different pool types and tested

5x SM863 960GB RAIDZ 64GB Ram 10GBIT

12x 2TB Hitatchi Zfs raid 10 DC3710 ZIL 850 PRO l2arc 64GB Ram 10GBIT

8x Toshiba 3TB ZFS Raid 10 DC3700 ZIL 850 PRO l2arc 64GB Ram 1GBIT and 10GBIT tested

Tested with sync writes on and off, not much differenc

Test iscsi VM Proxmox

FIO:
test: (groupid=0, jobs=1): err= 0: pid=31821: Mon Jun 5 13:23:12 2017 read : io=3071.7MB, bw=208635KB/s, iops=52158, runt= 15076msec write: io=1024.4MB, bw=69575KB/s, iops=17393, runt= 15076msec cpu : usr=9.41%, sys=45.10%, ctx=141770, majf=0, minf=25 IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0% issued : total=r=786347/w=262229/d=0, short=r=0/w=0/d=0 latency : target=0, window=0, percentile=100.00%, depth=64

IOPING:
--- . (xfs /dev/dm-0) ioping statistics ---
99 requests completed in 38.7 ms, 396 KiB read, 2.56 k iops, 10.0 MiB/s
generated 100 requests in 1.65 min, 400 KiB, 1 iops, 4.04 KiB/s
min/avg/max/mdev = 246.0 us / 390.6 us / 1.06 ms / 108.1 us

Test NFS VM Proxmox

FIO
test: Laying out IO file(s) (1 file(s) / 4096MB) Jobs: 1 (f=1): [m] [99.0% done] [21482KB/6997KB/0KB /s] [5370/1749/0 iops] [eta 00m:01s] test: (groupid=0, jobs=1): err= 0: pid=473611: Mon Jun 5 13:30:27 2017 read : io=3071.7MB, bw=32872KB/s, iops=8217, runt= 95686msec write: io=1024.4MB, bw=10962KB/s, iops=2740, runt= 95686msec cpu : usr=2.06%, sys=14.09%, ctx=600466, majf=0, minf=25 IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0% issued : total=r=786347/w=262229/d=0, short=r=0/w=0/d=0 latency : target=0, window=0, percentile=100.00%, depth=64

IOPING
--- . (xfs /dev/dm-0) ioping statistics ---
99 requests completed in 3.29 s, 396 KiB read, 30 iops, 120.4 KiB/s
generated 100 requests in 1.67 min, 400 KiB, 0 iops, 3.99 KiB/s
min/avg/max/mdev = 262.6 us / 33.2 ms / 2.34 s / 238.3 ms

Look likes an extreme difference...

Maybe just switch to iscsi?

gea · Jun 5, 2017

I am not experienced with Proxmox but the results are not as expected.
Under ESXi iSCSI and NFS perform similar with same sync enabled/disabled on both. (on iSCSI the according setting is writeback cache enabled same as sync disabled)

There should also be a remarkable to huge difference on writes with sync disabled vs enabled.

The only ZFS parameter that may be additionally performane relevant is blocksize.
iSCSI is 8k while NFS use dynamically a value up to 128k per default.
For VM usage I would try a lower value like 32k (This is a filesystem property on ZFS for next writes)
but this does nor declare the huge difference.

timvdh · Jun 5, 2017

Hi Gea,

I also tried to set blocksize at different levels.. Does'nt make a difference
Sync disabled, i thought it was normal to don't see difference at full SSD pool?

I like NFS because it's easy.
Does napp-it replication and iscsi also work easy to reconnect proxmox to the new iscsi server when te first one fails?

When something goes wrong with my proxmox cluster I can just access all VM files at the pool with FTP or SCP directly on my SAN and put them in a new virtualisation environment. Is there also a way to do this with iscsi? This were de reasons I chose for NFS earlier.

gea · Jun 5, 2017

I also prefer NFS over iSCSI as handling is so much easier especially
when paired with SMB and Windows previous versions for snaps/versioning or copy/clone.

I would really like to know why your NFS is as slow
(Solarish is one of the best NFS servers, remember Sun invented NFS)

If you use iSCSI, you must replicate the zvol.
To use the zvol as a LUN from your backup server, you must

when zvols were created as a filesystem/share property
in menu ZFS filesystems

- disable replication and set zvol to r/w
- re-enable iSCSI sharing in menu ZFS filesystems
thats all as napp-it keeps the target GUID and view the same as this is part of the zvol name then

when zvols were created in menu Comstar
- disable replication and set zvol to r/w
- import logical unit from a zvol. You must enter a GUID
(you must know the old one if you want to keep the former LUN workable
otherwise you must reimport the LUN)
- set a view on targets

timvdh · Jun 5, 2017

Hi Gea,

I also like NFS the most
So if you now something to troubleshoot this I'll be happy to try. I also ran tuning under system but this doesn't make a difference.

Tim

gea · Jun 5, 2017

hmm, only to rule out something ..
add an entry of your nfs clients into /etc/hosts (ip hostname)
in menu System > Eth network > hosts for reverse name resolving

timvdh · Jun 6, 2017

Hi Gea,

Doesn't help

timvdh · Jun 7, 2017

Today I tried to test the following:

I installed a clean server with OmniOS and napp-it
Placed one SSD and create a basic ZFS pool. I create a ZFS filesystem and enable NFS.
I mounted this to a new installed Proxmox host. And to a laptop with ubuntu.
I started to benchmark with ioping. First test went allright.
After that I was generating a little bit of load at the NFS share, ioping was going down, varying from 10 to 1000 iops. Very bad. Exactly like my problem i already had.

The next thing I did was reinstalling OmniOS to Openindiana and installed Napp-it to it.
I imported the ZFS pool and did the same testing. The results were perfect so far I could test for now. Stable ioping arround 3K iops. Also with heavy load in my VM at the same NFS share.

So it looks like the problem is with OmniOS. It does not exist on Openindiana.
Somebody a clue how to fix this in OmniOS or maybe switch to Openindiana.

I use omnios version
OmniOS 5.11 omnios-cac2b76 October 2016
and OmniOS 5.11 omnios-2fb9a48 March 2017

Both same problems...

dragonme · Jun 7, 2017

@timvdh

I am not following

Attaching it to proxmox host how.. virtual all in one or over phyical cables and switches?

What drivers are being used for the network ports for the NFS

Need more config data

I personally have not had very good luck with napp-it from a performance standing ..

Writes to the ZFS store over SMB are terrible .. 60MBs or so while downloading files from the same filer to the same workstation saturate my physical line 130MBs

The vmxnet3 network drivers in omnios are bursty and they seem temperamental and dont improve overall thoughput over the e1000 but the e1000 takes ALOT more cpu ... but its more level and not bursty..

The e1000 doesnt seem as tempermental either...

But either way performance is not great

SMB is at 2.x which is archaic and solaris in general is a bastard stepchild as far as development and with Omni coming to an end being forced in a new direction is probably a good thing...

If openindiana is really that much better might give it a try buy I have been trying to get performance of this all in one box even remotely on par with my old, 9 year old, zfs server with no luck and I think that omnios is a big culprit.

gea · Jun 9, 2017

Fast reads (mostly from cache) and slow writes on OmniOS indicate more a configuration or pool problem. The kernelbased and multithreaded Solaris CIFS server is know to be very fast. Usually no problem to go up to 1G performance or 600-900 MB/s on 10G

regarding OpenIndiana
OmniOS and OpenIndiana are distributions based on the current Illumos so from core OS view they are identical. They may only differ in some system settings and optimisations. With same tcp, NFS and ZFS settings they should perform identically as OS and drivers are the same. (OpenIndiana is more or less pure ongoing Illumos)

In my tests Illumos is one of the fastest Open-ZFS options especially as RAM usage in ZFS is Solaris optimised. Oracle Solaris is often a bit faster than the free forks around Illumos with a much faster resilvering time. On 10G/40G I saw up to 30% better throughput than on OmniOS but Oracle Solaris is not free.

Search

Very slow ioping NFS

timvdh

New Member

gea

Well-Known Member

timvdh

New Member

gea

Well-Known Member

timvdh

New Member

gea

Well-Known Member

timvdh

New Member

gea

Well-Known Member

timvdh

New Member

timvdh

New Member

dragonme

Active Member

gea

Well-Known Member