Napp-in-one (OmniOS) on ESXi 6.5U1, performance issues?

sstillwell

New Member
Feb 21, 2018
12
0
1
60
Hello, all! First-time poster, long-time reader of articles.

Here's what I have:

Napp-it latest appliance (with Home Complete license) running as a VM with 8 GB RAM and 2 vCPU on:

Mac Pro 2008 - dual quad-core Xeon 2.8 GHz, 64 GB RAM, running ESXi 6.5U1 and latest critical and non-critical patches

Internally in the SATA bays, I have two 250 GB SSD (OWC Mercury Extreme Pro 6G, Samsung 850 EVO) and a couple of 4TB WD RED drives. Napp-it lives on one of the 4TB internal drives. One SSD is assigned to ESXi vFlash Cache, the other has VMFS where virtual L2ARC and SLOG "disks" live.

I have an LSI SAS 9201-16e in the chassis, passed through to Napp-it. The card is flashed to the latest firmware (20) and BIOS (7.3?). In a separate (non-expander) SAS chassis I have 4 x 8TB WD RED drives and 2 x 4TB WD RED drives.

I have two pools - one for low-priority storage that simply uses the 2 x 4TB RED drives as a mirror. The other is two sets of mirrored 8TB drives (RAID10?), with L2ARC and SLOG .vmdks on the SSD datastore.

NFS datastores are defined and are connected to Napp-it via a second virtual NIC in the VM and a separate VMkernel NIC and vSwitch all with 9000 MTU. The rest of the network is 1500 MTU for compatibility.

The issue at hand is that I can't seem to get more than about 50-60 MB/s out of it...err, wait. Jeez, never mind.

On second test I'm getting more like 95 MB/s, peaking at 115 MB/sec. I guess that's not bad over 1GB network at 1500 MTU. As has been noted before, CIFS performance DEFINITELY lower than SMB. ESXi will connect to the NFS datastore via a separate network with 9000 MTU, noted above.

Is there any performance tuning that can or should be done at the command line or GUI? A special concern is NFS performance for VMs that will run on this same host.

I do have one additional question - how do you change NFS permitted IP ranges after creating the NFS share? The only way I've found so far is to un-share it and re-share with the additional IP addresses or ranges.

Thanks to all for the information resources and to Gea for Napp-it. This is a lot of fun, and VERY useful besides!
 

sstillwell

New Member
Feb 21, 2018
12
0
1
60
Oh, actually...one problem I'm getting is that the Napp-it appliance doesn't advertise its presence on the network. It's joined to my AD domain and I have a WINS server that it should be registered to...not understanding what additional steps I might need to take. I THINK that NetBIOS is enabled and running...
 

sstillwell

New Member
Feb 21, 2018
12
0
1
60
Bonnie++ results on the RAID10 array:

NAME SIZE Date(y.m.d) File Seq-Wr-Chr %CPU Seq-Write %CPU Seq-Rewr %CPU Seq-Rd-Chr %CPU Seq-Read %CPU Rnd Seeks %CPU Files Seq-Create Rnd-Create
vmstore 14.5T 2018.02.21 16G 222 MB/s 79 186 MB/s 56 129 MB/s 49 173 MB/s 96 384 MB/s 73 2501.4/s 22 16 24071/s 21594/s

How does that compare to similar systems?
 

gea

Well-Known Member
Dec 31, 2010
2,589
879
113
DE
You can do benchmarks to check if a system is as expected or to check if the result of a tuning is better or worser. For the first you must consider the extremes regarding sequential or io performance and the test conditions.

A single disk like a WD Red is quite slow (4500 rpm) with a max sequential data rate (data is on disk like a daisy chain) of 178 MB/s according to the data sheet on the outer tracks, maybe around 100 MB/s on the inner tracks. Without cache effects this performance can only be achieved with strict sequential data what is possible with a data stream on a simple/older filesystem.

A raid especially with CopyOnWrite like ZFS spreads data over the whole array. This means you do not have a pure sequential workload but a workload where iops matters. So lets check the extreme there. Your WD Red with 5400 rpm can deliver around 80 raw iops. This is the amount of read/write actions can be achieved if the heads must be always repositioned and you must wait until the data is under the heads. If your want to read/write 4k datablocks that are evenly spread over the whole disk your worst value is 80 x 4k = 320 KB/s. If you run benchmarks like Crystal Disk you can see performance values over blocksize.

Your Bonnie bechmark shows around 180 MB/s sequential read/write in a Raid-10. I would call this "within range" unless I know more details like fillrate, cache effects or HBA quality. Maybe you can run a Pool > Benchmark with newest napp-it. This is a test sequence with tests to check sequential and io write and read performance. In this tests you can see the effect of the rambased writecache when you compare sync write vs nonsync write and optionally the effect of the rambased readcache when you disable readcaching. Default tests should be done with readcache=all. If you post the result, please insert as code for better readability.

about your questions
A typical performance tuning is to increase tcp and NFS buffers and NFS servers. MTU 9000 for internal ESXi datatransfers is not relevant as this is done in software. MTU only becomes relevant on tcp transfers over real ethernet.

L2Arc on a vmdk can help a little but I would not expect too much from it. For initial tests I would always disable. Slog on vdmk is mostly not helpful. In your case with consumer SSDs it is not helpful as the performance is "not good enough" and due the lack of powerloss protection. Either you should offer an enterprise SSD like an Intel S3700 directly, a Intel Optane like a 900P via vdmk or disable sync and do backups. For initial tests I would remove the Slog and do then additional tests with an Slog

For your system, I would simply remove L2Arc and Slog and give more RAM (16-24 GB RAM) for storage. If you want a faster crash save write behaviour with sync=enabled, think of a more suited SSD or NVMe. Compare my benchmark series at http://napp-it.org/doc/downloads/optane_slog_pool_performane.pdf

about netbios
check Services > SMB > Properties for netbios, enable if disabled

about CIFS
Do you refer to CIFS as the Solarish ZFS/kernelbased SMB server is often called Solaris CIFS or do you mean CIFS=SMB1? The first naming has historical reasons while the current SMB implementation is SMB 2.1/3 depending on distribution.
 
Last edited:
  • Like
Reactions: T_Minus

sstillwell

New Member
Feb 21, 2018
12
0
1
60
Thank you for responding. I'll run some of the tests and process your advice later in the day (3:00 AM here now), but when I said CIFS I was referring to Apple's use of cifs:// in server URLs, in other words, SMB1.

I've removed L2ARC and SLOG from the RAID10 pool.

Argh...forgot. NetBIOS is enabled in SMB properties, yet I still don't see the server showing up in browse lists like other Windows machines or even Macs.
 

gea

Well-Known Member
Dec 31, 2010
2,589
879
113
DE
For Macs to see the server in your finder you must enable the Bonjour/Multicast service and add SMB as a discovered resource (setup manually or use menu Services > Bonjour in napp-it Pro. For Windows you need an additional Wins-Server. Solaris does not act as a Wins server.

Mac
use goto smb://ip to connect via SMB 2/3
Up from 10.5, SMB is slow on Macs. While I have done SMB tests with OSX on former version with nearly 1 GB/s over 10G, I now only get around 300 MB/s even with the signing fix.

see
http://napp-it.org/doc/downloads/performance_smb2.pdf
How to Fix Slow SMB File Transfers on OS X 10.11.5+ and macOS Sierra
 

sstillwell

New Member
Feb 21, 2018
12
0
1
60
Ah, actually the second option in Bonjour and Autostart "Advertize SMB service for <hostname> (valid Pro or Dev edition) is permanently set to "disabled" (drop-down menu is enabled and responsive, but "disabled" is the only choice). Is that feature not included with a homeuse complete license?

I added:
Code:
dns-sd -R "<hostname>" _smb._tcp local. 139 &
dns-sd -R "<hostname>" _device-info._tcp. local 139 model=Xserve &
to the settings "edit bonjour.sh" dialog and I can use dns-sd -B to see that it in fact is advertising its presence. Trying to connect to it by clicking in the Finder sidebar and choosing "Connect as" fails, though. Error message is
Code:
There was a problem connecting to the server "<hostname>".  
The server may not exist or it is unavailable at this 
time. Check the server name or IP address, check 
your network connection, and then try again.
Okay, it seems you must use the newer SMB-over-TCP (as opposed to over NETBIOS) port 445 in order for it to work correctly
Code:
dns-sd -R "<hostname>" _smb._tcp local. 445 &
dns-sd -R "<hostname>" _device-info._tcp. local 445 model=Xserve &
Active Directory domain login is quite slow, but once connected everything seems fine.

In the meantime, I'm in the middle of rsync-ing 1.2 TB of archived files from my aging Thecus N7700PRO onto the mirrored pair...almost halfway done!
 

sstillwell

New Member
Feb 21, 2018
12
0
1
60
Upgraded the Napp-it VM to 16 GB RAM and restricted the ARC to about 13 GB of that. I may see some degradation in performance if I upgrade to 10 Gb networking, but at 1 Gb it's pretty much wire-speed throughput to the disk. Reads are a bit slower than writes, but it's otherwise perfectly functional so far. Nice!