Looking to update OmniOS/NAPP-IT from r151014

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

NOTORIOUS VR

Member
Nov 24, 2015
78
7
8
43
Just looking for some info, looking to update OmniOS to a more current build.

What would be the best (safest) way to go about this, export pools, spin up the new VM OVA and import the pools, or do an update through some sort of package manager from within OmniOS?

running on ESXi 5.5.

Thanks!
 

gea

Well-Known Member
Dec 31, 2010
3,141
1,184
113
DE
You can update OmniOS,
see http://www.napp-it.org/doc/downloads/setup_napp-it_os.pdf

I would export the pool and install the template with OmniOS 151024 (last one with support for ESXi 5.5).
You can then update OmniOS to a newer one

Maybe I would update the whole server to current ESXi 6.7U1 and oiptionally wait a few days for 151028 (next stable in a few days). You can either update 5.5 to 6.7 when you boot the ESXi installer (either iso or from an USB stick when you use Rufus to create a stick from iso). You can also use a new bootdisk, install ESXi 6.7u1 and import the VMs. (This would remain the old system intakt).

Unless you update your pool, the data remains accessable either from an older OmniOS and 151028
 

NOTORIOUS VR

Member
Nov 24, 2015
78
7
8
43
thanks for the advice! I will back up my VM's and plan to do a full upgrade in that case.

Regarding pool update to the newer version obviously this is something that is done (or an option given?) when I would re-import the pools into the newer OmniOS version?

Sorry in the newest version that is going to be released, what is the state of SMB? Still only v1?
 

gea

Well-Known Member
Dec 31, 2010
3,141
1,184
113
DE
A pool update is not done automatically, you must initiate manually (napp-it menu pool when you click on zfs version)

Regarding SMB version of the ZFS/kernelbased Solarish SMB server
151014 is SMB 1
up from 151018: SMB server is 2.1, client is SMB1

Currently SMB client 2.1 is under development, ready soon
Feature #9735: Need to provide SMB 2.1 Client - illumos gate - illumos.org

This is required to join a domain when SMB1 is disabled on the AD server

Oracle Solaris 11.4 is on SMB3.1 (client 2.1)
 

NOTORIOUS VR

Member
Nov 24, 2015
78
7
8
43
Fantastic, thank you very much! you're always very helpful, I hope to one day soon purchase a home license from you for your fantastic work.
 

NOTORIOUS VR

Member
Nov 24, 2015
78
7
8
43
You can update OmniOS,
see http://www.napp-it.org/doc/downloads/setup_napp-it_os.pdf

I would export the pool and install the template with OmniOS 151024 (last one with support for ESXi 5.5).
You can then update OmniOS to a newer one

Maybe I would update the whole server to current ESXi 6.7U1 and oiptionally wait a few days for 151028 (next stable in a few days). You can either update 5.5 to 6.7 when you boot the ESXi installer (either iso or from an USB stick when you use Rufus to create a stick from iso). You can also use a new bootdisk, install ESXi 6.7u1 and import the VMs. (This would remain the old system intakt).

Unless you update your pool, the data remains accessable either from an older OmniOS and 151028
Well I finally did it (not by choice, but due to my own stupidity regarding other things - always learning I guess!) after many hours rebuilding ESXi from scratch/new install...

I've moved to ESXi 6.7U1, all VM's running fine.... I then loaded the latest ovf from you and set it up from scratch network wise, imported the pool (previously exported from the original VM) and added AD servers and bind the users... all the shares came right back up! Using VMXNET3 adapters.

Unfortunately I now have a major issue with speed and I'm at a loss why.... previously without any tuning I would saturate my gigE connection (110-112 MB/s).... now I'm lucky to crack 30 MB/s....

iperf test from my desktop to the server (napp-it is the iperf server) comes back with 930Mbit/sec so the network connection seems to be ok...

All I did was import the pool, nothing else.

Attached are some screen shots, maybe you have an idea where I could look?




Did a benchmark test from napp-it... not sure if I did it correctly though, just used the default settings.

Code:
start filebench..
Filebench Version 1.4.9.1
16633: 0.000: Allocated 126MB of shared memory
16633: 0.003: File-server Version 3.0 personality successfully loaded
16633: 0.003: Creating/pre-allocating files and filesets
16633: 0.016: Fileset bigfileset: 10000 files, 0 leafdirs, avg dir width = 20, avg dir depth = 3.1, 1254.784MB
16633: 0.022: Removed any existing fileset bigfileset in 1 seconds
16633: 0.022: making tree for filset /storage_z2/filebench.tst/bigfileset
16633: 0.051: Creating fileset bigfileset...
16633: 2.047: Preallocated 8015 of 10000 of fileset bigfileset in 2 seconds
16633: 2.047: waiting for fileset pre-allocation to finish
16633: 2.048: Starting 1 filereader instances
16671: 2.091: Starting 50 filereaderthread threads
16633: 4.094: Running...
16633: 34.129: Run took 30 seconds...
16633: 34.133: Per-Operation Breakdown
statfile1            48686ops     1621ops/s   0.0mb/s      0.4ms/op       15us/op-cpu [0ms - 2521ms]
deletefile1          48614ops     1619ops/s   0.0mb/s      3.1ms/op       66us/op-cpu [0ms - 2390ms]
closefile3           48694ops     1621ops/s   0.0mb/s      0.1ms/op        6us/op-cpu [0ms - 690ms]
readfile1            48696ops     1621ops/s 214.1mb/s      1.5ms/op       63us/op-cpu [0ms - 2284ms]
openfile2            48699ops     1621ops/s   0.0mb/s      0.9ms/op       20us/op-cpu [0ms - 2105ms]
closefile2           48700ops     1621ops/s   0.0mb/s      0.1ms/op        6us/op-cpu [0ms - 2323ms]
appendfilerand1      48706ops     1622ops/s  12.6mb/s      3.7ms/op       53us/op-cpu [0ms - 1720ms]
openfile1            48712ops     1622ops/s   0.0mb/s      1.0ms/op       21us/op-cpu [0ms - 2483ms]
closefile1           48712ops     1622ops/s   0.0mb/s      0.1ms/op        6us/op-cpu [0ms - 1885ms]
wrtfile1             48718ops     1622ops/s 202.6mb/s      6.1ms/op       84us/op-cpu [0ms - 2450ms]
createfile1          48730ops     1622ops/s   0.0mb/s      4.4ms/op       62us/op-cpu [0ms - 2415ms]
16633: 34.133:

IO Summary:
535667 ops, 17835.374 ops/s, (1621/3244 r/w), 429.4mb/s,    375us cpu/op,   7.1ms latency
16633: 34.133: Shutting down processes

ok.
 
Last edited:

gea

Well-Known Member
Dec 31, 2010
3,141
1,184
113
DE
Is read different from write?

Your filebench values are ok. Can you add the output of
Pools > Benchmark, this is a set of benchmarks with sync disabled vs enabled.

I would asume that you have enabled sync write where 30MB/s write is as expected.
 

NOTORIOUS VR

Member
Nov 24, 2015
78
7
8
43
Is read different from write?

Your filebench values are ok. Can you add the output of
Pools > Benchmark, this is a set of benchmarks with sync disabled vs enabled.

I would asume that you have enabled sync write where 30MB/s write is as expected.
HI Gea,

Actually Read is just as fast (just checked from my desktop), at about 110 MB/s sustained over gigE. The same file copied back to the server maxes out at 35 MB/s but it's not even able to maintain that, it goes up and down from 25-35 MB/s over the course of the transfer.

Below is the output from the test you've requested.

I haven't done any changes to the server, I only imported the pools which is why I am shocked at the write speeds, and now seeing the read speeds, it does seem like it must be a setting somewhere (I was worried that maybe the passthrough of the LSI controllers under 6.7U1 was the issue). Is it possible that the new server has sync write enabled by default where the old one did not?

By the way, updating the zpool is something I would like to do (once this has been figured out)... is that command line only? Anything I should be aware of if/when I do that?

The napp-it VM has 24GB RAM dedicated to it, not sure if I can do any further optimization for that?

Code:
begin tests ..
Bennchmark filesystem: /storage_z2/_Pool_Benchmark
Read: filebench, Write: filebench_sequential, date: 12.28.2018

hostname                        batcavefs  Memory size: 24576 Megabytes
pool                            storage_z2 (recsize=128k, compr=off, readcache=all)
slog                            -
remark                         


Fb3                             sync=always                     sync=disabled                 

Fb4 singlestreamwrite.f         sync=always                     sync=disabled                 
                                164 ops                         2961 ops
                                32.798 ops/s                    591.973 ops/s
                                7855us cpu/op                   3139us cpu/op
                                30.3ms latency                  1.6ms latency
                                32.6 MB/s                       591.8 MB/s
________________________________________________________________________________________
 
read fb 7-9 + dd (opt)          randomread.f     randomrw.f     singlestreamr
pri/sec cache=all               96.2 MB/s        43.2 MB/s      1.5 GB/s                   
________________________________________________________________________________________
Code:
pool: storage_z2
 state: ONLINE
status: Some supported features are not enabled on the pool. The pool can
   still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
   the pool may no longer be accessible by software that does not support
   the features. See zpool-features(5) for details.
  scan: scrub repaired 0 in 12h31m with 0 errors on Mon Dec 24 15:31:30 2018
config:

   NAME                       STATE     READ WRITE CKSUM      CAP            Product /napp-it   IOstat mess          SN/LUN
   storage_z2                 ONLINE       0     0     0
     raidz2-0                 ONLINE       0     0     0
       c0t50014EE20C8A61A8d0  ONLINE       0     0     0      4 TB           WDC WD40EFRX-68W   S:0 H:0 T:0          WDWCC4E0RLC2XL
       c0t50014EE20C8A802Ad0  ONLINE       0     0     0      4 TB           WDC WD40EFRX-68W   S:0 H:0 T:0          WDWCC4E6PVPYN7
       c0t50014EE261DFBF31d0  ONLINE       0     0     0      4 TB           WDC WD40EFRX-68W   S:0 H:0 T:0          WDWCC4E6PVPACX
       c0t50014EE261DFCCC6d0  ONLINE       0     0     0      4 TB           WDC WD40EFRX-68W   S:0 H:0 T:0          WDWCC4E7FZ2SL8
       c0t50014EE20C8A235Cd0  ONLINE       0     0     0      4 TB           WDC WD40EFRX-68W   S:0 H:0 T:0          WDWCC4E0TRF0NK
       c0t50014EE2B7350224d0  ONLINE       0     0     0      4 TB           WDC WD40EFRX-68W   S:0 H:0 T:0          WDWCC4E1SANR1T
       c0t50014EE2B7353608d0  ONLINE       0     0     0      4 TB           WDC WD40EFRX-68W   S:0 H:0 T:0          WDWCC4E0TRFL0A
       c0t50014EE2B7355807d0  ONLINE       0     0     0      4 TB           WDC WD40EFRX-68W   S:0 H:0 T:0          WDWCC4E4SZ275E
       c0t50014EE2B735BA1Ad0  ONLINE       0     0     0      4 TB           WDC WD40EFRX-68W   S:0 H:0 T:0          WDWCC4E6PVPZUE
       c0t50014EE2B737F8D9d0  ONLINE       0     0     0      4 TB           WDC WD40EFRX-68W   S:0 H:0 T:0          WDWCC4E7JLA4TL

errors: No known data errors
 

gea

Well-Known Member
Dec 31, 2010
3,141
1,184
113
DE
Sync write performance is around 33 MB/s (unsync 592 MB/s)
Can you check menu ZFS filesystems if sync for the SMB shared filesystem is disabled

A pool upgrade can be done in menu Pools when you click on the pool version 5000 (or "-" on an older napp-it)
 

NOTORIOUS VR

Member
Nov 24, 2015
78
7
8
43
Sync write performance is around 33 MB/s (unsync 592 MB/s)
Can you check menu ZFS filesystems if sync for the SMB shared filesystem is disabled

A pool upgrade can be done in menu Pools when you click on the pool version 5000 (or "-" on an older napp-it)
Hi Gea,

sync is set to Standard, which according to your menu means means sync was on for certain writes? Changing to disabled I'm guessing will get my speed back (and I suppose possibly how the old napp-it server was set to by default?), but is there any issues for data integrity in turning sync off at all? I appreciate the speed but if there is another way I'm interested as well. Maybe turning on compression, I've heard that can help quite a bit with performance in many cases.

Re. the upgrade... thank you.

 

gea

Well-Known Member
Dec 31, 2010
3,141
1,184
113
DE
Set sync to disable and check performance.

Sync=default means that a writing app decides.
ESXi+NFS requests sync, SMB does not per default.

Without sync, the content of the rambased write cache is lost on a crash. This does not affect integrity of ZFS due CopyOnWrite but there is a dataloss. If you store VMs, they may become corrupted.
 

NOTORIOUS VR

Member
Nov 24, 2015
78
7
8
43
Set sync to disable and check performance.

Sync=default means that a writing app decides.
ESXi+NFS requests sync, SMB does not per default.

Without sync, the content of the rambased write cache is lost on a crash. This does not affect integrity of ZFS due CopyOnWrite but there is a dataloss. If you store VMs, they may become corrupted.
Well that is interesting. While I cannot check with iperf for some reason from my media server (connection refused errors)..

I have changed from SMB to NFS shares on my media server (plex/NZB, etc) connected via a separate vswitch (no jumb frames yet) and through glances I've been seeing bursts of 1.5-2.5 GB/s transfers on that network (where there is only NFS mapped).

But on my Win10 desktop PC I'm using SMB and that's where it is capped. @ 35 MB/s so this is quite odd considering the only thing that changed is ESXi version and the latest napp-it instance.

Also I don't believe I ever changed sync option on the old napp-it instance so I'm going to assume it was on Default as well but I have no proof of that. I do not use that Z2 store for anything except media and files... No VM's currently on there. I'll test with Disabled and report back.

Speaking of jumbo frames, I read an old thread ( https://forums.servethehome.com/index.php?threads/esxi-5-5-vswitch-network-setup-all-in-one.4276/ ) about ESXi5.5 and napp it and the big disaster of the vmware tools being th eold version (for Solaris V10) and that's why jumbo frames didn't work. Do you know you latest instance is working correctly with jumbo frames? Can I use the tuning section to enable them (maybe you've seen a guide somewhere? :) ? I believe your instance is running open-vm-tools, yes? 6.7U1 keeps telling me vmware tools is outdated, I suppose I can dismiss/ignore that message permanently?

Cheers!
 

gea

Well-Known Member
Dec 31, 2010
3,141
1,184
113
DE
about Jumbo
This requires support of every server, network device and client. This is why it requires some tests. It can improve performance but can introduce stability problems. For ESXi internal transfers it is useless as this is in software only.

Illumos includes the newest Open-VM tools. You can switch to the Solaris tools from 6.7u1 but I would not outside Solaris where you must use the genuine VMware tools. You can ignore the outdated message then.
Ignore posts about a years old ESXi or OS release.
 

NOTORIOUS VR

Member
Nov 24, 2015
78
7
8
43
Hi Gea,

ok understood... currently I was only interested in the internal/vswitch, but I do have 10G hardware as well, just not yet to my desktop so I will leave that for now.

As for disabling sync, that didn't seem to do anything for speed. If anything it might have made the total throughput slightly worse. :(

 

gea

Well-Known Member
Dec 31, 2010
3,141
1,184
113
DE
Yes, as SMB is async per default, sync=disabled should not make a difference but should also not be slower.

Have you enabled Jumbo?
try without
 

NOTORIOUS VR

Member
Nov 24, 2015
78
7
8
43
Yes, as SMB is async per default, sync=disabled should not make a difference but should also not be slower.

Have you enabled Jumbo?
try without
Am I looking in the wrong place to disable async for SMB?

I have not enabled jumbo or made any other changes to the base system as you created it. I simply imported my pool and connected to AD.
 

gea

Well-Known Member
Dec 31, 2010
3,141
1,184
113
DE
On Solarish SMB shares and sync settings are both properties of the same filesystem.

As your local async performance is around 580MB/s and sync is off, the problem is not clear. Can you try a Windows VM with vmxnet3s and check its performance at least to check if the problem is within ESXi or outside.

btw
Do you use any "acceleration tools" on Windows? Is the nic driver the newest and the nic not of type "Realtek". On Windows nic settings, interrupt throtteling is critical. Check if you have such a setting (disable).

Can you try another Windows client.
 

NOTORIOUS VR

Member
Nov 24, 2015
78
7
8
43
On Solarish SMB shares and sync settings are both properties of the same filesystem.

As your local async performance is around 580MB/s and sync is off, the problem is not clear. Can you try a Windows VM with vmxnet3s and check its performance at least to check if the problem is within ESXi or outside.

btw
Do you use any "acceleration tools" on Windows? Is the nic driver the newest and the nic not of type "Realtek". On Windows nic settings, interrupt throtteling is critical. Check if you have such a setting (disable).

Can you try another Windows client.
No acceleration tools, just the standard onboard NIC of my PC which I think is a realtek. But it was never an issue before and nothing changed on the desktop. Also while it is not scientific, it is another data point... Doing an internet speed test from my desktop will do 950Mbit both ways on my Gigabit fiber connection. So that shows that the desktop and network gear is still functioning.

I've done the tests you've requested in a Win10 VM with vmnet3 drivers installed. The results surprised me a little to be honest.

Write speeds around 190-215 MB/s, but read speed starting at 150 MB/s max falling shortly to a stable 110-115 MB/s for some reason. I would have expected much higher read speeds vs. write. On the old napp-it/ESX setup I was in the 300-350 MB/s range for both read/write.

READ:


WRITE:
 

gea

Well-Known Member
Dec 31, 2010
3,141
1,184
113
DE
You can compare another physical Windows client (non-Realtek) but I would check if there is a newer nic driver. Optionally check if the nic allows to disable int throtteling in nic settings.

Read slower than write can be happen on ZFS depending on access pattern.
Write is always going over the rambased write cache so even small and slow random writes are always done as large and fast sequential writes.

Reads are cached by the Arc and L2Arc but they cache mainly metadata and random reads not sequential reads. Sequential reads especially when fragmentation is high can be slower therefor.
 

NOTORIOUS VR

Member
Nov 24, 2015
78
7
8
43
You can compare another physical Windows client (non-Realtek) but I would check if there is a newer nic driver. Optionally check if the nic allows to disable int throtteling in nic settings.

Read slower than write can be happen on ZFS depending on access pattern.
Write is always going over the rambased write cache so even small and slow random writes are always done as large and fast sequential writes.

Reads are cached by the Arc and L2Arc but they cache mainly metadata and random reads not sequential reads. Sequential reads especially when fragmentation is high can be slower therefor.
Hi Gea,

I updated to the latest driver (2018), no change... but that is not surprising as the issue is certainly not with my client. I'm 100% sure about this.

I wonder if there is an issue somewhere with ESX 6.7U1 + napp-it instance causing the issue.... network setup/issues maybe I don't know. My setup right now is quite simple in my opinion though so I do not understand why after changing ESXi version and napp-it this speed issue is now present even in the VM machine were I never seemed to have issues before (copying files previously through my DC would net transfer speeds of up to 300MB/s without an issue).

I suppose I should see if I can transfer quickly between VM's, then I will know if it is napp-it or the ESXi network.

EDIT, VM/VM results...

from Win10 to DC (both VM):



from DC to Win10:



Not quite sure I understand why the speed fell when writing to one VM and not the other... both VM's are stored on an ESXi datastore that is 2x 512GB SSD's as one 1TB datastore.
 
Last edited: