A few zfs questions for a rebuild...

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

danwood82

Member
Feb 23, 2013
66
0
6
Hey all,

I'm about to do a clean wipe and rebuild of my home-office server, after a solid 5 years running OmniOS/napp-it.

It's been a mostly-great experience - except for a recurring issue where the SMB server would lock the CPU into a 100%-thrashing-85+degree-overheating state while the server was completely idle, until I manually logged in and reset the SMB server. That's my main reason to wipe and update - I assume whatever that particular gremlin is, it'll probably be gone in later versions.
Oh - and the 9-out-of-12 Seagate 3TB drive failures have been annoying, especially as they were all out of warranty, but that's on Seagate.

I thought I'd take the opportunity to clue up a bit more about how to tune the setup. Dumb amateur questions incoming :)

Firstly - I use it as a general everything-and-anything file-storage device, but with special emphasis on visual-effects-simulations kicking out massive cache files (writing 1-2GB per frame, every few minutes, from 4 different machines... but wanting to load sequences of 200-300GB of those frames as quickly as possible once they're written) - each of those four machines is connected via a dedicated 10GbE to 2x2-port 10GbE adapters, without a switch.

Given all that - I'm wondering about my understanding of l2arc, zil/slog, sync enabled/disabled, etc.

As it stands, I have a 6-core Xeon (E5-1650 I think), 64GB of RAM in there, and 3x Corsair Neutron GTX 240GB drives as an L2ARC.
Now, I've gotten the impression over time that the L2ARC wasn't really doing much for my workload, and I've just today read a post suggesting that I'm probably massively overdoing it, possibly to the detriment of my RAM-based caching, given that the L2ARC need to maintain a large index in RAM.
- I know I'd be better off whacking in an extra 64GB of RAM - I'd be interested to know whether that would likely improve things much given my single-user, 4x machines, mostly-large-sequential workloads. Is it worth buying the extra RAM?
- With or without the extra RAM, would it actually be a good idea to reduce and/or scrap the L2ARC altogether?

I've run the thing with sync=disabled up until now, with no SLOG and optane not existing back when I built it, it clearly couldn't function as high performance storage with it enabled.
Firstly, I'm wondering if sync=disabled is fine to leave as it is, given I'm running without a UPS. Most of what I write isn't critical - a file doesn't get written, I just relaunch that sim/render.
Then, if I have sync=disabled - is there *any* point to having an SLOG device? Does sync-disabled literally turn off the intent log, or would there still be a benefit? i.e. if such a thing exists, could writing even async intent-log to an Optane 900p, which would presumably complete much sooner than flushing to the pool, act as a form of insurance against data loss? Would there be any actual performance benefit? Would it be worth bothering at all even if there is, or maybe just going for a cheapo Optane 32GB instead?


Lastly, I was surprised at the time, but I was told at some point that it was fine to run SMB and NFS simultaneously... (rack machines are Linux, main workstation is Windows, and Windows NFS is a pain in the ass and stalls out for 30 seconds whenever I don't use the connection for a couple of minutes). I've found it to work fine in this configuration, never noticed any corruption or unpleasant behaviour.
Mostly just want to double-check that this is definitely still sound advice, and I'm not ignorantly tempting fate.
Additionally though, is there any performance penalty implications here? Could it potentially be a better idea to try and get the Linux clients running on Samba instead of NFS? (I think I'm about done fiddling with Windows' NFS implementation)


Any additional insight most welcome too.

Thanks in advance to anyone who's taken the time to wade through that wall of text!
 
Last edited:

gea

Well-Known Member
Dec 31, 2010
3,141
1,182
113
DE
I feel your pain with the 3TB Seagates. I have also nearly lost a Z3 backup pool with them after 2 years of working and then they failed disk by disk.

Basically you use OmniOS as a filer with a multiuser workload with large sequential files. The basic read/write behaviour is then

On reads you mainly read data sequentially from disk. The rambased Arc readcache mainly helps for small random io and metadata, not for your large sequential reads. Your read performance is limited mainly by sequential read performance of the pool. I would not expect much help from more RAM or an L2Arc beside a little effect of L2Arc when you enable read ahead.

For writes, ZFS collects all writes in the rambased writecache (default 10% ram, max 4GB). In your case you use a 4GB writecache. When the cache is full it is flushed as a single fast sequential out. This makes ZFS fast with small io or multiuser io.

On a crash during write the content of the cache with last written files is lost. Due CopyOnWrite this does not affect ZFS itself, only the files currently processed. You can enable sync to protect the cache but with large files this will not really help as the files are corrupted despite when they are not fully in the cache. I would disable sync in your case. An slog is not used/needed then.

In your case, check performance. If too slow, add more vdevs (improve iops) and datadisks (improve sequential performance) to your pool. Mainly you can improve read performance by a fast pool layout. Run a Pool > Benchmark that displays a mix of reandom or sequential performance values with sync enabled and disabled. If you like you can copy/paste the values here as code to keep the table readable.

You can use NFS3 and SMB simultaniously. Only problems is permissions as NFS3 does not offer authentication or authorisation. Best is a fully open file ACL setting with SMB guest access or optionally reduce permissions on SMB via share based permissions.
.
 
Last edited:

danwood82

Member
Feb 23, 2013
66
0
6
Many thanks as ever gea!

When you say "a fast pool layout" - I've been running 12 disks as 6 mirror pairs striped. I presume that's the best layout I can use for performance while maintaining redundancy?
Also, just to double-check - the way to set up a stripe-of-mirror-pairs is to create a vdev mirror-pair, then add each other mirror-pair to it, correct? There's not a specific way to set up a stripe, and I've just been running a bunch of mirrors this whole time?
 

gea

Well-Known Member
Dec 31, 2010
3,141
1,182
113
DE
A multi raid-10 is the fastest layout. Every new mirror increases iops by around 100 iops and sequential write performance by one disk and sequential read performance by 2 x the value of a disk (when pool is balanced and you do not hit other hardware limits).
 

gea

Well-Known Member
Dec 31, 2010
3,141
1,182
113
DE
Current data remain where they are.Only new or modified data (due CopyOnWrite) are spread over the whole pool. To rebalance current data you must copy or replicate them ex to a temp filesystem that you rename then and delete the old folder or filesystem.
 
Last edited:

danwood82

Member
Feb 23, 2013
66
0
6
Neat, good to know.

Well, I think that covers everything (and saves me some unnecessary expenditure :))

Thanks again for all your help and work!
 

danwood82

Member
Feb 23, 2013
66
0
6
Hmm, well I've done my rebuild, using OmniOS r151030 and napp-it 18.12w.

All is well, except for some glaring network issues:

For some reason, my 10GbE adapter seems to have blistering fast speeds reading from the server, but the write speeds have dropped to about half of a 1GbE connection (55-60MB/s)... I tried writing large sequential files to the server over the 1GbE connection via a switch, and it's twice as fast and saturates the connection. This is with a standard Intel x520-T2 (ixgbe0) with all default settings except 9000 mtu (set at both ends) - as far as I recall, exactly as I had it before the rebuild, where it worked.

The other big issue is, I assumed the rebuild would mean that SMB shares would now default to using SMB 2.1 - Given recent Windows 10 updates deprecated support for SMB 1 (and my laptop specifically is for some reason preventing me from reinstalling the Windows component to support it again).
Is there something I need to configure to make it run with 2.1 shares? When I check through Windows Powershell (Get-SmbConnection) - it tells me the share is "Dialect 1.5"
 

danwood82

Member
Feb 23, 2013
66
0
6
Urgh... seems Windows is being its usual strange self. I mapped a different drive to the share over the 1GbE link, and it shows that as "Dialect 2.1". Seems there's some stale information attached to the 10GbE adapter that I'm struggling to get rid of.
This is all on an older version of Windows 10 with SMB 1.0 support however. On my laptop, with just the updated version and the missing 1.0 support, it still can't find the share at all. Is there some issue where Windows still needs the 1.0 package even to access 2.1 shares?
 

danwood82

Member
Feb 23, 2013
66
0
6
Right, worked out the laptop thing... incase anyone else is looking for the information - apparently it's an additional security thing they introduced alongside deprecating SMB 1.0... they also denied guest accounts access to 2.1 shares and below.

The Group Policy setting is at:
Computer Configuration > Administrative templates > Network > Lanman Workstation > Enable insecure guest logons

Just got to work out what's going on with the phantom 1.5 mapping on the 10GbE card now.
(Edit: Finally got that too... have to remove relevant stale entries inside HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Explorer\MountPoints2\
- then create/delete/create the share mappings again - got it reporting 2.1 for me at last, and my write speeds are now back where they should be. What a mess!)

Seems like the server itself is working great... all this is just Windows issues, so I suppose I should stop talking to myself about it in this particular forum!
 
Last edited:

DedoBOT

Member
Dec 24, 2018
44
13
8
Be sure that smb signing is disabled at both ends too .
Useful in some cases Interrupts moderation rate to low/medium in properties of the Windows clients NICs ,note- the CPU will take little load.
Don't use smb 1 , thats one of the very few security rules which I obey :)

Try with out jumbos.
First thing is to check the network connectivity [ Iperf3 is the tool] , both directions, and tune it up if necessary . With out of the box 10gbe setup is not unusual to get under 5gb "in" or "out" .
 
Last edited:

danwood82

Member
Feb 23, 2013
66
0
6
Be sure that smb signing is disabled at both ends too .
Useful in some cases Interrupts moderation rate to low/medium in properties of the Windows clients NICs ,note- the CPU will take little load.
Don't use smb 1 , thats one of the very few security rules which I obey :)

Try with out jumbos.
First thing is to check the network connectivity [ Iperf3 is the tool] , both directions, and tune it up if necessary . With out of the box 10gbe setup is not unusual to get under 5gb "in" or "out" .
Thanks, just wanted to update incase anyone else is messing with this stuff.

Because the 10GbE NICs are all direct-links without a hub, I ended up disabling Flow Control and Interrupt Moderation entirely... and combined with 9k Jumbo Frames, sync writes disabled, and the upgrade to SMB 2.1, I ended up pretty much maxing-out the connection... ~1GB-per-second reads and writes consistently now.

Nearly brought a tear to mine eye :)
 

DedoBOT

Member
Dec 24, 2018
44
13
8
I feel you :)
With the latest additional 64 GB,the RAM size become 192GB@2400Mhz, 6 channels , the outcome :
aja50-50r.jpg
NAS with 2x 10Gbe NICs in active LACP and single 10Gbe win7pro client.
 
Last edited: