High performance local server build - starting advice

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

gea

Well-Known Member
Dec 31, 2010
3,157
1,195
113
DE
For a Solaris based free option, I would choose OmniOS as it is stable with regular security fixes and the newest Illumos enhancements. It may be as well the fastest due to LZ4 compress of ARC values.

ReleaseNotes
 

33_viper_33

Member
Aug 3, 2013
204
3
18
Correct zfs with either solaris or linux will not be able to use SMB3 RDMA, it will be able to use ISER or SRP RDMA but Windows does not have ISER or SRP baked into Server 2012 or windows 8 or 8.1. I have only read one post on OFED indicating server 2012 running SRP with some older drivers loaded in safe mode, but it sounded very hacking and a terrible pain to get up in running. So the only RDMA access for Windows in with SMB3 no block level RDMA protocals, where as Other OSs (linux and Solaris) have the RDMA Block protocals and NFS o RDMA. Hopefully this is what you were looking for?
As I understand it, the winof driver adds SRP support. Is this not correct? This wont help him on 10gbe, but could on infiniband.
 
Last edited:

danwood82

Member
Feb 23, 2013
66
0
6
For a Solaris based free option, I would choose OmniOS as it is stable with regular security fixes and the newest Illumos enhancements. It may be as well the fastest due to LZ4 compress of ARC values.
Ah, yes, well that looks a tad more promising, I'll admit :) Gave OmniOS a try out today, and it seems to leave the rest standing!

I got some weird connection dropping similar to my FreeNAS experience at first, only when using CrystalDiskMark, but for some reason it stopped of its own accord after I messed about with NFS for a bit before switching back to SMB. Now it seems to be rock-solid.

The only question-mark now is - In CrystalDiskMark, I can consistently get ~450MB/s writes, but the reads top out at around ~160MB/s, even when the files being read are definitely already cached in RAM.

Do you have any top-tips for tweaking OmniOS/napp-it for maximum raw performance?

I was also wondering about a couple of details regarding sync writes:
- What are the implications if I switch sync to disabled on the zpool? Is there any risk of data corruption, or would it only be a risk that a file might not get written at all in the event of a power-outage, etc? I seem to get more consistent performance with it off, and it's not a disaster with my workload if a file simply fails to write. Corruption would be bad though.
- If I switch sync to disabled, does that effectively mean the ZIL is bypassed entirely, or would I still benefit from a dedicated ZIL SSD?
- (Edit) Oh yes, forgot to ask - you mentioned LZ4 compression - is it enabled by default, or would I need to activate it somehow?

Thanks for the suggestion gea, even if I can't tweak any more out of it, it's already a massive improvement over the Windows setup I have currently. I think I'll definitely be heading down the ZFS route for the final server build. It feels so liberating to think of a system free of mediocre hardware-based raid performance :)
 
Last edited:

bds1904

Active Member
Aug 30, 2013
271
76
28
Have you tried disabling QOS packet scheduling in windows to see if it makes a difference? I've also run into limitations with 1500b frames & SMB before. Enabling jumbo frames (7000b) on a 10Gb network made SMB read go from ~200mbit to ~700mbit on a project I worked on with a friend.

Just remember that compression may affect the speed and latency immensely, but it is worth testing. Normally it is disabled by default.

Depending on work load sync-write can be a very important feature. You mentioned power outages and you are absolutely correct, but having sync on also helps protect you in the event of kernel panic. That being said, with a good SSD write cache (like a striped+mirrored 4x SSD ZIL) you won't see any performance hit with leaving it on. Even with a single SSD ZIL (as long as you have an SSD with really good write speed) you won't notice the difference.
 

gea

Well-Known Member
Dec 31, 2010
3,157
1,195
113
DE
I would start with a local benchmark like bonnie (pool - benchmark) to check basic values.

Next, you should create a volume (menu disks - volume), say 100 GB, then go to menu Comstar, create a LU on it, then a target and a target group with the target as member and set a view from te LU to this target group. Now you can do Chrystalmark benches via iSCSI from Windows.

Performance relevant is the blocksize (prefer larger values for first tests) and the writeback cache setting (on=fast, off =secure but slow without ZIL) For first tests I would disable sync (writeback cache=on) and disable compress.
If you add a ZIL for fast secure sync writes, the size must be at least enough to hold 10s of network traffic (even for a single 10 Gb, a 8GB Zil can be enough like a ZeusRAM-the best of all), important is very low latency with very good write performance. A good cheaper ZIL SSD is a Intel S3700.

If you performance tests show massive better writes than reads, I would look also at the Windows side (driver, setting problems etc). Reads should always be better than writes, otherwise you have an additional problem (Windows, drivers, switch, compress, dedup, ashift etc)

compare my values with 10 Gb/ mostly sync vs nonsync
http://napp-it.org/doc/manuals/benchmarks.pdf

Benchmarks are usually values without regarding any caches unless you have a very large cache or very small benchmark files.If you are intereested in cached values, you need a very large cache or very small files. For reads you can push performance with a lot of RAM.

If you disable sync (=enable writeback with iSCSI), all writes for 5s are cached and written as one large sequential write. If you enable sync, this basic behaviour is the same but every single write command must be confirmed from the log device until they go to RAM and the next write command is processed. (ZIL is a separate logging mechanism, not like the regular ZFS write caching that transforms 5s of small writes to one large sequential write). This means, without sync writes, you can loose up to 5s of last writes. This does not affect your pool consistency as this is copy on write. Your file system will not see any corruption unless your pool goes offline if more disks fail than your redundancy level allows. In this case, your pool is lost. But if any disk come back, your pool is ok without any data corruption. You can pull disks during writes and beside last write your pool is ok when you reinsert the disks. This is much different to hardware raid, where such an action my be a reason for a loss af data and raid consistency.

For writes, you can push performance only with fast disks as your write cache is always max 5s of writes (i.e. 5s of your max network traffic)

If you want th compare L2ARC, check the primarycache and secondarycache ZFS properties (I have not tested L2ARC compress myself), see https://www.illumos.org/issues/3137

other tuning options:
see napp-it // webbased ZFS NAS/SAN appliance for OmniOS, OpenIndiana and Solaris downloads
 
Last edited:

Dennis Wood

New Member
Feb 11, 2014
4
0
1
www.cinevate.com
I have been testing 10G throughput options for about a month now and can make a few suggestions:

1. Smb3 multichannel features in windows 8 and server 2012 are quite key for throughput. I have been able to measure 1.48 GB/s real world transfer speeds using dual 10GBe links between windows 8.1 workstations over the last week. Jumbo frames as well as some basic nic driver setting tweaks are required. RSS combined with SMB3 multichannel maps a thread to each CPU core ... and setting rss queues to match logical core count in the nic driver configuration makes a significant difference. Connecting two 10G ports to both server and workstation will result in negotiated multichannel performance completely by the OS..so zero config, and nothing required on the switch.

2. Having tested 2012 server, (storage spaces and Ssd tiering was way too slow)), I ended up using a rather inexpensive rocket raid 2720 and 6 x 4TB hitachi 7200 drives for real world 700MB/s writes and 500MB/s reads at raid 5 with 18TB usable. The 2720 can take 8 sata6 connections. The test server has 16GB ram, (all drives have read/write caching turned on)as does the workstation so at this point files under 14TB are moving to the server at 1.1 GB/s. A backup using shadowprotect (compressing and encrypting all files) sustained 400mb/s from a single SSD host to the server array over 10G. I'm doing my final build with a single 8 core processor (instead of the four core in the server now) as the 2720 raid card requires some CPU during writes. More importantly, the RSS vs core count vs MS SMB3 multichannel relationship is obvious in testing. Samba's SMB3 version does not support multichannel...yet.

So, a better raid card (so parity is not on the CPU), eight x 4TB drives and 2012 server with as much RAM (cache!) as you can afford would be give you a large cache at ddr3 ram speeds, and disk IO to keep up. I'm using x540 intel cards and fairly pedestrian workstations in my testing. The "server" right now is just an i5 3.4MHz with 16GB ram. The server will end up using a z87 chipset workstation board with an i7 4770 and 32GB ram. It will host our creative team (connected via 10G for shared Adobe CC workflow), R&D team and provide SQL services etc for my business team. Embedded 10G server boards all seem to be dual zeon iterations..and power efficiency (with adequate performance) steered my decision. Drives in, we're at about $2300 to build this box.

Cheers,
Dennis (Cinevate - Tools for Filmmakers and Photographers)
 

Aluminum

Active Member
Sep 7, 2012
431
46
28
I see a ZFS "RAID10" powered by an X9SRH-7TF in your future, its cheap and checks the SAS HBA and 10GbE boxes. (Flash to IT for ZFS, BTW is your 10GbE setup SFP+ or RJ45?)

Not all hardware cards do it properly, but ZFS definitely reads "10" configs as fast as the same drive count 0.

I'm also with gea, ram it up. Good news is that most good UP E5 boards take unbuffered or registered and have 8 dimms so 64-256GB is possible, bad news is ram is 2~3x more pricey than it used to be.

Another silver lining, 4 or 6 cores is plenty for a filer and UP 16xx E5 xeons are relatively cheap and not clock neutered like their 2P counterparts. (compare and shake your head at intel's fully operational monopoly)
 

Dennis Wood

New Member
Feb 11, 2014
4
0
1
www.cinevate.com
That Supermicro board: Supermicro | Products | Motherboards | Xeon® Boards | X9SRH-7TF stood out in my search online. With embedded 10G as well as 8 x SAS2 on the LSI 2308, even at $800 or so, it's a great choice where 10G and RAID are required. It was the only single socket (the duals are pretty watt thirsty) I could find currently available with all the right stuff for a small file server. Asus has several embedded 10Gbe MB comparables, but none seem to be listed anywhere yet.