SSD disks performance issues with ZFS

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

bibin

New Member
Dec 10, 2015
13
0
1
34
Hello,

We are currently testing our using ZFS on Linux as a storage platform for our VPS nodes but we don't seem to be getting the performance figures we expected. Can you please provide some suggestions as to what we should be tweaking to reach higher iops?

Hardware is SuperMicro with the MegaRAID 2108 chipset as a daughter card on each server. We had three servers that we tested: pure SSD with 4 x 480GB Chronos drives, 4 x 600GB SAS 10k drives and 480GB SSD cache, and lastly 4 x 1TB SAS 7.2k with 480GB SSD cache.

We set the onboard raid controller to essentially JBOD (raid0 per drive with cache turned off). We got the best performance when using Z2 with LZ4 compression. Here are the results we saw:

Server ----- RAID ----- Filesystem ---- Read Speed ----- Write Speed ------ Read IOPS ------ Write IOPS




pure SSD with 4 x 480GB ------ Soft - Z2 ------- ZFS without compression ------- 4.1GB/s ------ 778 MB/s ----- 23025 ----- 7664
Chronos drives

pure SSD with 4 x 480GB ------ Soft - Z2 ------- ZFS with lz4 compression ------- 4.6GB/s ----- 1.8GB/s ------ 47189 ----- 15715
Chronos drives

4 x 600GB SAS 10k drives ------- Soft - Z2 --- --- ZFS without compression ------ 4.0Gb/S ------ 486Mb/s ----- 10234 ------ 3413
with 480GB SSD cache

4 x 600GB SAS 10k drives ------ Soft - Z2 ------ ZFS with lz4 compression ------- 4.8Gb/s -------- 2.2Gb/s ----- 51056 ------- 17077
wtth 480GB SSD cache


4 x 1TB SAS 7.2k drives ----- Soft - Z2 ------- ZFS without compression --------- 4.1Gb/s ------- 1.4Gb/s -------- 53486 -------- 17840
with 480GB SSD cache


4 x 1TB SAS 7.2k drives
with 480GB SSD cache ------- Soft - Z2 ------ ZFS with lz4 compression -------- 4.4Gb/s -------- 1.7Gb/s ------- 37803 --------- 12594


It doesn't seem like there is a big difference between the pure SSD setup and the others, even without the SSD cache on the other setups. Is there something we are missing here or something we should be looking into? We were expecting iops to be a lot higher than the results.

Thank you for your help!
 

whitey

Moderator
Jun 30, 2014
2,766
868
113
41
That's because you have the cache drives doing most of the heavy lifting for the hybrid pools. For SUSTAINED or low latency apps/stacks bet the pure ssd AFA datasets runs circles arnd the hybrid zpools.
 

capn_pineapple

Active Member
Aug 28, 2013
356
80
28
What were your testing parameters, i.e. file #, file size, or did you iperf it?

It's bizarre that the hybrid pools would perform better than the pure SSD pool.
 

whitey

Moderator
Jun 30, 2014
2,766
868
113
41
Ok you got me there, my eyes crossed...still not bad numbers for single vdev raid-z2 configs depending on the benchmark used.
 

bibin

New Member
Dec 10, 2015
13
0
1
34
Hi,

I have tested those benchmarking on the zpool datasets with and without lz4 compression . It seems like all the servers aren't showing the big difference in iops performace. Please advice us If there have any tweak settings to do on the servers. Becasue we are expection higher iops even on ssd server.
 

cperalt1

Active Member
Feb 23, 2015
180
55
28
43
What is the Max IOPS of just one SSD. When doing a Raid Z pool the speed of the pool will be limited to the lowest device speed and that is what you are seeing I believe with the pure ssd pool since all transactions must be confirmed on each ssd whereas in the hybrid pool it is only being confirmed on the SSD cache and then flushed to disk hence the slightly higher iops.
 

whitey

Moderator
Jun 30, 2014
2,766
868
113
41
Also those aren't exactly ent grade ssd's, you may see better results from higher end devices or using a stripped mirror zpool setup. (Raid-10)
 

bibin

New Member
Dec 10, 2015
13
0
1
34
Hi,

We have tested with one SSD and Raid-10. But we are keep getting the same results that compared to the Z2 results. Here are the results we saw,

Server ----- RAID ----- Filesystem ---- Read Speed ----- Write Speed ------ Read IOPS ------ Write IOPS




pure SSD with 4 x 480GB ------ Soft - Raid0 ------- ZFS without compression ------- 4GB/s ------ 552 MB/s ----- 14078 ----- 4693
Chronos drives

pure SSD with 4 x 480GB ------ Soft - Raid0 ------- ZFS with lz4 compression ------- 4.6GB/s ----- 1.9GB/s ------ 46211 ----- 15356
Chronos drives


pure SSD with 4 x 480GB ------ Soft - Raid10 ------- ZFS without compression ------- 4.2GB/s ------ 995 MB/s ----- 24868 ----- 8302
Chronos drives

pure SSD with 4 x 480GB ------ Soft - Raid10 ------- ZFS with lz4 compression ------- 4.6GB/s ----- 1.9GB/s ------ 48585 ----- 16224
Chronos drives


We have tested iops by using fio tool and It doesn't seem like there is a big difference compared to other drives. Is there should be looking it? . I think the iops value should get upto 78000 as per the manufature recommended performance
 

bibin

New Member
Dec 10, 2015
13
0
1
34
Also we have tested with Samsung SSd and intel ssd as ZIL and we got slightly better performance than pure ssd server, when we use Samsund SSD 850 evo and intel s3700 ssds configured as ZIL along with the SAS drives. Here are the results we saw.

Server ----- RAID ----- Filesystem ---- Read Speed - ---- Write Speed ------ Read IOPS ------ Write IOPS


4 x 1TB SAS 7.2k+800Gb Intel S3700 SLOG/ZIL ---- Soft - Z2 ------ ZFS with lz4 compression ----- 4.4Gb/s ----- 1.7Gb/s ---- 54385 ---- 18045

4 x 1TB SAS 7.2k +800Gb Intel S3700 SLOG/ZIL --- Soft - Z2 --- ZFS without compression --- 3.7Gb/s ---- 1.3Gb/s ---- 39275 ---- 13082
=

4 x 600GB SAS 10k + 1 x 250GB Samsung SSD 850 evo SLOG/ZIL ---- Soft - Z2 ----- ZFS with lz4 compression ---- 4.8Gb/s ----- 2.2Gb/s ---- 51056 ----- 17077

4 x 600GB SAS 10k + 1 x 250GBSamsung SSD 850 evo SLOG/ZIL ---- Soft - Z2 ------ ZFS without compression ------ 4.0Gb/s ------ 486mb/s ------ 10234 ------- 3413



We are expection higher iops on the pure ssd server. Can you please provide some suggestions as to what we should be tweaking to reach higher iops?
 

T_Minus

Build. Break. Fix. Repeat
Feb 15, 2015
7,640
2,058
113
Is there a reason you're not testing your SSD Pool + S3700 SLOG ?? I'd do that.
 

Quasduco

Active Member
Nov 16, 2015
129
47
28
113
Tennessee
Hi,

We have tested with one SSD and Raid-10. But we are keep getting the same results that compared to the Z2 results. Here are the results we saw,

Server ----- RAID ----- Filesystem ---- Read Speed ----- Write Speed ------ Read IOPS ------ Write IOPS




pure SSD with 4 x 480GB ------ Soft - Raid0 ------- ZFS without compression ------- 4GB/s ------ 552 MB/s ----- 14078 ----- 4693
Chronos drives

pure SSD with 4 x 480GB ------ Soft - Raid0 ------- ZFS with lz4 compression ------- 4.6GB/s ----- 1.9GB/s ------ 46211 ----- 15356
Chronos drives


pure SSD with 4 x 480GB ------ Soft - Raid10 ------- ZFS without compression ------- 4.2GB/s ------ 995 MB/s ----- 24868 ----- 8302
Chronos drives

pure SSD with 4 x 480GB ------ Soft - Raid10 ------- ZFS with lz4 compression ------- 4.6GB/s ----- 1.9GB/s ------ 48585 ----- 16224
Chronos drives


We have tested iops by using fio tool and It doesn't seem like there is a big difference compared to other drives. Is there should be looking it? . I think the iops value should get upto 78000 as per the manufature recommended performance
Unless there is another Mushkin Chronos drive not on their site, they show the Chronos plain as 37k iops max, and the Chronos deluxe as 42k.

See:
http://www.poweredbymushkin.com/index.php/catalog/item/11-chronos/673-chronos-480gb
http://www.poweredbymushkin.com/ind...663:chronos-deluxe-480gb&cid=9:chronos-deluxe
 

bibin

New Member
Dec 10, 2015
13
0
1
34
Hi,

We are using the following mushkin chronos drives and you can view from the attachment. They are recommented as 4KB Random Read ==> upto 78,000 IOPS and 4KB Random Write ==>upto 42,000 IOPS. But we are getting the 4KB random write IOPS between 15000 and 17000. You can view the performance details on the top of this thread. We are looking for some tweaks. Please advice us.
 

Attachments

gea

Well-Known Member
Dec 31, 2010
3,157
1,195
113
DE
What do you expect?

The chronos is an older Sandforce based desktop SSD
If you get 10000 write iops under load you can be happy -
I would expect less.

The problem with datasheets
The values are "synthetic values" from a very special lab environment
on a new SSD that can be achieved only for a very short time.

If you compare one of the best enterprise SSDs, an Intel S3700, they claim
30000-40000 iops for 4k random write andf they have highest quality
flash, a very good controller and a huge builtin overprovisioning. And
they have powerloss protection - quite a must for a production use of SSDs
as there are always background activity for garbage collection by the firmware
at least for an Slog but suggested also for SSDs in a datapool

compare values
Mushkin Chronos 240 GB Review

The general problem
A single 6G spindle disk can give 50-200 MB/s sequentially depends on inner/outer tracks
or if you use a large file or several smaller files with about 100 iops

A single 6G desktop SSD can give 400-500 MB/s sequentially up to 10k write iops under load
with higher read values. Enterprise SSDs are up to 4x this value.

Sequential r/w performance scale with number of datadisks (2 x datadisks on reads from a mirror)
while iops scale with number of vdevs as you must position every disk on every r/w what means
that a raid-z with one vdev has the same iops like a single disk.

If your tests give you much better values, you must check effects of a cache. This is not
bad as you use a cache to increase performance. Its more a problem if you want to test
pure disk quality.

What I would
- do all tests without LZ4 (you do not want to test LZ4 quality)
- do all basic tests without l2arc or slog
- if you need iops from spindels, go raid-10

- check arcstat.pl do decide if you need to add an L2Arc
- you can use a Cronos for L2Arc, the S3700 as Slog is perfect
- use an Slog only (sync enabled) when needed, ex databases or VM storage with old filesystems

- do a secure erase and add a manual overprovisioning of 10-20% if you want to
achieve best write iops under load from the chronos

If you want to test real disk performance, reduce RAM to 2GB with test file sizes of 4GB and above.
(or reduce max arc cache in /etc/system)
 
Last edited:

whitey

Moderator
Jun 30, 2014
2,766
868
113
41
THANK YOU THANK YOU THANK YOU Gea, you hit it on the head!

Not trying to be a wiseguy at all, but if you expect your results to be in EXACT lockstep w/ what vendors 'claim' you'd have to abide by the old saying of 'there is a sucker born every day'.

What you are getting is MORE than enough to drive a substantial workload/I/O. Are you seeing bottlenecks currently w/ a 'real' workload running on these zpools or just trying to squeeze the last bit o' juice outta the pool?
 

Shog

New Member
Dec 15, 2017
1
0
1
61
Hello,

I know this is a bit late, and sorry to resurrect the thread. The results posted in the opening post are maybe more different than you have realised. On the assumption that the throughput figures are provided by fio (and I'd be interested in knowing what this is, and how you ran it) and not typos, it is plainly showing you major differences between the SSDs and the HDDs. Throughput figures are listed as GB & MB /sec for SSDs, and Gb / Mb / sec for HDDs. Which means there's an 8x factor of difference between throughputs that at initial glance look similar. "B" as in byte and "b" as in bit, of which there are 8 in a byte:

pure SSD with 4 x 480GB ------ Soft - Z2 ------- ZFS without compression ------- 4.1GB/s ------ 778 MB/s ----- 23025 ----- 7664
Chronos drives
4 x 600GB SAS 10k drives ------- Soft - Z2 --- --- ZFS without compression ------ 4.0Gb/S ------ 486Mb/s ----- 10234 ------ 3413
with 480GB SSD cache
I only noticed this thread because of google-fu related to trying to troubleshoot performance on a freeNAS system with 8 x 2TB Samsung 850 EVOs as 4 x mirror vdevs and 2 x partitioned 280GB Intel Optane 900p as SLOG and L2ARC.

Most of these appear related to drive write caches not being used, and you'd be amazed how slow an SSD can get when being addressed synchronously by ZFS.

Having noticed and fixed the issue on one system (Dell R730/Perc mini 730 in HBA mode not enabling write caches on drives), I now appear to be experiencing it on a subsequent system where the write caching _is_ enabled, but the fact that I created the pool by hand rather than through the GUI appears to be causing ZFS to treat them synchronously. Also, freeNAS GUI is apparently unable to detect and import a pool built on a per-disk basis rather than a per partition basis.

If you can post your fio commands, I'll attempt to run them on the setup here...