Confused about ZFS performance on SSD

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

bleomycin

Member
Nov 22, 2014
54
6
8
37
tldr: What is going on with these benchmarks, and do I really need a P3700 to get decent performance for VM's hosted on SSD ZFS pools? Ceph: how to test if your SSD is suitable as a journal device? | Sébastien Han

So I have a proxmox host that currently has the following configuration:

2x240GB mirrored ZFS Sandisk Extreme II SSD's
2x480GB mirrored ZFS Sandisk Extreme II SSD's

The 240GB drives are mostly empty, hosting just proxmox itself, and my VM's live on the 480GB mirror. I recently noticed that samba file transfers under one of my Windows 10 VM's were stalling when copying files from my NAS to itself over 10Gbit.

I ran this simple DD write test (with compression disabled on the filesystem) on the 480GB pool (which I know isn't terribly accurate):

Code:
dd if=/dev/zero of=tempfile bs=1M count=4024 conv=fdatasync,notrunc
4024+0 records in
4024+0 records out
4219469824 bytes (4.2 GB) copied, 35.0058 s, 121 MB/s
I saw as low as 80MB/s in further tests. Chalking this up to the lack of trim in ZoL and the fact the pool became probably close to 80% full at one point I destroyed the 480GB pool and forced a trim on them with:

Code:
mkfs.ext4 -F -E discard /dev/sda
And performance restored to ~450+MB/s or so in the same tests when I rebuilt the mirror. I wasn't able to secure erase these drives because they are frozen and in a machine I don't have easy physical access to, so trim was the best I could do.

Long story short i'm considering getting some newer SSD's that will perform better (especially without TRIM) and stumbled across this post showing benchmarks for a ton of drives: Ceph: how to test if your SSD is suitable as a journal device? | Sébastien Han

I'm horribly confused about what's going on there. How is an 850 Pro only writing at 1.5MB/s and how will that affect someone like me, a hobbyist without very demanding needs? I just want a system that's reasonably quick and responsive, but by looking at that page you'd think you need nothing less than a P3700 to pull that off?

Thank you for any help!
 

voodooFX

Active Member
Jan 26, 2014
247
52
28
What I will do at your place is monitoring the mirror performance and live in peace unless I see a drastic performance drop.
Moreover storing a VM disk file(s) is not equal to storing a OSD journal, so I would not get crazy for the results of the benchmarks you linked.
 

T_Minus

Build. Break. Fix. Repeat
Feb 15, 2015
7,625
2,043
113
How many VMs were live when you ran the test?
What dot he VMs do?

Just because it's a SSD does not mean that 1 SSD is going to perform great for a lot of VMs especially a consumer one, that's not OP.
 

BackupProphet

Well-Known Member
Jul 2, 2014
1,083
640
113
Stavanger, Norway
olavgg.com
On FreeBSD with 4 Intel DC S3500 connected to Dell H310(flashed to IT-mode) I get 2000MB/s for read and 1800MB/s for write on ZFS.

That Ceph test is disabling the write cache on every SSD and then testing how each SSD performs without it. A volatile cache is not something you want on a server. The Intel DC SSD's cannot deactivate the cache(there is no need to either), and that is why they perform so well. Beware of SSD drives which doesn't honor ack's for when the data is flushed.
 

bleomycin

Member
Nov 22, 2014
54
6
8
37
On FreeBSD with 4 Intel DC S3500 connected to Dell H310(flashed to IT-mode) I get 2000MB/s for read and 1800MB/s for write on ZFS.

That Ceph test is disabling the write cache on every SSD and then testing how each SSD performs without it. A volatile cache is not something you want on a server. The Intel DC SSD's cannot deactivate the cache(there is no need to either), and that is why they perform so well. Beware of SSD drives which doesn't honor ack's for when the data is flushed.
Ahhhh, that makes much more sense now! Thank you for explaining that. If only it were possible to find more benchmarks of various consumer drives with the write cache disabled for additional comparison.
 

T_Minus

Build. Break. Fix. Repeat
Feb 15, 2015
7,625
2,043
113
@bleomycin I started benchmarking/testing consumer/cheaper SSD but with the price of S3500 now it's easy to avoid buying consumer at all and just go S3500 :)
 

bleomycin

Member
Nov 22, 2014
54
6
8
37
@bleomycin I started benchmarking/testing consumer/cheaper SSD but with the price of S3500 now it's easy to avoid buying consumer at all and just go S3500 :)
Yeah, that makes sense. What's your opinion on an intel 750 instead? I see it performs extremely well and is about the same price as S3500.
 

T_Minus

Build. Break. Fix. Repeat
Feb 15, 2015
7,625
2,043
113
Depends on your usage... if it's consistently in use reading and writing I would go S3500 starting with 4x drives in mirrored vdevs, and then scale up to 8+ depending on storage/performance needs. The 750 is a fast drive but mixed performance and consistency are not enterprise levels, even compared to SATA. I personally used the 750 upgraded to p3700 nvme, and 8x 400gb sas SSD instead :)
 

gea

Well-Known Member
Dec 31, 2010
3,141
1,182
113
DE
NVMe with the pci-e interface is a whole new dimension.
I have some P750 and tested them against the more expensive Intels S3610 in a 10G environment
(use case: multi user video editing up to 4k)

The result
You need 2-3 S3610 in a raid-0 for comparable results than a single P750.
Result with the P3600 was similar to the P750 in my workload.

If you do not need hotplug and can live with up to 6 NVMe (what means max 6 TB in a Raid-Z1 or 3,6 TB with mirrors from 1,2 TB P750 with a 7 slot board and one slot left for 10/40G network) you will not find a solution that is nearly as fast in any workload with Sata/SAS SSDs.

With regular SSDs I prefer the Intel S3610 as a compromise. It includes the tech from the S3700 (one of the best regarding write iops) but with less overprovisioning what makes it cheaper. For price sensitive enterprise SSDs I consider more and more the Samsung PM/SM line.