NVMe performance on IBM x3650 M3

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

decibel83

New Member
Aug 13, 2019
3
0
1
Hi,
On my IBM x3650 M3 server (model 7945K3G) I added two Kingston SSD A1000 M2 with 960 Gb of capacity (the exact model is SA1000M8/960G A1000) mounted into two M.2 PCIe adapter (https://www.amazon.it/gp/product/B07CBJ6RH7/ref=ppx_yo_dt_b_asin_title_o00_s00?ie=UTF8&psc=1).

I'm using Proxmox, and I created a new ZFS mirror pool to do some benchmark.
The server already has a RAID5 array with 8 x 500 Gb Seagate ST9500430SS (IBM-ESXSST9500430SS) drives configured on an IBM M5015 RAID controller with BBU battery:

Code:
root@server:/# megacli -LDInfo -Lall -Aall
Adapter 0 -- Virtual Drive Information:
Virtual Drive: 0 (Target Id: 0)
Name                :
RAID Level          : Primary-5, Secondary-0, RAID Level Qualifier-3
Size                : 3.176 TB
Sector Size         : 512
Is VD emulated      : No
Parity Size         : 464.729 GB
State               : Optimal
Strip Size          : 128 KB
Number Of Drives    : 8
Span Depth          : 1
Default Cache Policy: WriteBack, ReadAheadNone, Cached, Write Cache OK if Bad BBU
Current Cache Policy: WriteBack, ReadAheadNone, Cached, Write Cache OK if Bad BBU
Default Access Policy: Read/Write
Current Access Policy: Read/Write
Disk Cache Policy   : Enabled
Encryption Type     : None
Is VD Cached: No
I used Proxmox's pveperf utility to make some tests, and what surprised me is that I'm having worst fsync value on the new NVMe drives than on the RAID5 array:

pveperf execution on RAID5 array:

Code:
root@server:/# pveperf /var/lib/vz
CPU BOGOMIPS:      76803.76
REGEX/SECOND:      1253534
HD SIZE:           3044.94 GB (/dev/mapper/pve-data)
BUFFERED READS:    63.95 MB/sec
AVERAGE SEEK TIME: 14.82 ms
FSYNCS/SECOND:     1781.54
DNS EXT:           70.81 ms
DNS INT:           4.11 ms (domain.com)
pveperf execution on ZFS on NVMe:

Code:
root@server:/# pveperf /mnt/nvme/
CPU BOGOMIPS:      76803.76
REGEX/SECOND:      1150312
HD SIZE:           31.25 GB (/dev/zd16)
BUFFERED READS:    1129.42 MB/sec
AVERAGE SEEK TIME: 0.04 ms
FSYNCS/SECOND:     421.13
DNS EXT:           167.97 ms
DNS INT:           4.84 ms (domain.com)
1781.54 on SAS RAID5 and 421.13 on NVMe!

Could you help me to understand where I'm wrong, please?
Thank you very much!
 

vanfawx

Active Member
Jan 4, 2015
365
67
28
45
Vancouver, Canada
Kingston consumer SSD's are not known for their fsync performance. Most likely each fsync is causing the buffer to flush, due to the NVMe not having any power-loss protection, and TLC isn't known for its speed, so the flushes take real time.

You might want to look at m.2 enterprise SSD's that have power-loss protection.
 

decibel83

New Member
Aug 13, 2019
3
0
1
Thanks for your answer!

Most likely each fsync is causing the buffer to flush, due to the NVMe not having any power-loss protection, and TLC isn't known for its speed, so the flushes take real time.
What do you mean with TLC?

You might want to look at m.2 enterprise SSD's that have power-loss protection.
Could you advise me some model which has a good fsync performance?
Maybe Samsung 970 EVO or 970 PRO?
 

decibel83

New Member
Aug 13, 2019
3
0
1
Thanks!
I read both articles and see that Intel Octane are the best, and I understand that Samsung PRO and not so much better than Samsung EVO for this purposes..
But generally, how I could recognise a good M2 SSD for ZFS before buying it?
 

vanfawx

Active Member
Jan 4, 2015
365
67
28
45
Vancouver, Canada
If you're okay with the tradeoffs, you can set "zfs set sync=disabled dataset" which will tell ZFS to make the sync calls, asynchronous, to be flushed on each transaction group flush (usually every 5s by default). The downside to setting this flag, is you can lose whatever data is in memory that hasn't been flushed to disk yet.

I'm honestly not very familiar with M.2 SSD's with power loss protection, hopefully someone else on the forum can make some recommendations. But they won't be cheap. I'd recommend setting sync=disabled and retry your test.