Supermicro poor NVMe SSD performance in Linux

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

cuddylier

New Member
Apr 23, 2016
24
0
1
43
Hi

I've tried different NVMe SSDs in various Supermicro motherboards. both in the M.2 slots on the board and using a M.2 to PCie adapter in the PCIe slots, whether that's a X9DRI-LN4F+ or the latest boards such as X11SCZ-F. Typically the read speeds are okay but the write speeds are low e.g. on CentOS 7 with 2 x Intel P3605 mdadm raid 1 plugged into PCIe 3.0 x4 slots:

689MB/s sequential write
2341MB/s sequential read

I've tried other NVMe SSDs such as a 1TB Samsung 970 Pro but only saw around 800MB/s on the Supermicro boards both in the M.2 slot and using a M.2 to PCie adapter but over 1.5GB/s on other motherboards such as the ASRock Z370 Pro4. I've also tried other Linux distros such as Ubuntu with the same results.

I've also seen issues with NVMe SSDs slowing down over time when on the Supermicro boards, even when there's at least 20% available space and they have been trimmed regularly.

Windows doesn't appear to have any issues and easily gets to the quoted write speeds with tools such as crystaldiskmark with Supermicro.

Does anyone know why Supermicro boards with Linux seem to consistently have much poorer NVMe SSD throughput speeds, especially for writing?

Thanks
 

Kev

Active Member
Feb 16, 2015
461
111
43
41
It cannot, they are just PCIe lanes. Start looking at the Linux drivers, etc.
 

cuddylier

New Member
Apr 23, 2016
24
0
1
43
It cannot, they are just PCIe lanes. Start looking at the Linux drivers, etc.
That's what I thought but I've seen consistently massive performance differences between the same NVMe SSDs on both Supermicro and ASRock (and ASRock Rack) motherboards in this case. Always with the same CentOS 7 base install. Multiple ASRock and Supermicro boards tested.

The drivers for both Samsung 970 Pros and Intel P3600 series are Windows only it seems. I researched this before and the consensus I believe was that the drivers were already integrated into most Linux distros or at least a basic NVMe driver.
 

Jeggs101

Well-Known Member
Dec 29, 2010
1,529
241
63
Did you check if you're putting them into the PCH's PCIe lanes not the CPU's? Also that they're on the same CPU not different across sockets? I'd say it's that or its a driver issue. SM NVMe servers are used by a lot of big storage co's so I wouldn't first look at the hardware unless it's PCH lanes.
 

cuddylier

New Member
Apr 23, 2016
24
0
1
43
Did you check if you're putting them into the PCH's PCIe lanes not the CPU's? Also that they're on the same CPU not different across sockets? I'd say it's that or its a driver issue. SM NVMe servers are used by a lot of big storage co's so I wouldn't first look at the hardware unless it's PCH lanes.
That wasn't something I knew about. Although I did use the single M.2 slot on a Supermicro board for one of the tests that was getting poor results but moving that exact SSD with the OS on it to an ASRock's single M.2 slot showed considerable improvement in write performance (1GB/s -> 1.5GB/s) with no other changes hence my assumption Supermicro was having some sort of issue with their non dedicated NVMe boards.

How do I know which PCIe slots I should be using for e.g. this board: Supermicro | Products | Motherboards | Xeon® Boards | X9DRi-LN4F+ ? I'm not familar with identifying PCH and CPU PCIe lanes.

Motherboard manual https://www.supermicro.com/manuals/motherboard/C606_602/MNL-1258.pdf page 18 has a diagram of where each PCIe slot goes to but doesn't mention PCH and CPU lanes or at least using those words.
 

Evan

Well-Known Member
Jan 6, 2016
3,346
598
113
If you used any slot other than slot #2 it would have been off the CPU, I didn’t read but slot 2 has some strange connection both to the CPU and PCH.

But I will echo what others have said, don’t think it’s a Supermicro issue as such, never seem any issues as you describe (X10 & X11 only though, not ever played with anything NVMe on X9)
 

cuddylier

New Member
Apr 23, 2016
24
0
1
43
If you used any slot other than slot #2 it would have been off the CPU, I didn’t read but slot 2 has some strange connection both to the CPU and PCH.

But I will echo what others have said, don’t think it’s a Supermicro issue as such, never seem any issues as you describe (X10 & X11 only though, not ever played with anything NVMe on X9)
What sort of read/write performance have you seen for NVMe on Linux with X10 or X11 boards?
 

Evan

Well-Known Member
Jan 6, 2016
3,346
598
113
Read 2800-2900 MB/s
Write 1900 MB/s

RHEL7
X10SRM-L
E5-2680v4
4 x 32gb 2400 ram
Samsung PM983 3.84tb

Nothing special in terms of setup, however airflow was good for the drive as it gets proper warm if not and I would assume throttle. I just ran a quick test when I looked at smart data as I purchased used from a member at here at STH.
 

cuddylier

New Member
Apr 23, 2016
24
0
1
43
Read 2800-2900 MB/s
Write 1900 MB/s

RHEL7
X10SRM-L
E5-2680v4
4 x 32gb 2400 ram
Samsung PM983 3.84tb

Nothing special in terms of setup, however airflow was good for the drive as it gets proper warm if not and I would assume throttle. I just ran a quick test when I looked at smart data as I purchased used from a member at here at STH.
I could only dream of seeing such numbers. My NVMe seem sufficiently cool as they usually stick around 20 - 30c.

Anything special with partitioning? My partioning below for the actual NVMEs which are part of a mdadm raid 1 array. They start on 2048, I did see an Intel document saying they suggest the start is divisible by 4096 though: https://www.intel.com/content/dam/w...briefs/ssd-partition-alignment-tech-brief.pdf (page 8) but according to that, mine is okay (2048 x 512 = 1048576 / 4096 = 256).

Code:
Disk /dev/nvme1n1: 1600.3 GB, 1600321314816 bytes, 3125627568 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk label type: dos
Disk identifier: 0x000028d1

        Device Boot      Start         End      Blocks   Id  System
/dev/nvme1n1p1            2048  3125627567  1562812760   fd  Linux raid autodetect

Disk /dev/nvme0n1: 1600.3 GB, 1600321314816 bytes, 3125627568 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk label type: dos
Disk identifier: 0x00010c7d

        Device Boot      Start         End      Blocks   Id  System
/dev/nvme0n1p1            2048  3125627567  1562812760   fd  Linux raid autodete
At least I know those speeds are possible, thanks for the help.
 
Last edited:

cuddylier

New Member
Apr 23, 2016
24
0
1
43
Read 2800-2900 MB/s
Write 1900 MB/s
Do you get the write speed you mentioned with the following command, if it's not what you already used?

Code:
dd if=/dev/zero of=test_$$ bs=64k count=16k conv=fdatasync && rm -f test_$$
 

Evan

Well-Known Member
Jan 6, 2016
3,346
598
113
If used DD, bs=128 (sequential needs big blocks, random needs high QD and lots of processes to see anywhere near specified IOPs)

No raid and just 1 partition (whole disk), no over provision or anything but it was a
totally empty disk.

I would run it again if the system still had Linux, but I do have some consumer Samsung NVMe and an X10SDV-f board I was planning to test for selling , if I get a chance today I will try it out but also my experience M.2 drives never match the 2.5” cousins just due to max voltage & power draw available.
 

cuddylier

New Member
Apr 23, 2016
24
0
1
43
If used DD, bs=128 (sequential needs big blocks, random needs high QD and lots of processes to see anywhere near specified IOPs)

No raid and just 1 partition (whole disk), no over provision or anything but it was a
totally empty disk.

I would run it again if the system still had Linux, but I do have some consumer Samsung NVMe and an X10SDV-f board I was planning to test for selling , if I get a chance today I will try it out but also my experience M.2 drives never match the 2.5” cousins just due to max voltage & power draw available.
Thanks, much appreciated. I'm actually primarily testing another X9DRI-LN4F server I have with 2 X 2.5" Intel P3605 1.6TB in it. The two devices are plugged into pcie ports through a pcie to U2 adapter. Seeing 600 - 700MB/s write, 2400MB/s read. Read looks fine to me, it's just write that isn't. 500GB available space and fully trimmed (hasn't gone beyond 500GB free space since formatting anyway).
 

Evan

Well-Known Member
Jan 6, 2016
3,346
598
113
Strange indeed, p1605 is not super fast writer but still would easily enough expect 1G+ speeds
 

T_Minus

Build. Break. Fix. Repeat
Feb 15, 2015
7,625
2,043
113
In 9-10 days we can test P3605 1.6TB on X9 and X10 SuperMicro with Funtin adapter and let you know what we get.
We'll primarily be using Ubuntu but can do a CentOS to see if we get anything strange like this.

I've not seen such poor performance in the past though with linux and p3605 but usually do tests on Ubuntu.

I don't have a X9DRI-LN4F anymore, but have 1 or 2 X9 variants, we'll also be testing with the Intel Motherboard with E5 V2 CPU as well for this drive, and Optane if interested can share that as well.
 

cuddylier

New Member
Apr 23, 2016
24
0
1
43
In 9-10 days we can test P3605 1.6TB on X9 and X10 SuperMicro with Funtin adapter and let you know what we get.
We'll primarily be using Ubuntu but can do a CentOS to see if we get anything strange like this.

I've not seen such poor performance in the past though with linux and p3605 but usually do tests on Ubuntu.

I don't have a X9DRI-LN4F anymore, but have 1 or 2 X9 variants, we'll also be testing with the Intel Motherboard with E5 V2 CPU as well for this drive, and Optane if interested can share that as well.
That'll be very helpful, thanks a lot.
 

Kev

Active Member
Feb 16, 2015
461
111
43
41
Check in bios that the slot was not configured for pcie gen 2 5gbits/sec. Usually automatic will do.
 

cuddylier

New Member
Apr 23, 2016
24
0
1
43
Check in bios that the slot was not configured for pcie gen 2 5gbits/sec. Usually automatic will do.
Thanks for the idea. I checked the output of "lspci -vv" and it seems to suggest it's PCie 3.0 okay:

Code:
82:00.0 Non-Volatile memory controller: Intel Corporation PCIe Data Center SSD (rev 01) (prog-if 02 [NVM Express])
Capabilities: [60] Express (v2) Endpoint, MSI 00
LnkSta: Speed 8GT/s, Width x4, TrErr- Train- SlotClk- DLActive- BWMgmt- ABWMgmt-
And dmidecode shows for the following for the 2 x NVMe:
Code:
x16 PCI Express 3 x16
x8 PCI Express 3 x8
 
Last edited:

cuddylier

New Member
Apr 23, 2016
24
0
1
43
I checked which PCie slots I'm using on the X9DRI-LN4F+, slot 1 and 6 which are on different CPUs so it rules out using two PCIe slots on the same CPU. The write speed almost does seem like it's exactly half what I'd expect though, seeing 700MB/s to the mdadm raid 1 array and would expect around 1400MB/s (rated at 1600MB/s).
 

EffrafaxOfWug

Radioactive Member
Feb 12, 2015
1,394
511
113
I assume you're comparing performance with the RAID1 under both the SM and ASrock boards or are you comparing bare drives? What workload are you using to test the speeds?

mdadm introduces several variables on top of just the hard drive topology (for instance, a write-intent bitmap can often limit write speed) so if at all possible I'd suggest blowing the array away and testing on the bare drives first.
 

cuddylier

New Member
Apr 23, 2016
24
0
1
43
I assume you're comparing performance with the RAID1 under both the SM and ASrock boards or are you comparing bare drives? What workload are you using to test the speeds?

mdadm introduces several variables on top of just the hard drive topology (for instance, a write-intent bitmap can often limit write speed) so if at all possible I'd suggest blowing the array away and testing on the bare drives first.
I've tested the bare drives relatively recently and got slightly better performance but not much at all, still below 1GB/s write speed. I'll have another this coming week to test again.

Read speed: hdparm -t /dev/md2 to test the raid 1 array and hdparm -t /dev/nvme0n1 etc to test the individual drives
Write speed:
dd if=/dev/zero of=test_$$ bs=64k count=16k conv=fdatasync && rm -f test_$$

The NVMes are a mounted partition in CentOS 7 because the motherboard doesn't support booting from NVMe.