Supermicro poor NVMe SSD performance in Linux

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

EffrafaxOfWug

Radioactive Member
Feb 12, 2015
1,394
511
113
Both hdparm and dd are poor ways to test an SSD, particularly a very fast one - they're not really proper benching tools and dd especially will often hit the limits of a single thread.

fio using/tweaking a test script such as this one should give a much better indicator of "real world" performance.
 

cuddylier

New Member
Apr 23, 2016
24
0
1
43
Both hdparm and dd are poor ways to test an SSD, particularly a very fast one - they're not really proper benching tools and dd especially will often hit the limits of a single thread.

fio using/tweaking a test script such as this one should give a much better indicator of "real world" performance.
Thanks. Do you know an example fio command I could use to use that config file? I Googled about it but can't find anything that suggests how you use such a file with fio. I've used fio in the past to test a drive without it needing a partition table (it overwrote existing data) but want to make sure this doesn't hurt existing data on the NVMes.
 

EffrafaxOfWug

Radioactive Member
Feb 12, 2015
1,394
511
113
It's simpler than it sounds - if you've got a pre-defined job in a file like that, you basically go:
Code:
fio /path/to/test.fio
...and it'll go off and run. The results can be a bit hard on the eyes if you're not used to them, but you can make things simpler by deleting the rand-read and rand-write stanzas from the script since you're only testing sequential throughput at the moment.

Likely you'll want to tweak the values inside - for example the mount point of the drive/array and the block size (the one I linked to uses 4k blocks which are where SSDs shine but that might not max out the sequential throughput, so perhaps change it to 128k or even 1M blocks.

fio's a very powerful - and necessarily complex - tool, but like a Swiss army knife it's pretty easy to figure out what at least some of the attachments do :)

I've used fio in the past to test a drive without it needing a partition table (it overwrote existing data) but want to make sure this doesn't hurt existing data on the NVMes.
I'm pretty sure fio always needs a test file sitting on the filesystem (I've only ever used it as such); the fio script I linked to, unedited, would try and create a 10GB file called ssd.test.file in the directory /mount-point-of-ssd. Obviously you might want to pick a 120GB file called cuudliers_fio.test at /path/to/my_raid_array.

Up one directory is a dizzying array of different test scripts which you can have a look through for other ideas, but the one I linked to is a good starter for ten.
 
Last edited:

cuddylier

New Member
Apr 23, 2016
24
0
1
43
It's simpler than it sounds - if you've got a pre-defined job in a file like that, you basically go:
Code:
fio /path/to/test.fio
...and it'll go off and run. The results can be a bit hard on the eyes if you're not used to them, but you can make things simpler by deleting the rand-read and rand-write stanzas from the script since you're only testing sequential throughput at the moment.

Likely you'll want to tweak the values inside - for example the mount point of the drive/array and the block size (the one I linked to uses 4k blocks which are where SSDs shine but that might not max out the sequential throughput, so perhaps change it to 128k or even 1M blocks.

fio's a very powerful - and necessarily complex - tool, but like a Swiss army knife it's pretty easy to figure out what at least some of the attachments do :)



I'm pretty sure fio always needs a test file sitting on the filesystem (I've only ever used it as such); the fio script I linked to, unedited, would try and create a 10GB file called ssd.test.file in the directory /mount-point-of-ssd. Obviously you might want to pick a 120GB file called cuudliers_fio.test at /path/to/my_raid_array.

Up one directory is a dizzying array of different test scripts which you can have a look through for other ideas, but the one I linked to is a good starter for ten.
Thanks a lot. I ran 4k, 128k, 1m and 2m blocks to see and the performance seems pretty good actually for 128k , 1m and 2m at least, 4k is low for throughput but probably as expected as you mentioned:

4k
128k
1m
2m

Is there anything that might impact the 1m and 2m block tests to show such high random read bandwidth figures? E.g. 4420MiB/s (4634MB/s). The sequential read and write figures match the specs of the P3600 of around 2600MB/s read and 1600MB/s write, I've read that the P3605 has slightly better real world performance though than these specs in some cases.
 

EffrafaxOfWug

Radioactive Member
Feb 12, 2015
1,394
511
113
First off, have you tested the same fio script on the different motherboards to check if performance is significantly different on the various motherboards? My point about using fio rather than other tools is that it should be considerably better at benching the SSD on different boards than tools like dd (for any number of potential reasons) and - assuming there isn't an issue with the PCIe on the SM motherboard(s) - should give you comparable numbers across different hardware.

It would take a godly controller to be able to max out a PCIe link of 2GB/s using 4k IOs ;) 2GB = 2097152 kB which means you'd need to sustain 524,288 4k IOPS to saturate a 2GB/s link. Still, 30,000 4k read IOPS is nothing to sniff at, I assume from the device being md2 that these are running on the mdadm RAID1 and not on single drives?

If so, that would explain why your reads are so high; since it's an entirely read-based operation, mdadm can read from both discs in the RAID pair at the same time (like RAID0 basically, striping reads from all devices), allowing a theoretical doubling of read speed in this scenario - random reads at large block sizes mean better scaling across multiple devices than single large sequential reads. Writes however will always have to be written to both devices simultaneously with RAID1 writes you'll always be limited by the maximum write speed of the slowest device.
 

cuddylier

New Member
Apr 23, 2016
24
0
1
43
First off, have you tested the same fio script on the different motherboards to check if performance is significantly different on the various motherboards? My point about using fio rather than other tools is that it should be considerably better at benching the SSD on different boards than tools like dd (for any number of potential reasons) and - assuming there isn't an issue with the PCIe on the SM motherboard(s) - should give you comparable numbers across different hardware.

It would take a godly controller to be able to max out a PCIe link of 2GB/s using 4k IOs ;) 2GB = 2097152 kB which means you'd need to sustain 524,288 4k IOPS to saturate a 2GB/s link. Still, 30,000 4k read IOPS is nothing to sniff at, I assume from the device being md2 that these are running on the mdadm RAID1 and not on single drives?

If so, that would explain why your reads are so high; since it's an entirely read-based operation, mdadm can read from both discs in the RAID pair at the same time (like RAID0 basically, striping reads from all devices), allowing a theoretical doubling of read speed in this scenario - random reads at large block sizes mean better scaling across multiple devices than single large sequential reads. Writes however will always have to be written to both devices simultaneously with RAID1 writes you'll always be limited by the maximum write speed of the slowest device.
I haven't used the fio script on different boards yet but now I know about it, I can.

Yes, these results were from a mdadm raid 1 array. That makes sense why the reads are what they are, thanks.
 

EffrafaxOfWug

Radioactive Member
Feb 12, 2015
1,394
511
113
Best way to prove it would be to stop one of the RAID devices and see if the read speeds drop accordingly.