Samsung SM863 performance on z97 SATA and RMS25LB controller

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

heathen

Member
May 30, 2015
74
7
8
45
Hi, ppl!

I've got a weird situation and would like to ask for help.

I have a number of Samsung sm863 1.92TB SSDs and finally got my servers. Firstly I've tested SSDs on a workstation motherboard with z97 chipset via internal sata3 port. Have got the next results (I've posted these results in Sebastien Han's blog comments):

Samsung SM863 MZ7KM1T9HAJM-00005 1.92TB - firmware GXM1003Q - fio 2.12

Commands:
hdparm -W 0 /dev/sdc
fio --filename=/dev/sdc --direct=1 --sync=1 --rw=write --bs=4k --numjobs=X --iodepth=1 --runtime=60 --time_based --group_reporting --name=journal-test

1 job:
write: io=9559.9MB, bw=163151KB/s, iops=40787, runt= 60001msec
clat (usec): min=22, max=19940, avg=24.09, stdev=15.52

5 jobs:
write: io=20166MB, bw=344155KB/s, iops=86038, runt= 60001msec
clat (usec): min=23, max=9054, avg=57.56, stdev=18.88

10 jobs:
write: io=20243MB, bw=345477KB/s, iops=86369, runt= 60001msec
clat (usec): min=28, max=19841, avg=115.18, stdev=42.61

Tested on Intel Z97 onboard SATA3 port.

But now I've tested the same drives with newly arrived Intel H2312JFFKR servers. I've got and installed a 6 gbps SAS option (AH2000JF6GKIT) as well, which includes 6Gbps SAS module RMS25LB. I found and flashed the latest firmware for it.

I can see that SSDs connected at 6 gbps:
smartctl 6.5 2016-01-24 r4214 [x86_64-linux-4.10.0-33-generic] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model: SAMSUNG MZ7KM1T9HAJM-00005
Serial Number: S2HNNXxxxxxxxxxx
LU WWN Device Id: 5 002538 c4040827a
Firmware Version: GXM1003Q
User Capacity: 1,920,383,410,176 bytes [1.92 TB]
Sector Size: 512 bytes logical/physical
Rotation Rate: Solid State Device
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: ACS-2, ATA8-ACS T13/1699-D revision 4c
SATA Version is: SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Wed Sep 6 15:25:27 2017 +05
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
AAM feature is: Unavailable
APM feature is: Unavailable
Rd look-ahead is: Enabled
Write cache is: Enabled
ATA Security is: Disabled, NOT FROZEN [SEC1]
Wt Cache Reorder: Enabled

But the same tests show totally different results:

1 job:
write: io=3485.3MB, bw=59481KB/s, iops=14870, runt= 60001msec
clat (usec): min=63, max=1332, avg=66.06, stdev= 4.39

5 jobs:
write: io=7992.3MB, bw=136399KB/s, iops=34099, runt= 60001msec
clat (usec): min=83, max=817, avg=145.25, stdev=12.15

10 jobs:
write: io=9994.8MB, bw=170572KB/s, iops=42642, runt= 60002msec
clat (usec): min=110, max=1227, avg=233.07, stdev=18.05

Do I miss something? These results are just terrible for such drives and more than 2 times worse than on z97 sata port. Are there any parameters I can tune on rms25lb adapter?

Thanks!
 
Last edited:

heathen

Member
May 30, 2015
74
7
8
45
A little update for those who are interested in my investigation.

I haven't got a time to check parameters on my z97 workstation with Fedora Linux, but I've got pretty impressive improvements by enabling write-through cache:
Code:
echo "temporary write through" | sudo tee /sys/block/sdc/device/scsi_disk/0\:0\:0\:0/cache_type
Jobs=1:
write: io=6183.6MB, bw=105530KB/s, iops=26382, runt= 60001msec
clat (usec): min=35, max=267, avg=36.89, stdev= 2.05

Jobs=5:
write: io=14762MB, bw=251931KB/s, iops=62982, runt= 60001msec
clat (usec): min=37, max=775, avg=78.18, stdev=14.62

Jobs=10:
write: io=14603MB, bw=249210KB/s, iops=62302, runt= 60002msec
clat (usec): min=66, max=947, avg=158.92, stdev=30.05

Beware: please, don't use write through cache with non-enterprise SSDs! Such drives (and SM863(a) as well) have capacitors so even in case of power failure they will be able to flush cached data into permanent memory. Consumer level SSDs don't have capacitors so you have a big chance to loose your data.

These results are still ~25% less then after non-optimized tests on z97 chipset. I'm going to put one drive back to that workstation and check what's going on with settings. Don't know if it is important\interesting for anyone though...
 

heathen

Member
May 30, 2015
74
7
8
45
It seems to be a different parameter:

Code:
admin@e001n03:~$ sudo hdparm -W /dev/sda

/dev/sda:
 write-caching =  1 (on)

admin@e001n03:~$ cat /sys/block/sda/device/scsi_disk/0\:0\:0\:0/cache_type
write back

admin@e001n03:~$ sudo fio --filename=/dev/sda4 --direct=1 --sync=1 --rw=write --bs=4k --numjobs=1 --iodepth=1 --runtime=60 --time_based --group_reporting --name=journal-test
journal-test: (g=0): rw=write, bs=4K-4K/4K-4K/4K-4K, ioengine=sync, iodepth=1
fio-2.2.10
Starting 1 process
...
  write: io=3444.7MB, bw=58788KB/s, iops=14696, runt= 60001msec
    clat (usec): min=63, max=421, avg=66.80, stdev= 4.68
...

admin@e001n03:~$ sudo hdparm -W 0 /dev/sda

/dev/sda:
 setting drive write-caching to 0 (off)
 write-caching =  0 (off)

admin@e001n03:~$ sudo fio --filename=/dev/sda4 --direct=1 --sync=1 --rw=write --bs=4k --numjobs=1 --iodepth=1 --runtime=60 --time_based --group_reporting --name=journal-test
journal-test: (g=0): rw=write, bs=4K-4K/4K-4K/4K-4K, ioengine=sync, iodepth=1
fio-2.2.10
Starting 1 process
...
  write: io=3455.1MB, bw=58980KB/s, iops=14745, runt= 60001msec
    clat (usec): min=62, max=2981, avg=66.62, stdev= 5.45
...

admin@e001n03:~$ echo "temporary write through" | sudo tee /sys/block/sda/device/scsi_disk/0\:0\:0\:0/cache_type
temporary write through

admin@e001n03:~$ cat /sys/block/sda/device/scsi_disk/0\:0\:0\:0/cache_type
write through

admin@e001n03:~$ sudo fio --filename=/dev/sda4 --direct=1 --sync=1 --rw=write --bs=4k --numjobs=1 --iodepth=1 --runtime=60 --time_based --group_reporting --name=journal-test
journal-test: (g=0): rw=write, bs=4K-4K/4K-4K/4K-4K, ioengine=sync, iodepth=1
fio-2.2.10
Starting 1 process
...
  write: io=5390.7MB, bw=91999KB/s, iops=22999, runt= 60001msec
    clat (usec): min=36, max=273, avg=42.33, stdev= 3.05
...
I've heard an explanation for this as that Intel drives have disabled FLUSH command by default ("hey, we have PLP, just don't worry and trust us!") and Samsung has the opposite opinion: like you can choose whatever you want but by default we set this.

By the way, changing io scheduler doesn't differ results much.

But still don't understand why so huge difference with Intel z97 SATA controller results. Going to do more tests today.
 

heathen

Member
May 30, 2015
74
7
8
45
I've made some tests on z97 workstation and it's actually funny:

Code:
[vlad@red ~]$ sudo hdparm -W 1 /dev/sdc

/dev/sdc:
 setting drive write-caching to 1 (on)
 write-caching =  1 (on)

[vlad@red ~]$ cat /sys/block/sdc/device/scsi_disk/2\:0\:0\:0/cache_type
write back

[vlad@red ~]$ sudo fio --filename=/dev/sdc4 --direct=1 --sync=1 --rw=write --bs=4k --numjobs=1 --iodepth=1 --runtime=60 --time_based --group_reporting --name=journal-test
journal-test: (g=0): rw=write, bs=4K-4K/4K-4K/4K-4K, ioengine=psync, iodepth=1
fio-2.12
Starting 1 process
...
  write: io=5035.6MB, bw=85939KB/s, iops=21484, runt= 60001msec
    clat (usec): min=41, max=181, avg=46.03, stdev= 3.75
...

[vlad@red ~]$ sudo hdparm -W 0 /dev/sdc

/dev/sdc:
 setting drive write-caching to 0 (off)
 write-caching =  0 (off)

[vlad@red ~]$ cat /sys/block/sdc/device/scsi_disk/2\:0\:0\:0/cache_type
write through

[vlad@red ~]$ sudo fio --filename=/dev/sdc4 --direct=1 --sync=1 --rw=write --bs=4k --numjobs=1 --iodepth=1 --runtime=60 --time_based --group_reporting --name=journal-test
journal-test: (g=0): rw=write, bs=4K-4K/4K-4K/4K-4K, ioengine=psync, iodepth=1
fio-2.12
Starting 1 process
...
  write: io=9621.2MB, bw=164198KB/s, iops=41049, runt= 60001msec
    clat (usec): min=22, max=148, avg=23.85, stdev= 3.1
...
This workstation works under Fedora 25, kernel 4.11.9.
So after I use hdparm -W, it is automatically changes write cache type (as it should). It doesn't happen on Intel RMS25LB controller.

You can enable write cache with

sudo hdparm -W 1 /dev/sdb
Though on enterprise SSD drives I get 40-50% better performance with it off
Yes, the difference is even bigger: almost 100% on z97 controller after disabling write cache.

But still don't understand why is so huge difference between controllers and how can it be reduced. Also pretty interesting is that hdparm -W 0 actually does nothing in case of rms25lb.