This will be a work in progress with the intent to implement this with the recommended 4-nodes (here is a link to current recommendations for Storage Spaces Direct hardware requirements link). I have the intent to give Storage Spaces Direct a trial and then setup ScaleIO on the same hardware and compare the results. I may not be able to post my ScaleIO results due to some limitations with EMC's ULA but I will reach out to them when I get to that point. I am starting with two nodes simply because of cost, I did not want to purchase 4 full systems just to find out that performance was not where I wanted. If scaling of performance is acceptable I will purchase the 1 extra node for ScaleIO (minimum of 3 nodes) or the 2 extra nodes for Storage Spaces Direct (S2D) (minimum of 4 nodes)
**Hardware used to test per node**
- 2x e5-2670
- 128GB ddr3 ecc registered 1600mhz
- 4 x 400gb intel 750 nvme drives
- 2x mellanox ConnectX-3 40/56gb VPI dual port cards
- SuperMicro x9drh-if
***Spoiler Large IO is easy for S2D***
(1024kb 32thread 1 outstanding io at 29.6 GB/s)
Day #1 (Trial using 2-nodes)
**1st test single node singe nvme drive, to see the speed of a single drive. For the sake of all of these tests small random IO will be the only thing I will list since sequential large IO is very easy for S2D to do but small IO scaling will the issue. I use diskspd 2.1.5 for windows for testing and can really push the IO. This is wi
./diskspd.exe -c100G -d10 -r -w0 –t32 –o32 -b4K -h -L D:\testfile.dat
417,670 random 4k read iops
./diskspd.exe -c100G -d10 -r -w100 –t32 –o32 -b4K -h -L D:\testfile.dat
250,728 random 4k write iops
CPU during both reads and writes, to show that diskspd and full utilize all the cores of dual cpus.
**2nd test single node 4x nvme drive 4 column simple space w/ default interleave
./diskspd.exe -c100G -d10 -r -w0 –t32 –o32 -b4K -h -L D:\testfile.dat
326,857 random 4k read iops - 4x NVME 4 column simple space
First issue to solve, why did this happen? To dig deeper I wanted to play with thread count since I have experienced this issue once before with socket 1366 nodes before and numa placement was an issue with those nodes. Observe what happens around thread count of 14* using this script which runs the benchmark through with increasing thread counts.
712,465 random 4k read iops with 4x NVME 4 column simple space, this is better than 326,857 but only ~42% of what 4 drives should be.
So lets go back and check what 4 seperate NVME drives can do.
./diskspd.exe -c100G -d10 -r -w0 -t8 -o8 -b4K -h -L d:\test.dat e:\test.dat f:\test.dat g:\test.dat
Even worse, lets try this test using just one cpu.
./diskspd.exe -c100G -d10 -r -w0 -t4 -o8 -b4K -h -L -n d:\test.dat e:\test.dat f:\test.dat g:\test.dat (-n is used to disable default affinity to cpu, and adjusted threads to 4 from 8 so that it would only hit 1 cpu and ran twice since it will hit cpu 0 1st next run ran on cpu 1)
983,278 random 4k read iops over 4 independant NVME drives. This is getting better, but will have to do some reading to figure out why this is happening. I am not having any problems with my dual e5-2680v2 setup and scaling. I will post images when I move drives back into it to reverify this problem.
./diskspd.exe -c100G -d10 -r -w100 -t8 -o8 -b4K -h -L d:\test.dat e:\test.dat f:\test.dat g:\test.dat
Still need to figure out what is going on with these systems and hopefully get them close to the 1.5-2 million read iops that I am under the impression they can do.
**3rd test two node hyperconvered cluster mirrored 4 column nvme
**12/15/2015 since I spent a lot of time today trouble shooting I only have a few shots of these benchmarks, more will be added soon**
Ran this increasing thread count script for reads on cluster shared volume 1
312,513 random 4k read iops from the system not under control of the cluster shared volume. Since I am seeing single system limited speeds of 650,000-675,000 iops running a mirror set and geting about half of that isnt too bad, but we are now getting random reads speeds of less than a single nvme drive. I have no doubt that its going to take some tweaking to fully maximize the speeds of these systems but all things considered not a bad first attempt.
A note to consider the second node that was the control node for the cluster shared volume was at 100% utilization during this test, and as the above image show the node that was running diskspd was only seeing 46.40 % untilization. When I changed over to the node running diskspd as the controller of cluster shared volume it was then utilizing 100% cpu and the other node was only at 46%.
There will be more to come and I have no doubt I can pull more speed from this setup (perhaps to hit my goals different hardware, ie bios updates, v2 cpus, etc, not sure yet) but there is one saving grace and that is large IOPS they were pretty impressive
./diskspd.exe -c1000G -d10 -w0 -t32 -o -b1024K -h -L C:\ClusterStorage\Volume1\test.dat
Yep, thats 29.618 gigabytes per second read speeds, which is faster than my 8 drives in the dual e5-2680v2 running the same command by about 6 gigabytes per second.
**Hardware used to test per node**
- 2x e5-2670
- 128GB ddr3 ecc registered 1600mhz
- 4 x 400gb intel 750 nvme drives
- 2x mellanox ConnectX-3 40/56gb VPI dual port cards
- SuperMicro x9drh-if
***Spoiler Large IO is easy for S2D***
(1024kb 32thread 1 outstanding io at 29.6 GB/s)
Day #1 (Trial using 2-nodes)
**1st test single node singe nvme drive, to see the speed of a single drive. For the sake of all of these tests small random IO will be the only thing I will list since sequential large IO is very easy for S2D to do but small IO scaling will the issue. I use diskspd 2.1.5 for windows for testing and can really push the IO. This is wi
./diskspd.exe -c100G -d10 -r -w0 –t32 –o32 -b4K -h -L D:\testfile.dat
417,670 random 4k read iops
./diskspd.exe -c100G -d10 -r -w100 –t32 –o32 -b4K -h -L D:\testfile.dat
250,728 random 4k write iops
CPU during both reads and writes, to show that diskspd and full utilize all the cores of dual cpus.
**2nd test single node 4x nvme drive 4 column simple space w/ default interleave
./diskspd.exe -c100G -d10 -r -w0 –t32 –o32 -b4K -h -L D:\testfile.dat
326,857 random 4k read iops - 4x NVME 4 column simple space
First issue to solve, why did this happen? To dig deeper I wanted to play with thread count since I have experienced this issue once before with socket 1366 nodes before and numa placement was an issue with those nodes. Observe what happens around thread count of 14* using this script which runs the benchmark through with increasing thread counts.
Code:
1..32 | % {
$param = "-t $_"
$result = C:\Diskspd-v2.0.15\amd64fre\diskspd.exe -c100G -d10 -w100 -r -b4k $param -o32 -h -L D:\testfile.dat
foreach ($line in $result) {if ($line -like "total:*") { $total=$line; break } }
foreach ($line in $result) {if ($line -like "avg.*") { $avg=$line; break } }
$mbps = $total.Split("|")[2].Trim()
$iops = $total.Split("|")[3].Trim()
$latency = $total.Split("|")[4].Trim()
$cpu = $avg.Split("|")[1].Trim()
"Param $param, $iops iops, $mbps MB/sec, $latency ms, $cpu CPU"
}
712,465 random 4k read iops with 4x NVME 4 column simple space, this is better than 326,857 but only ~42% of what 4 drives should be.
So lets go back and check what 4 seperate NVME drives can do.
./diskspd.exe -c100G -d10 -r -w0 -t8 -o8 -b4K -h -L d:\test.dat e:\test.dat f:\test.dat g:\test.dat
Even worse, lets try this test using just one cpu.
./diskspd.exe -c100G -d10 -r -w0 -t4 -o8 -b4K -h -L -n d:\test.dat e:\test.dat f:\test.dat g:\test.dat (-n is used to disable default affinity to cpu, and adjusted threads to 4 from 8 so that it would only hit 1 cpu and ran twice since it will hit cpu 0 1st next run ran on cpu 1)
983,278 random 4k read iops over 4 independant NVME drives. This is getting better, but will have to do some reading to figure out why this is happening. I am not having any problems with my dual e5-2680v2 setup and scaling. I will post images when I move drives back into it to reverify this problem.
./diskspd.exe -c100G -d10 -r -w100 -t8 -o8 -b4K -h -L d:\test.dat e:\test.dat f:\test.dat g:\test.dat
Still need to figure out what is going on with these systems and hopefully get them close to the 1.5-2 million read iops that I am under the impression they can do.
**3rd test two node hyperconvered cluster mirrored 4 column nvme
**12/15/2015 since I spent a lot of time today trouble shooting I only have a few shots of these benchmarks, more will be added soon**
Ran this increasing thread count script for reads on cluster shared volume 1
Code:
1..32 | % {
$param = "-t $_"
$result = C:\Diskspd-v2.0.15\amd64fre\diskspd.exe -c100G -d10 -w100 -r -b4k $param -o32 -h -L C:\ClusterStorage\Volume1\testfile.dat
foreach ($line in $result) {if ($line -like "total:*") { $total=$line; break } }
foreach ($line in $result) {if ($line -like "avg.*") { $avg=$line; break } }
$mbps = $total.Split("|")[2].Trim()
$iops = $total.Split("|")[3].Trim()
$latency = $total.Split("|")[4].Trim()
$cpu = $avg.Split("|")[1].Trim()
"Param $param, $iops iops, $mbps MB/sec, $latency ms, $cpu CPU"
}
312,513 random 4k read iops from the system not under control of the cluster shared volume. Since I am seeing single system limited speeds of 650,000-675,000 iops running a mirror set and geting about half of that isnt too bad, but we are now getting random reads speeds of less than a single nvme drive. I have no doubt that its going to take some tweaking to fully maximize the speeds of these systems but all things considered not a bad first attempt.
A note to consider the second node that was the control node for the cluster shared volume was at 100% utilization during this test, and as the above image show the node that was running diskspd was only seeing 46.40 % untilization. When I changed over to the node running diskspd as the controller of cluster shared volume it was then utilizing 100% cpu and the other node was only at 46%.
There will be more to come and I have no doubt I can pull more speed from this setup (perhaps to hit my goals different hardware, ie bios updates, v2 cpus, etc, not sure yet) but there is one saving grace and that is large IOPS they were pretty impressive
./diskspd.exe -c1000G -d10 -w0 -t32 -o -b1024K -h -L C:\ClusterStorage\Volume1\test.dat
Yep, thats 29.618 gigabytes per second read speeds, which is faster than my 8 drives in the dual e5-2680v2 running the same command by about 6 gigabytes per second.