Windows Server 2012 Storage Spaces Benchmarks

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

dba

Moderator
Feb 20, 2012
1,477
184
63
San Francisco Bay Area, California, USA
I'd like to share a few benchmarks that I ran to test Windows 2012 R1 Storage Spaces performance.

The Server:
HP DL180 G6 - the 3G expander backplane version with 12 drives
Single Intel Xeon L5520 CPU
24 GB RAM (3x8GB) DDR3-1333 ECC
1 Samsung 840 pro 128GB SSD drive for boot - mounted inside of the chassis
12x Seagate Constellation 7200 RPM SATA drives for bulk storage
4x Samsung 840 Pro 128GB SSD drives for high IOPS storage - mounted inside of the chassis
4x StarWind 4GB RAM disks for testing
LSI 9201-8i SAS HBA for the four SSD drives
HP P410 PCIe RAID card w/ battery for the 12 bulk storage drives
1 GbE connection to my network
1 Mellanox ConnectX-2 Infiniband QDR connection to my network, configured as 32Gbit IPoIB. The card has custom firmware that provides RDMA support in Windows.

Drive Configuration:
The 12 SATA drives were used to create a RAID6 array on the HP P410 card. I tested these just for fun; the results aren't meaningful.
The 4 SSD drives were used to create a simple (striped with no redundancy) Windows storage pool.
The 4 RAM disks were formatted as standalone drives.
The bulk storage, the SSD pool, and the four RAM disks were configured as windows shares

Test Client:
Over the network testing was initiated by a Dell c6100 node with dual L5520 CPUs and 96GB of RAM and both a 1GbE connection and a Mellanox QDR IB connection using IPoIB. Custom firmware to enable RDMA in Windows.

Baseline Tests:
First, I used IOMeter, running on the server itself, to baseline the RAM disks, the SSD pool, and the bulk storage pool. I wanted to see what they were capable of by themselves, without any network access. In all cases I used a 16GB test file (or 4x4GB files in the case of the RAM disks) and ran IO at a queue depth of 32 with a single worker.

RAM Disks:
4kb random reads: 189,200 IOPS
1MB random reads: 3,728 MB/s
4kb random read latency: 0.005ms
RAM disk speed is, obviously, high enough to not be a bottleneck!

SSD Pool: (4 SSD drives)
4kb random reads: 115,900 IOPS
1MB random reads: 2,220 MB/s
4kb random read latency: 0.28ms

Bulk Storage Pool: (12 drives)
4kb random reads: 2,454 IOPS
1MB random reads: 473 MB/s
The bulk pool array was still initializing, which hurt performance, but the main point I'd like to convey is the fact that even the best magnetic hard drives have simply dismal performance compared to ordinary SSD drives.


Windows Networking Tests:
In this round of testing, I ran IOMeter with the same test configuration on a separate client machine, accessing the same drives as above, but this time over the network. In this case, it's a 32Gbit IPoIB network with RDMA.

RAM Disks:
4kb random reads: 80,941 IOPS
1MB random reads: 3,210 MB/s
4kb random read latency: 0.367ms

SSD Pool: (4 SSD drives)
4kb random reads: 71,902 IOPS
1MB random reads: 2,208 MB/s
4kb random read latency: 0.44ms

Bulk Storage Pool: (12 drives)
4kb random reads: 2,512 IOPS
1MB random reads: 477 MB/s


1GbE iSCSI Tests:

SSD Pool: (4 SSD drives)
4kb random reads: 1,862 IOPS
1MB random reads: 119 MB/s
4kb random read latency: 18.03ms - 41x higher than SMB3+ QDR IPoIB


Discussion:
I have been very impressed by the combination of SMB3, Windows Storage Spaces, and IPoIB with RDMA. Looking at the SSD pool results, accessing the drives remotely over the network showed throughput 99% as high as accessing them directly. Remote IOPS were only 62% as high as direct IOPS, but still very impressive at 72K. Stepping up to the much more capable RAM disks improved IOPS only to 81K, which would seem to indicate that we're getting close to an absolute limit for our IPoIB setup.

Contrast that with the 1GbE iSCSI results. While we were able to saturate our single network connection, which is positive, we did so with just 119MB/s of throughput. Further, the lower speed and higher latency of the gigabit network allowed us only 1,862 IOPS. This shows us the limits of gigabit networking. Note that the 40x higher latency on the gigabit network compared to IPoIB resulted in 1/40 the IOPS, an indication that small-block file server performance is very much driven by latency.

By the way, curious about CPU utilization? No more than 2.7% on the server while pushing 3,210MB/s to the client, and just under 12% when pushing 80K IOPS.


Testing the RAM disks with 1mb random reads in IOMeter:


Accessing the same storage pool remotely over 32Gbit IPoIB and SMB3 with RDMA:


iSCSI over Gigabit Ethernet looking rather anemic when testing 1MB random reads:
 
Last edited:

dba

Moderator
Feb 20, 2012
1,477
184
63
San Francisco Bay Area, California, USA
Very cool, is it RAID-5 where storage spaces falls apart so they recommend RAID-10?
Yes. Windows storage spaces software parity RAID performance is quite bad - as you'd expect from a software implementation without the benefit of a battery backed cache. If you have a single gigabit network connection and just push around big files then you may never notice it, but for higher speed networking or any kind of small-block access (like VMs) you really want to avoid WSS parity raid entirely.