Andy, I think dba has been seeing something very similar. He is currently testing a motherboard with an onboard SAS 2308, a 9207-8e, and a few other things. Great update.
You are welcome. Which MB's are we talking about - single, dual or quad socket ?
I'm building a single and dual socket system, unfortunately a quad SandyBridge rig is outside the envelope for my project. Haven't come across a quad motherboard which loops the 160 PCI 3.0 lanes to I/O slots. Intel's MB seems to hold for now the high watermark with 6 x 16 PCI 3.0 lanes (plus 2 internal x8 slots). The trend in raid/HBA adapters looks like to converge around the x8 PCI slots factor and not x16 (ie. with 24-32 SATA/SAS ports).
Hi Andy,
Like you, I've been finding that the new LSI SAS2308 is capable of some amazing IOPS, more than I'll ever need and certainly higher than the SAS2008 chip or anthing else I've measured, but not as high as the specified 700K.
Also, can you share your IOMeter setup, specifically the "Maximum Disk Size" in sectors from the Disk Targets tab?
There are 2 things I'd like to understand better.
1) the scaling from 1-8 SSDs per controller. Even with the lower spec of the Samsung 128GB drives (80.000 iops), perfect scaling would get us to 640.000 iops per controller. I am currently at 445.000, so I "lost" somewhere approx 200.000. If the HBA cant get above - let's say 500.000 - fine with me. But LSI somehow manages 700.000 with appropriate SSDs. Not sure if I need to start with 120k iops drives (8x120k = 960k iops) and some loss get me to the architectural limit of the HBA.
Or, (2) are MLC based drives with their higher overhead per se not able to work with the 2308 controller in concert towards 700k. Could it be an exclusive realm of SLC SSD's? Don't want to spend that amount per GB, but I might take a closer look at the SuperSSpeed s301 drives which should show up soon.
My settings in iometer are simple - nothing fancy:
For random read I take a testfile with at least 50% of capacity - maximising the likelyhood that all flash chips get some work to do. To be on the safe side, one can take 100%. For Datatransfertests I take 10GB (if a single SSD) or 50 GB (when a raid0).
(Note: This is the area where I had significant differences with quite a few drives, which were on public benchmarks and specs faster than the Samsungs. Writing the full drive in one sweep had either very low write speeds in the second half (OCZ Vertex 4, Agility 4), or benefited after the write from some cool down period to show better performance vs. immediately after the write operation (SanDisk, Vertex 3). Due to many rearrangements in the drives/raid/HBA settings, the amount of data written to the individual drives is not synchronized. Some of my drives get far more write cycles than others. I don't have enough drives of all models, but the Samsungs don't exhibit for now much variation. Independet of the structure and combination of setups I am currently checking. Not sure if this would be the case with all drives, neither I am sure it might be an issue at all. It is just a comment for those who assemble setups with more than 8 SSDs)
Data transfer was measured with: 1MB blocks, 100% sequential, 100% read, QD16
Data transfer for write is currently limited to 24 SSDs - as I still havent got my new PSU with enough power on the 5 volt rail. Should be here any day now.
IOPS tests benefit far more from a larger testfile, then seq transfer. Setting for read: 512B (or 4K), 100% random, 100% read, QD 128 (single SSD), QD >512 (on raid). I was too lazy to check for the "perfect" QD curve. Only assessment was, that QD64 delivers not the peak value.
Beware of the CPU load when measuring many individual SSDs
. For production work, I'd rather take the lower absolute transfers and iops values of raid-ing in the HBA and keep the CPU free for its main duty.
Tip: The testfile (ie. 10GB) by iometer is not written very fast. It is usually faster to take one template file and copy it with the filemanager to the SSDs under test.
Thanks for sharing those metrics Andy. I recall you mentioned you are in the process of setting up a dual proc. Nevertheless, I think these are incredible results for a single CPU.
You are welcome ehorn. Agree, it is just mindboggling what the combination of recent developments (CPU, IO architecture, controller, software, ...) deliver these days. 1.300.000 iops used to require 13.000 disk drives (and some CPUs).
BTW, today i "tuned" my workstation. A simple change in CPU cooler got me about +10% more performance. I do have some heavyweight apps, which triggered CPU throtteling when the CPU hit 90 degree celcius. Neither might it be good to run the CPU for hours at 90 degrees. This evening I picked up a Corsair H100 CPU water cooler set. Installation is a snap. Impact is incredible so far. With a decent air cooler the CPU hit 90 degrees when running at 3.5 GHz, throttled sometimes down to 3.2 or even 3.0. With the water cooler, the CPU (without overclocking it) stays rock solid at 3.8 GHz which is the maximum frequency of the 3930K. Temperture at full 100% load on all cores and all functional units never exceeded 58 degrees. With normal workloads (ca. 30%) the temperature is below 40 degrees. Amazing.
A screenshot of HWInfo while the CPU was really maxed out. Think of something like Intel's Linpack benchmark taxying the CPU plus add 3-4 GB/sec IO. Runtime was this time 3 hrs. Check the CPU temp which is at least 30 degrees less, plus I get 10% more CPU frequency vs. before. If interested, with this load on the CPU, memory, 4 LSI, 33 SSDs and fans, power consumption is approx 300 watt at the wall socket.
With this setup,
Intel's Linpack delivers 149,5 Gflops vs. 135 GFlops before. Setting: 20000, 20000, 4K
Stream shows 41 GByte/sec on the memory bus
Second workstation:
It is in the making. The components are ordered except the SSDs. Some components are already here, some will come this week. Made up my mind on the CPUs (2 x E5-2687W), memory (16x8 GB regECC 1600 MHz), MB (Asus P9ZE-D16. Not the L version, but the one with 4 x GBit ports).
With SSDs, I am still thinking. The Samsung were a good decision a few weeks ago, but there are new drives coming in which I'D like to test first before making a decision. Neutron GTX, Plextor M5Pro and SuperSSpeed s301 are those I will probably buy one first and compare it to the other drives in my closet. I'll see.
To get 700k IOPS from the SAS2308 you would need 8x SSDs capable of at least 90k IOPS each.
The Vertex4 supposedly maxes out at 125K IOPS but I doubt it will ever get that high without the tail wind, downhill test
Pieter,
you are right, 90K IOPS drives would be needed to get me to 700.000 IOPS per HBA. But you might have seen the "poor" scaling of the 2308 beyond 5 drives (the Samsungs are spec'ed at 80k iops). As probably most of us I don't need these last IO/s beyond the already 445K I measured. It is rather a matter of curiosity to understand the "why". Do I need 150K SSDs to start with and the unavoidable scaling effect still produces outstanding 700k, or is there something else I overlooked? Is it the IR firmware which limits this exercise, or do I need (your fav) IT firmware to avoid the scaling implication currently seen. Or is there some magic in the drives and an undocumented setting in the HBA driver unleashes the last bit of performance?
Needless to say, but I say it anyway
The workstation is great fun to work with in its current state of development.
rgds,
Andy