Intro & Built notes

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

ehorn

Active Member
Jun 21, 2012
342
52
28
I like dba's methodical approach to find the BW curves. Keep adding drives (one at a time) and plot the curve to see where it begins to roll over.

Is it possible I/O is becoming saturated on this platform at this point (~14GB/s) ?

I recall the EchoStreams guys (still hoping to see some data on that system) were saying they were getting ~20GB/s with (48) 520's on a dual Romley platform with (5) 9207's. This indicates that linear scaling rolls off at these levels so clearly the curve beings to roll off somewhere on the way to these heights of throughput.

What if you created 6 small ramdisks and ran some tests in IOMeter... assign one to each worker/core and examine. Would that give you a quick and dirty indication of I/O saturation levels (assuming you have enough RAM left over to not starve IOMeter and other processes)?

When I did this on an 1155/3570K platform (4 ramdisks, 4 workers, 1MB seq @ 32QD), I saw right around that level and could get no more regardless of how much more I stacked.... Perhaps that is a quick and dirty method to find saturation points.

Either way, I am not sure how this relate (directly or indirectly)... just thought it was interesting to note.

peace,
 

Andreas

Member
Aug 21, 2012
127
1
18
Have a link to the version you used? I want to try doing this myself.
Thanks ehorn. Thats the link.

I like dba's methodical approach to find the BW curves. Keep adding drives (one at a time) and plot the curve to see where it begins to roll over.
Is it possible I/O is becoming saturated on this platform at this point (~14GB/s) ?
ehorn,
I will do dba's propsal when I am back - it is the way to identify emerging bottlenecks. There are a few initial methods in my current setup to check SAS controller/remaining system induced limitations.
Check 1:
Series 1: add incrementally 1 to 8 drives on one controller
Series 2: add incrementally 1 to 8 drives on all 4 controllers round robin

Check 2:
continue check 1 until 32 drives in increments of 2 or 4 drives

Check 3:
look into the sweet spot of HBA raid0 usage vs. software JBOD stripe sets

wrt to your question on saturation point:
Yes, sure it can get saturated at much lower levels than the current 14 GB/s high water mark of my system. Not at an absolute level, but percentage wise. It could be the controller in the LSI card which would be perfectly capable to deliver (as an output function) close to 6 GB/s on the PCIe3 bus, but due to HW/firmware limitations run into an issue driving both 4-port connectors with 4 SAS/SATA lanes at full speed. A second possibility is timing related within a bunch of 8 SSDs. If a drive needs theoretically 1/8 of its time for garbage collection or similar performance limiting activities, a 4 drive benchmark might get away with all 4 drives running at full throttle during the benchmark. In an 8 drive setup, statistically, one drive of the eights will be in GC mode, impacting the whole array.

This is one of the fundamental differences of HD to SSD arrays. The former being "passive" and henceforth much more predicatable components, vs. SDDs where the active behavior models inside the drive impact the total performance of larger aggregations in more subtle and more complex ways. I shared my experiences into the other SSDs in an earlier post. Imagine if 7 OCZ Vertex 4 SSDs during a run are in performance mode and only one drive in storage mode (about 5x slower). The whole array would be impacted in significant ways. Interesting area for investigation though.

Beware of a RAM disk at the speed levels we currently are. It would tax the memory bus 2x and is in my humble opinion a completely unrelated performance number vs. the straight PCI delivered IO we are looking at. The Romley platform addressed another I/O bottleneck (cache handling) in bandwidth intensive IO. Not sure the same approach is taken with the LGA2011 desktop CPUs.

cheers,
Andy
 
Last edited:

ehorn

Active Member
Jun 21, 2012
342
52
28
Agreed... Apples to bananas analysis... The memory subsystem (aka the CPU) was crying "uncle"... hehe... Thought is was interesting nevertheless.

It seems the manufacturers and integrators are well-aware of the behaviors you sight as we are seeing them announcing software to manage performance consistency and leveling (among other feature sets) for their products/systems.

Hopefully I can contribute some data to the discussion as I have a couple more HBA's/Drives en route. Testing is fun, albeit very time consuming. I imagine you guys would agree, finding the time is not so easy these days. :)

Have a great day.

peace,
 
Last edited:

Patrick

Administrator
Staff member
Dec 21, 2010
12,519
5,827
113
Interjecting here... of course, as you guys explore this, feel free to let me know if/ when you want to do a main site post. Many more eyes there and this stuff is something that I think folks would be very interested in seeing.
 

dba

Moderator
Feb 20, 2012
1,477
184
63
San Francisco Bay Area, California, USA
Good idea Patrick. It sounds like Andreas is planning to incrementally add drives and then run more tests. His curves will let us know if there is a point of non-linearity for a single-CPU i7. I'll be doing the same for a dual CPU system. I propose that we combine data in a few weeks and put together a main site post. What do you think Andreas?

Also, it might be useful to first run the IOMeter tests using separate disks (no RAID). That will focus the test on raw throughput as opposed to raw throughput plus RAID processing - which is a good thing for a first round of testing. If you set up an IOMeter test with 4 workers and 8 separate non-RAID drives, for example, IOMeter will automatically distrubute the load to two drives per worker. Once the raw throughput testing done, you can then run tests to look RAID implementations if you wish, hardware and software. If this looks like just too much work, keep in mind that you can start testing with four or five drives per controller instead of one - we know that the 9207 is easily capable of handling that much throughput with linear scaling.

No problem if you want to test on your own and/or decline any suggestions - it's your hardware after all.

Jeff

Interjecting here... of course, as you guys explore this, feel free to let me know if/ when you want to do a main site post. Many more eyes there and this stuff is something that I think folks would be very interested in seeing.
 

Andreas

Member
Aug 21, 2012
127
1
18
Jeff,
the IOMeter tests sounds like a good approach. Really like to know the point of non-linearity (and the cause). But it will take some time. It is fun to dive into that area, but thanks to a day job it has to be lower priority....

BTW, the next step in my little journey is a dual E5-2687W system. The MB and 128GB RAM (16x8GB, regECC, 1600Mhz) already arrived, but a few things are still missing: the 2 CPUs, 2 more 9207-8i (for a total of 6 PCIv3 x8 slots), 16 Samsung SSDs - will pick them up after vacation.

Thanks to your eBay pointer and the good article on the 9202-16e. Just ordered 4 pcs. My offer for $225 apiece was accepted. :)

The ASUS Z9PE-D16 has 4 PCIe3 x16 slots, providing another option to check with the 4 x 9202-16e controllers for perf boundary conditions. My Samsungs deliver 320MB/sec write, which should be well within the performance envelop the PCIv2 x8 interface can cope with (16 x 320MB/s = 5120 MB/s). Need to be careful on the size of the test matrix though, these things tend to grow quite fast.

wrt to power: Need to check how to get the Add2PSU adapter for the new system - the online shop has issues with PayPal. The single socket system will be covered by a single 40A/5V PSU - should be ok.

regards from vacation,
Andy
 

Andreas

Member
Aug 21, 2012
127
1
18
Have a look here at what happens with RAID on the SAS2308 and SSD's
Thanks, interesting read

BTW,
my numbers are in IR mode, raid0 of 8 SSDs per controller, SW striping of the 4 logical drives.
FW version: 14;
Bios turned off on all 4 controllers
I took the Asus X79 WS motherboard as it had only native PCI slots, no PLX bridges inbetween.
rgds,
Andy
 

Andreas

Member
Aug 21, 2012
127
1
18
mobilenvidia,
the guy in the other forum used Vertex 4. As written above, I could not get them to sustainable perf.

I.e.:
Vertex 4, 128 GB, FW 1.5.

1) Secure erase
2) 1 hr pause
3) write first 64GB file: speed = 400 MB/s
4) write second 64GB file: speed = 100 MB/s
5) read first 64 GB file: 490 MB/sec
6) read second 64 GB file: 320 MB/sec
7) delete both files and write a 128 GB file: 155 MB/sec
8) read 128 GB file: 392 MB/sec
9) delete 128 GB file
10) write 64 GB file: 336 MB/sec
11) write second 64 GB file: 102 MB/sec
12) read first 64 GB file: 490 MB/sec
13) read second 64 GB file: 330 MB/sec

For the Samsung 830 / 128 GB:
All write speeds are 320 MB/sec
all read speeds are 520 MB/sec

Andy
 

Patrick

Administrator
Staff member
Dec 21, 2010
12,519
5,827
113
mobilenvidia,
the guy in the other forum used Vertex 4. As written above, I could not get them to sustainable perf.

I.e.:
Vertex 4, 128 GB, FW 1.5.

1) Secure erase
2) 1 hr pause
3) write first 64GB file: speed = 400 MB/s
4) write second 64GB file: speed = 100 MB/s
5) read first 64 GB file: 490 MB/sec
6) read second 64 GB file: 320 MB/sec
7) delete both files and write a 128 GB file: 155 MB/sec
8) read 128 GB file: 392 MB/sec
9) delete 128 GB file
10) write 64 GB file: 336 MB/sec
11) write second 64 GB file: 102 MB/sec
12) read first 64 GB file: 490 MB/sec
13) read second 64 GB file: 330 MB/sec

For the Samsung 830 / 128 GB:
All write speeds are 320 MB/sec
all read speeds are 520 MB/sec

Andy
That makes a lot of sense. If you wait awhile, performance does get better but if you go over half of the spare area quickly, it is an issue. I also think that's why you want to oversize the Vertex 4 for an application. In your scenario, there is a huge premium going from 128GB to 256GB across 32 drives though.
 

Mike

Member
May 29, 2012
482
16
18
EU
The Samsung 830 is known for having pretty aggressive garbage collection, just like the first kingston ssds out there. Thats why these have been doing fairly good in raid arrays without trim (till recently) so well compared to lots of other ssds i guess.
 

Patrick

Administrator
Staff member
Dec 21, 2010
12,519
5,827
113
So that gives me a thought. I should test drives exclusively on non-Intel controllers. Everyone does Intel controllers as they are basically industry standard for desktop/ enthusiast markets. The big difference here is the LSI, Adaptec, Areca, ATTO and etc. do not support TRIM.
 

Andreas

Member
Aug 21, 2012
127
1
18
That makes a lot of sense. If you wait awhile, performance does get better but if you go over half of the spare area quickly, it is an issue. I also think that's why you want to oversize the Vertex 4 for an application. In your scenario, there is a huge premium going from 128GB to 256GB across 32 drives though.
Patrick,
I did not want to consider all these kind of SSD "internal" issues in my project. As stated above, the 4 TB SSD array can be filled and read completely within 4-7 minutes which is really a speed difference to anything a had before. The behavior of the Samsungs is predictable like with HD's.

The Sandforce Controller based SSDs (Vertex and SanDisk) don't exhibit this 50/50 pattern like the Agility/Vertex 4 drives, but are impacted by compressible/uncompressible data.

last point:
The Samsung have the highest write power consumption, but if you consider the power consumption for writing the full drive, it is the most power efficient of all 9 drives due to its high avg. write speed.

Andy
 

john4200

New Member
Jan 1, 2011
152
0
0
The Samsung have the highest write power consumption, but if you consider the power consumption for writing the full drive, it is the most power efficient of all 9 drives due to its high avg. write speed.
I agree with your choice of Samsung 830 for your experiments since it is an excellent value. But I just wanted to point out that the Plextor M3P is similarly consistent in performance to the Samsung 830, but the Plextor M3P actually writes a little faster than the Samsung 830, and uses significantly less power doing so. For anyone who wants even better performance and lower power consumption than the Samsung 830, and is willing to pay for it (the Plextor M3P is more expensive than the Samsung 830), look at the Plextor M3P (or M5P which should be widely available in a month).
 

Andreas

Member
Aug 21, 2012
127
1
18
http://www.tomshardware.com/reviews/ssd-recommendation-benchmark,3269-3.html
The 830 uses no power compared to other drives
Good write up in timing and testing a heap of SSD's
Thanks for the link.

The 830's power consumption numbers are different from my quick measurements while the Vertex 4 is in the same ballpark.

My Vertex 4 idle value is 1,3 Watt (vs. TH's 1,5 watt)
My Samsung 830 idle value is 0,45 Watt (vs. TH's 0,08 Watt).

Especially the write number of 0.15 Watt is way below the 4,5 Watt I measured with a couple of 830 drives.

If these Samsung numbers are correct my 5V circuit breaker wouldn't fire (32 x 0,15 Watt = 4,8 Watt).

rgds,
Andy