The Million IOPS Club Thread (Or more than 1 Million IOPS)

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

Patrick

Administrator
Staff member
Dec 21, 2010
12,516
5,811
113
I know we have a lot of storage nuts on the STH forums. 1 million IOPS used to be something that was a big number. These days, 1 million IOPS is fairly easy to attain.

How to see if your storage is part of the club:
  • FIO, IOMeter or another common benchmarking tool. Best practice is to do a heavy write pre-conditioning workload to get the drives to steady state.
  • As an example, use IOMeter 4K Random Read (100% read 100% random) on drives. For today's NAND NVMe drives you will need to pump up the QD to a high enough number (often QD128 or QD256) to reach peak performance.
  • Post system specs and screenshots of your entries to this thread. If you want, system price and date acquired are also good figures to have.
For those looking for simple read Iometer tests you can do the following on Windows:
  1. Download and extract Iometer 1.1.0
  2. Download and extract STHiometer zip for example read tests (note this does NOT have the pre-condition workloads)
  3. Run Iometer as administrator
  4. Delete the default Manager
  5. Open the STHiometer4k128kRead.icf profile from Iometer
  6. Add a manager
  7. Use 4 workers per disk (e.g. for 1 drive being tested you would have 4 workers, for 4 drives you would have 16) and ensure they are setup.
  8. Select a file name that is descriptive to save
  9. Get over 1 million IOPS (if you want the dashboard picture you can go to Results Display and click the > character on the right side to bring that view up.)
  10. If you want to speed things along, for NVMe NAND SSDs you will want to try tests 6, 7 and 13, 14 of 14 as those will effectively be at QD128 and QD256. For SATA SSDs you will want to look around test 3 and 4, 10 and 11. For lower QD figures. You can use the "next" button on the dashboard page to move to the next QD test.
Why do we not have all of the write workloads/ pre-conditioning in that profile? Our pre-conditioning scripts for STH testing can easily write over 100TB to a consumer grade-SSD. For many consumer drives that is a significant amount and I do not want to have someone run the tests a few times and use the majority of write endurance up on their drives.
 

Attachments

Last edited:

Patrick

Administrator
Staff member
Dec 21, 2010
12,516
5,811
113
Going first here.
  • System: Dell PowerEdge R930
  • RAM: 512GB RAM
  • CPU: 4x E7-8890 V4 Processors
  • OS: Windows Server 2012 R2
  • OS Drives: 2x 300GB SAS hard drives
  • Drives being tested: 8x Dell 400GB NVMe SSDs
  • Capacity being tested: 3.2TB
  • Acquire date: 9/ 2016
  • Approximate Cost: $50,000
Results
  • Tool Used: IOMeter
  • 4K Random IOPS: 3 million
upload_2016-9-13_8-27-24.png
  • 128K Sequential Read: 21GB/s
upload_2016-9-13_8-24-18.png
 
Last edited:

T_Minus

Build. Break. Fix. Repeat
Feb 15, 2015
7,641
2,058
113
OOh man, making me get some more NVME in baremetal :D for this 'test'.

@Patrick -- Is it possible to save/export IOMeter configs so we can share the same one ?

Can people also post what benchmark tool they used too, I'd find that useful vs just "I got 1m"
 

Patrick

Administrator
Staff member
Dec 21, 2010
12,516
5,811
113
OOh man, making me get some more NVME in baremetal :D for this 'test'.

@Patrick -- Is it possible to save/export IOMeter configs so we can share the same one ?

Can people also post what benchmark tool they used too, I'd find that useful vs just "I got 1m"
Yes. Let me work on this.

Edit: @T_Minus read Iometer profiles added.
 
Last edited:
  • Like
Reactions: T_Minus

Patrick

Administrator
Staff member
Dec 21, 2010
12,516
5,811
113
Low cost SATA SSD Option
  • System: Supermicro 2U 24-bay w/ 3x LSI SAS 3008 HBAs
  • RAM: 256GB RAM
  • CPU: 2x E5-2698 V3 Processors
  • OS: Windows Server 2012 R2
  • OS Drive: SATA DOM
  • Drives being tested: 24x Phison S10DC 960GB SATA SSDs
  • Capacity being tested: 23TB
  • Acquire date: 8/ 2016
  • Approximate Cost: $25,000
Results
  • Tool Used: IOMeter
  • 4K Random IOPS: 1.5 million
Phison S10DC 4K Rand Read IOPS.JPG
  • 128K Sequential Read: 10GB/s
Phison Sequential Read.JPG
 

Patrick

Administrator
Staff member
Dec 21, 2010
12,516
5,811
113
Maybe we will see someone join and post some optane numbers
First gen Optane are still going to be limited by the PCIe bus.

@RobertFontaine - this is intended to be a longer-term thread. Frankly, you could have 3x of these in your workstation: Fusion-iO ioDrive II 1.2TB MLC Flash SSD PCIe 2.0 x8 Card KCC-REM-FIO-ioDrive2 and that setup would do 1.0-1.1 million read IOPS at under $1500. That pricing will, of course, go down to the point we will see it even more home-lab friendly.

Now - if you wanted to say data center or home lab in your post, that is OK as well.
 
  • Like
Reactions: MiniKnight

TuxDude

Well-Known Member
Sep 17, 2011
616
338
63
Even at work I wouldn't be able to get any of our arrays to put out a million iops, at least not without cheating a bit. And if I'm gonna cheat, may as well just take the easy road - no-one ever said these iops had to be done against persistant storage ;)

So using a CentOS 7.2 VM I happened to have laying around (4 cores / 8GB RAM VM hosted on a dual Xeon 5560 ESXi 6 host), I turned on a 4GB ramdisk (zram based) and ran fio against it. 4K random reads, running 32 parallel jobs each with a queue depth of 4, was happily sitting at about 1.25 million iops with all 4 vCPUs pegged at 100%.

Ok - you can all return to your ridiculous hardware now.
 

Chuntzu

Active Member
Jun 30, 2013
383
98
28
Well shoot, now I need to boot up my 32x ssd server and 10x nvme server to get these screen shoots. Both are home use only. I really like this thread it makes me feel better that I am not the only one this crazy!!!! 2million iops x 2 here we go!

Sent from my SM-N920T using Tapatalk
 
  • Like
Reactions: Patrick

MiniKnight

Well-Known Member
Mar 30, 2012
3,073
974
113
NYC
Tried a system today stuck at 950K iops. Need to wait for a storage system acquisition to burn-in.
 

whitey

Moderator
Jun 30, 2014
2,766
868
113
41
Even at work I wouldn't be able to get any of our arrays to put out a million iops, at least not without cheating a bit. And if I'm gonna cheat, may as well just take the easy road - no-one ever said these iops had to be done against persistant storage ;)

So using a CentOS 7.2 VM I happened to have laying around (4 cores / 8GB RAM VM hosted on a dual Xeon 5560 ESXi 6 host), I turned on a 4GB ramdisk (zram based) and ran fio against it. 4K random reads, running 32 parallel jobs each with a queue depth of 4, was happily sitting at about 1.25 million iops with all 4 vCPUs pegged at 100%.

Ok - you can all return to your ridiculous hardware now.
Be interested to hear your zram ramdisk testing methodology end-to-end @TuxDude if you would be so kind? I could go research/implement but would like to follow the testing parameters you did as there seems to be a LOT of variance when it comes to performance testing and some repeatable documented processes/guidelines would be nice to follow for sanity/accuracy sake....and learning of course!

TIA, whitey
 

Patrick

Administrator
Staff member
Dec 21, 2010
12,516
5,811
113
As a reality check what kind of iops can you get out of a M1015 and 8 - s3500's?
500k-600k random 4K read would be my guess if they were on a SAS3008 but have not tried on a M1015. It is also dependent on what capacity you are using. 800GB S3500's are much faster than 80GB variants.
 

Patrick

Administrator
Staff member
Dec 21, 2010
12,516
5,811
113
Here is another 3+ million IOPS machine wither about 26GB/s sequential reads in the DemoEval lab.

Low cost NVMe SSD Option
  • System: Supermicro 2U 24-bay NVMe chassis
  • RAM: 512GB DDR4-2400 RAM
  • CPU: 2x E5-2698 V4 Processors
  • OS: Windows Server 2012 R2
  • OS Drive: Intel DC S3700 400GB
  • Drives being tested: 24x Intel DC P3320 2TB
  • Capacity being tested: 47TB
  • Acquire date: 8/ 2016
  • Approximate Cost: $32,000
Results
  • Tool Used: IOMeter
  • 4K Random IOPS: 3.1 million
  • 128K Sequential Read: 26GB/s

Intel DC P3320 4K Rand Read IOPS.JPG Intel DC P3320 sequential read.JPG

We are running into architectural limits on these machines.

This one has over 40TB usable NVMe space so it is a bit more useful than that Dell one as a mass storage server.

Also, we have seen this machine run a big data platform using KVM that even with virtualization was achieving around 20GB/s read speeds.

Remember, a DDR4-2400 module is only 19.2GB/s or so making this fairly impressive.
 
Last edited:

TuxDude

Well-Known Member
Sep 17, 2011
616
338
63
Be interested to hear your zram ramdisk testing methodology end-to-end @TuxDude if you would be so kind? I could go research/implement but would like to follow the testing parameters you did as there seems to be a LOT of variance when it comes to performance testing and some repeatable documented processes/guidelines would be nice to follow for sanity/accuracy sake....and learning of course!

TIA, whitey
Well, since I was just doing it for humor purposes it was very much just a quick'n'dirty fast run. I just happened to have everything needed for the test already assembled in a VM that I've been using for some other storage benchmarking at home.

First step was to load the zram module, tell it how many ramdisk devices you want.
Code:
modprobe zram num_devices=1
And tell it how big it should be (note: zram is a compressed block device, so it should use somewhat less RAM than this depending on the data stored)
Code:
echo 4294967296 > /sys/block/zram0/disksize
Lately I've been playing with Fio-Visualizer which is being developed/used-by Intel for testing their SSDs, and includes a bunch of workloads designed to stress NVMe drives - I borrowed the NVMe_04k_RR_QD4_32J.ini job from that (you can grab just that file from the github link if you want and use it with regular command-line fio), and made a few modifications. Most importantly, I changed the 'filename' setting to point at '/dev/zram0', but I also turned the runtime down to 300 (5 minutes) since it wasn't really a serious test and I'm not concerned with pre-conditioning RAM. And so the final command, with the modified job ini file copied over to my test VM (I named the modified file 4krr.ini):
Code:
fio 4krr.ini
I suppose since I was CPU limited, that I could probably get much better numbers by switching zram's compression algorithm to something with less overhead, or disable compression completely if that's an option. But my goal was just to be able to make some type of 1-million iops claim and that config managed to do it, so I didn't spend any more time tweaking it.

Also just to be clear to everyone reading, there is no battery protection, or any type of job to backup the data to persistent storage. The zram disk is not suitable for any kind of real use, either as a storage device or a write cache (I had originally started playing with zram to play with bcache a bit). All of the data stored in it is lost on a crash, reboot, etc. So don't go using these instructions to make the fastest VM storage pool ever and come whining when all your VMs suddenly disappear.
 

gigatexal

I'm here to learn
Nov 25, 2012
2,913
607
113
Portland, Oregon
alexandarnarayan.com
The next iteration of this kind of thread should be an STH OLTP benchmark that installs a standard DB, configures it in a standard way, and then runs a production like load test to test both the CPUs and the disks. Iops is neat and all but it's still a bit abstract.
 
  • Like
Reactions: Patrick