The Million IOPS Club Thread (Or more than 1 Million IOPS)

whitey · Nov 7, 2017

@JDM , what about fio write I/O numbers? Nice read iops run btw!

JDM · Nov 7, 2017

whitey said:
@JDM , what about fio write I/O numbers? Nice read iops run btw!

Here those are below, took that benchmark as well but figured people wouldn't want a single post to be so long

Config:
[global]
thread
ioengine=libaio
direct=1
buffered=0
group_reporting=1
rw=randwrite
bs=4k
iodepth=64
numjobs=2
size=50%

[job1]
filename=/dev/nvme1n1

[job2]
filename=/dev/nvme2n1

Output:
job1: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=64
...
job2: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=64
...
fio-2.16
Starting 4 threads
Jobs: 4 (f=4): [w(4)] [100.0% done] [0KB/4338MB/0KB /s] [0/1111K/0 iops] [eta 00m:00s]
job1: (groupid=0, jobs=4): err= 0: pid=123737: Tue Nov 7 13:15:07 2017
write: io=534182MB, bw=4322.6MB/s, iops=1106.6K, runt=123580msec
slat (usec): min=1, max=30029, avg= 1.94, stdev= 2.67
clat (usec): min=42, max=30240, avg=228.85, stdev=38.90
lat (usec): min=43, max=30242, avg=230.84, stdev=38.97
clat percentiles (usec):
| 1.00th=[ 191], 5.00th=[ 193], 10.00th=[ 195], 20.00th=[ 197],
| 30.00th=[ 199], 40.00th=[ 211], 50.00th=[ 231], 60.00th=[ 235],
| 70.00th=[ 239], 80.00th=[ 258], 90.00th=[ 274], 95.00th=[ 294],
| 99.00th=[ 326], 99.50th=[ 338], 99.90th=[ 366], 99.95th=[ 378],
| 99.99th=[ 402]
lat (usec) : 50=0.01%, 100=0.01%, 250=78.37%, 500=21.63%
lat (msec) : 50=0.01%
cpu : usr=22.30%, sys=60.08%, ctx=12805960, majf=0, minf=4
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
issued : total=r=0/w=136750572/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
latency : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
WRITE: io=534182MB, aggrb=4322.6MB/s, minb=4322.6MB/s, maxb=4322.6MB/s, mint=123580msec, maxt=123580msec

Disk stats (read/write):
nvme1n1: ios=162/68310148, merge=0/0, ticks=0/15535956, in_queue=16306048, util=100.00%
nvme2n1: ios=162/68297331, merge=0/0, ticks=0/15516168, in_queue=16584844, util=100.00%

JDM · Nov 7, 2017

gigatexal said:
What about pools of NVMe on ZFS? Could that work? With maybe some mirrored optanes read cache etc

These are now (after raw device benchmarking) in a mirrored zpool to host VMs, will let you know how that goes as soon as I get time...which may not be until this weekend.

i386 · Nov 7, 2017

gigatexal said:
Wow. I wonder if the SAN vendors are scared by what could be some competent DIY offerings.

I think they will stay for a while.
The real strength of 3dxpoint will be servers with many nv-dimm slots and large (128+ gb) nvdimms replacing nvme/sas ssds, bypassing the "pcie bottleneck".

MiniKnight said:
Tape and mainframes are still alive so FC has decades ahead.

In 2014/15 our company had a project for a customer to see how much it would cost to migrate from ibm system z to x86 servers and run the applications on x86...
Customer bought 2 new ibm z13 after that project

nkw · Nov 8, 2017

Patriot said:
Ceph can't even use a single nvme drives performance yet... its bottlenecked... Optane would be a waste.

Huh? Are you just referring to the fact you might need to run multiple OSD instances per physical nvme drive?

Patriot · Nov 8, 2017

nkw said:
Huh? Are you just referring to the fact you might need to run multiple OSD instances per physical nvme drive?

Aaaaaand that negates redundancy or the reason you are using ceph. It is used for benchmark numbers not production.

nkw · Nov 8, 2017

Patriot said:
Aaaaaand that negates redundancy or the reason you are using ceph. It is used for benchmark numbers not production.

Isn't that addressed by a properly configured crush map? e.g. Configuring a bucket type and bucket hierarchy which is aware of the physical device. (osd -> device -> host -> chassis, etc.) See Manually editing a CRUSH Map — Ceph Documentation

gigatexal · Nov 8, 2017

i386 said:
In 2014/15 our company had a project for a customer to see how much it would cost to migrate from ibm system z to x86 servers and run the applications on x86...
Customer bought 2 new ibm z13 after that project

this is interesting, can you shed any more light on this?

nkw · Nov 8, 2017

gigatexal said:
this is interesting, can you shed any more light on this?

IBM mainframes / system z has always fascinated me. I've never had the opportunity to use/touch/poke one and know virtually nothing about them (beyond wikipedia). I think someone here should take one for the team and provide us a walkthrough:

IBM Z SYSTEM zEnterprise Z114 2818-M05 1 to 5 PUs and 8 to 120 GB memory | eBay

CheapSushi · Nov 8, 2017

Patrick, do you think they'll release x4 M.2 Optane drives soon-ish? Would be better for an x16 adapter.

Rand__ · Nov 8, 2017

nkw said:
IBM mainframes / system z has always fascinated me. I've never had the opportunity to use/touch/poke one and know virtually nothing about them (beyond wikipedia). I think someone here should take one for the team and provide us a walkthrough:

IBM Z SYSTEM zEnterprise Z114 2818-M05 1 to 5 PUs and 8 to 120 GB memory | eBay

How about How to Run Your Own Mainframe Linux

gigatexal · Nov 8, 2017

nkw said:
IBM mainframes / system z has always fascinated me. I've never had the opportunity to use/touch/poke one and know virtually nothing about them (beyond wikipedia). I think someone here should take one for the team and provide us a walkthrough:

IBM Z SYSTEM zEnterprise Z114 2818-M05 1 to 5 PUs and 8 to 120 GB memory | eBay

THANKS! in for 2 ;-)

edit: kidding.

Search

The Million IOPS Club Thread (Or more than 1 Million IOPS)

whitey

Moderator

JDM

Member

JDM

Member

i386

Well-Known Member

nkw

Active Member

Patriot

Moderator

nkw

Active Member

gigatexal

I'm here to learn

nkw

Active Member

CheapSushi

New Member

Rand__

Well-Known Member

gigatexal

I'm here to learn