I'm learning to use fio to benchmark disk speed. Any fio gurus out there who could drop some knowledge on me? Below is a sample output writing to a ZFS pool. As I watch the one line progress display during the sequential write, it hovers around 950 - 990MiB/s. At the end of the test, the results at the bottom show total bw=1057MiB/s. I never see a number this high during the test, so are these two numbers calculated differently?
As a second, unrelated question, how can I simulate writing through the buffer cache when the memory is full? In general I want to use the kernel buffer cache, but there will be times when writing a huge file where the cache fills up so I need to benchmark the underlying disks. With a regular SSD with XFS filesystem, I could use direct=1 but I think that's not quite the same workload. I tried fdatasync=1 but that goes very slow.
Code:
$ fio --name=seqwrite --rw=write --bs=1M --numjobs=1 --size=30G --directory=/tank --ioengine=sync
seqwrite: (g=0): rw=write, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=sync, iodepth=1
fio-3.7
Starting 1 process
seqwrite: Laying out IO file (1 file / 30720MiB)
Jobs: 1 (f=1): [W(1)][100.0%][r=0KiB/s,w=950MiB/s][r=0,w=950 IOPS][eta 00m:00s]
seqwrite: (groupid=0, jobs=1): err= 0: pid=5708: Mon Sep 9 18:04:29 2019
write: IOPS=1057, BW=1057MiB/s (1109MB/s)(30.0GiB/29054msec)
clat (usec): min=249, max=100307, avg=929.44, stdev=1016.86
lat (usec): min=256, max=100321, avg=944.79, stdev=1016.62
clat percentiles (usec):
| 1.00th=[ 314], 5.00th=[ 334], 10.00th=[ 359], 20.00th=[ 693],
| 30.00th=[ 914], 40.00th=[ 1004], 50.00th=[ 1020], 60.00th=[ 1090],
| 70.00th=[ 1106], 80.00th=[ 1123], 90.00th=[ 1139], 95.00th=[ 1156],
| 99.00th=[ 1188], 99.50th=[ 1205], 99.90th=[ 1237], 99.95th=[ 1254],
| 99.99th=[ 1598]
bw ( MiB/s): min= 900, max= 2822, per=100.00%, avg=1057.39, stdev=345.13, samples=58
iops : min= 900, max= 2822, avg=1057.38, stdev=345.13, samples=58
lat (usec) : 250=0.01%, 500=14.21%, 750=7.50%, 1000=17.06%
lat (msec) : 2=61.22%, 100=0.01%, 250=0.01%
cpu : usr=1.96%, sys=32.88%, ctx=188200, majf=0, minf=26
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwts: total=0,30720,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
Run status group 0 (all jobs):
WRITE: bw=1057MiB/s (1109MB/s), 1057MiB/s-1057MiB/s (1109MB/s-1109MB/s), io=30.0GiB (32.2GB), run=29054-29054msec