Understanding fio bandwidth display

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

Elliott

New Member
Sep 3, 2019
2
0
1
I'm learning to use fio to benchmark disk speed. Any fio gurus out there who could drop some knowledge on me? Below is a sample output writing to a ZFS pool. As I watch the one line progress display during the sequential write, it hovers around 950 - 990MiB/s. At the end of the test, the results at the bottom show total bw=1057MiB/s. I never see a number this high during the test, so are these two numbers calculated differently?

Code:
$ fio --name=seqwrite --rw=write --bs=1M --numjobs=1 --size=30G --directory=/tank --ioengine=sync
seqwrite: (g=0): rw=write, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=sync, iodepth=1
fio-3.7
Starting 1 process
seqwrite: Laying out IO file (1 file / 30720MiB)
Jobs: 1 (f=1): [W(1)][100.0%][r=0KiB/s,w=950MiB/s][r=0,w=950 IOPS][eta 00m:00s]
seqwrite: (groupid=0, jobs=1): err= 0: pid=5708: Mon Sep  9 18:04:29 2019
  write: IOPS=1057, BW=1057MiB/s (1109MB/s)(30.0GiB/29054msec)
    clat (usec): min=249, max=100307, avg=929.44, stdev=1016.86
     lat (usec): min=256, max=100321, avg=944.79, stdev=1016.62
    clat percentiles (usec):
     |  1.00th=[  314],  5.00th=[  334], 10.00th=[  359], 20.00th=[  693],
     | 30.00th=[  914], 40.00th=[ 1004], 50.00th=[ 1020], 60.00th=[ 1090],
     | 70.00th=[ 1106], 80.00th=[ 1123], 90.00th=[ 1139], 95.00th=[ 1156],
     | 99.00th=[ 1188], 99.50th=[ 1205], 99.90th=[ 1237], 99.95th=[ 1254],
     | 99.99th=[ 1598]
   bw (  MiB/s): min=  900, max= 2822, per=100.00%, avg=1057.39, stdev=345.13, samples=58
   iops        : min=  900, max= 2822, avg=1057.38, stdev=345.13, samples=58
  lat (usec)   : 250=0.01%, 500=14.21%, 750=7.50%, 1000=17.06%
  lat (msec)   : 2=61.22%, 100=0.01%, 250=0.01%
  cpu          : usr=1.96%, sys=32.88%, ctx=188200, majf=0, minf=26
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,30720,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: bw=1057MiB/s (1109MB/s), 1057MiB/s-1057MiB/s (1109MB/s-1109MB/s), io=30.0GiB (32.2GB), run=29054-29054msec
As a second, unrelated question, how can I simulate writing through the buffer cache when the memory is full? In general I want to use the kernel buffer cache, but there will be times when writing a huge file where the cache fills up so I need to benchmark the underlying disks. With a regular SSD with XFS filesystem, I could use direct=1 but I think that's not quite the same workload. I tried fdatasync=1 but that goes very slow.