How to monitor/identify used blocksize (for different data)

Rand__

Well-Known Member
Mar 6, 2014
4,610
918
113
Hi,

I decided to split up my large FreeNas pool into smaller ones. While doing that I want to optimize (recordsize of) the new pools for the different content that I have (video, photo, documents, software, vms).

While that is fairly simple for videos i struggled for photos. Historically I classified those as random IO with small blocksizes, but nowadays each photo has at least a couple of megabytes and while I might access many different ones after each other (browsing a gallery) per photo this is not random.

So next thing I though about was to read up on that, but I did not find much. O/c, application specific there are lots of info (databases primarily), but most I use (eg Lightroom) are working on file system level.

Then I wondered - can I measure that? - but also on that I did not find much.

Also, if all the tools do is to read files on the OS layer, is there such a thing as data specific blocksize at all? Or is it all depending on storage blocksize only and the source OS is agnostic? Also we have something like network TCP window scaling which changes the transport layer (potentially all the time), how does that change things...

Looks like I need to do some basic reading here first, but maybe somebody is further along that road, or has a totally different point of view :)

Cheers
 

EffrafaxOfWug

Radioactive Member
Feb 12, 2015
1,285
439
83
There's a million ways to skin this particular cat, and it sounds like you want more detail than I've bothered with in the past, but as a starter for ten you can set up a perfmon counter in windows or use something like sar/iostat/sysstat under linux (and if you're using ZFS I believe it has its own inbuilt version of iostat) as those should be available out of the box.

On windows the rather confusingly named counters "Average Disc Bytes/Read" and "Average Disc Bytes/Write" (under both physical disc or logical disc) will report on the average IO size seen over each tick period. I don't have a windows box to hand but it should be fairly easy for you to have a fiddle with.

Similarly on *nix, iostat will give you a rundown of the average request size. Here I take a reading from a lightly used MD array every 3 seconds and monitor the average request size (avgrq-sz):
Code:
effrafax@wug:~$ iostat -x 3 /dev/md32 
Linux 4.9.0-9-amd64 (wug)  09/06/19    _x86_64_    (8 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           1.76    0.11    0.39    0.59    0.00   97.15

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
md32              0.00     0.00   88.70    2.28  6247.47   540.61   149.22     0.00    0.00    0.00    0.00   0.00   0.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.04    0.00    0.08    0.00    0.00   99.87

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
md32              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.00    0.00    0.00    0.08    0.00   99.92

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
md32              0.00     0.00    0.00    0.67     0.00   197.33   592.00     0.00    0.00    0.00    0.00   0.00   0.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.75    0.00    0.21    0.00    0.00   99.04

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
md32              0.00     0.00    0.00    5.67     0.00    22.67     8.00     0.00    0.00    0.00    0.00   0.00   0.00
I should add that's measured in sectors rather than bytes so you'll need to multiply it by your sector size (almost certainly either 512 or 4096) to get the size in bytes.

If you're running sar periodically (debian defaults to running sar stats every 5 minutes) you can view historical block device stats with sar -d.

If you want more detail than that, more advanced tools will be needed; I've only toyed with it a little, but dtrace (or various partial ports like systemtap) will let you query and process a bewildering amount of performance data.

Your usage scenario is somewhat more complicated because - reading between the lines - you're running an application (presumably on windows?) from a file server of some sort (presumably linux/BSD?) and want to take network stats into account as well - that's where things can get even more complicated due to the nature or whatever layers there are sitting between the base discs and the application, but even so I think you'll get your best idea of what IO patterns actually look like by just looking at the IO sizes that are hitting your file server.

Edit: looking about for the current state of the dtrace tool under linux led me to a util I'd never heard of called sysdig; from the examples page it's exceptionally detailed yet with a reasonably simple syntax:
draios/sysdig
 
Last edited:
  • Like
Reactions: BoredSysadmin

Rand__

Well-Known Member
Mar 6, 2014
4,610
918
113
Thanks, will have a look at these.
Most don't seem to have identical output on FreeBSD (FreeNas) but I'll see what they can tell me:)
 

EffrafaxOfWug

Radioactive Member
Feb 12, 2015
1,285
439
83
IIRC iostat's mostly the same program on all the *nix platforms, but pulling data from different kernel counters might mean some of the names are different... although a quick gander at the man page looks like it doesn't have an equivalent for IO sizes;
Code:
r/s     read operations per second
w/s     write operations per second
kr/s    kilobytes read per second
kw/s    kilobytes write per second
qlen    transactions queue length
ms/r    average duration of read transactions, in milliseconds
ms/w    average duration of write transactions, in milliseconds
ms/o    average duration of all other transactions, in milliseconds
ms/t    average duration of all transactions, in milliseconds
%b      % of time the device had one or more outstanding transactions
Assuming you're using FreeNAS with ZFS, does zfs/zpool iostat show you better information?
 

Rand__

Well-Known Member
Mar 6, 2014
4,610
918
113
Not really (for this use case)
Code:
 zpool  iostat -v tank
                                           capacity     operations    bandwidth
pool                                    alloc   free   read  write   read  write
--------------------------------------  -----  -----  -----  -----  -----  -----
tank                                    22.4T  6.61T    590     63  68.2M  2.87M
  mirror                                6.26T  1012G    133      9  15.1M   523K
    gptid/f72b1d2b-126b-11e7-b2b0-0050569e17a3      -      -     50      7  12.1M   526K
    gptid/f7af499b-126b-11e7-b2b0-0050569e17a3      -      -     50      7  12.1M   526K
  mirror                                6.22T  1.03T    132     14  15.0M   748K
    gptid/f82c7777-126b-11e7-b2b0-0050569e17a3      -      -     49     10  12.1M   750K
    gptid/f8b7f18d-126b-11e7-b2b0-0050569e17a3      -      -     49     10  12.2M   750K
  mirror                                5.59T  1.66T    183     20  21.5M   866K
    gptid/61f9f99a-1719-11e7-b3a3-0050569e17a3      -      -     59     12  10.8M   869K
    gptid/62892b14-1719-11e7-b3a3-0050569e17a3      -      -     58     12  10.8M   869K
  mirror                                4.32T  2.93T    141     19  16.5M   806K
    gptid/659d00e3-1719-11e7-b3a3-0050569e17a3      -      -     49     11  8.27M   808K
    gptid/662cca81-1719-11e7-b3a3-0050569e17a3      -      -     48     11  8.28M   808K
--------------------------------------  -----  -----  -----  -----  -----  -----
 

EffrafaxOfWug

Radioactive Member
Feb 12, 2015
1,285
439
83
Harrumph, I would have hoped it would give better stats than that but from having a look through the docs that seems to be the best you can expect.

I'm not a FreeNAS user, and don't use ZFS much either (and when I do I use ZOL) so I'm fumbling in the dark a little, from a bit of reading it looks like gstat also works for providing IO stats but again from a quick look around it doesn't seem to provide much in the way of reporting the actual sizes of IOs.
 

Rand__

Well-Known Member
Mar 6, 2014
4,610
918
113
Yeah, have not really found much to provide the data either - thats why I asked;)

In the end it probably will be a minor difference, maybe 5 or 10% at most - just would have been nice to be able to identify ;)
Thanks for trying:)
 

EffrafaxOfWug

Radioactive Member
Feb 12, 2015
1,285
439
83
Aye I'm quite surprised to see it doesn't seem to be a thing on ZFS/BSD... so I feel (much like yourself perhaps) that I'm missing something obvious...!

As you say, it's probably going to be of negligible difference (especially when RAM and flash caches are used which'll generally turn small random IOs into larger sequential IOs) but it's nevertheless interesting to know and test.
 

Rand__

Well-Known Member
Mar 6, 2014
4,610
918
113
So found a few nice dtrace scripts but unfortunately they don't run on freenas... too consumer-ish I assume