Linux-Bench: Storage

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

Jeggs101

Well-Known Member
Dec 29, 2010
1,529
241
63
While they are how it is used... raw allows a consistent measure rather than have a kernel update completely change the XFS performance attributes.
I would rather raw and the occasional FS comparisons. Play with one variable at a time.
That's a fundamental question though. Are you trying to make a completely objective measure of speed or are you trying to try setting a generic environment as a baseline then compare in that baseline?

The CPU L-B is great because it's like add LiveCD, 3 commands and go. Not perfect but output makes sense on a relative basis.
 

handruin

Member
May 24, 2015
51
15
8
This is the only thread I've seen on storage-related benchmarking so I wanted to see if it could be re-kindled at all or if there has been any other progress here or elsewhere?

I've spent the last month working with fio and scripts surrounding this utility so that the each of the storage devices could be benchmarked at the block layer and above to aid in platform qualification among vendors for a project we're working on. Basically I've been testing for differences in SSDs and nvme devices in our Dell, HP, and Lenovo rack servers. I've also been testing their included controller io scalability through the use of concurrent SSD benchmark testing (1 drive, 2 drives, 3...etc) to see if there is any IO dropoff as the number of devices increases.

Typically all tests have been run from a kickstart customized and minimized CentOS 6.7 ISO that I've built and executed through a bash shell script which calls fio with command line options (this can be converted to Python if desirable). All tests are done with o_direct and 0_sync using the libaio engine to avoid OS caching layers and perform I/O to the raw block device with no filesystem on top.

The scripting I've been working on has made use of sg_format to aid in wiping an SSD (secure erase) to improve consistency but this command is not universal to all drives and mainly focuses on SAS devices. The script can take in a combination of iodepths and block sizes with a configurable time to run the test. I've been stepping through combinations of 4k, 64k at iodepths 1, 8, 16 but that's because those numbers met the needs for our project.

At the end, I have another script to parse the fio output file into a semicolon-separated file that can be imported into google sheets or Excel (or any other system). There is also the option to use json if that's more portable for a global benchmark tracking system.

Would any of this be helpful in this project?
 
  • Like
Reactions: Chuntzu and Patrick

handruin

Member
May 24, 2015
51
15
8
I'd like to take a stab at this in my own repo but wanted some feedback on some ideas.

OS: Linux
Script will install required packages based on the distribution or maybe just pick one distribution for now and deploy it as a container of some kind? I can start with Ubuntu server 16.0.4.1 for now until I see more feedback.

Test types (selectable|default) :
  • Short test run: 5 min
  • Medium test run: 20 min:
  • Long test run: 60 min

IO testing strategy:
  • io types: randwrite, randread, write, read
  • variable block size (one per test: 4k, 8k, 16k, etc)
  • variable queue depth (one per test: 1, 2, 4, 8, etc)
  • variable number of jobs (1 for now)
  • ioengine: libaio
  • directio enabled: avoid using buffered IO, usually this is similar to O_DIRECT..more consistent results by not using OS RAM as buffer
  • Use device as block device without filesystem
  • Option to use filesystem for those who can't provide unformatted block device
  • Time based: (5 min, 20 min, 60 min)

Capturing performance results:

  • Need to think about this...we could parse all available fields returned by FIO or strip out only the ones we want.
  • Format the file to be added into some collection system that's yet to be defined.

Capturing drive information:
  • Install latest smartctl (smartmon tools) with nvme support enabled. This may require the package to be built if it doesn't come with this baked in.
  • Collect the following info if possible: drive make/model/part number/ bus speed/ is advanced format/etc
  • For SSDs, collect wear indication (remaining durability) and over provision details if possible? Also collect how much capacity is currently used during the test.
  • Also collect temperature readings as this has influence over throttled performance in some SSD/NVMe drive controllers

SSD conditional options:
  • Allow secure erase before testing? This can be tricky and may cause for problems due to inconsistent behavior in how each SSD manufacturer supports this.

Thoughts?
 
  • Like
Reactions: Patrick

Patrick

Administrator
Staff member
Dec 21, 2010
12,512
5,800
113
I love the idea.

Need a 70/30 test certainly. QD 1,2,4,8,16,32,64,128,256

For "normal" benchmarking you need a pre-conditioning run before each actual run. Generally these are in the multi-hour range. I think my normal iometer script is around 36 hours for "quick" runs.

Ubuntu 16.04.1 is great.
 

William

Well-Known Member
May 7, 2015
789
252
63
66
I love the idea.

Need a 70/30 test certainly. QD 1,2,4,8,16,32,64,128,256

For "normal" benchmarking you need a pre-conditioning run before each actual run. Generally these are in the multi-hour range. I think my normal iometer script is around 36 hours for "quick" runs.

Ubuntu 16.04.1 is great.
Yes they take a long time. Just finished up doing benches for 2 drives, they started Saturday morning :(
 

handruin

Member
May 24, 2015
51
15
8
Thanks for passing this along, that's good data. Makes me curious what tooling they use to collect the IO sizes in their systems. I'd like to do something similar for a couple of my own projects.

I think it makes sense to make this flexible enough to allow for changing out the pattern if needed but it would be nice to agree on a standard set so that people have something relative to compare against.

The fio project on github has a bunch of example tests and after reading through a handful of them I can see there are some that offer the split read/write as well as a steady state and fill.

I'd also like to inquire more about the 36 hour duration for your quick runs and what were the decision factors in using this time vs shorter amounts of time (or longer in your case). I understand that for SSDs that there will be a certain amount of data needed to fill the drive and over-provisioning to get to a steady state vs when its empty but 36 hours seems really excessive. If the test uses direct and sync, there's no need to fill the host OS page cache.
 

Patrick

Administrator
Staff member
Dec 21, 2010
12,512
5,800
113
@handruin I think they are collecting the information at the box level and sending data back via phone home features.

On the 36 hour duration pre-conditioning on enterprise drives to hit steady state easily takes over an hour, often three or more. So if you want to hit steady state performance, you need to have a long pre-conditioning curve. You also need to ensure the drive stays in steady-state.

Good reading:
Solid State Storage (SSS) Performance Test Specification (PTS) | SNIA

Also, I had this one bookmarked but have not had the chance to run it (plenty of hardware, just no time)
GitHub - cloudharmony/block-storage: Block storage test suite based on SNIA's Solid State Storage Performance Test Specification Enterprise v1.1

I do like the report writer functionality.
 
  • Like
Reactions: William