28 pcs 1TB HDD ZFS differ pool benchmark in PVE test

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

thx0701tw

Member
Dec 2, 2016
33
12
8
46
www.linkedin.com
Still sorry for my bad English . We wrote this test in Chinese last month.
CPU: CPU E5-2660 v4 * 2
RAM: 64G * 4 = 256GB
HDD: Segate 7200RPM 1TB * 28 hard drives
Motherboard Model: X10DSC +
Case Model: 6048R-E1CR60L
SAS Card Model: AOC-S3008L-L8-P The use of the chip is LSI 3008
SAS Expander backplane model: BPN-SAS3-946SEL1
PVE 5.0

Summary
1. The efficacy of a single set of ZFS Raidz1 is slightly lower than that of Stripe Raidz1 (Raidz1 crossed with Raid 0 structure)

2. Bus -> SAS HBA-> SAS Expander-> total bandwidth of each hard disk
Use ZFS raid0 0 vs mdadm raid 0 test,
When the get new machine , doing this test is very important
First check single hard drive test. If HBA and HDD compatibility issues,it 's a big problem


3.fio2 test
There are restrictions on the use of zfs ram, the largest use of 2G RAM test files over 2G, so do not take RAM to cache.

Result



If you want see more detial please read
Google 翻译


 

gea

Well-Known Member
Dec 31, 2010
3,141
1,184
113
DE
Your sequential results seems heavily limited by your system ex expander (if you count at least 100 MB/s per disk, raw sequential disk read/write values should be higher), especially given your high amount of RAM that will lead to a rambased writecache of 4GB and a huge readcache for random reads/ metadata reads.

You can see this especially on your write values. They are quite the same, does not matter the raid. (a 28 disk raid-0 should have a raw read/write capacity of around 3 GB/s, you land at around 600 MB/s write, higher on reads due caching).

Your iops values reflects more cache quality than disk/raid quality with your benchmark. The raw iops of a single disk is around 100. This is also the raw value of a single raid-Z vdev as for every single read/write all heads in the vdev must be positioned A 4 vdev raid-Z2 therefor is limited to around 400 raw iops. On mirrors the raw write iops is n x number of mirrors, the raw read iops is 2 x number of mirrors as you can read from both mirror parts. The huge rambased writecache and the efficiency of ZFS regarding readcache is the key for your high values.

So the basic question is, do you want to check the overall performance (disk subsystem with RAM) or the disk subsystem only. For pure disk subsystem tests or raid comparisons, set sync to always what would limit the effect of the rambased writecache to a minimum and set primarycache to none to disable readcaching.

If you disable sync and enably primarycache then, you can see the quality of ZFS to handle poor disk quality and to overcome the performance disadvantages of ZFS compared to older filesystems as it must process more data due the additional checksums, must always read/write at least a complete ZFS datablock (no single byte read/write option) and as it is more affected by fragmentation due the CopyOnWrite mechanism.

Then select your raid/ pool layout according to your expected real workload and wanted performance optimisation. Any benchmark can only give a hint. Real load and benchmark load is often different.

For example, if this is a backup system, you can use a 2 vdev z2 of 14 disks, for a filer a multi raid-z2 of around 6-10 disks per vdev is optimal. If this is mainly for database or VM use, use the 14 mirror option and add an Intel Optane 900P or better for Slog and optionallly L2Arc with the read-ahed option of L2Arc enabled.
 
Last edited:
  • Like
Reactions: Patriot