zfs ashift question

BoredSysadmin · Mar 8, 2021

So my Freenas has two pools. One with 6 drives(2 x z1 vdevs with 3 drives each) (512e/4k physical) and one with 12x WD 2TB RE4 drives which are 512 physical. (all of 12 drives in a single z2 vdev - not ideal, but the idea was to maximize capacity) [smartctl -a /dev/adaX]

Due to lack of knowledge both pools created with a default ashift of 12 for both pools. I was able to check it today with "zdb -U /data/zfs/zpool.cache" command

I recently learned that the 512n RE4 drives should probably have been set with ashift of 9 and I also know that I can't change it now.

Since day one performance of the second 12 RE4 disks pool has been terrible on writes/deletes, I guess know what some (or all) of the blame could be down to the wrong ashift?

If I resilver and replace these drives one by one with 4k native drives should expect a noticeable performance improvement?

I also know that the best way of fixing it to replace it with 4-8 new drives and be done with it, but it's not within the budget for now.

gea · Mar 8, 2021

If you create a vdev with ashift=12 (4k) with 512n disks

you have the disadvantage to
- waste some space with small files (reduced capacity)
- maybe a reduced performance with small files (same with 4k disks)

and the advantage
- gain some performance for larger files
- ability to replace those disks with newer 512e/4k disks (you cannot replace 512b with newer 4k physical disks in a ashift=9 pool). If you replace 512n disks im a ashift=12 pool your main advantage is that newer disks are basically faster than disks from 10 years ago.

so be happy, I would always use ashift=12 even with 512n disks

If performance is bad with the old Re4 (I had a lot of them), this is because they are slow under todays expectations. More RAM may help as well as a multi mirror pool. You may also check for single weak disks (pool is as fast as slowest disk)

BoredSysadmin · Mar 8, 2021

hmm, I see. I guess the problem lies somewhere else. Thanks. As usually your answers @gea are awesome!
Ram isn't a problem - I have 32GB. I did disable sync (not a concern as data isn't critical and it's UPS and generator protected) - that helped a lot with writes, bud deletes are still painfully slow.
I wonder if I should abandon ZFS and try MerferFS and SnapRAID for my media files.

Overview & FAQ - Perfect Media Server

perfectmediaserver.com

gea · Mar 9, 2021

If you use an older filesystem instead of ZFS
You must process less data (no checksums) and you have less fragmentation (Copy on Write) what means less security and a better performance especially in low RAM situations.

If you avoid realtime Raid (Zn, Raid 5/6) like with Snapraid, you have always the performance of a single disk while realtime raid gives n x datadisks. In a realtime raid you mostly spread data over an array so io is important then not only pure sequential performance. With Re4 this may mean 100 MB/s per disk and 10 x 100MB/s in a Z2 of 12 disks (ok Z2 scales not really linear so maybe only 700-800 MB/s). If you have a weak disk this affects only data on disk. In a realtime raid this affects the whole array.

For pure media stream this is not relevant.
For a single videostream much less than the 100 MB/s of a single disk is enought.

What you can check with a local performance test.
- Is the Z2 pool sequentially between 500 and 800 MB/s as it should be with 12 disks?

If not:
Are the wait% and busy% iostat values of disks of the ZFS pool are quite equal (should be) or is one or some much worse (weak disks)
?

What remains always the case is the superiour data security of ZFS. You can trust data what you cannot with older filesystems.

BoredSysadmin · Mar 9, 2021

"What you can check with a local performance test." ? Could you please point me to specific command if possible?

SnapRAID does include checksums, but you said correctly, they aren't real-time, but post writes scheduled operation. SnapRaid will protect just like ZFS vs bitrot, abeit with some detail.
I'd probably continue to store more critical data on ZFS, but media files are far less critical, thus my interest in alternatives.

gea · Mar 10, 2021

Snapraid does not have checksum protection for current/new data nor does is has crash protection comparable to CopyOnWrite.

What you can do is a backup alike sync run of your data where the backup on runtime is checksum protected. If data on disk is corrupted at that moment the Snapraid data is as well.

gb00s · Mar 10, 2021

I'm far away from being a ZFS expert. Take note that the RaidZ2 setup never was and is a 'write' performance monster. Write performance doesn't gain with the HDD count. You just have to much write penalties with RaidZ2. Just the read performance increases. If you need good write performance, you may consider changing to a RAID10 setup with all the disadvantages space-wise. But it can be extended pretty easy. We are using RaidZ2 here just for storage used for read-purposes only. As @gea said, ashift=12 is always preferred.

gea · Mar 10, 2021

gb00s said:
I'm far away from being a ZFS expert. Take note that the RaidZ2 setup never was and is a 'write' performance monster. Write performance doesn't gain with the HDD count. You just have to much write penalties with RaidZ2. Just the read performance increases.

This is true for iops (random access) where a Raid-Zn has always the same iops as a single disk. For sequential transfers, performance (read and write) goes up with number of data disks as every disk must read/write only a part of overall data.

BoredSysadmin · Mar 10, 2021

gb00s said:
I'm far away from being a ZFS expert. Take note that the RaidZ2 setup never was and is a 'write' performance monster. Write performance doesn't gain with the HDD count. You just have to much write penalties with RaidZ2. Just the read performance increases. If you need good write performance, you may consider changing to a RAID10 setup with all the disadvantages space-wise. But it can be extended pretty easy. We are using RaidZ2 here just for storage used for read-purposes only. As @gea said, ashift=12 is always preferred.

I don't really look for max performance and I am well aware that Z2 has significant performance drawbacks. I could've gone with TWO vdevs of 6 drives each with Z2. I'd gain performance due to stripping of two devs, but I'd lose on total usable capacity. I made the choice to sacrifice performance to max capacity.
The issue is the delete operations from this volume are painfully slow. I am just trying to see if anything else is culpit.

BoredSysadmin · Mar 10, 2021

gea said:
Snapraid does not have checksum protection for current/new data nor does is has crash protection comparable to CopyOnWrite.

What you can do is a backup alike sync run of your data where the backup on runtime is checksum protected. If data on disk is corrupted at that moment the Snapraid data is as well.

Again to my limited knowledge SnapRAID writes NEW data without COW nor calculating checksum in real-time.
However, existing data WILL get checksum calculated in-post process. In event of bitrot, SnapRaid will heal it, just like ZFS would for the long-term stored data.

SnapRAID

A backup program for disk arrays. It stores parity information of your data and it recovers from up to six disk failures

www.snapraid.it

This of course less secure method than ZFS and as I said I would still prefer ZFS for storing critical data, but for much less important data (read easily replaceable) I don't see an issue here. There are Pros vs cons in both approaches, for example in my case SnapRAID writes will be as fast as a single disk could drive, which should be much faster than 12 old disks in z2 (which is approx 0.6 to 0.7x write of a single disk or I think somewhere in that ballpark)

Search

zfs ashift question

BoredSysadmin

Not affiliated with Maxell

gea

Well-Known Member

BoredSysadmin

Not affiliated with Maxell

Overview & FAQ - Perfect Media Server

gea

Well-Known Member

BoredSysadmin

Not affiliated with Maxell

gea

Well-Known Member

gb00s

Well-Known Member

gea

Well-Known Member

BoredSysadmin

Not affiliated with Maxell

BoredSysadmin

Not affiliated with Maxell

SnapRAID