ZFS Snapshots Send/Recv & Compression

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

T_Minus

Build. Break. Fix. Repeat
Feb 15, 2015
7,641
2,058
113
Those of you who use ZFS Send/Receive what's the most efficient compression you've done on the snaps before you send them?

Just bzip and go or worth actually testing different options here?

Is this still effective on a zfs dataset utilizing lz4 or other compression already?
 

T_Minus

Build. Break. Fix. Repeat
Feb 15, 2015
7,641
2,058
113
Oracle Docs

If you need to store many copies, consider compressing a ZFS snapshot stream representation with the gzip command. For example:

# zfs send pool/fs@snap | gzip > backupfile.gz
 

Monoman

Active Member
Oct 16, 2013
410
160
43
nice! I will be doing the same tests myself. Once zfsend makes it into the GUI.
 

T_Minus

Build. Break. Fix. Repeat
Feb 15, 2015
7,641
2,058
113
@niekbergboer thanks I'll look into that!

@Monoman it's literally 2 lines of code, one for the snap one for the send with piped ssh/receive for the other machine. If you're just sending it to another pool on the same system just output the file (gz, bz, etc) to the other pool location like the example above after creating the snap. Obviously if you want to do more complex things or transfer multiple zfs fs it's gonna take a few more rows or pipes.

I told myself this time I was going to spend enough time with snap/send/receive to actually have a better understanding of how the fs stores the snaps, utilization/affect on other files/snaps, most efficient since bandwidth is limited, etc... The Oracle pages and other examples/explanations make it easier to grasp.

I like to read different peoples perspective aside from the docs so here's some I've found to be helpful:
Snap/Send/Receive
Fun with ZFS send and receive
Aaron Toponce : ZFS Administration, Part XIII- Sending and Receiving Filesystems
Sending and Receiving ZFS Data - Oracle Solaris ZFS Administration Guide
http://docs.oracle.com/cd/E19253-01/819-5461/gbinw/
http://docs.oracle.com/cd/E19253-01/819-5461/6n7ht6r4f/
All about Snaps: (send/recv at bottom)
http://docs.oracle.com/cd/E19253-01/819-5461/gavvx/index.html
This is an important piece of info:
"When a snapshot is created, its disk space is initially shared between the snapshot and the file system, and possibly with previous snapshots. As the file system changes, disk space that was previously shared becomes unique to the snapshot, and thus is counted in the snapshot's used property. Additionally, deleting snapshots can increase the amount of disk space unique to (and thus used by) other snapshots.

A snapshot's space referenced property value is the same as the file system's was when the snapshot was created."

General ZFS Commands
Sun - ZFS cheatsheet
19.3. zpool Administration
http://docs.oracle.com/cd/E19253-01/819-5461/6n7ht6r01/index.html
 
  • Like
Reactions: gzorn and Monoman

nitrobass24

Moderator
Dec 26, 2010
1,087
131
63
TX
If you are already compressing your dataset, then I doubt you will be successful compressing the snapshot (by anything significant let alone worth the CPU time/power/heat) since the underlying data would already be compressed.
 

T_Minus

Build. Break. Fix. Repeat
Feb 15, 2015
7,641
2,058
113
@nitrobass24 that's what I was thinking but since Oracle says to try it I'll try it and compare :) a bit later today I'll post what I find. And, doh! moment... this pool has compr=off so I can't compare/test for you, sorry:(
 

T_Minus

Build. Break. Fix. Repeat
Feb 15, 2015
7,641
2,058
113
GZIP failed on my Napp-IT/Solaris with "gzip: stdout: Permission denied" after it wrote about 3GB of the gz

- GZIP and BZIP2 consumed 4GHZ consistently while compressing the zfs stream, this was about 50% of CPU available to this 4cPU VM


Switcehd to BZIP2 and already passed the 3GB line... it's taking a very long time to do it... I think it's due to the ZFS FS I wanted to test on... it's my smallest, but it's also my wife's data which means likely 95% or more are pictures or docs... TONS of tiny-files. BZip2 took approx 7 minutes and it's now done.

According to ZFS list of snapshots this snapshot is: 3.3GB
BZIP2 Size = 3.29GB
Compress Contents = 3.58GB
BZIP2 compressed file contains the ZFS Send file.

This is NOT a compressed pool by the way.

So basically BZIP did 0 for my wifes file contents, and still took around 10M to accomplish it :)

If I could go from 3.3GB to 2.5GB that'd be worth it, these are dedicated storage boxes so I can afford the CPU utilization HOWEVER this makes me re-think very low CPU frequency and core count for any ZFS system that will do compression for snaps on send/recv. Alternatively you could easily mitigate the CPU utilization by streaming the ZFS send to another ZFS implementation in a VM on a host with a lot of extra CPU cycles, and then just gzip it there on the fly and move it over to your backup from there... extra network overhead but that's easily mitigated with a direct connect, or high performance network. Obviously this won't work for everyone or anyone with TBs of snaps they send/receive but for a home setup it may let you keep a super low power storage system and then use free resources on your VM host to better compress your data if it matters.


I'm going to compress and send some other of the zfs fs on this and see if I can get any better compression with different file types, and re-run this specific one with --best to compare. Since it didn't even shrink 5% i'm guessing best won't yield anything worthwhile, but worth a test :)
 

T_Minus

Build. Break. Fix. Repeat
Feb 15, 2015
7,641
2,058
113
Well that was strange... maybe @gea can provide some input on this weird issue.

I renamed the compressed file by adding -default-bzip2 to file name so I could compare to the new bzip2 --best... watched the directory as the new file was growing in size, and other still there.

10 minutes later I refresh and BOTH files are gone.

No error on command line, it just finished the send/compression like before.
Snapshot still exists, and no error creating ZFS send stream :(

When I check ZFS FS utilization this FS shows 6.xGB which would be exactly 2 of these compressed snaps... yet not visible to me anymore?

Any ideas?
 

T_Minus

Build. Break. Fix. Repeat
Feb 15, 2015
7,641
2,058
113
In SSH if I go to where I put the compressed snaps I can see them with an ls -la, but in the windows share they're both no longer visible... very odd... any ideas?

BTW: Both are same size.
 

T_Minus

Build. Break. Fix. Repeat
Feb 15, 2015
7,641
2,058
113
Code:
total 6918603
drwxrwxrwx+ 3 root root          5 May 11 15:03 .
drwxr-x---  2 root sys           3 May 11 14:40 .$EXTEND
drwxr-xr-x+ 3 root root          3 May 11 14:40 ..
-rwxrwxrwx+ 1 root root 3538688015 May 11 14:54 wife-2017.05.11.01.02.02-default--bzip2.gz
-rwxrwxrwx+ 1 root root 3538688015 May 11 15:12 wife-2017.05.11.01.02.02.gz
Had no problem copying original to my windows desktop and decompressing... then once i started the 2nd send stream to be compressed both files disappeared from my network share, and now the 3rd file I'm creating isn't showing up either. Yet, as you can see above they're there.

Looks like a windows error.
I closed explorer, reloaded it and bam all 3 are there again... (I did try F5/Refreshing too prior)
 

T_Minus

Build. Break. Fix. Repeat
Feb 15, 2015
7,641
2,058
113
Second test on 7.4GB with bzip2 --best is reduced to 6.5GB.
Not a lot, and it took well over 10M to compress.

Keep in mind these ZFS FS are not compressed to start. Obviously most of what I'm storing isn't too compressible, yet.

Next up I have 40GB of websites/backups we'll see how much better that does :)
 

T_Minus

Build. Break. Fix. Repeat
Feb 15, 2015
7,641
2,058
113
This is local from a raidz2 pool of 6 drives to a mirror vdev/pool of 2 drives locally done, no network.
All time is compression time / copy time localy.
 

T_Minus

Build. Break. Fix. Repeat
Feb 15, 2015
7,641
2,058
113
40GB still going
Most cores are under 25% ~ 10-20% while 1 core is around 45% utilization.

No usage other than the zfs send stream, compressing it and saving on another pool.
 

JustinH

Active Member
Jan 21, 2015
124
76
28
48
Singapore
Are you setting any of the compression flags on bzip etc?

Interesting if you can try lz/7zip. That does about 30% on my disk image files compared to bzip at hits highest compression settings. (But pegs the CPU)


Sent from my iPhone using Tapatalk