48TB Raw, 18TB Formatted?

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

TeeJayHoward

Active Member
Feb 12, 2013
376
112
43
I've got 16TB of data, and I'm almost out of space on 24x 2TB SAS drives. This doesn't seem right to me. I should have...

24 disk -4 lost to raidz2 vdevs = 20 disks @ 2TB = 40TB - 1/64th for ZFS = ~39.3TB
39.3TB - 1.1TB snapshots = 38.2TB

So... 38.2TB, give or take. Instead, I've got less than half of that. Where did my free space all go?

Code:
[root@nas ~]# df -h|grep -v tmpfs|grep -v datastore|grep -v /dev
Filesystem               Size  Used Avail Use% Mounted on
pool                      18T   16T  2.1T  88% /pool

[root@nas ~]# zpool list pool
NAME   SIZE  ALLOC   FREE    CAP  DEDUP  HEALTH  ALTROOT
pool  43.5T  40.3T  3.19T    92%  1.00x  ONLINE  -

[root@nas ~]# zfs list pool
NAME   USED  AVAIL  REFER  MOUNTPOINT
pool  33.4T  2.08T  15.0T  /pool

[root@nas ~]# zpool status pool
  pool: pool
 state: ONLINE
status: The pool is formatted using a legacy on-disk format.  The pool can
    still be used, but some features are unavailable.
action: Upgrade the pool using 'zpool upgrade'.  Once this is done, the
    pool will no longer be accessible on software that does not support
    feature flags.
  scan: scrub repaired 0 in 9h53m with 0 errors on Sat Mar 21 09:54:16 2015
config:

    NAME                              STATE     READ WRITE CKSUM
    pool                              ONLINE       0     0     0
      raidz2-0                        ONLINE       0     0     0
        scsi-35000c50034f36cff        ONLINE       0     0     0
        scsi-35000c50034eb58bb        ONLINE       0     0     0
        scsi-35000c50034f44577        ONLINE       0     0     0
        scsi-35000c50034e85e4b        ONLINE       0     0     0
        scsi-35000c50034f422b7        ONLINE       0     0     0
        scsi-35000c50034e85c3f        ONLINE       0     0     0
        scsi-35000c50040cf0c4f        ONLINE       0     0     0
        scsi-35000c500409ae567        ONLINE       0     0     0
        scsi-35000c500409946ff        ONLINE       0     0     0
        scsi-35000c5003c95a907        ONLINE       0     0     0
        scsi-35000c50034fbe17b        ONLINE       0     0     0
        scsi-35000c50034f3dfc7        ONLINE       0     0     0
      raidz2-1                        ONLINE       0     0     0
        scsi-35000c50034f3cc5f        ONLINE       0     0     0
        scsi-35000c50034f3e81f        ONLINE       0     0     0
        scsi-35000c50034ea0857        ONLINE       0     0     0
        scsi-35000c50034ff6167        ONLINE       0     0     0
        scsi-35000c50034f3decf        ONLINE       0     0     0
        scsi-35000c50034f421c7        ONLINE       0     0     0
        scsi-35000c50034f3daeb        ONLINE       0     0     0
        scsi-35000c50034ff1b8b        ONLINE       0     0     0
        scsi-35000c50034f42db7        ONLINE       0     0     0
        scsi-35000c50034f3d3ab        ONLINE       0     0     0
        scsi-35000c50034e011d3-part1  ONLINE       0     0     0
        scsi-35000c5003c95abdf        ONLINE       0     0     0

errors: No known data errors

[root@nas ~]# zfs list -t snapshot
NAME              USED  AVAIL  REFER  MOUNTPOINT
pool@2015-02-13  3.23G      -  14.3T  -
pool@2015-02-14  2.07G      -  14.3T  -
pool@2015-02-15  3.19G      -  14.3T  -
pool@2015-02-25  1.95G      -  14.4T  -
pool@2015-02-26  1.45G      -  14.4T  -
pool@2015-02-27  1.68G      -  14.4T  -
pool@2015-02-28  2.10G      -  14.4T  -
pool@2015-03-01  1.61G      -  14.4T  -
pool@2015-03-02  1.21G      -  14.4T  -
pool@2015-03-03  1.46G      -  14.4T  -
pool@2015-03-04  3.29G      -  14.5T  -
pool@2015-03-05  2.13G      -  14.5T  -
pool@2015-03-06  2.44G      -  15.5T  -
pool@2015-03-07  4.14G      -  15.5T  -
pool@2015-03-08  2.02G      -  14.6T  -
pool@2015-03-09  1.97G      -  14.6T  -
pool@2015-03-10  37.7G      -  14.7T  -
pool@2015-03-11  9.69G      -  14.7T  -
pool@2015-03-12  1.94G      -  14.8T  -
pool@2015-03-13  2.23G      -  14.7T  -
pool@2015-03-18  25.8G      -  14.8T  -
pool@2015-03-19   148G      -  14.9T  -
pool@2015-03-20  8.63M      -  14.8T  -
pool@2015-03-21  28.9M      -  15.2T  -
pool@2015-03-22  33.6M      -  15.5T  -
pool@2015-03-23  54.6M      -  15.6T  -
pool@2015-03-24   416G      -  15.5T  -
pool@2015-03-25   401G      -  5.86T  -
pool@2015-03-26      0      -  15.0T  -
 

markarr

Active Member
Oct 31, 2013
421
122
43
Snapshots, they dont take up space they subtract from your total space. I ran into that with using zfs for a backup pool that was being replicated to another box. The snapshot total didn't add up to the amount was missing on mine either. Someone more educated on zfs can chime in for more detail, but the snapshots are killing your pool size.
 
  • Like
Reactions: T_Minus

T_Minus

Build. Break. Fix. Repeat
Feb 15, 2015
7,641
2,058
113
Snapshots, they dont take up space they subtract from your total space. I ran into that with using zfs for a backup pool that was being replicated to another box. The snapshot total didn't add up to the amount was missing on mine either. Someone more educated on zfs can chime in for more detail, but the snapshots are killing your pool size.
That is GOOD to know ahead of time. I'd be scratching my head too.
 

TeeJayHoward

Active Member
Feb 12, 2013
376
112
43
Someone more educated on zfs can chime in for more detail, but the snapshots are killing your pool size.
1.1TB of delta is removing 20TB of space? That's crazy! I can't understand ANYONE wanting to use snapshots if this were the case.
 

markarr

Active Member
Oct 31, 2013
421
122
43
I think for zfs there is much more data in the snapshots than just the deltas, due to all of the integrity checks that zfs does. I think it calculates everything it would need to revert or remove itself (don't quote me on this as it is my observation).

Short version, I don't think zfs snapshots are for long term use.
 

cperalt1

Active Member
Feb 23, 2015
180
55
28
43
Another thing to check is to see how many copies that you are keeping of each object by running. By default it should be set to 1, but if it is 2 or more that can also explain your size discrepancy.
zfs get copies pool

Also do you have any other datasets under the pool "pool" such as a zvol or something else you didn't list since you only listed the root zfs dataset.

zfs list
 

TeeJayHoward

Active Member
Feb 12, 2013
376
112
43
Another thing to check is to see how many copies that you are keeping of each object by running. By default it should be set to 1, but if it is 2 or more that can also explain your size discrepancy.
zfs get copies pool

Also do you have any other datasets under the pool "pool" such as a zvol or something else you didn't list since you only listed the root zfs dataset.

zfs list
Code:
[root@nas ~]# zfs get copies pool
NAME  PROPERTY  VALUE   SOURCE
pool  copies    1       default
[root@nas ~]# zfs list
NAME        USED  AVAIL  REFER  MOUNTPOINT
datastore  63.0G   874G  63.0G  /datastore
pool       33.4T  2.08T  15.0T  /pool
[root@nas ~]#
Nope. What makes this even more odd is that my backup system, which rsyncs nightly, doesn't have this issue. Same number of snapshots, same data... 10TB extra space.

Code:
[root@backup ~]# df -h|grep -v tmpfs|grep -v /dev
Filesystem               Size  Used Avail Use% Mounted on
backup                    26T   16T   11T  60% /backup
[root@backup ~]# zfs list -t snapshot
NAME                USED  AVAIL  REFER  MOUNTPOINT
backup@2015-02-25      0      -   575K  -
backup@2015-02-26      0      -   575K  -
backup@2015-02-27      0      -   575K  -
backup@2015-02-28      0      -   575K  -
backup@2015-03-01      0      -   575K  -
backup@2015-03-02      0      -   575K  -
backup@2015-03-03      0      -   575K  -
backup@2015-03-04      0      -   575K  -
backup@2015-03-05  1.34T      -  3.13T  -
backup@2015-03-06   578G      -  7.96T  -
backup@2015-03-07  1.67T      -  12.8T  -
backup@2015-03-08   658G      -  16.1T  -
backup@2015-03-09   641G      -  16.2T  -
backup@2015-03-10   657G      -  16.2T  -
backup@2015-03-18   103G      -  17.4T  -
backup@2015-03-19  67.1K      -  17.4T  -
backup@2015-03-20  67.1K      -  17.4T  -
backup@2015-03-21  73.1G      -  14.8T  -
backup@2015-03-22   417G      -  15.1T  -
backup@2015-03-23   417G      -  15.1T  -
backup@2015-03-24   417G      -  15.1T  -
backup@2015-03-25      0      -  15.1T  -
backup@2015-03-26      0      -  15.1T  -
[root@backup ~]# zpool status backup
  pool: backup
state: ONLINE
status: The pool is formatted using a legacy on-disk format.  The pool can
        still be used, but some features are unavailable.
action: Upgrade the pool using 'zpool upgrade'.  Once this is done, the
        pool will no longer be accessible on software that does not support
        feature flags.
  scan: scrub repaired 0 in 19h18m with 0 errors on Sat Mar 21 19:19:31 2015
config:

        NAME                        STATE     READ WRITE CKSUM
        backup                      ONLINE       0     0     0
          raidz2-0                  ONLINE       0     0     0
            scsi-35000c50034f447c7  ONLINE       0     0     0
            scsi-35000c50034d58277  ONLINE       0     0     0
            scsi-35000c50034e7a243  ONLINE       0     0     0
            scsi-35000c50034f37e2f  ONLINE       0     0     0
            scsi-35000c50034e7b3af  ONLINE       0     0     0
            scsi-35000c50034f435c7  ONLINE       0     0     0
            scsi-35000c50034ff4993  ONLINE       0     0     0
            scsi-35000c50034f41723  ONLINE       0     0     0
            scsi-35000c50034ff2b87  ONLINE       0     0     0
            scsi-35000c50034f76637  ONLINE       0     0     0
            scsi-35000c50034e9a92b  ONLINE       0     0     0
            scsi-35000c50034fbc2d7  ONLINE       0     0     0
          raidz2-1                  ONLINE       0     0     0
            scsi-35000c50034edb303  ONLINE       0     0     0
            scsi-35000c50034ec07eb  ONLINE       0     0     0
            scsi-35000c50034f3718b  ONLINE       0     0     0
            scsi-35000c50034f4426b  ONLINE       0     0     0
            scsi-35000c50034f38adf  ONLINE       0     0     0
            scsi-35000c50034ff5eb3  ONLINE       0     0     0
            scsi-35000c50034f4467b  ONLINE       0     0     0
            scsi-35000c50034eb7d9f  ONLINE       0     0     0
            scsi-35000c50034f3dd6b  ONLINE       0     0     0
            scsi-35000c50034f3df0b  ONLINE       0     0     0
            scsi-35000c50034f43aeb  ONLINE       0     0     0
            scsi-35000c50034f39007  ONLINE       0     0     0

errors: No known data errors
 

TeeJayHoward

Active Member
Feb 12, 2013
376
112
43
Code:
[root@nas ~]# zfs list -o space -r pool
NAME  AVAIL   USED  USEDSNAP  USEDDS  USEDREFRESERV  USEDCHILD
pool  2.08T  33.4T     18.3T   15.0T              0      4.07G
That's definitely a command that's going in to my toolbox. Sure enough, the snapshots are the issue. Looks like I just need to be careful about storage for the next 30 days.

edit: It's great to know what's wrong, but now I'm curious as to the "why" of it all. Why does zfs list -t snapshot show a different number than zfs list -o space -r pool?

Code:
[root@nas ~]# zfs list -o space -r pool
NAME  AVAIL   USED  USEDSNAP  USEDDS  USEDREFRESERV  USEDCHILD
pool  2.08T  33.4T     18.3T   15.0T              0      4.07G

[root@nas ~]# zfs list -t snapshot -o space -r pool
NAME             AVAIL   USED  USEDSNAP  USEDDS  USEDREFRESERV  USEDCHILD
pool@2015-02-13      -  3.23G         -       -              -          -
pool@2015-02-14      -  2.07G         -       -              -          -
pool@2015-02-15      -  3.19G         -       -              -          -
pool@2015-02-25      -  1.95G         -       -              -          -
pool@2015-02-26      -  1.45G         -       -              -          -
pool@2015-02-27      -  1.68G         -       -              -          -
pool@2015-02-28      -  2.10G         -       -              -          -
pool@2015-03-01      -  1.61G         -       -              -          -
pool@2015-03-02      -  1.21G         -       -              -          -
pool@2015-03-03      -  1.46G         -       -              -          -
pool@2015-03-04      -  3.29G         -       -              -          -
pool@2015-03-05      -  2.13G         -       -              -          -
pool@2015-03-06      -  2.44G         -       -              -          -
pool@2015-03-07      -  4.14G         -       -              -          -
pool@2015-03-08      -  2.02G         -       -              -          -
pool@2015-03-09      -  1.97G         -       -              -          -
pool@2015-03-10      -  37.7G         -       -              -          -
pool@2015-03-11      -  9.69G         -       -              -          -
pool@2015-03-12      -  1.94G         -       -              -          -
pool@2015-03-13      -  2.23G         -       -              -          -
pool@2015-03-18      -  25.8G         -       -              -          -
pool@2015-03-19      -   148G         -       -              -          -
pool@2015-03-20      -  8.63M         -       -              -          -
pool@2015-03-21      -  28.9M         -       -              -          -
pool@2015-03-22      -  33.6M         -       -              -          -
pool@2015-03-23      -  54.6M         -       -              -          -
pool@2015-03-24      -   416G         -       -              -          -
pool@2015-03-25      -   401G         -       -              -          -
pool@2015-03-26      -   313K         -       -              -          -
 
Last edited:

TeeJayHoward

Active Member
Feb 12, 2013
376
112
43
Another tool you might want to see is to use
zfs diff pool@snap1 pool@snap2

in order to see the differences between the snaps.

Overview of ZFS Snapshots - Oracle Solaris ZFS Administration Guide
Good lord, every single file I have is modified in my latest snapshot.

...Wait, I guess that makes sense. I deleted them, then copied them from a previous snapshot. That would update the file creation time. Now the question is, does ZFS think it's a completely new file, and so keep two copies of the same data, or does it compare the checksums of the two and not snapshot the file since it's the same?
 

cperalt1

Active Member
Feb 23, 2015
180
55
28
43
Just asked on IRC #smartos, and yes that is what you are seeing that the files now have a new checksum. Are you the one from the other thread that blew away the pool due to the xattr=sa setting? The proper way to not take up twice the blocks accounting wise would have been to clone the old snap, sync in latest changes from the current dataset and then promote the clone and rename to have everything back in place.
 
  • Like
Reactions: TeeJayHoward

TeeJayHoward

Active Member
Feb 12, 2013
376
112
43
Just asked on IRC #smartos, and yes that is what you are seeing that the files now have a new checksum. Are you the one from the other thread that blew away the pool due to the xattr=sa setting? The proper way to not take up twice the blocks accounting wise would have been to clone the old snap, sync in latest changes from the current dataset and then promote the clone and rename to have everything back in place.
Sure enough, that's me. I understand the concept behind what you're saying, but not the application. Could you walk me through how to do that?

What I'm thinking:
============================================
Clone the old snap: zfs clone pool@2015-03-24 pool/clone
Sync in changes: ???
Promote clone: zfs promote pool/clone
Rename: zfs rename pool/clone pool
 

gea

Well-Known Member
Dec 31, 2010
3,163
1,195
113
DE
ZFS is a CopyOnWrite filesystem what means that every edit or copy creates new datablocks. The former dataversion - even a delete- is frozen on a per datablock level whenever you create a snap otherwise ZFS can overwrite the former datablocks.

So a snap does not consume space on creation time but it freezes the delta data to a former snap resulting in a lower capacity for any modification that you do after the snap (If you like previous versions of a file, you must give ZFS the space to keep the data).

If you enable dedup, only a single copy of a databock (calculated from checksums) is on disk (works poolwide) - but as ZFS dedup is realtime dedup, you need a lot of RAM if you enable dedup. Mostly this is not suggested.
 
  • Like
Reactions: T_Minus

TeeJayHoward

Active Member
Feb 12, 2013
376
112
43
If you enable dedup, only a single copy of a databock (calculated from checksums) is on disk (works poolwide) - but as ZFS dedup is realtime dedup, you need a lot of RAM if you enable dedup. Mostly this is not suggested.
So if dedup is enabled, ZFS will do checksum comparisons on files rather than assuming that it's a new file because the timestamp changed? How does it store the timestamp difference? Or is all that metadata the reason that dedup needs so much RAM?
 

gea

Well-Known Member
Dec 31, 2010
3,163
1,195
113
DE
If you write a datablock to disk with dedup enabled, ZFS checks if there is already a datablock with that checksum on your pool to decide if a fast reference or a full copy is needed.

This means that if you cannot keep the dedup table in RAM (must read from disk), you have a massive performance impact. (A snap destroy can last a week or so)
 
  • Like
Reactions: T_Minus

cperalt1

Active Member
Feb 23, 2015
180
55
28
43
Sure enough, that's me. I understand the concept behind what you're saying, but not the application. Could you walk me through how to do that?

What I'm thinking:
============================================
Clone the old snap: zfs clone pool@2015-03-24 pool/clone
Sync in changes: ???
Promote clone: zfs promote pool/clone
Rename: zfs rename pool/clone pool
Sync in Changes as in Rsync from Current Pool to clone but you will still have the original issue you had regarding xatrs. I guess the real question you will have to ask yourself is how important are those snapshots as you have them on your backup pool. What is your retention policy. For example I use the
zfs attibute com.sun:auto-snapshot=true on a per data-set basis, and as I am on linux this utilizes zfsonlinux/zfs-auto-snapshot · GitHub to create frequent, hourly, daily, weekly, and monthly snaps, pruning the snaps as I go along. That can help corral your snapshot space usage.
 

TeeJayHoward

Active Member
Feb 12, 2013
376
112
43
Sync in Changes as in Rsync from Current Pool to clone but you will still have the original issue you had regarding xatrs.
Aah, nevermind then. The whole point of the copy was to get rid of the xattr issue.

As for my snapshots, I suppose they're not all that important. They'll fall off after 30 days anyway... I'll just blow 'em away now.
Code:
[root@nas Computer]# for x in {01..26};do zfs destroy pool@2015-03-$x;done
[root@nas Computer]# zfs list -t snapshot
no datasets available
And then I thought it'd be fun to watch it free up the disk space... So I went a little stupid with a while loop:
Code:
[root@nas Computer]# echo "NAME  AVAIL   USED  USEDSNAP  USEDDS  USEDREFRESERV  USEDCHILD";while true; do zfs list -o space -r pool|grep -v NAME;sleep 5;done
NAME  AVAIL   USED  USEDSNAP  USEDDS  USEDREFRESERV  USEDCHILD
pool  12.3T  23.2T         0   15.0T              0      8.14T
pool  12.5T  23.0T         0   15.0T              0      7.93T
pool  12.7T  22.8T         0   15.0T              0      7.72T
pool  12.9T  22.6T         0   15.0T              0      7.52T
pool  13.1T  22.4T         0   15.0T              0      7.34T
pool  13.3T  22.2T         0   15.0T              0      7.16T
pool  13.4T  22.0T         0   15.0T              0      6.98T
pool  13.6T  21.8T         0   15.0T              0      6.79T
pool  13.8T  21.6T         0   15.0T              0      6.59T
pool  14.0T  21.4T         0   15.0T              0      6.40T
pool  14.2T  21.3T         0   15.0T              0      6.21T
pool  14.4T  21.1T         0   15.0T              0      6.02T
pool  14.6T  20.9T         0   15.0T              0      5.81T
pool  14.8T  20.7T         0   15.0T              0      5.62T
pool  15.0T  20.5T         0   15.0T              0      5.42T
pool  15.2T  20.3T         0   15.0T              0      5.23T
pool  15.4T  20.1T         0   15.0T              0      5.04T
pool  15.6T  19.9T         0   15.0T              0      4.83T
pool  15.8T  19.7T         0   15.0T              0      4.63T
pool  16.0T  19.5T         0   15.0T              0      4.45T
pool  16.2T  19.3T         0   15.0T              0      4.26T
pool  16.3T  19.1T         0   15.0T              0      4.07T
pool  16.5T  18.9T         0   15.0T              0      3.88T
pool  16.7T  18.7T         0   15.0T              0      3.69T
pool  16.9T  18.5T         0   15.0T              0      3.50T
pool  17.1T  18.4T         0   15.0T              0      3.31T
pool  17.3T  18.2T         0   15.0T              0      3.11T
pool  17.5T  18.0T         0   15.0T              0      2.93T
pool  17.7T  17.8T         0   15.0T              0      2.76T
pool  17.8T  17.6T         0   15.0T              0      2.60T
pool  18.0T  17.5T         0   15.0T              0      2.42T
pool  18.2T  17.3T         0   15.0T              0      2.22T
pool  18.4T  17.1T         0   15.0T              0      2.03T
pool  18.6T  16.9T         0   15.0T              0      1.84T
pool  18.8T  16.7T         0   15.0T              0      1.64T
pool  19.0T  16.5T         0   15.0T              0      1.45T
pool  19.1T  16.3T         0   15.0T              0      1.27T
pool  19.3T  16.1T         0   15.0T              0      1.08T
pool  19.5T  16.0T         0   15.0T              0       926G
pool  19.7T  15.8T         0   15.0T              0       739G
pool  19.9T  15.6T         0   15.0T              0       537G
pool  20.1T  15.4T         0   15.0T              0       323G
pool  20.3T  15.2T         0   15.0T              0       125G
pool  20.4T  15.0T         0   15.0T              0       227M
pool  20.4T  15.0T         0   15.0T              0       227M
 
Last edited: