Hi,
I've done lot of testing lately with ZFS and dedup and I got a result I was not expecting. First, to my knowledge, when you want to access ZFS through iSCSI, you have to create a volume in your pool (tank/myvol). When you define this volume, you have to specify the space needed. The way ZFS handle the size is by reserving the actual space from the pool without any consideration for dedup (which is normal as it doesn't know what kind of dedup ratio it could get). That also means that you can't put a number that is larger then the actual available space in the pool.
That said, I was expecting the volume to return the unused space to VMware which would report the free space accurately. That's not what is happening. Let say you have a 100GB volume as your datastore. If you create 3 clone of the same 30 GB VM in that datastore with dedup active, you were supposed to use 90 GB and that's what the datastore will report. However, when you look at your volume or pool from ZFS point of view, it will say that you are using about 30 GB so the pool benefit from the space but the volume seems to be as if no dedup was done from the datastore point of view. So, if you try to clone a 4th image, VMware will end with an error as it will say that there is no space left as only 10GB was remaining.
When sharing the same pool over NFS, the free space seems to be accurate. My guess is that VMware understand that NFS is a shared space so as it doesn't control it at 100%, it actually ask to the NFS share to report free space and VMware just display it as the NFS server is authoritative on this information. It is actually quite interesting as the "Capacity" is getting bigger and bigger then the pool was originally because, I assume, VMware is calculating the stuff it installed on it and add it to the free space remaining. That gives funny result when you clone the same VM 10 times on the same 50GB datastore
Is there any way to have "expandable" iSCSI/FC lun as a datastore that would permit to migrate more VMDK then what would normally be allowed without dedup?
As a side information, I did the same test with Starwind and a virtual disk is used the same way as a volume from ZFS would. There is a difference as we can create a virtual disk that is bigger then the physical pool. That cost a lot of memory as it will reserve the proper memory to handle the dedup of that volume upfront but it's kind of weird doing so because you have to guesstimate the capacity you will get from the physical space.
I guess the way Starwind is handling it is similar to thin-provisioning where you can create a volume bigger then the physical space. That can work but that's not easy to handle. Thinking about that, I just checked and ZFS allow the creation of a "sparse" volume that can be bigger then the pool. That would allow to do the same as Starwind. However, is that the only/best way of handling datastore when dedup is used?
Thank.
I've done lot of testing lately with ZFS and dedup and I got a result I was not expecting. First, to my knowledge, when you want to access ZFS through iSCSI, you have to create a volume in your pool (tank/myvol). When you define this volume, you have to specify the space needed. The way ZFS handle the size is by reserving the actual space from the pool without any consideration for dedup (which is normal as it doesn't know what kind of dedup ratio it could get). That also means that you can't put a number that is larger then the actual available space in the pool.
That said, I was expecting the volume to return the unused space to VMware which would report the free space accurately. That's not what is happening. Let say you have a 100GB volume as your datastore. If you create 3 clone of the same 30 GB VM in that datastore with dedup active, you were supposed to use 90 GB and that's what the datastore will report. However, when you look at your volume or pool from ZFS point of view, it will say that you are using about 30 GB so the pool benefit from the space but the volume seems to be as if no dedup was done from the datastore point of view. So, if you try to clone a 4th image, VMware will end with an error as it will say that there is no space left as only 10GB was remaining.
When sharing the same pool over NFS, the free space seems to be accurate. My guess is that VMware understand that NFS is a shared space so as it doesn't control it at 100%, it actually ask to the NFS share to report free space and VMware just display it as the NFS server is authoritative on this information. It is actually quite interesting as the "Capacity" is getting bigger and bigger then the pool was originally because, I assume, VMware is calculating the stuff it installed on it and add it to the free space remaining. That gives funny result when you clone the same VM 10 times on the same 50GB datastore
Is there any way to have "expandable" iSCSI/FC lun as a datastore that would permit to migrate more VMDK then what would normally be allowed without dedup?
As a side information, I did the same test with Starwind and a virtual disk is used the same way as a volume from ZFS would. There is a difference as we can create a virtual disk that is bigger then the physical pool. That cost a lot of memory as it will reserve the proper memory to handle the dedup of that volume upfront but it's kind of weird doing so because you have to guesstimate the capacity you will get from the physical space.
I guess the way Starwind is handling it is similar to thin-provisioning where you can create a volume bigger then the physical space. That can work but that's not easy to handle. Thinking about that, I just checked and ZFS allow the creation of a "sparse" volume that can be bigger then the pool. That would allow to do the same as Starwind. However, is that the only/best way of handling datastore when dedup is used?
Thank.