The other thing about dedup ON HARD DRIVES is that it intrinsically causes fragmentation. So even if you have enough RAM for the DDTs, your performance will tank over time as the filesystem fragments; the seeks will bite you in the ass. This is not a problem with SSDs as seeks are "free" but it still has a subtle effect as multiple reads have to be issued instead of one large contiguous one. To see how the fragmentation occurs consider the following scenario:
- create a VM and write out its backing file A
- create a similar VM and write its backing file B ; B is almost the same as A except for a few blocks. Great dedup, get two files for the price of one except for the few blocks. But when reading B you read A except where the blocks are different, then you seek way over there to get the first of the different blocks, then seek back to where you were in A, then seek way over there to get the next of B's different blocks, then seek back to A..... You get the picture
- rinse and repeat for all your VMs
Now combine that with the COW nature of ZFS you can see that you might have a file's blocks scattered all over the disk, and this will get worse over time. Dedup is a great win for a few use cases, but not many. I'll bet that most dedup'd pools are converted back to non-deduped ones eventually and at great inconvenience as this can't be done in place.
If you want to play with dedup do it on a disposable pool.