ZFS build for VM storage. SSD or HDD pool advice.

legen

Active Member
Mar 6, 2013
208
35
28
Sweden
Do you guys see any problem with running a SSD pool alongside an slower ordinary hdd pool?

I'm thinking about possible issues with the ARC when its shared between a smaller SSD pool and a much larger HDD pool. Like before the HDD pool might have a SSD log device attached.
 

dswartz

Active Member
Jul 14, 2011
393
33
28
Keep in mind an SSD will have such low latency, you could disable caching for it. Or maybe just cache metadata. Not sure what you mean by SSD log device - that has no play in ARC caching...
 

legen

Active Member
Mar 6, 2013
208
35
28
Sweden
Keep in mind an SSD will have such low latency, you could disable caching for it. Or maybe just cache metadata. Not sure what you mean by SSD log device - that has no play in ARC caching...
I meant possibly using a SSD ZIL for the HDD pool :). As i understand it the log device is attached to a pool. Thus the SSD pool will need nothing added while i can add a ZIL to the HDD pool if needed.
 

ColdCanuck

Member
Jul 23, 2013
33
3
8
Halifax NS
The other thing about dedup ON HARD DRIVES is that it intrinsically causes fragmentation. So even if you have enough RAM for the DDTs, your performance will tank over time as the filesystem fragments; the seeks will bite you in the ass. This is not a problem with SSDs as seeks are "free" but it still has a subtle effect as multiple reads have to be issued instead of one large contiguous one. To see how the fragmentation occurs consider the following scenario:

- create a VM and write out its backing file A

- create a similar VM and write its backing file B ; B is almost the same as A except for a few blocks. Great dedup, get two files for the price of one except for the few blocks. But when reading B you read A except where the blocks are different, then you seek way over there to get the first of the different blocks, then seek back to where you were in A, then seek way over there to get the next of B's different blocks, then seek back to A..... You get the picture

- rinse and repeat for all your VMs

Now combine that with the COW nature of ZFS you can see that you might have a file's blocks scattered all over the disk, and this will get worse over time. Dedup is a great win for a few use cases, but not many. I'll bet that most dedup'd pools are converted back to non-deduped ones eventually and at great inconvenience as this can't be done in place.

If you want to play with dedup do it on a disposable pool.
 

legen

Active Member
Mar 6, 2013
208
35
28
Sweden
The other thing about dedup ON HARD DRIVES is that it intrinsically causes fragmentation. So even if you have enough RAM for the DDTs, your performance will tank over time as the filesystem fragments; the seeks will bite you in the ass. This is not a problem with SSDs as seeks are "free" but it still has a subtle effect as multiple reads have to be issued instead of one large contiguous one. To see how the fragmentation occurs consider the following scenario:

- create a VM and write out its backing file A

- create a similar VM and write its backing file B ; B is almost the same as A except for a few blocks. Great dedup, get two files for the price of one except for the few blocks. But when reading B you read A except where the blocks are different, then you seek way over there to get the first of the different blocks, then seek back to where you were in A, then seek way over there to get the next of B's different blocks, then seek back to A..... You get the picture

- rinse and repeat for all your VMs

Now combine that with the COW nature of ZFS you can see that you might have a file's blocks scattered all over the disk, and this will get worse over time. Dedup is a great win for a few use cases, but not many. I'll bet that most dedup'd pools are converted back to non-deduped ones eventually and at great inconvenience as this can't be done in place.

If you want to play with dedup do it on a disposable pool.
Great points. In theory I actually think the dedup fragmentation will increase the performance for our SSD pool. A file will to a greater extent be spread on different NAND chips inside the SSD (instead of residing on only one). Thus reading the file pieces will be done from several NAND chips and not be limited to the speed of a single chip. This is highly theoretical and I have no proof that this is true in the real world.

I have written a build log in my other thread here. In our step 2 we will build the first SSD pool and in step 3 another one + a HDD pool. This gives us a great opportunity to test whatever we want in step 2. If we see that "darn, dedup ruined everything" in step 3 we can easily migrate the data to either of our newly added pools.

So my plan is to begin with dedup on (given that our testing shows that we want it). Then if all goes haywire we migrate to a pool without dedup in step 3 :)
 

legen

Active Member
Mar 6, 2013
208
35
28
Sweden
Question
I was looking into SSD prices and noticed that the Samsung PRO 512 GB almost equals the cost of a Crucial M500 960GB SSD.

According to anandTech, AnandTech | The Crucial/Micron M500 Review (960GB, 480GB, 240GB, 120GB), the samsung has better performance BUT not by that much. The crucial M500 price/gigabyte is almost half that of the samsung for a little less performance. The crucial also has some "enterprise" features like capacitor and their Redundant Array of Independent NAND technology, HARDOCP - Introduction - Crucial M500 480GB SSD Review.

Have anyone used the Crucial M500 960GB with ZFS, is the capacitor and their RAIN tech a plus or a minus?
Any other pros/cons?
 
Last edited:

lmk

Member
Dec 11, 2013
128
20
18
Question
I was looking into SSD prices and noticed that the Samsung PRO 512 GB almost equals the cost of a Crucial M500 960GB SSD.

According to anandTech, AnandTech | The Crucial/Micron M500 Review (960GB, 480GB, 240GB, 120GB), the samsung has better performance BUT not by that much. The crucial M500 price/gigabyte is almost half that of the samsung for a little less performance. The crucial also has some "enterprise" features like capacitor and their Redundant Array of Independent NAND technology, HARDOCP - Introduction - Crucial M500 480GB SSD Review.

Have anyone used the Crucial M500 960GB with ZFS, is the capacitor and their RAIN tech a plus or a minus?
Any other pros/cons?
capacitor = good
rain = good

performance = slightly worse, but other than benchmarks/synthetic tests it really may not be an appreciable difference

If you can get 2x the capacity with the M500 over the Samsung, for the same price, I would get the M500.

Oh, and the M500 is already overprovisioned - 960GB vs 1024GB (Or, conversely, for Samsung 512GB instead of 480GB), it is another way of extending the life of your NAND/SSD. *Yes, it is possible Samsung may have secret spare NAND to compensate, but I don't think it would be the same usual 7%-10%.
 

legen

Active Member
Mar 6, 2013
208
35
28
Sweden
capacitor = good
rain = good

performance = slightly worse, but other than benchmarks/synthetic tests it really may not be an appreciable difference

If you can get 2x the capacity with the M500 over the Samsung, for the same price, I would get the M500.

Oh, and the M500 is already overprovisioned - 960GB vs 1024GB (Or, conversely, for Samsung 512GB instead of 480GB), it is another way of extending the life of your NAND/SSD. *Yes, it is possible Samsung may have secret spare NAND to compensate, but I don't think it would be the same usual 7%-10%.
After reading some more i agree with you. Capacitor + RAIN = good.
Some sources if someone wants to read more
Capcaitor, see first answer.
RAIN

The only problem here is that I do not know how (or in what way) this will affect ZFS that runs on top. Given that i.e. RAIN gives you parity bits in the NAND itself. I would think that ZFS dont care about the underlying storage devices but I am not sure....
 
Last edited:

ColdCanuck

Member
Jul 23, 2013
33
3
8
Halifax NS
You are over thinking the RAIN bit. Conceptually it is no different than the ECC or spare blocks on a spinning disk. That is, it is invisible to anything outside the device's firmware. In other words ignore it, the OS already does.


As an aside RAIN is a great marketing gimmick, it provides no benefit over simple over provisioning, and a good ECC, which all modern SSDs provide. But it does have a nice name and perhaps it will it get my "whites whiter than white".


Power loss caps are great if you are in the habit of yanking the plug on your computer to turn it off. I would assume that for a backup server a UPS and orderly shutdowns would be the norm, so again not much benefit IMO. However if it's free then go for it.
 

caveat lector

New Member
Jan 4, 2014
22
0
1
Oregon, USA
... And as usual I have to put in a plug for my own favorite VM architecture for a lab or small network: Locally attached non-RAID SSD drives for VM storage plus block-level incremental replication every few minutes to a separate server. Locally attached SSD means great IOPS, which VMs love. Giving up RAID reduces costs dramatically, while the very frequent replication provides protection against disaster with a maximum data loss window of only a few minutes. Using replication as the DR strategy also means that you automatically maintain very frequent snapshots on the replication sever, protecting against human error as well.
Do you replicate with rsync?
 
Last edited:

legen

Active Member
Mar 6, 2013
208
35
28
Sweden
You are over thinking the RAIN bit. Conceptually it is no different than the ECC or spare blocks on a spinning disk. That is, it is invisible to anything outside the device's firmware. In other words ignore it, the OS already does.


As an aside RAIN is a great marketing gimmick, it provides no benefit over simple over provisioning, and a good ECC, which all modern SSDs provide. But it does have a nice name and perhaps it will it get my "whites whiter than white".


Power loss caps are great if you are in the habit of yanking the plug on your computer to turn it off. I would assume that for a backup server a UPS and orderly shutdowns would be the norm, so again not much benefit IMO. However if it's free then go for it.
I might have been somewhat misled by the RAIN marketing paper :). We have UPS so capacitors might not be needed. Then my only remaining argument is the $/GB for M500 vs samsung PRO.