ReFS or Deduplication

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

zakari

New Member
Sep 10, 2012
9
0
1
Hello,

I recently built a server that I use mainly for media storage (pictures, movies ...)

Here are the basic details :
- Norco 4220 Chasis
- Supermicro X9SCM-F board
- E3-1270
- 16GB DDR3 ECC
- Areca 1882 IX24
- Mellanox ConnectX2 board
- 256 GB SSD Boot
- 2TB 2.5" disk for regular access stuff
- 3TB WD RED drives for the array (not full yet, RAID 6 array)

I just reinstalled the whole thing to Windows 2012 (mainly for test purposes & SMB3 over RDMA) and I find myself a bit puzzled :

On one side there's ReFS which looks great in terms of resiliency, but the RAID card should do that just fine (I do a checkup of the array to see if parity or data blocks are gone every two weeks)
The main argument of Microsoft was NTFS can be corrupt in case of a Power Loss. While it's true I've experienced that in the past, I can't say I have in the last couple of years.

The other great feature of ReFS (which is the thing I like most) is long path, no longer limited to 255 characters total.


On the other side is Data Deduplication, which is also very tempting, but supported only on NTFS (that's right ...). Not sure it'd save me a lot of space, but even 10% is great on a 20 drive array.

Do you have any technical suggestions in regard to that choice ?
I'm not sure yet which way to go and would really appreciate to hear your inputs :)
 

acesea

New Member
Oct 7, 2011
8
1
3
You should try and benchmark the options available and post the results. Also whatever you decide to stick with try to sha hash several large files for reference and verify the hashes sporadically to verify all is well.
 

zakari

New Member
Sep 10, 2012
9
0
1
I'm trying Deduplication on a 2TB drive full of "Real Life" data (videos, music, programs, documents etc ...)

Will let it run for a week or so and see the actual savings. (and post them)

I'll try to get the hashes for several files, thanks for the suggestion.
 

Patrick

Administrator
Staff member
Dec 21, 2010
12,519
5,827
113
I'm trying Deduplication on a 2TB drive full of "Real Life" data (videos, music, programs, documents etc ...)

Will let it run for a week or so and see the actual savings. (and post them)

I'll try to get the hashes for several files, thanks for the suggestion.
Interested to see the results! (Only have so much time to test and write stuff on my end.)
 

zakari

New Member
Sep 10, 2012
9
0
1
Here's the original Data (not really real life though, there's loads of movie files on this disk, no pics or sound data ...)




And here's the result on the Server Storage management console (look at the HD204UI drive)




120 GB Saved, less than 10%. I think this things is really best used with virtualisation data (HyperV & other)

I might give it a shot on the R5 array which is still NTFS and is made up of this :




That way we can see the results on a dataset composed of various stuff. (still mainly movie files, HD stuff tends to weigh in a lot)