Tool to compare x percent of files on a fileshare?

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.
Sep 4, 2017
47
25
18
46
I just replaced an old fileserver with a new one, and copied over all of the data. I want to have a high degree of confidence that the data is intact on the new array, but I can't spare the time to binary compare 100 TB of files. I'd be satisfied if I could verify, say, 10% of the files.

Is there any Windows-based tool that will compare a random x% of files on one share vs. another?
 

Evan

Well-Known Member
Jan 6, 2016
3,346
598
113
md5deep is what I would think is a good option in this case. It does hash all files but all I know for your needs right now other than the mentioned ones.
 
Sep 4, 2017
47
25
18
46
Thanks folks but none of those actually do the task, so I guess I'll have to write a script by dumping the file list into Excel and constructing a CMD using FC.
 

Evan

Well-Known Member
Jan 6, 2016
3,346
598
113
Since you say Windows I assume you used robocopy to copy everything ? Even the report from that should give a good level of confidence ??
 

EffrafaxOfWug

Radioactive Member
Feb 12, 2015
1,394
511
113
Depends how thorough you want to be about it, but I'd echo Evan's recommendation of md5deep/hashdeep - I use this to verify the integrity of my offsite backup server. You can turn one dir tree into a list of MD5 checksums, and then compare those checksums to another dir tree like so (windows-ified version of the commands I use under linux):
Code:
hashdeep -rlc md5 c:\tree1 > c:\tree1_md5sums.txt
hashdeep -ravvl -k c:\tree1_md5sums.txt c:\tree2
Assuming there are no differences between the dir trees you should get output like this:
Code:
hashdeep: Audit passed
          Files matched: 3975194
Files partially matched: 0
            Files moved: 0
        New files found: 0
  Known files not found: 0
Bear in mind this requires reading every last byte of each tree and computing the MD5 hash so it's quite IO and CPU intensive.

hashdeep and friends are something of a hammer to crack a nut though. If you're just worried about whether files copied from A to B via explorer or robocopy correctly then it's probably well beyond your requirements.

If you don't need a full cryptographic hash and are content just to use date/time detection to show differences (and assuming both trees are accessible from the same computer) you can just use robocopy's /L param to list files it would copy from one dir tree to another.

Been wishing for years that robocopy would add a checksum like is available in rsync (IIRC rsync also uses MD5 now).
 
  • Like
Reactions: scobar and ecosse

Celoxocis

New Member
Mar 28, 2017
3
0
1
41
There is Total Commanders "compare directorys" if you prefer a Gui Windows Tool. Which can even compare by content: