Windows Server 2012 R2 reporting all hard drives as healthy, HDD Sentinel Pro 4.71 does not agree?

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

RamGuy

Member
Apr 7, 2015
35
2
8
34
I'm about to move back to hardware RAID after using Windows Spaces / Storage Pools in Windows Server 2012 R2 for a few years now. Mainly because I don't have enough ports and I need more storage so I will be splitting up my RAID and adding new drivers and ended up with ordering a LSI MegaRaid 9265-8i + Intel SAS Expander.

Therefore I backuped all data and decided to download HDD Sentinel Pro 4.71 just to inspect my hard drives as they've been running 24/7 for years now and this is the perfect time to replace drives if some is bad.

Windows Server 2012 R2 does report health status on hard drives that are a part of a Storage Pool and they are all reported as being Perfect / Healthy. So I didn't expect anything funky when installing HDD Sentinel but I was wrong.

According to HDD Sentinel, two of my WD RE 2TB hard drives are at a critical low health status of 4% due to bad sectors. I did a surface scan and they both have a few bad and damage sectors, but the number was rather low and one of the drives had most of the bad ones in a small area of the spectrum whereas the other had some issues all over.

I have filled my RAID to 98% so both drives should have been pretty much jammed-packed and I haven't had any performance or stability issues and as I mentioned above, Windows Server itself doesn't not seem to care much and reports both as healthy.


Should I be worried? I do notice that HDD Sentinel claims that doing a complete WRITE+READ or Re-Initialize disk surface might improve on the situation and restore the drives to a better health. I might give that a go before I configure things on the new hardware RAID and I might also go with RAID6 instead of RAID5 on this particular RAID so if both drives dies anytime soon I will have time to replace them without loosing anything.


What do you guys think?
 

TuxDude

Well-Known Member
Sep 17, 2011
616
338
63
Just grab the raw smart data from the drive with whatever tool does that on windows, and see what the drive itself thinks.
 

Fritz

Well-Known Member
Apr 6, 2015
3,372
1,375
113
69
Most people, me included, don't know how to interpret raw smart data. :rolleyes:
 

RamGuy

Member
Apr 7, 2015
35
2
8
34
I have no clue how to interpret raw data either. Did the re-initialization within HDD Sentinel and there was no bad or damage sectors on either disk doing the full write wipe of the drives. But upon doing a new surface read scan afterwards there are about 4 bad sectors and 1 damage sector on both drives. That's way less than before the initialization and does sound like a low number but I don't know how safe it would be to continue using these drives?
 

Fritz

Well-Known Member
Apr 6, 2015
3,372
1,375
113
69
I have HD's with bad sectors that I use for non critical / transient data.
 

TuxDude

Well-Known Member
Sep 17, 2011
616
338
63
Sounds like its time to toss that drive to me.

All HDDs keep internal spare hidden sectors that are used as replacements when other sectors go bad. So if you are reading from a bad sector you will get an error, but if you write to a bad sector internally the HDD will fail to write that sector, then remap one of its spares to that address and write the data there instead, and then return a success code up to the OS, so you never see bad sectors on writes. Re-initializing the drive is going to over-write the entire drive, which will force the HDD to remap any bad sectors it encounters during the process - afterwards you should have no bad sectors left. If you do have bad sectors again right away, that would indicate that the drive is losing sectors at a pretty high rate which probably means a bigger problem is causing them.

Go grab the smart data yourself and you can see these things. You don't need to understand the majority of it, just a couple basic things. First - look at the value for Reallocated Sector Count (might not look exactly like that, my drives show as 'Reallocated_Sector_Ct' - having a couple sectors reallocated is not a problem, and I've even continued to use drives with a couple thousand reallocated sectors before as long as that number is stable. Watch that number for a while, check it before/after things like re-initializing disks - if its going up then the disk is dying, if its going up fast (more than a few a day) you probably don't have much time left. The other number to watch when dealing with bad sectors is Offline Uncorrectable and/or Current Pending Sector - some drives have one or the other, some have both, the mean almost the same thing. Those are the count of sectors that have had problems but that have not yet been reallocated - like when you try to read from a bad sector but haven't written to it to remap it. Re-initializing the disk should reset those to zero, while adding about the same number to the reallocated sector count. If either of those is above zero, applications will see bad sectors if they try to read the whole drive - if you are going to keep using the drive you should overwrite them and force them to be reallocated. Again, the thing to watch for is if they are increasing (or more constantly coming back after you remap them) - if the drive is losing a lot of sectors it is near death.
 

oily

New Member
Jul 5, 2016
1
0
1
58
Don't trust Windows Server. Ever.

Don't trust HD Sentinel on SSD's

DL CrystalDiskInfo and get a second opinion.
I'm beginning to agree. If not Windows Server 2012 R2 - then what? We currently use Hyper-V, Active Directory, file services, replicated filesystems (DFSR to remote branch office) .. in a Dell MD1000 JBOD.

What would / do you use? (CentOS, SAMBA, etc?)
 

Fritz

Well-Known Member
Apr 6, 2015
3,372
1,375
113
69
The problem with Crystaldiskinfo is that it will only see drives hooked directly to the MB. It's blind to drives hooked to a HBA or RAID controller. HD Sentinel OTOH will see all drives no matter how they're attached to the box. Either HDS is seeing info not included in SMART or it makes crap up. It will flag a drive as bad when CDI (and SMART) says all is perfect.
 

felmyst

New Member
Mar 16, 2016
27
6
3
28
This is technology, not magic. Storage Spaces report drive health bad when the drive has already died or is giving constant high latency and/or i/o errors -> actively dying. Same for event logs: you should monitor them for disk block and i/o errors but they serve the same purpose: to let you know which drive to replace in a big array. They can't predict a failure. It's you who can.
Never trust any software that gives % health of any component. This ain't video games, HDD doesn't have an HP pool of 100 or something. The only "% health" you can almost trust is "health" or "wear" raw SMART metric of an SSD. But even in this case you should double check this value using raw "LBA written" SMART data, SSD manual, a calculator and your brain.

>due to bad sectors

You should definitely get constant "bad block" and i/o errors in your event logs in this case. If there're none of them, your software lies to you.

>I have no clue how to interpret raw data either.

So go on and goddamnit google it. It ain't rocket science. Raw data is just a number of events. Reallocated sectors metric is constantly growing? Your drive is going to die soon. UltraDMA errors are growing? Replace SATA cable. Seek error is growing? Replace HDD ASAP.
ALWAYS look in raw data. NEVER trust percentages.
And seriously, go google it.