Has anyone else seen these drives report SMART values for attribute 199 UDMA_CRC_Error_Count?
I just noticed that 5 / 12 HDDs are reporting a value of 1 instead of 0. I find that odd as I put them through hell to burn them in for what feels like an eternity (largest drive I had burned in prior was a 6TB HDD). Burn in methodology = SMART short / SMART extended / badblocks / SMART short / SMART extended. All tests completed without error / clean SMART. But I've also learned one thing about burn in: (1) either I'm not doing it right, or (2) proper burn which cleanly completes doesn't mean you won't have a component fail a month later. I've had #2 occur with both HDDs and DIMMs.
I know that this isn't one of the SMART errors that correlates strongly to failure, yet find it interesting nevertheless, as I had 12 HGST Deskstar NAS drives also installed in that same server (and same backplane) and those drives have 10k+ hours on them (minimum) and were in that server for about 3x longer than the WDC drive's current Power-On Time of 1200 hours (50 days). Of those 12 drives, now in an identical SC826, the above provided "dashboard" only notes one "blip" and that being 199 UDMA_CRC_Error_Count = 1. I just ran the math on when it occurred and it was prior to deploying my current servers (rather it occurred when it was in an SC836).
My point here = I think the integrity of the server and backplane can be ruled out.
Not super concerned regarding this appearing, yet I don't think it is something to completely overlooked, either. From personal experience I know that HGST will RMA HDDs with the only SMART error being 1 for the same attribute (I've done it twice with the aforementioned 6TB Deskstar NAS drives), so that error must mean
something.
I have a cronjob set up in FreeNAS to run SMART tests with the following intervals:
- Long = 8th and 22nd of month @ 4AM + Short = 5th, 12th, 19th, 26th @ 3 AM
- I note that as the the provided SMART data (see spoiler in next post, shows the last test was run at hour 822 (16 days ago) which means the cronjob was missed for the following scheduled tests: (1) Long test on 12/22, (2) Short test on 12/26, and (3) Short test on 1/5.
- While I don't doubt that I had the server powered off for at least one of those days/times, to miss all three would really surprise me. I just looked at the cronjob in FreeNas and it is set up correctly. Be interesting to see if SMART extended runs on 1/8 at 4 AM.
Full SMART data contained in a spoiler in the next post (I hit the character limit when including here in this post).