Meta data corruption on SuperMicro H8DG6

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

fossxplorer

Active Member
Mar 17, 2016
556
98
28
Oslo, Norway
Ah ofc! I already got couple of those adapter at home. I bought 6 of them a while back for my 4GB SATA DOMs that have old Molex power connectors.
Anyways, order placed for that EVGA 750W SuperNOVA GQ, delivery time is up to 4 days. So hopefully i can test more coming weekend.
Let's see if the issue is related to power or not (would be happy if it's not the MB really)

And thanks a lot for your help along so far!

You do get SATA to Molex adapters reasonably cheap, such as these: 8 inch SATA 15 Pin to 2 Molex 4 Pin Dual Power Adapter Y Splitter Cable Cord M/F So its not as big of an issue to hook it all up, if required, further down the troubleshooting road. Don't rule out the backplane or anything else for that matter just yet, one step at a time. Begin by making certain you have good clean power to everything :)
 

fossxplorer

Active Member
Mar 17, 2016
556
98
28
Oslo, Norway
Finally i received the PSU yesterday!
Since the server is at home, it can't be turned on 24/7 :( But so far with the new EVGA 750 GQ connected directly (bypassing the PDU) it works just fine.
Couldn't reproduce the issue. But will some more testing with at least 10H server uptime. Perhaps this weeekend i'll be able to do that.

Also, i forgot to mention, first time i turned on the server after receiving it from the DC, it smelled a bit burned and my wife said the appartment was a bit foggy...
But when the server was at the DC, both PSU was tested by taking 1 out...so it's likely not the PSU either. Atm, it looks like likely the PDU ( the power distributor backplane) is the problem.
 

fossxplorer

Active Member
Mar 17, 2016
556
98
28
Oslo, Norway
Update:
I was too early to make assumptions! Today, after having moved to a place where i can have it turned on for a longer time, i got meta data corruption on one of the disks connected via SAS backplane to the SATA ports onboard. Server has been on for some 8 hours with couple of reboots and shutdown. The issue apperared right after a reboot on attempt to mount the disk partition.

Now strange enough, there are a 2nd SSD and a single HDD (both on ports that are connected to a M1015 SAS controller), which weren't corrupted. As it's now running on the external PSU, it doesn't seem to be PSU or PDU related. I'm using SATA male to molex adapters to power the SAS backplane.

Any tips on what i should test/do now before i take the drastic step of replaceing MB, CPUs?
 

fossxplorer

Active Member
Mar 17, 2016
556
98
28
Oslo, Norway
So after while and being in a good believe that the issue was some of the SSD, i sent the server back to the DC.
Today i could see this: #395506 • Fedora Project Pastebin :( This was with another good used Intel 320 pulled from another server.

I'm tempted to ship the server back to me and replace MB, CPU and RAM. I have a Intel Tyan board with CPUs i'm thinking to replace it alltogether.

Any advise or tips highly appreciated.
 

pricklypunter

Well-Known Member
Nov 10, 2015
1,714
520
113
Canada
When you had the server in front of you, did you re-seat all the RAM, clean the CPU's and re-seat them with new thermal paste? What is the CPU temperature like when the fault appears? Have you determined the disk temperatures? If temperatures seem fine/ normal and the CPU's have been given some TLC, I think my next step would be a couple of sticks of known good RAM to test with, see if the fault persists, but that is me being methodical, you may not have the time to waste on it :)

These kinds of intermittent issues can get you tied in a knot trying to track them down, but if this is a critical server, esp given shipping times etc, if I had to have it before me again, I would swap the major components this time and be done with it. I wouldn't be sending it back either until I had confirmed once and for all that the issue has been dealt with :)