Supermicro board/memory problems

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

Jonkheer

New Member
Jun 18, 2017
12
2
3
42
So, I saw a nice deal on eBay and purchased a X10SDV-2C-TLN2F. I wanted to use it for a FreeNAS build that was running fine under ESXi with my LSI 9201-16e in pass-through on a Supermicro X10SDV-TP8F (Xeon D-1518).

Some of the X10SDV-TP8F ram was used as donor for the new X10SDV-2C-TLN2F, being two 8GB modules that are on the Supermicro compatibility list and were working flawlessly for more than a year in the donor system.

After re-installing the latest BIOS and BMC/IPMI Firmware I went ahead and installed FreeNAS.
Trying to install FreeNAS and running it I noticed hangs and resets as a result (triggered by the IPMI watchdog).
Looking at the BIOS log and IPMI event log I saw what is in the included screenshots.
bios_error_log.png
event log.png

The problems (crashing/freezing) would turn up after a short time (15 minutes?).
I decided to try a Ubuntu Live boot, which worked without problems.
Then I did a memtest86 for about 2 hours, without a problem or entry in the logs during these tests...
A test install of ESXi 6.7 also crashed after some time.

I started to think I damaged the ram modules during un- or installation, so I reinstalled them (as sole modules) in the donor. So far (hour+) the donor runs without a problem, which makes me think the memory is OK.
I still have two 16GB modules from the donor (also on the compatibility list), but I'm reluctant to install them in a potential malfunctioning board.

Which takes me to my provisional conclusion, that I bought a lemon.

Any insights, tests or advice from you guys would be appreciated.
Also, the buyer had a 'does not accept returns' policy... but I'm thinking of returning it if it turns out to be the board, will have to ask him first though.
If it is the board and returning it is not possible, what about RMA to Supermicro, is this even possible on a second hand buy and what will it cost me?
 
  • Like
Reactions: Tha_14

Terry Kennedy

Well-Known Member
Jun 25, 2015
1,140
594
113
New York City
www.glaver.org
Which takes me to my provisional conclusion, that I bought a lemon.

Any insights, tests or advice from you guys would be appreciated.
Do you happen to have any Dell-branded cards (mostly storage controllers, but Ethernet / FC also qualifies) in there?
Also, the buyer had a 'does not accept returns' policy... but I'm thinking of returning it if it turns out to be the board, will have to ask him first though.
At least in the US, even on "Seller does not accept returns" items, the eBay "Money Back Guarantee" still applies. If the item was not as described and the seller won't take it back, just get refunded by eBay and let them deal with the seller.
If it is the board and returning it is not possible, what about RMA to Supermicro, is this even possible on a second hand buy and what will it cost me?
It depends on how the board was originally purchased. Distribution channel (both retail and white box versions) boards are normally RMA-able, OEM channel boards are not. If you are "out of market" (such as a board sold by Supermicro in the US, you're in Europe), probably not.
 

Jeggs101

Well-Known Member
Dec 29, 2010
1,529
241
63
So the sticks work fine in another SM X10SDV. The sticks work fine in that board under other OS's just not FreeNAS?

FreeNAS has had many watchdog issues, esp w/ AsRock. GoogleFu is your friend there.

If the RAM is OK and everything works with Linux, it's a FreeNAS problem. Just go Linux.

FreeNAS problems like these are why so many people have been leaving their platform.
 

Jonkheer

New Member
Jun 18, 2017
12
2
3
42
Do you happen to have any Dell-branded cards (mostly storage controllers, but Ethernet / FC also qualifies) in there?
I don't know about the LSI HBA, it did not have any Dell markings on it (bought it second hand). It is flashed to the FreeNAS advised P20.0.0.7 firmware. So, it could be Dell-branded. Is there a way to check? What if it is, could this be the root of my problems (note the card was also installed in my other system, without problems)?

The sticks work fine in that board under other OS's just not FreeNAS?
An ESXi installation (not even running any VMs) also crashed, so it is not limited to FreeNAS. Will try Ubuntu again if I have the time, but maybe I got lucky that one time? Still strange Memtest86 did not show any problems... Maybe I should also try a Windows 10 installation, just to be sure.
I'm willing to try other (NAS) OS's (Napp-It?), as long as they provide good performance, good documentation/community support and feature an easy management (web)interface.


So, it seems my two 8GB sticks run fine in the old system, meaning they are probably not defective.
Maybe I'll try the 16GB modules later today then, just to be sure.

As for the OEM-part, I don't think it is OEM as I have a nice box with SATA-cables and all.

If the other modules also fail, I'll check up with the seller for a refund (preferably a full refund).
 

Stephan

Well-Known Member
Apr 21, 2017
920
698
93
Germany
Your board logs clearly show "ECC uncorrectable" errors, that would be either the CPU (RAM interface), the board with its slots, or your RAM. If your RAM is ok elsewhere, even under intense bandwidth pressure (try prime95), that leaves board and CPU. I'd say contact seller, describe problem, ask for other board or refund and send it back. Don't waste much time on it. If the seller is uncooperative, get eBay in on the issue through buyer protection or whatever it is called in the USA. I.e. Terry's recommendation above imho is what makes the most sense.
 

Jonkheer

New Member
Jun 18, 2017
12
2
3
42
Thank you for your feedback.

I went ahead and installed the two 16GB modules in the troublesome board and have been stressing FreeNAS (getting the ARC, thus ram filled and used) for more than an hour now and have not had a single problem...

This makes me wonder if I should blame the board. Tomorrow (or later) I'll try to re-install the 8GB modules, hoping they might work better too now (maybe it is naive to think/hope it was just a bad fitting or connection between the ram and the slot after the first installation).
 

Jonkheer

New Member
Jun 18, 2017
12
2
3
42
So, a small update. I re-installed the 8GB modules and got instant failures once more. Re-installed the 16GB modules, no problems.
I'm starting to think the compatibility with Micron MTA18ASF1G72PZ-2G1A2 chips is not OK (although they are on the Supermicro list), at least not with the Crucial modules I have.
My Crucial 16GB modules with Micron MTA36ASF2G72PZ-2G1A2 chips are fine.
I'm still finding it strange almost the same boards and processor family handle the memory differently.

To conclude. I think I'll keep the board and wait for the DDR4-prices to drop before buying some more memory for my now memory starving ESXi machine.
 

chinesestunna

Active Member
Jan 23, 2015
621
191
43
56
Have you contacted Supermicro support? Maybe they have something to help, I've gotten firmware/BIOS updates from their support team which is way newer than what's on the website which resolved issues I've had in the past.