H12SSL and Epyc can't handle the ram

thatch45

New Member
May 12, 2022
12
1
3
I just got a SuperMicro H12SSL motherboard and an AMD Epyc 7313 processor. The board boots up just fine and is running Linux like a champ - but only with half the ram installed. But I got 8x64GB memory sticks, and if I put in more than half of them the system refuses to post. I can get into the IMPI web UI and it only shows 5 of the 8 sticks of ram.

I have moved the sticks of ram all around, cleaned the connectors, etc. multiple times.

The memory is
SK hynix 64GB 4DRx4 PC4-2400T-LE1-11 LRDIMM

The bios is to the latest available version, 2.3 from October 2021

So to sum up, I have 512GB of ram, but the system will only boot with 256GB installed.

Any ideas on what I could be missing? Thanks!
 

RageBone

Active Member
Jul 11, 2017
570
141
43
test each slot individually with only a single stick of memory.
Look into the socket for bent pins, and do a reseat of the CPU like rollo said.
 
  • Like
Reactions: RolloZ170

thatch45

New Member
May 12, 2022
12
1
3
Thanks!
I do not have a misplaced motherboard standoff, I checked before I posted
I can reseat the CPU, I will give that a shot
There are no errors during post and no errors in the IPMI


The big question I have is this:
Milan does not like Hynix LRDIMM

Really? Whats that about? I will look that up and get back to you all :)
Thanks!
 

RolloZ170

Well-Known Member
Apr 24, 2016
1,935
501
113
55
The big question I have is this:
Milan does not like Hynix LRDIMM
set to a lower frq will most solve this issue.
but if your system don't have errors but DIMMs hidden, they are not proper connected then.
 

Bjorn Smith

Well-Known Member
Sep 3, 2019
539
281
63
48
r00t.dk
not to forget: a misplaced motherboard standoff ?
Yes - I had that issue with a Supermicro board - only affected the RAM - so whenever I plugged something into certain slots - the board complained so being silly I called SuperMicro and they suggested this - and voila - I had a misplaced motherboard standoff that was short circuiting some of the ram slots - luckily nothing was permanently broken.
 

nabsltd

Active Member
Jan 26, 2022
156
87
28
Yes - I had that issue with a Supermicro board - only affected the RAM - so whenever I plugged something into certain slots
I'm still actively using an X9SRA that wouldn't allow me to use every DIMM slot until I wedged something at the edge of the board.

I had not bench tested the board before install, and my first thought was that it was a short because of a standoff since it worked when I removed it and bench tested, but careful inspection showed only standoffs were ones that were used with screws. When I put the board back in but did not screw it down (quick check for shorts), it worked. As soon as I screwed it down, I lost some DIMMs again. After loosening the screws near the DIMM slots, they worked again.

Again, I thought it was a short, so I switched to plastic push-pin style standoffs, and the same DIMMs wouldn't work as long as I had the mounting holes near the DIMMs "locked down" in some way (either through screws or the tight pull of the plastic standoff). At some point in the testing, I finally tried a wedge under the side, as I was now fairly sure it was a "bending" issue, and that solved it. It's been that way for around 6 years, running great.
 

thatch45

New Member
May 12, 2022
12
1
3
Thanks @nabsltd , I think that is it. the chassis I am using is used and I think part of it may be rising up and touching the motherboard. The problem came back when I installed a video card now it can only see 128 GB of ram.
I will need to figure out how to add some insulation or tighten where the chassis is loose.
 

thatch45

New Member
May 12, 2022
12
1
3
nope, I installed some washers and now only 2 sticks show up. hopefully the board is not damaged and the ram is not bad. I have another server to build with the same board, that should tell me if there are issues with the motherboard or ram
 

thatch45

New Member
May 12, 2022
12
1
3
You guys have been fantastic!
I have reseated the CPU and carefully inspected it and the socket, I was unable to find any damage, dirt, oil, etc.
I have tried multiple memory configurations
I have reseated all memory sticks multiple times
I have installed washers below the motherboard pegs on the side with the cpu and memory

I have tried out multiple DIMM slots, this is where things get interesting...

Slots A, B, and C consistently show up as Not Present in the bios. IPMII and the board still do not show any errors.

I am starting to think I got a bad motherboard.

Again, you guys have been fantastic! Any other ideas? Or do you think I should I be looking at swapping out this motherboard?
 

RageBone

Active Member
Jul 11, 2017
570
141
43
Such problems can be due to the Board or CPU.
You could test both with confirmed good components, that is the only way to be sure that i know of.

Pictures of the board and socket could lead to more because just because you don't see anything, doesn't mean that we won't see anything.

Very weird is that you are not getting any errors in a Log, if the sticks were detected and powered, you should be getting messages about that.
That you aren't could be either because they aren't powered and or detected.
With such things, i would usually expect the hole group to not work.
Only A, B and C being consistently not detected is weird though.

A more common issue i see is physical damage from rough handling.
Traces can get scratched and components ripped off with just sliding it around on the standoffs in a case.
Pictures of both, the front and back of the Board would be good for spotting such damage, should that be the case.
 

oneplane

Active Member
Jul 23, 2021
268
125
43
This is very tricky considering the huge amount of interfaces and traces that would be able to affect this. Even a single pin in the CPU socket or DIMM slot could do it (i.e. if it messes up the register chip on the DIMM or a Vdd/Vcc line). The worst would be a hairline fracture in one or more traces in an internal layer, you'd never be able to see it, and even on an X-Ray it might not show up.

Testing RAM + CPU on another board might validate those, the other way around (different CPU and RAM on your current board) could also do that, but both require extra parts which people usually don't have lying around.
 

Terry Kennedy

Well-Known Member
Jun 25, 2015
1,118
569
113
New York City
www.glaver.org
Slots A, B, and C consistently show up as Not Present in the bios. IPMII and the board still do not show any errors.
I don't know if they still do this, but some older Supermicro boards (I'm familiar with the X8 series) routed some of the memory modules SDR SMBus to some of the expansion slots. This meant that if there was a conflict (Dell HBAs were particularly notorious for this*), certain memory slots would show as empty if the conflicting controller was installed. Have you tested the system with just the motherboard, CPU(s) and memory without any expansion cards installed?

* Some people like to blame Dell. But there is no overall governing body to avoid conflicts between various bus IDs/slave addresses. Also, Dell never sold those controllers as add-in cards for 3rd-party systems - we just buy them because they're usually inexpensive and widely available.
 

thatch45

New Member
May 12, 2022
12
1
3
I installed the CPU, and ram from the one system in question on another board, and the exact same issues showed up, same DIMM slots were bad. So I concluded that the ram is either bad or more likely not compatible with the board. So I returned the ram and ordered some PC3200 direct from Supermicro to replace it. It was a little more expensive, but I suspect it will not have the same issues.
 

thatch45

New Member
May 12, 2022
12
1
3
I also did try it with and without any adapters - same issues. I also looked at the board and could not see any damage, and I ruled that out after the second board gave me the same issues with the same CPU AND a different CPU
 

Terry Kennedy

Well-Known Member
Jun 25, 2015
1,118
569
113
New York City
www.glaver.org
I also did try it with and without any adapters - same issues. I also looked at the board and could not see any damage, and I ruled that out after the second board gave me the same issues with the same CPU AND a different CPU
It is good to have sufficient spares to be able to rule things out. Some years ago I helped a friend with his 1st-generation Zen build and he cycled through several motherboards, CPUs and RAM until we got a combination that worked. It turned out that the early hardware compatibility lists and BIOS versions were extremely picky about RAM, beyond what was listed in the HCL.