A1SAi-2750F Crashing with 32GB RAM

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

Mr. F

Active Member
Sep 5, 2011
172
30
28
I was running the Avoton board with 16GB of Kingston ECC SODIMMs and everything was fine.

I then installed 32GB of SK Hynix RAM, part #HMT41GA7BFR8A-PB, which is the exact 8GB module that's qualified by SuperMicro on their page. Hyper-V Server either freezes hard at the spinner, at the CTRL-ALT-DELETE screen, or it gets far enough that it lets me log in and do a few things with powershell. In all cases, it freezes hard - no BSOD, no logs.

My first thought was bad modules, so I ran memtest86 which reported 0 errors. I booted a CentOS Livecd and had a play around - no issues there.

I tried taking it back down to 16GB swapping modules around and everything was fine again with Hyper-V server. I'm back to 32GB now, and ran "Windows Memory Diagnostic" before booting - crashed again.

Any ideas as to what's going on here? I'm going to try mixing the Kingston modules in with the SK Hynix modules next...
 

Patrick

Administrator
Staff member
Dec 21, 2010
12,511
5,792
113
Let me think about it a bit. I have 3-4 of this family of boards with 32GB and another 2-3 with either 2x 8GB and 2x 4GB or 4x 4GB. Seems really odd.

Are they all seated properly? I have seen that be an issue before.

Is this only with Windows? Seems like that is a BIOS setting if so.
 

Mr. F

Active Member
Sep 5, 2011
172
30
28
I tried seating multiple times. Swapped modules around too.

So far, it's only in Windows. I may let memtest run overnight with multiple passes if I can't figure it out before bed, but the first memtest run went about 6 hours and passed.

One thing I thought of is that memtest definitely and CentOS (older LiveCD) possibly didn't load the i354 LAN drivers, while Hyper-V Server would have. Also, both of those booted without UEFI, where I believe Hyper-V Server is booting UEFI.
 

Patrick

Administrator
Staff member
Dec 21, 2010
12,511
5,792
113
Good things to test. Are you on the latest BIOS revision?
 

Mr. F

Active Member
Sep 5, 2011
172
30
28
Yes - updated BIOS and IPMI to the latest available from SuperMicro. Going to try mixing the Kingston modules in now for a 32GB total.
 

Patrick

Administrator
Staff member
Dec 21, 2010
12,511
5,792
113
If that does not work, I would also try sending SM a note.
 

Mr. F

Active Member
Sep 5, 2011
172
30
28
2 Hynix + 2 Kingston modules has been running for about 15 minutes and hasn't crashed.

Tried the Hynix modules in 2x configurations of 16GB to rule out bad modules again - no issues.

So as far as I can tell at this point, unless the Hynix + Kingston config crashes, the problem lies with running 4 of the Hynix modules with Hyper-V Server 2012 R2. Bizarre.
 

Chuckleb

Moderator
Mar 5, 2013
1,017
331
83
Minnesota
A few years back I bought a cluster and in certain configurations, we could cause the motherboard to die. This was odd, we had some code that when run, would cause memory failures and they tracked it down to how SM designed their memory circuit. They came in and soldered some parts on a few hundred nodes over a day, fixed the problem. It was pretty cool... I told them that we could kill 100% of any compute node that we ran on and it did.

Anyway, problem fixed. They are good at tracking these things down. Supermicro is an awesome company.
 

Mr. F

Active Member
Sep 5, 2011
172
30
28
A few years back I bought a cluster and in certain configurations, we could cause the motherboard to die. This was odd, we had some code that when run, would cause memory failures and they tracked it down to how SM designed their memory circuit. They came in and soldered some parts on a few hundred nodes over a day, fixed the problem. It was pretty cool... I told them that we could kill 100% of any compute node that we ran on and it did.

Anyway, problem fixed. They are good at tracking these things down. Supermicro is an awesome company.
Cool!

Still running with the 2 Hynix + 2 Kingston setup. I emailed support@supermicro.com with my issue. Hopefully they'll get back to me with some info.
 

NeverDie

Active Member
Jan 28, 2015
307
27
28
USA
How are you cooling it? I ask because I wonder if heat could be an issue.

Consider:
1. DIMMA2 and DIMMB2 are close to the CPU heatsink, which can get pretty hot even just idling. Also, with memory modules in those slots, the heatsink can't dissipate heat as well and will get even hotter.
2. You didn't experience the problem until those memory slots were used.

Possibly the heatsink is overheating the memory, or visa versa, or both.

The specification on your SM board says: "Operating Temperature: 0°C - 60°C" Try measuring it. It's not hard to imagine that 60C is being exceeded.

If you notice, on the micro-atx boards SM uses a larger heatsink, and it doesn't seem to get as hot either as the one on the mini-itx board that you're using. On the
A1SAM-2750F SM even put a fan on the CPU heatsink. Why would SM do that on that board if the passive heatsinks on the other SM boards are sufficient?

Anyhow, it's an easily testable hypothesis: use a heatgun to see if you can reproduce the problem and/or use aggressive cooling to see if it prevents the problem from occuring.
 
Last edited:

Mr. F

Active Member
Sep 5, 2011
172
30
28
I have a 40mmx10mm fan cooling it.

According to IPMI my CPU is at 18°C idle - peaks to around 23°c under load, RAM modules are between 22°C and 27°C. I've got the event log running and it hasn't captured any over temp events.
 

NeverDie

Active Member
Jan 28, 2015
307
27
28
USA
I have a 40mmx10mm fan cooling it.

According to IPMI my CPU is at 18°C idle - peaks to around 23°c under load, RAM modules are between 22°C and 27°C. I've got the event log running and it hasn't captured any over temp events.
I just now noticed that on the current SM website, the A1SAi-2750F is pictured as having a heatsink but no fan:




whereas in Patrick's November 14, 2013 review of the same board shows it pictured as having a fan on it:



So, which is the proper configuration: with a fan or without? Did SM change the product offering without changing the product name?

According to IPMI my CPU is at 18°C idle - peaks to around 23°c under load, RAM modules are between 22°C and 27°C. I've got the event log running and it hasn't captured any over temp events.
Are you sure? Those seem like very low temperature values. For instance, 18C = 64.4F. Your room temperature would have to be colder than that for the idle temperature to be that. Is your ambient air temperature where you have the computer normally that cold?
 
Last edited:

Mr. F

Active Member
Sep 5, 2011
172
30
28
Ran all night with 2 Hynix + 2 Kingston. Seems the problem is definitely related to 4 of the Hynix chips in Windows.

Are you sure? Those seem like very low temperature values. For instance, 18C = 64.4F. Your room temperature would have to be colder than that for the idle temperature to be that. Is your ambient air temperature where you have the computer normally that cold?
Those values are direct from the IPMI page. The system is in my basement and it's winter. It's cold down there.
 

Patrick

Administrator
Staff member
Dec 21, 2010
12,511
5,792
113
whereas in Patrick's November 14, 2013 review of the same board shows it pictured as having a fan on it:
I think I have posted it a few times before but that fan is actually something the SM lab guys use and the standard part does not have the fan. The first few I had I got before the big shipments came.
 

NeverDie

Active Member
Jan 28, 2015
307
27
28
USA
Apologies for doubting your temperatures then. I recently received this exact board, and mine runs so hot even with just two memory slots filled and even when just idling that I'm wondering whether to keep it or not, because I've also been testing a micro-ATX C2758 SM motherboard that doesn't have the same amount of concentrated heat. Both are presently sitting open-air (no cases) in the indoor living environment without any fans blowing on them, so plainly my test environment is far different.
 
Last edited:

DolphinsDan

Member
Sep 17, 2013
90
6
8
You need SOME airflow on them, just not a lot. You can get a silent 120mm fan atop the ram and CPU and you'd be fine.

In the SM 1U's you'll see they have 2 fans over the HSF just to keep airflow up if one fan fails, but they spin really slow.
 

Mr. F

Active Member
Sep 5, 2011
172
30
28
The boring outcome of this is that somehow changing the RAM to 4 sticks of Hynix when it was previously 2 sticks of Kingston caused Windows to freak out, and that was the cause of the crash. I don't get why it still worked with 2 Kingston + 2 Hynix - I even moved the Kingstons to different slots and it still worked with 32GB.

The 8GB Hynix modules now seem to be working fine - which is great because they were only about $40 apiece!

I moved all of the VMs to a different host and did a wipe and reinstall of Hyper-V Server 2012 R2. Everything seems fine now - I'm going to let it run for a few days before I put anything important back on it.
 

PigLover

Moderator
Jan 26, 2011
3,184
1,545
113
I'd guess one of the Hynix sticks has a fault. If you can put them back in and run a couple passes of memtest86+ and you'll probably be able to identify which one.
 

Mr. F

Active Member
Sep 5, 2011
172
30
28
I'd guess one of the Hynix sticks has a fault. If you can put them back in and run a couple passes of memtest86+ and you'll probably be able to identify which one.
They've already run memtest for hours and hours and passed. I actually wasn't even considering an OS reinstall until Supermicro asked me to try it. Maybe I'll run another day of memtest just to be sure...

That was the strangest part of this ordeal - my already-installed Hyper-V Server was the only OS that had issues, and it was always just after boot. LiveCDs and memtest booted and ran just fine.