X10QBI and v3/v4 cpus (e.g. supermicro sys-4048b-trft)

tconrado

New Member
Jul 17, 2020
16
2
3
hey @tconrado did you manage to get this fixed?

I just did some critical server maintenance yesterday and when i put it all back together again, I got DC detect failure for the first time ever. It turns out that you get that problem if the memboard isn't inserted all the way into the mobo. It also turns out that some memboard connectors are thicker than others!!!! One of my memboards can literally only fit into one slot and none of the others (it won't even go in when I sit on it, yes - I did that - don't try that at home.)

So when I returned that memboard to the slot where it "belonged", everything worked fine after that. Maybe it was a similar case for you? If it were up to me, I would check all the pins in the mobo connector. Memboard issues are almost always an issue of geometry.

Also, after taking the motherboard out I noticed that there were more standoffs installed on the backboard than necessary. Now I bought this system entirely prebuilt. These metal standoffs were TOUCHING THE METAL TRACES ON THE BOTTOM. I was absolutely furious. Whoever built my system deserves a slap in the face. But anyways after taking the unnecessary ones out, all my system's issues spontaneously resolved. This could also be the case for you!! Maybe the metal standoffs have been shorting something important!!

To anyone else reading this, if you get a persistent VBAT lower non recoverable (nr) error even after swapping for a new CMOS battery, updated IPMI firmware and BIOS (which is the first thing that Supermicro would tell you to do), in the absence of visible damage of the motherboard, check the standoffs under the motherboard.
ohhhhhhhhhhhhhhhhhh, I did solve the exactly way, tried even Supermicro that made me buy v2 CPU, hence, v2/v3/v4 cpus... all fail... we did swap the motherboard and it work. One of hte things I did request for the DC guys was to check for grounding points that will explain this result very well...

so the best part: did u notice any difference between the rev of the boards? I think Supermicro changed the boards and did properly mention it!
 
  • Haha
Reactions: angel_bee

tconrado

New Member
Jul 17, 2020
16
2
3
BTW, I think I spent too much on my CPUs 8890v4; those are a beast, but IOP are the bottlec and it fit "only 7 pci-e buffer drives"
It is working well, I can do some stats if anyone get interest...
4 x 8890v3; 1TB of RAM DDR3; 4 memboards, so yes, this is a board that u can save on RAM and go wild on CPU.
Asking $7500
 

lvx4

New Member
Jun 13, 2020
2
0
1
For those looking to watercool, Asetek makes a narrow ILM bracket for their coolers. So you can (theoretically, if the pump/waterblock fits in that socket spacing) fit most Corsair/NZXT/EVGA/etc AIOs.

There's no need to hack together cooling solutions.
 

angel_bee

Member
Jul 24, 2020
37
18
8
This is so much fun.

For a beginner like me, can you please provide a minimal parts list to get started? So of course the X10QBi motherboard. But how many memory boards do you need? Can you start with one and still operate with multiple CPUs? Do you need the official SuperMicro power supply or is any other one working?

I had to look up what a BMC board is and found this nice article.
Explaining the Baseboard Management Controller or BMC in Servers
But what I don't get if you can run a setup without a BMC?

I feel like a little kid again figuring out all kind of new stuff:)

If I have to create a new thread for these questions, just let me know.
im happy to clarify any questions that might not be immediately obvious, though i think your questions about the cpus and memboards has been already addressed in the previous pages. there is alot of valuable information there. i really suggest having a thorough read of its entirety before you buy any parts.

im pretty sure you can have as many memboards as you want? but there are 2 memboards per cpu and the cpus without memboard will not be active (someone refute this if im wrong). the x10qbi manual from supermicro has a comprehensive map-out of which memboard slots belong to which cpu. so you need at least 4 memboards if you want all 4 cpus to be running.

@jpk has noted before that you probably can't start with only one cpu; you need 2, 3 or 4. and it could be because the board is technically 2 dual sockets combined

I wanted to post an update here.
I did get things working. It was similar to @gmaxwheel 's experience - I got a controller card from the seller that had had the IPMI password reset and I was able to flash the bios from in that, and now it boots!

Some of my observations about the system:

CPU/memory stuff
  • After from the long initialization before you get to the bios, it behaves a lot like a regular server.
  • That long initialization time is actually the memory cards - if you take out the memory cards (or possibly just don't populate them) the lines for those empty cpus stay at 00:00 instead of going through the BA:0F etc.
  • It does recognize the CPUs in the system that don't have any memory cards - they are shown like regular to the OS and are present in the bios.
  • It will boot with only 2 CPUs, as long as the second CPU is in either socket 3 or 4 - just having them in sockets 1 & 2 doesn't work. It also works with 3 processors (I only tested with sockets 1,2,3, but I don't see why it wouldn't work with 1,3,4 or 1,2,4) But it does require that socket 1 is filled. It kind of behaves like this might be two dual socket boards put together, and each set of sockets needs to have a CPU to boot fully.
pci-e stuff
  • it happily boots off an NVME M2 stick on a pci-e card off a pci-e slot attached to CPU3/4
  • I don't know why they put pci-e slots in between the center memory boards, because even with the extremely low profile card I got (only slightly bigger than the NVME M2 stick) the heatsinks on the memory cards nearly touch the memory card next to it, and so the card won't fit.
Noise/power
  • noise level isn't bad, but it is definitely louder than my other 846 (4U) machine. The four 92mm fans in the middle won't run any slower than just over 6k rpm (and the three 80mm fans that are supposed to pull air over the middle memory cards run at about the same speed). I'm pretty susceptible to that kind of noise, and there's no way I could have it in the same room as me for a long period of time.
  • you can run it off of a single power supply (assuming you don't use more power than a single one can supply)
  • It runs at about 320W idle booted into linux with 4 CPUs, 8 memory cards and 16x 16GB dimms.
  • Each set of two memory cards seems to take about 40W at idle - running only 2 CPUs + a pair of memory cards it idled just under 200W, and with 2 pairs of memory cards, it's idling just under 240W
Weird stuff
  • It does not think my E7-8857 CPUs have hyper-threading (the line for HT is simply not present in the bios) which seems really weird. Has anyone else seen this before?
  • Does anyone have any experience putting different CPUs in different sockets in these? I'm wondering if it might be possible to have something different in sockets 3/4... I know from past experience with older dual-socket systems (x56xx), you could put mismatched CPUs in as long as they had the same amount of L2 cache (and I think same wattage needs)

Does anyone have some spare E7 v3/4 CPUs they would be interested in selling or lending me? I'd like to make sure that my system works well with v3/4 CPUs before I spend a lot of money on some larger ones.

Next things I want to look at are memory performance on only one card versus two - that might open up a couple of slots in the middle a little bit.
as for power supply, the motherboard has standard ATX connectors that you typically find on consumer PSUs, with the exception of special power connectors such as SATA-DOM. i would not recommend getting a consumer PSU instead of the 1620W redundant power supply that is designed for it, because to operate 4 sockets, you need 8 EPS (8-pin) power connections. This is too much for many typical consumer power supplies, as the system routinely draws 1200W+ power. that being said, you can probably get away with a 1600W+ power supply. but then the question is, where are you going to put it??? the supermicro case for the x10qbi (CSE-848X) does not have room for a consumer ATX power supply. if you're thinking of putting it into another case other than the CSE-848X, there are no other cases that will fit the x10qbi. it's a chungus motherboard that measures 19" x 17" (48.26cm x 43.18cm). I recommend being sensible and just using the supermicro case intended for the x10qbi, unless you have easy access to metal modding... well in that case I would like your contact details because I have some mod jobs to be done :p

and no, the system will not boot without the BMC card. it should come with all prebuilt and pretested systems. the only time when it doesnt is when you're buying a barebones system.

**MINIMAL COMPONENTS**
- the mobo
- cpu heatsinks
- enough memboards and memory for the number of CPUs you want to use. as usual, make sure the DIMMs are the same type for each channel
- sufficient power supply
- if you use a supermicro power supply, make sure the case contains a power distribution board on the ground floor.
- if you plan to use discrete graphics or other devices that require power supply, the power distribution board has 8 modular connectors at your disposal.
- if you plan to use the 24 drive bays, you need a SAS backplane + SAS/raid controller. you can also use the onboard SATA powers, although cable management will be difficult because you also need to route SATA power.
- BMC card - this has onboard LAN and single-output VGA graphics
- fans for the midplane (92mm)
- strongly recommend rear fans to help suck air out (80mm)

exerpt from official supermicro website (4048B-TRFT | 4U | SuperServer | Products | Super Micro Computer, Inc.):

1603371038821.png
 
  • Like
Reactions: sth-n00b

angel_bee

Member
Jul 24, 2020
37
18
8
ohhhhhhhhhhhhhhhhhh, I did solve the exactly way, tried even Supermicro that made me buy v2 CPU, hence, v2/v3/v4 cpus... all fail... we did swap the motherboard and it work. One of hte things I did request for the DC guys was to check for grounding points that will explain this result very well...

so the best part: did u notice any difference between the rev of the boards? I think Supermicro changed the boards and did properly mention it!
wow. what a disaster. yeah you'd think that supermicro tech support would be a bit more helpful... but unfortunately they aren't

im guessing they asked you to try xeon v2s because v3/v4s are not officially supported?
lmao what a joke.
 

synchrocats

New Member
Oct 20, 2020
4
5
3
From my expirience system doesn't start with one CPU. Two CPU config doesn't tested by me, and 3x CPU started with 3x memboards with 4x DIMMs installed in each MEM board. For the unkwown reason when I installed 6x memboards rear fans stopped and newer turned on again, so I removed It. Midplane fans is enough co cool the system under load at 20% PWM and 4500 RPM when I tested them with external PWM controller before they was removed completely.
 

synchrocats

New Member
Oct 20, 2020
4
5
3
I played with midplane fans PWM when there was standard heatsinks - When I got watercooling, midplane was removed completely
 

sth-n00b

New Member
Sep 30, 2020
8
0
1
...

**MINIMAL COMPONENTS**
- the mobo
- cpu heatsinks
- enough memboards and memory for the number of CPUs you want to use. as usual, make sure the DIMMs are the same type for each channel
- sufficient power supply
- if you use a supermicro power supply, make sure the case contains a power distribution board on the ground floor.
- if you plan to use discrete graphics or other devices that require power supply, the power distribution board has 8 modular connectors at your disposal.
- if you plan to use the 24 drive bays, you need a SAS backplane + SAS/raid controller. you can also use the onboard SATA powers, although cable management will be difficult because you also need to route SATA power.
- BMC card - this has onboard LAN and single-output VGA graphics
- fans for the midplane (92mm)
- strongly recommend rear fans to help suck air out (80mm)
...
I found this company which sells the Supermicro CSE-848X X10QBI.
Serverschmiede.com GmbH
So a server with 4 x Intel Xeon E7-4880v2 CPUs, heatsinks, 4 x 4GB DIMMs, 8 memory boards, backplane, BMC card, 4 x 1620W power supplies and the case itself would be €691. The shipping costs are (only) €29 to my country so for €720 I would have a deafening running server with 60 cores :) Sounds like a good deal but I would love to hear anybody his/her opinion on this.
 

angel_bee

Member
Jul 24, 2020
37
18
8
I found this company which sells the Supermicro CSE-848X X10QBI.
Serverschmiede.com GmbH
So a server with 4 x Intel Xeon E7-4880v2 CPUs, heatsinks, 4 x 4GB DIMMs, 8 memory boards, backplane, BMC card, 4 x 1620W power supplies and the case itself would be €691. The shipping costs are (only) €29 to my country so for €720 I would have a deafening running server with 60 cores :) Sounds like a good deal but I would love to hear anybody his/her opinion on this.
sounds good. this will save you all the hassle of piecing everything together and I think it's cheaper to have all of this in one go anyways.
make sure that you have enough ram for 60 cores lol and if you want to use the 24-drive bay, buy the drive trays and a raid controller too.
 

otspadmin

New Member
Nov 9, 2020
7
2
3
you have watercooling, that doesnt count :p

P.S. and yeah omgosh. like... i tried 1% PWM before and the fans are still around 1200RPM hahaha these fans cannot be stopped
@angel_bee Is there any way you could help me get my PWM fans down to 1200RPM? haha
I picked up a X10QBI for work and the raw IPMI commands are not taking effect that I was looking at on other threads. So currently all 7 of my fans are running at 6000-7000RPM which is super aids hahaha.

Any advice you can give for me?
Thank you,
Jason
 
  • Haha
Reactions: angel_bee

angel_bee

Member
Jul 24, 2020
37
18
8
@angel_bee Is there any way you could help me get my PWM fans down to 1200RPM? haha
I picked up a X10QBI for work and the raw IPMI commands are not taking effect that I was looking at on other threads. So currently all 7 of my fans are running at 6000-7000RPM which is super aids hahaha.

Any advice you can give for me?
Thank you,
Jason
Hi Jason,

Congratulations on being a proud owner of a new x10qbi

Unfortunately, no. The firmware is hard-coded to run the fans at a minimum of 50% speed and no raw IPMI commands will be obeyed. It is possible that this behaviour is only specific to BIOS versions newer than ~2019?? because many other people have reported raw IPMI working up until ~july 2020.

I do not know of any reliable source where u can get an old BIOS to flash. And if you have v3/v4 CPUs (and not v1/v2), u need the newest BIOS for maximum compatibility anyways.

The best bet is to get a fan controller and hook up only the fans' PWM wires to it. I know a guy who sells cheap-ish fan controllers with customisable curves, so if you are interested, u can msg me about that.
Know that some of your components aside from the CPU, especially the RAID card, require a large amount of airflow to cool otherwise it will malfunction. Depending on how much heat your CPUs generate, there will be a minimum to how low you can set the fan speeds before the CPUs simply get too hot.

But that being said, I have ziptied a cheap 40 mm fan to my raid card and my fans are running at 2000- 4000rpm these days and it's fine. Just be sure you know what you're doing!
 

otspadmin

New Member
Nov 9, 2020
7
2
3
Hi Jason,

Congratulations on being a proud owner of a new x10qbi

Unfortunately, no. The firmware is hard-coded to run the fans at a minimum of 50% speed and no raw IPMI commands will be obeyed. It is possible that this behaviour is only specific to BIOS versions newer than ~2019?? because many other people have reported raw IPMI working up until ~july 2020.

I do not know of any reliable source where u can get an old BIOS to flash. And if you have v3/v4 CPUs (and not v1/v2), u need the newest BIOS for maximum compatibility anyways.

The best bet is to get a fan controller and hook up only the fans' PWM wires to it. I know a guy who sells cheap-ish fan controllers with customisable curves, so if you are interested, u can msg me about that.
Know that some of your components aside from the CPU, especially the RAID card, require a large amount of airflow to cool otherwise it will malfunction. Depending on how much heat your CPUs generate, there will be a minimum to how low you can set the fan speeds before the CPUs simply get too hot.

But that being said, I have ziptied a cheap 40 mm fan to my raid card and my fans are running at 2000- 4000rpm these days and it's fine. Just be sure you know what you're doing!
Thank you for the quick response!! Interestingly enough I am running v2 CPUs, however the more interesting problem would be in finding older bios to revert to for testing. This server will not have a significant load on it necessarily, but I completely understand the need for proper cooling. If I was able to bring the fan speeds down I would monitor the temps closely to ensure there would be no thermal issues.

The only question I would have for you in this case, assuming I wanted to go down this path of bios reverting. Do you think I would need to flash the BMC and the X10QBI to an earlier version or just the X10QBI bios? (i'm assuming the later)
Thanks again
Jason
 

angel_bee

Member
Jul 24, 2020
37
18
8
Thank you for the quick response!! Interestingly enough I am running v2 CPUs, however the more interesting problem would be in finding older bios to revert to for testing. This server will not have a significant load on it necessarily, but I completely understand the need for proper cooling. If I was able to bring the fan speeds down I would monitor the temps closely to ensure there would be no thermal issues.

The only question I would have for you in this case, assuming I wanted to go down this path of bios reverting. Do you think I would need to flash the BMC and the X10QBI to an earlier version or just the X10QBI bios? (i'm assuming the later)
Thanks again
Jason
I have scored the internet for old BIOSes before, but returned bitter and empty handed lol

You can try at your own risk, as I have no experience in BIOS downgrading whatsoever. U have to make sure the source is legit or u risk bricking your BIOS.

In the event you do manage to downgrade your BIOS, I wouldn't be surprised at all if the BMC firmware becomes mismatched with the BIOS version and something breaks. The typical rule of thumb is to flash the BMC first and the BIOS second. Don't ask me why. Supermicro makes the rules. But it might just be a matter of tinkering around with different combinations until you get it right. Which is not worth the risk of firm-bricking the BMC in my opinion. However if you brick the BIOS, you can still use Supermicro's custom IPMI tool to force BIOS update.

Hope that helps
 

otspadmin

New Member
Nov 9, 2020
7
2
3
I have scored the internet for old BIOSes before, but returned bitter and empty handed lol

You can try at your own risk, as I have no experience in BIOS downgrading whatsoever. U have to make sure the source is legit or u risk bricking your BIOS.

In the event you do manage to downgrade your BIOS, I wouldn't be surprised at all if the BMC firmware becomes mismatched with the BIOS version and something breaks. The typical rule of thumb is to flash the BMC first and the BIOS second. Don't ask me why. Supermicro makes the rules. But it might just be a matter of tinkering around with different combinations until you get it right. Which is not worth the risk of firm-bricking the BMC in my opinion. However if you brick the BIOS, you can still use Supermicro's custom IPMI tool to force BIOS update.

Hope that helps
Capture.JPG

Above is my current BMC firmware and X10QBi bios versions.
Well after several different firmware/bios changes this is the current versions I am running (successfully) and still no dice on the raw commands to change the zone via raw commands.
I'm very curious if there is anything specific about the X10QBI that is stopping me from being able to set the raw commands.

le sigh
 

angel_bee

Member
Jul 24, 2020
37
18
8
View attachment 16335

Above is my current BMC firmware and X10QBi bios versions.
Well after several different firmware/bios changes this is the current versions I am running (successfully) and still no dice on the raw commands to change the zone via raw commands.
I'm very curious if there is anything specific about the X10QBI that is stopping me from being able to set the raw commands.

le sigh
wow lool i really admire your bravery!

i think the BMC and BIOS versions could still be too new... i know the dates say 2017/2018 but the x10qbi was made in 2011 i think
 

otspadmin

New Member
Nov 9, 2020
7
2
3
wow lool i really admire your bravery!

i think the BMC and BIOS versions could still be too new... i know the dates say 2017/2018 but the x10qbi was made in 2011 i think
Welp,

Going to throw the server up in a location that doesn't care about noise haha. I would rather have the latest Bios/firmware. (the 2015 bios still didn't work)
oh well. Going to try some Noctua fans really quick, but I would rather have proper cooling and latest firmware than jerry-rigging this.
Thanks for you help!
 
  • Like
Reactions: angel_bee

gmaxwheel

New Member
Dec 24, 2019
23
13
3
For the unkwown reason when I installed 6x memboards rear fans stopped and newer turned on again, so I removed It.
You might have shorted out one of the fans. The fan wires have to go outside the case, they're supposed to get threaded through an impossibly thin notch where the cards go. I broke the wires on one of my fans installing and removing the BMC board, but fortunately I caught it before powering it on. Connecting 12v fan to the case would probably be pretty bad.

I'm really careful of those fan wires now and double check any time I take a system apart that I haven't nicked any of them.

FWIW, I have 8 of these systems now up from 6 in my original post all with v4 cpus. They're all still working fine. I hadn't noticed that this thread continued-- I'm super glad to see that other people have also had success.

On some of my systems the ipmi was set to higher fan speeds. I was able to reduce them somewhat, which dramatically reduced the noise. :) According to my shell history, the command I used was "ipmitool raw 0x30 0x45 1 2" ... I recall that there was a lower speed setting than that, but if I set it lower it would ignore it and just go to maximum speed. So if you tried reducing the fan speeds and had that result you might have run into the same problem.

My systems are in a rack in a separate garage building from my house. With the fans reduced I can only just barely hear the systems inside the house if everything is quiet. :)

The second we tried 4 x 8890 v3 and we got this VMSE DC detect failure... also, sometimes we can get termal trips on processors 1, 3, 4
but sometimes the system does load and the stress test go fine (with half of memboards 512GB)
so we did move the memboards from the working unit to this one, and we got the same error. Did someone see that before?
I have good news and bad news. I also got that on one system and was able to fix it!

The bad news is that the cause was infinitesimally bent pins in the cpu socket. It took me hours to fix it, working with super fine tweezers and a camera with a bunch of macro tubes to act as a microscope. The bends were so fine that they were just BARELY visible to the naked eye while using a flashlight and different angles (and even then were more visible as irregularities in the pattern rather than explicit bends, until I got out the camera-microscope).

I also got that error a couple times when it was just a memory board that wasn't seated right. Sometimes it would still fail after multiple reseating attempts: When the system is full up with memory the positioning of the boards is pretty tricky. The problem is that all the other memboards stress the alignment of the motherboard in teh case, and everything has to be JUST RIGHT. So even if you think you have everything seated well, you might want to try again. And again. And if that fails, get out a flashlight and very carefully check the offending cpu socket.

Good news I found is that when it wasn't working I was just able to remove the one offending memboard (the error identifies it) and the system still worked okay.
 
Last edited:
  • Like
Reactions: sth-n00b