X10QBI and v3/v4 cpus (e.g. supermicro sys-4048b-trft)

angel_bee

Member
Jul 24, 2020
32
14
8
Hey guys,

we are building the 2nd x10qbi;

the 1st is running with 4 x 8890v4 + 1TB of LRDIMM Samsung 1333MHz (32 x 32GB) + 24 3TB SAS 7200 + 5 pci-e nvme and is fine.

The second we tried 4 x 8890 v3 and we got this VMSE DC detect failure... also, sometimes we can get termal trips on processors 1, 3, 4
but sometimes the system does load and the stress test go fine (with half of memboards 512GB)
so we did move the memboards from the working unit to this one, and we got the same error. Did someone see that before?

View attachment 15711
I can't remember if I've seen "DC Detect Failure" specifically, but if it's saying P1M1, then that is referring to that memory board. I usually just take it out, fiddle the RAMs on it for a bit, then put it back in and it boots fine haha :D It kind of reminds me of the Nintendo 64 back in the day you had to blow into the game cartridges ... maybe it's the same thing and there is some dust inside the RAM-board and the mobo slots? but anyways I hope the problem you're getting is an easy fix like giving it a good blow ... fingers crossed :)

could you please confirm that you got the 8880v3 working?
yes! they're working perfectly fine although they're MUCH hotter than the E7-4820 V2s that came with the base machine. It was a bit scary when I first installed them to be honest ... when I turned the system back on, it kept freezing at OS handoff and never moved beyond the second supermicro splash screen. But then it occurred to me to do a CMOS reset. Take the battery out, short the CMOS contact pads, put the battery back in. Works. PHEW. I've been running almost non-stop for a week with heavy RNA folding calculations, and it's stable. Only time it crashed was when I used up all my RAM lol.

p.s. HOLY SSS*** how much storage do you have on your 1st machine?!?!?!?!!!!! so jealous
 
Last edited:

tconrado

New Member
Jul 17, 2020
16
2
3
I can't remember if I've seen "DC Detect Failure" specifically, but if it's saying P1M1, then that is referring to that memory board. I usually just take it out, fiddle the RAMs on it for a bit, then put it back in and it boots fine haha :D It kind of reminds me of the Nintendo 64 back in the day you had to blow into the game cartridges ... maybe it's the same thing and there is some dust inside the RAM-board and the mobo slots? but anyways I hope the problem you're getting is an easy fix like giving it a good blow ... fingers crossed :)



yes! they're working perfectly fine although they're MUCH hotter than the E7-4820 V2s that came with the base machine. It was a bit scary when I first installed them to be honest ... when I turned the system back on, it kept freezing at OS handoff and never moved beyond the second supermicro splash screen. But then it occurred to me to do a CMOS reset. Take the battery out, short the CMOS contact pads, put the battery back in. Works. PHEW. I've been running almost non-stop for a week with heavy RNA folding calculations, and it's stable. Only time it crashed was when I used up all my RAM lol.

p.s. HOLY SSS*** how much storage do you have on your 1st machine?!?!?!?!!!!! so jealous
DNA folding? I have a Genetics PhD and I'm running masternodes for crypto on those machines!!! I do envy u!!!
8890v3 are really hot, never saw something like this (the 8890v4 do not this -- more cores, running cooler )
1600176748612.png


So we did 4 test expressive tests:
a) removed all the pci-e
b) move all the memboards from working unit to the bad one --- same error so it is not the memboards or memtype or dim
c) move the v3 processor to the working unit and it did work (hence not a processor problem 8890v3 did good here too),
d) move the v4 did trigger to the bad unit, and it triggered at P1M1-VMSE1 again... but with another problem...
(this DDR training Failure comes and goes)
1600175002488.png

I do start to think that it is something on the mainboard...
 

angel_bee

Member
Jul 24, 2020
32
14
8
DNA folding? I have a Genetics PhD and I'm running masternodes for crypto on those machines!!! I do envy u!!!
8890v3 are really hot, never saw something like this (the 8890v4 do not this -- more cores, running cooler )
View attachment 15739


So we did 4 test expressive tests:
a) removed all the pci-e
b) move all the memboards from working unit to the bad one --- same error so it is not the memboards or memtype or dim
c) move the v3 processor to the working unit and it did work (hence not a processor problem 8890v3 did good here too),
d) move the v4 did trigger to the bad unit, and it triggered at P1M1-VMSE1 again... but with another problem...
(this DDR training Failure comes and goes)
View attachment 15738

I do start to think that it is something on the mainboard...

Hmm the CPUs get hot, but they shouldn't throttle!! not even with 8890v3s... if you're putting the midplane and rear fans at full scream, it can seriously dissipate heat. It works a treat for me but it's too loud and I'm looking into watercooling the whole server hahaha (stay tuned for those interested).
Have u checked for proper heatsink contact with the CPU packages after you replaced the CPUs with 8890v3s? Proper thermal paste quality? Measured the temps of the heatsinks (or otherwise) to see if they get as hot as the CPUs?

So you're saying that the errors never occur with a v3? That's so strange! have u tried booting with only 2 out of the 4 x v4s at a time to see whether one of them is at fault? Do the v4s work with no problem in a different machine?
Also does the IPMI record any weird events/sensor readings with the v4 inside?
 

tconrado

New Member
Jul 17, 2020
16
2
3
Hmm the CPUs get hot, but they shouldn't throttle!! not even with 8890v3s... if you're putting the midplane and rear fans at full scream, it can seriously dissipate heat. It works a treat for me but it's too loud and I'm looking into watercooling the whole server hahaha (stay tuned for those interested).

I guess I'm too hard on the stress testing
1600181609725.png


so, on the good server, v3, and v4 work fine...
on the bad server, v3 give one error, v4 give another error both P1M1-VSME

it does start with half of memboards (and because of htat I think it chipset relate), but get unstable
 

angel_bee

Member
Jul 24, 2020
32
14
8
I guess I'm too hard on the stress testing
View attachment 15740


so, on the good server, v3, and v4 work fine...
on the bad server, v3 give one error, v4 give another error both P1M1-VSME

it does start with half of memboards (and because of htat I think it chipset relate), but get unstable

does the IPMI give any clues as to any off readings? :/ e.g. event logs, sensors.
 

angel_bee

Member
Jul 24, 2020
32
14
8
hmm... if only there was a service manual from Supermicro!!! TT

Maybe it could be worth messaging them for one... just like the ones that printer servicing guys carry and it's got exclusive hardcore troubleshooting and tweaking stuff
 

PabloChakon

New Member
Jun 2, 2020
8
1
3
Hey guys I decided to sell of the spare 8890 v3 qs I have. Great condition and tested on a working rig I already have. The price will be 500$ plus shipping to wherever you are from serbia. Message me here if anyone is interested. I really looked into building the third rig but with covid and the market at the moment im not willing to risk another 1000$ playing around with something I dont have enough experience with. Thanks for all the help you offered for now but its just too much and shipping x10qbis with the whole box and all is out of this world priced for serbia so its a no no from me. I would much rather have someone play with these properly
 

sth-n00b

New Member
Sep 30, 2020
7
0
1
What a nice read this thread.

I am considering buying a secondhand server for fun, to learn and do some stuff with mining.

For example this one.
Supermicro CSE-848X X10QBI 4U Server 24x 3,5" LFF 4x Intel XEON E7-4800 v1 v2 DDR3 ECC 2x PSU Server -CTO-

Got some questions, hope you guys/girls know the answers.
  • Can you boot Linux (eg Debian) from an usb stick with this machine?
  • I understand this specific motherboard does accept E7-48xxV3/V4 and E7-88xxV3/V4 CPUs; but what about for instance an HP DL580 G8? Does it also takes V3/V4 CPUs (if I have to create a new thread for this question, please let me know).
  • I am a bit nervous about the bulkiness and noise of this server because it will be placed in our home. Right now my HP Z640 workstation is very quiet. Will my girlfriend start ignoring me if a get one of these? :)
 

angel_bee

Member
Jul 24, 2020
32
14
8
What a nice read this thread.

I am considering buying a secondhand server for fun, to learn and do some stuff with mining.

For example this one.
Supermicro CSE-848X X10QBI 4U Server 24x 3,5" LFF 4x Intel XEON E7-4800 v1 v2 DDR3 ECC 2x PSU Server -CTO-

Got some questions, hope you guys/girls know the answers.
  • Can you boot Linux (eg Debian) from an usb stick with this machine?
  • I understand this specific motherboard does accept E7-48xxV3/V4 and E7-88xxV3/V4 CPUs; but what about for instance an HP DL580 G8? Does it also takes V3/V4 CPUs (if I have to create a new thread for this question, please let me know).
  • I am a bit nervous about the bulkiness and noise of this server because it will be placed in our home. Right now my HP Z640 workstation is very quiet. Will my girlfriend start ignoring me if a get one of these? :)
hi! :)

- yes you can boot from USB. it comes with 2 ports on the back of the case and at least 2 on the actual motherboard. note that it only comes with USB 2.0, not 3.0, so plan accordingly.
- i believe this thread is for x10qbi only. i have no experience with the HP system.
- The cse-848x is F#($*&%( LOUD. there's a high chance that your girlfriend won't be ignoring you because she's mad at you - she's ignoring you because she can't hear you.
 
  • Like
Reactions: sth-n00b

angel_bee

Member
Jul 24, 2020
32
14
8
Hey guys,

we are building the 2nd x10qbi;

the 1st is running with 4 x 8890v4 + 1TB of LRDIMM Samsung 1333MHz (32 x 32GB) + 24 3TB SAS 7200 + 5 pci-e nvme and is fine.

The second we tried 4 x 8890 v3 and we got this VMSE DC detect failure... also, sometimes we can get termal trips on processors 1, 3, 4
but sometimes the system does load and the stress test go fine (with half of memboards 512GB)
so we did move the memboards from the working unit to this one, and we got the same error. Did someone see that before?

View attachment 15711
hey @tconrado did you manage to get this fixed?

I just did some critical server maintenance yesterday and when i put it all back together again, I got DC detect failure for the first time ever. It turns out that you get that problem if the memboard isn't inserted all the way into the mobo. It also turns out that some memboard connectors are thicker than others!!!! One of my memboards can literally only fit into one slot and none of the others (it won't even go in when I sit on it, yes - I did that - don't try that at home.)

So when I returned that memboard to the slot where it "belonged", everything worked fine after that. Maybe it was a similar case for you? If it were up to me, I would check all the pins in the mobo connector. Memboard issues are almost always an issue of geometry.

Also, after taking the motherboard out I noticed that there were more standoffs installed on the backboard than necessary. Now I bought this system entirely prebuilt. These metal standoffs were TOUCHING THE METAL TRACES ON THE BOTTOM. I was absolutely furious. Whoever built my system deserves a slap in the face. But anyways after taking the unnecessary ones out, all my system's issues spontaneously resolved. This could also be the case for you!! Maybe the metal standoffs have been shorting something important!!

To anyone else reading this, if you get a persistent VBAT lower non recoverable (nr) error even after swapping for a new CMOS battery, updated IPMI firmware and BIOS (which is the first thing that Supermicro would tell you to do), in the absence of visible damage of the motherboard, check the standoffs under the motherboard.
 
  • Like
Reactions: gb00s

sth-n00b

New Member
Sep 30, 2020
7
0
1
hi! :)

...
- The cse-848x is F#($*&%( LOUD. there's a high chance that your girlfriend won't be ignoring you because she's mad at you - she's ignoring you because she can't hear you.
:)

My non educated guess is the case fans are making most of the noise? So what happens if you disconnect the power from those fans? Or do you need them to keep the processors at a normal temperature? Is it possible to use cpu coolers like this one and don't use the case fans?
Mugen 5 PCGH Edition: CPU Kühler, Lüfter, Lüftersteuerung von Scythe
I only want four processors, some ram and a small ssd in my system, no hard drives or anything else.
 

angel_bee

Member
Jul 24, 2020
32
14
8
:)

My non educated guess is the case fans are making most of the noise? So what happens if you disconnect the power from those fans? Or do you need them to keep the processors at a normal temperature? Is it possible to use cpu coolers like this one and don't use the case fans?
Mugen 5 PCGH Edition: CPU Kühler, Lüfter, Lüftersteuerung von Scythe
I only want four processors, some ram and a small ssd in my system, no hard drives or anything else.
do NOT operate without the fan XD

yes, it's the fans that make the noise. both the midplane fans and the rear fans. i'm not sure if this is standard issue, but my midplane fans are "San Ace 92", model no. 9GA0912P1H041.

the base system screams. the x10qbi is hard-wired to supply a PWM duty cycle of 50% and it's impossible to go below that, and yes, that means no ipmitool command will change it. This is the case at least for my BIOS version (3.2a), maybe later versions did not have the 50% limit, but note that to use V3/V4 CPUs, your BIOS needs to be quite new.

I have a friend who makes fan controllers and he supplied me one for $50 ($36 usd). And I've just got the PWM wires attached to the fan controller so I have full control over it. if u r interested, PM me and i can give you the details.

To answer your questions about temperature, I'm pretty sure Xeons are sturdy as heck and can run high temps 24/7 no problem. But try not to allow it to go over 80 degs. The fan speed increases as the power output of your CPUs increase. Which generally correlates to TDP which is different for each CPU model. But do not expect PC-levels of noise. PCs are silent in comparison to the x10qbi. Also note the PSU fans are still loud even if you tone down the other fans (but bearable, i'd say they're as loud as someone talking).

The stock fans are fast for a reason - there are alot of components that must have airflow or you risk breaking it. Best example is raid controllers seem to have "passive" cooled heatsinks but it's because the manufacturer assumes that it's going to be operating in a server environment with very high airflow. You have to know what you're doing if you're going to go around modding fans.

That CPU cooler that you linked is most likely not right for the x10qbi. There are alot of considerations that may not be obvious if you are new to servers. The unique characteristics of the x10qbi sockets are:
1. close spacing - i can almost guarantee that the cooler is too wide for all 4 CPUs and they will clash. That being said, there do exist special server versions of these coolers which have the correct narrowness.
2. the cooler is probably too tall. The base server is very, very space-efficient with almost no free space. You have a maximum of ~12cm height for a cooler.
3. The socket mounting mechanism is "LGA-2011 narrow ILM". It's specific for server. Make sure you have the correct mounting mechanism/mounting plate that's compatible with the narrow rectangular screws around each CPU.

hope this helps
 
Last edited:
  • Like
Reactions: sth-n00b

sth-n00b

New Member
Sep 30, 2020
7
0
1
Thanks. I've learned a lot. And I have a lot to learn:)

Ok, the new plan looks a bit like this:
1. Find two Xeon E5-2680 V2 or better CPUs to upgrade my HP Z640 (which has two E5-2620 V2 processors at the moment).
2a. Get an estimate for electricity and internet in the shed (is that the correct word?) so I can have a server in there. Although I don't think that will work because it will be way too hot in the summer.
2b. Do some desk research and/or create a new thread over here for building a quiet four socket workstation.
 
Last edited:

angel_bee

Member
Jul 24, 2020
32
14
8
Thanks. I've learned a lot. And I have a lot to learn:)

Ok, the new plan looks a bit like this:
1. Find two Xeon E5-2680 V2 or better CPUs to upgrade my HP Z640 (which has two E5-2620 V2 processors at the moment).
2a. Get an estimate for electricity and internet in the shed (is that the correct word?) so I can have a server in there. Although I don't think that will work because it will be way too hot in the summer.
2b. Do some desk research and/or create a new thread over here for building a quiet four socket workstation.
depending on how well insulated it is, a shed could work. your biggest enemy would actually be humidity. like when it rains and stuff.
a quiet x10qbi system is entirely possible - it just depends on how much money you have and how much you know your stuff.

the E2680 v2s seem like they run a lot cooler than what i've got (8880 v3s). you might be able to get away with fans at a lower speed. for reference, my fans manually set to 4500rpm are able to keep the temperatures down nicely (a bit less than 70 degs C). THis is even at 100% load when the room temperature is like 15-20 degs C.

all the best! :) <3
 
Last edited:

synchrocats

New Member
Oct 20, 2020
4
5
3
So, it's a long story:) As I know, there is no cheap LGA 2011 narrow AIO water coolers. I bought four Gammaxx L240T and designed custom mounting brackets that was laser cut from 2 mm AISI304 steel. Also, standard mounts (6mm) was too high, and I used set of DIN912 M4x20 bolt + one stanard nut + one DIN985 M4 nut with plastic insert to fix the water block.

Everything else was quite trivial, x10qbi works just fine with four fans running at low rpm. As my system come without backplane I decided to disassemble HDD holder grid and made sort-of soundproof labyrinth that allows airflow but cancels high frequency noise from PSU fans a bit. Temperatures is fine, and only PCH10G is overheating, so I put two fans over BMC board for now, probably will change it to something else in future.

Machine was installed in IBM 11U rack that was soundproofed with felt, and two 140 mm fans wan installed to pull the hot air from the rack. Now it works with 3x E7 8891 v2 idling at 40-45 C and running at 60-65 C under load. Noise level is 37 dB idling and 43dB under full load, and most of the noise came from the PSU that I will change to something less annoying in the near future. Next steps will be design and fabrication of new top cover and hacking EMC VNX disk shelf to use noctua fans in the PSU.
 

Attachments

Last edited:

angel_bee

Member
Jul 24, 2020
32
14
8
So, it's a long story:) As I know, there is no cheap LGA 2011 narrow AIO water coolers. I bought four Gammaxx L240T and designed custom mounting brackets that was laser cut from 2 mm AISI304 steel. Also, standard mounts (6mm) was too high, and I used set of DIN912 M4x20 bolt + one stanard nut + one DIN985 M4 nut with plastic insert to fix the water block.

Everything else was quite trivial, x10qbi works just fine with four fans running at low rpm. As my system come without backplane I decided to disassemble HDD holder grid and made sort-of soundproof labyrinth that allows airflow but cancels high frequency noise from PSU fans a bit. Temperatures is fine, and only PCH10G is overheating, so I put two fans over BMC board for now, probably will change it to something else in future.

Machine was installed in IBM 11U rack that was soundproofed with felt, and two 140 mm fans wan installed to pull the hot air from the rack. Now it works with 3x E7 8891 v2 idling at 40-45 C and running at 60-65 C under load. Noise level is 37 dB idling and 43dB under full load, and most of the noise came from the PSU that I will change to something less annoying in the near future. Next steps will be design and fabrication of new top cover and hacking EMC VNX disk shelf to use noctua fans in the PSU.
nice job! this is quite useful information and i agree that coming up with a nice way to mount waterblocks on the CPUs is not easy at all.

I'm personally coming up with a custom loop and I have LGA-2011 narrow ILM brackets from EK already: Mounting plate Supremacy LGA-2011 Narrow ILM.

But I wasn't about to go and spend $400 on waterblocks. Fortunately, nowadays, chinese watercooling manufacturers such as Byksky and Barrow have very high quality legit stuff at a fraction of the cost. I bought a Barrow x99 block for ~$35 AUD and it actually fits the bloody mounting bracket from EK! haha! Now the only issue I had was actually the octagonal rubber ring. The one from Barrow was actually too thin for the mounting bracket. I mean the seal is O.K. but it is barely enough. So to be safe, I have 5 EK supremacy o-rings on the way. I'm confident that they'll work.

For the BMC cooling, have you looked into swapping out the stock heatsink for an active one? It seems like when you are finally able to turn down the fans, things that never got hot before are now burning. haha
I've been eyeing a few active heatsinks on ebay - I hope i can find one that's short enough to fit :D

Those temperatures are quite respectable with what you have. Since I can't be bothered with a rack, I'm planning on running tubing all the way to a MO-RA radiator (most likely 420mm) outdoors. That way, I can blast the fans without being worried about noise. This is valuable data because I always wondered if a single MO-RA would be enough...

I'm guessing your CPU temperature still has *alot* of headroom, because all your radiator fans are dialed down to super-quiet levels?

As for the noise from the fans, I'm quite averse to using noctua fans. They're just so stupid expensive for not alot of gain at all. I'm pretty sure the reason why the stock power supply fans (which are Nidec Ultraflo counter-rotating fans) are so loud is because the midplane fans are directly competing against them. This design is O.K. for an industrial environment but for the home with a watercooling setup, I think that the airflow needs optimisation.

My plan is to actually flip the power supply fans around. So this means the fans will be sucking air in from the back. And I'm going to remove one of the fans, leaving the other one inside just in case something goes wrong that I haven't accounted for and I need air blasting power. The noise level from the outlet fan is actually almost inaudible at standby. But I haven't tested it at full load yet because the system doesn't boot with only 1/2 fans detected. It thinks there's something wrong. I'll have to split the single fan's tachometer.

Instead of those teeny tiny 40mm fans trying to cool the PSUs, I'm going to let the midplane fans do all the work. They're much bigger and the CFM is much higher at lower RPM. The rear fan kits will help suck even more air out. To avoid circular airflow, I have a ventilation duct fitted to the rear of the case which goes out the window - like the portable air conditioners. I'm actually in the process of setting this up and it's going to help prevent the room from getting hot.

I think that will work??? Even if you don't plan to fit ducting like me, just put a plastic separator sheet so the exhaust doesn't get sucked back into the case lol. Let me know if you end up trying that. My watercooling build is still a bit away i think.
 
  • Like
Reactions: sth-n00b

sth-n00b

New Member
Sep 30, 2020
7
0
1
This is so much fun.

For a beginner like me, can you please provide a minimal parts list to get started? So of course the X10QBi motherboard. But how many memory boards do you need? Can you start with one and still operate with multiple CPUs? Do you need the official SuperMicro power supply or is any other one working?

I had to look up what a BMC board is and found this nice article.
Explaining the Baseboard Management Controller or BMC in Servers
But what I don't get if you can run a setup without a BMC?

I feel like a little kid again figuring out all kind of new stuff:)

If I have to create a new thread for these questions, just let me know.
 
Last edited: