Dual EPYC 7742 H11DSi-NT Build

captainblie

New Member
Feb 12, 2021
3
3
3
Roaming the country
Build Name: Big John
OS: Fedora 33 Server (RIP: Centos)
CPU: 2x AMD EPYC 7742
MB: Supermicro H11DSi-NT
RAM: 16x NEMIX 8GB DDR4 3200 (PC4 25600) ECC Registered RDIMM
HD: 2x Micron 9300 Pro 3.84TB NVMe
GPU: 4x NVIDIA Geforce RTX 3080 Founders Edition
Case: Mountain Mods U2-UFO Gold Digger
Case Fans: Noctua NF-F12 PWM
Risers: FebSmart PCI-E Riser
PSU: 2x Thermaltake ToughpoweriRGB Plus 1250W 80+ Titanium
CPU Fan: 2x Noctua NH-D9 DX-3647 4U, with optional SP3/TR4 Mounts
Spacers: Generic 8mm Nylon standoffs for MN mounting
Fan Splitters: Generic PWN 4-way fan splitters

Usage Profile: This “workstation” is designed to be a low priority Boinc/mining rig and provide other services as needed at a higher priority.

Challenges: Bent CPU socket pin. Poor MB airflow for VRM’s. OS installation onto NVMe. PSU Challenges. Placement of the PCIe cards to accommodate the side-mounted fans.

Narrative: Being a long time IT guy, I periodically get the bug to build a PC or a server, this build is one of those. Minor challenges were encountered right out of the gate with component availability and costs. For instance, I still don’t have the 2,000-watt modular power supply that I preferred.

The name of Big John was chosen for this server because of the classic ballad “Big Bad John” by Jimmy Dean.

After the parts arrived, the assembly of Big John went fairly smooth.

A mountain Mods U2-UFO case was chosen because it can accommodate an EATX board, lots of fans, and up to 27 PCIe slots. Additionally, it has some optional features I liked, like casters, being stackable, movable HDD mounts, different styles of interchangeable top and side panels, etc. Keep in mind this case is large, functionally an 18” cube.

When the case arrived there were no instructions, so some brain cycles were needed there to get it assembled. All in all, I like the case. Unsurprisingly, this case didn’t account for the nonstandard mounting holes on the supermicro board, but that was easily fixed with female-to-female 8mm nylon PCB standoffs (think pegs) and electronics washers because of the small nylon screw heads.

The first MB to arrive had a bent pin in the number two CPU socket. Daytona Web, of Amazon fame, was the supermicro vendor I was working with and they told me to send it to supermicro and have it fixed for sixty dollars. That clearly is not how the customer support game is played, so I returned the board to them and ordered another from a different vendor.

Once a working MB was installed, installation of the CPU, memory and CPU coolers was next. This was largely uneventful, with one standout point. The CPU cooler, not being designed for this board, had optional free mounting brackets that needed to be ordered from Noctua. There were no instructions on how to use them, so a small amount of brain cycles was spent on figuring out that they are simple retention clips. Big John, now having a brain, started silently mocking me by saying, “I can has be smartz, yepo.”

Two daisy-chained 1,250-watt power supplies were next, to be plugged into a dedicated 20 amp circuit. This case only has one power supply mount point, so I placed the second one inside, above, and slightly offset to the front and mounted it with command strips and zip ties, and then connected them together with a dual PSU cable (non-redundant). I have limited access to tools, so I went with this classy mounting solution. One PSU was designated to power the MB and non-GPU peripherals and the other for the GPUs. Being a “Big” hungry miner, Big John approves of the power he would be fed.

This particular case has lots of fan mounts, with the optionally chosen side panels, there are 18 120mm fan mounts. Three behind the main 20 wide PCIe tray, nine on the front panel, four on the sides, and three on the top. Being a fan (pun) of positive pressure inside my cases and like to follow standards, so I configured the front and sides as intakes, and the rear and top as exhausts. Apparently, mining can be a hot job.

The main fans chosen for cooling this rig are the Noctua NF-F12 PWM 120x25 MM fans. Keep in mind that should you use this case you may want slimmer fans for the side panels as the wider fans can protrude into the area where you may have components installed. To accommodate the fans for this build the PCIe cards were moved more to the center and the “optional,” ahem, second PSU was also moved slightly to the center of the case.

Twin NVMe drives were mounted to a fan in the center of the front panel using the optional HD mounts that you can get with this case. This leaves just enough clearance for the 3080 GPUs. This location was chosen because the center fan has to be “hard” mounted to the front panel instead of using the provided rubber retention mounts. My fat fingers are too big to fit into the narrow space to grasp the rubber dongles to mount the center fan with them, and like I said, limited access to tools. Since this fan was already an anomaly, it was elected to be completely different.

Mounting the PCIe cards and risers was super easy, if you have large hands, I recommend doing them one at a time because things can get crowded.

From here it was accessing the IPMI and configuring it. Then power on and configuring the BIOS. No front panel controls or indicator lights are installed in this build. Power is controlled from the IPMI or PSU’s with the BIOS set to auto power on when power is restored.

After the first POST it was time to test the system. Ultimate Boot CD was used for burn-in testing. This is where VRM heat problems started to show. Sadly, the IPMI system logs doesn't do a good job capturing this error.

I decided to take a risk and install Linux anyways, and it installed. No problems were encountered until I tried to reboot into the locally installed OS post-installation. Pro Tip: NVMe boot drives require UEFI, and Fedora is not smart enough to know of this need. Back to the BIOS to adjust the default dual, "legacy and UEFI," boot setting in there to only allow UEFI, I was then able to install and then boot into Linux.

After some basic configuration, I was able to do some more stress testing only to watch Big John shut off after about a minute. Hmm, nothing in the IPMI Event logs. Thankfully, Big John giving a loud solid never-ending beep of death clued me in that something was horribly wrong. After a hard power off and some research, I now know Big John has a heat problem… Yep, on a system with 18 case fans, 4 CPU fans, and 8 GPU fans, a heat problem. Really, I’m not making this up.

Pro Tip: FAN1-6 plugs are for the “CPU” region and FANA-B is for the case.

More research and troubleshooting... it was identified that the main problem is the CPU VRMs and that Big John not alone in his plight. Numerous forum posts clearly state that this board is best used in a server case with highly directed airflow shrouds. Some other options that were posited are dedicated VRM fans and water cooling.

It was during this research that I also noted the unusual surging fan behavior indicative of low RPM fans and supermicro MBs. IPMI captures this as a FAN error in its logs, but not what’s overheating, sigh. This apparently can be adjusted using the Linux native ipmitool command, search around, you’ll find it. Forums state there is a similar solution for those who run Windows on similar HW.

Big John has a maximum TDP north of 2,162. Drawing about 18 Amps at peak and creating about 7,400 BTU. All of which I knew when I spec’d out his build, however, I didn’t account for the airflow on the CPU VRM being impeded by the large CPU heatsinks. Overhead cooling of the offending VRM, from above the heatsinks, won’t work because of the heatsink generated air crossflow. With the stock VRM heatsink, this leaves direct cooling from on top of the heatsink, or from the sides. Due to the orientation of the CPU VRM fins, side airflow will not be very effective without replacing the VRM heatsink itself. Top-down airflow from a 40x100 mm fan resting on the heatsink may work, but its efficacy will be limited because of the strong airflow passing over it from the CPU heatsinks. I feel water cooling is the best option in this situation, which will have the nice side benefit of helping Big John less noisy.

While Big John is stable and runs under light load this is not to be his lot in life, he needs to accommodate 100% utilization, 24/7. Both Big John and I doubt the 40x10 mm fans resting on his VRM heat sinks will sufficiently keep him cool, but we’re going to test it while waiting for the custom loop water-cooling components to arrive.

For the water-cooling system, we're largely relying on Thermaltake and EKWB parts because it seems like they marry up with the case the best, I have 5 360x120 mm locations that can be used. Since Big John likes things being safely overbuilt with extreme flexibility in mind, we are cooling the following items:
  • 2x CPU’s with a TDP of 225 each
  • 4x Nvidia Geforce RTX 3080 Founders Edition with a TDP of 320 each
  • 1x CPU VRM with an unknown TDP
  • 4x RAM banks with an unknown TDP. Big John says this is probably unnecessary, but I’ve reassured him that it’s best to do a job right the first time and that there is no kill like overkill. After which he silently stares at his hands for some reason…

This gives us 11 hot units. The standard rule of thumb being N+1 for each section of 120mm cooling radiator space with N being the number of hot units to be cooled. Since Big John travels with me in my RV office, I don’t have the luxury of being able to do a proper thermal model, this is also why I have limited access to tools.

Aluminum parts are being avoided, leaving the to be added water-cooling component list looking like this:
  • Equipment Hosing Adjustments.
    • Rear Panel HDD/SSD PCI mounting brackets, needed to move the NVMe drives from the front so there is adequate space for the big radiators.
    • GPU’s will need to be moved slightly to accommodate the NVMe drives.
  • Pumps, Reservoirs, and Flowmeters.
    • Thermatake Pacific DP100-DF distro plate and pump combo. This will mount on the one side panel that can accommodate a 360x120 attachment, this will move to one of the sides depending on space, hopefully the PSU side. This panel that will be used is currently the top in the pictures.
    • Second pump EKWB EK-D5 Vario Pump. This was chosen because of the amount of resistance provided by all the water blocks, hoses, and radiators.
    • Anti-vibration pads were added to dampen the pump vibrations.
    • A basic flowmeter was added to easily verify that fluid is moving.
  • Hose and fixtures.
    • 20 Feet of ½” ID x ¾” OD flexible hose. Chosen because I don’t have a lot of room for the tools needed to bend ridged pipes.
    • Roughly Forty G ¼ compression fittings for ½” ID x ¾” OD flexible hose.
    • An assortment of 45 and 90 degree elbows.
    • A pressure equalization stop plug for the reservoir, because of imminent elevation changes.
  • Drainage.
    • One T-Splitter for a drainage hose junction point.
    • One ball valve for drainage.
    • EKWB metal EK-PLUG for extra safety on the drainage port.
  • Radiators
    • 3 Thermaltake CL360 64mm radiators, for mounting on the front panel.
    • 1 Thermaltake C360 27mm radiator, for mounting on the rear panel.
  • Water Blocks
    • 2x Thermaltake Pacific w6 TR4 CPU Water blocks, SP3 compatible.
    • RAM
      • 4x EKWB EK-RAM Monarch water blocks.
      • 8x EKWB EK-RAM Monarch Modules.
    • 4x Corsair Hydro X Series XG7 Founders Edition GPU Water Blocks.
    • 1x Anfi-tec UPC002 13x82-120mm VRM water block.
  • 4 liters of Clear pre-mixed coolant fluid
  • Fans
    • Existing Noctua fans will be utilized for the water-cooling radiators.
Because of the amount of heat in this build radiators will be interspersed in the single loop. Which will look like this:

Reservoir => pump => CPU1 => VRM => Mem bank 1 & 2 => 64 mm Radiator #1 => CPU2 => Mem bank 3 & 4 => 64 mm Radiator #2 => GPU 1-4 => 27 mm radiator => 64 mm Radiator #3 => Return to reservoir

Once all the water-cooling parts arrive, I’ll post an update on how Big John fares.

Future Concerns: A third pump may be needed, this will be determined based on flow rate after build-out. A solution may need to be identified for the memory VRMs.

Hindsight: If I had to do it again, I may have chosen a different MB, even though I like supermicro in general. I don’t know… What you can do with PCIe bifurcation is mind-boggling and this board has that capability in abundance. Big John might indeed become Big Bad John.

Disclaimer: For the inevitable trolls, yes I know mining is: speculative, a waste of energy, a waste of money, etc; By that logic all hobbies are. Also, I know Big John is an inanimate object and that he is being anthropomorphized for your entertainment purposes, do you? Finally, no miners were hurt, trapped, or developed health problems as a result of this write-up. All miners were worked only in accordance with Union regulations.

image0.jpegimage1.jpegimage2.jpegimage4.jpegimage5.jpegimage6.jpegimage7.jpegimage8.jpegimage9.jpegimage10.jpegimage11.jpegimage12.jpegimage13.jpegimage15.jpegimage16.jpeg
 
Last edited:

gsrcrxsi

Member
Dec 12, 2018
109
18
18
I would recommend swapping out the USB risers for PCIe 3.0 ribbon risers for BOINC use.

not sure which projects you're considering, but some do benefit from the increased PCIe bandwidth (GPUGRID for example, but this project doesn't work with Ampere yet). and it would simply your wire managment a lot not having to run power to the riser or deal with flaky USB cables.

but for mainly mining, the 2x 7742 is a huge waste. you can GPU mine with 10+ GPUs on a pentium CPU lol.