4-node Open Compute Cluster Build

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

polyfractal

Member
Apr 6, 2016
33
27
18
First post here, but long time lurker. This entire build was motivated by the excellent Open Compute guide provided by NorCalm over in deals.

Fair warning: I'm a developer, not an ops guy, so apologies for improper terminology and a slightly janky setup :)

Background

My primary development machine for the last few years has been a Macbook Air, which is decidedly underpowered. It was fine when I first joined the company, but as I've moved onto new projects it struggles with our builds/tests/benchmarks. We make a clustered search/analytics engine, so a lot of our tests involve spinning up multiple nodes and can be pretty resource intensive. To date I've been renting a beefy server at Hetzner for these kinds of tasks.

I was recently given budget to update my setup. Most people get a Macbook Pro or build a desktop. I opted for a third option: build a mini cluster based on E5-2670's.

The Build

Given how cheap E5-2670's are, the challenge was finding cheap motherboards. I opted for the Open Compute route, and essentially followed this STH guide. The chassis + 2x motherboard + 4x heatsinks + power supply all together together was cheaper than most LGA2011 motherboards alone, which made it a hard deal to pass up.

Final price came out to $504 per node ($469 without shipping).

Since I don't really have any other home infra other than a FreeNAS tower, I wasn't worried about the non-standard rack size. I'll probably build a true homelab some day, but my current living circumstances (renting, moving soonish) prevent me from expanding. I'll deal with other rack sizes later. :)

I'm currently using one node as a desktop machine, and leave the other three powered off unless I need them. The BIOS claims to support suspend-to-RAM, but I'm still trying to get that to work. For now they are either physically off, or suspended to disk. Also working to get SOL and WOL working so that cluster booting is a bit more automated.

Node specs:
  • 2x Xeon E5-2670
  • 96gb DDR3 1333MHZ ECC RAM (12 of the 16 slots used)
  • 1TB 7200rpm Seagate HDD
  • Ubuntu 14.04.4 LTS server / desktop
The "desktop" node also has a HD6350 video card and a PCI usb expander. I purchased an SSD for it as well, but need to sort out how to mount it (the rack only has two 3.5" caddies)

Cluster stats:
  • 64 cores, 128 with hyperthreading
  • 384gb RAM
  • 4TB hdd

Practical considerations
  • Temperatures idle around 32-35C. At 100% utilization, the temp cranks up to ~93C before the fans kick into high gear, then settle down to ~80C.

  • Noise is a pleasant 35dB idle, which honestly is quieter than the Thinkserver on the floor. At 100% utilization the fans go into "angry bee" mode and noise rises to around 56dB. It's loud, but not unbearable, and I'll only be using the full resources for long tests, likely overnight. While idling, the 60x60x25 fan in the PSU is far louder than the node fans, and gives off a bit of a high-pitch whine due to thinner size.

  • Unsure of power consumption, I don't own a kill-a-watt (yet). The spec sheet claims 90-300W per node depending on activity. So probably around 360-1200W when the full cluster is running, plus some overhead of the step-up transformer

  • As someone who has never worked with enterprise equipment before...these units were a pleasure to assemble! Only tool required was a screwdriver to install the heatsinks, everything else is finger-accessible. Boards snap into place with a latch, baffles lock onto the sushi-boat, PSU pops out by pulling a tab, hard drive caddies clip into place. Just a really pleasant experience, I kept saying "oh, that's a nice feature" while assembling the thing.

  • The only confusing aspect was the boot process and what the lights mean. As NorCalm discovered, blue == power, followed after a few seconds by yellow which equals HDD activity. If you boot and it stays blue, something is wrong. In my case it was a few sticks of bad RAM.

  • Nodes are configured to boot in staggered sequence. The PSU turns on, fans slow down, first node powers on, about 30 seconds later the second node powers on.

  • There are two buttons on the board: red (power), grey (reset). I think... it's confusing because the boards like to reboot themselves if they lose power, even when you manually switch them off. The power-on preference can be changed in BIOS, default is "last state". I haven't played with it yet
Photos!


DSC09463.jpg
The components all laid out.
  • PCI USB hub - Radeon HD6350
  • x4 Seagate 1TB 7200 RPM hdd (ST1000DM003)
  • TP-LINK TL-SG108 8-Port unmanaged GigE switch
  • SanDisk Ultra II 120GB SSD
  • x50 Kingston 8GB 2Rx4 PC3L-10600R
  • x8 Xeon E5-2670
  • ELC T-2000 2000-Watt Voltage Converter Transformer
DSC09466.jpg

One of the OCP nodes, unpacked before installation of components.

DSC09500.jpg

The final setup, including the second node. Setup is temporary for now, until I can build a rack.

The first node sits on three strips of neoprene rubber. The second node sits on top of the first, also separated by a layer of neoprene. The top is covered by some cardboard (temporary). It isn't needed for airflow, since the nodes have plastic baffles... It's just for my peace of mind, so I don't accidentally drop/spill something.


Screenshot from 2016-04-15 23:13:50.png

128 threads burning power :)



More photos of the build:

DSC09479.jpgDSC09465.jpgDSC09468.jpgDSC09469.jpgDSC09471.jpgDSC09473.jpg

DSC09474.jpgDSC09482.jpgDSC09484.jpgDSC09485.jpgDSC09487.jpgDSC09490.jpgDSC09492.jpg

DSC09494.jpgDSC09495.jpgDSC09497.jpgDSC09502.jpgScreenshot from 2016-04-17 14:42:01.pngDSC09504.jpg
 

polyfractal

Member
Apr 6, 2016
33
27
18
Thanks! Yeah, I was pretty surprised how quiet it is idling. It's loud when at 100% for sure (~56dB), but transient loads or partial loading only some of the cores remains pretty quiet, as the fans don't really spin up until it hits 90C.
 

polyfractal

Member
Apr 6, 2016
33
27
18
Just checked, no fan spinup. -j 32 peaked around 50C, -j 64 peaked around 70C (and ran right after the 32, so perhaps influenced by that). Either way the fans held steady at their "low" mode
 
  • Like
Reactions: gigatexal

nickscott18

Member
Mar 15, 2013
77
19
8
Looking at those two cases stacked like that seems like it would make the ideal set-up for a tower sized, lab in a box. With it built into a wooden tower case (providing those boxes don't mind being run on their side), it would make the ideal home lab for those without a rack. Now, how much to ship a couple of those boxes to the bottom of the world . . .
 

polyfractal

Member
Apr 6, 2016
33
27
18
Looking at those two cases stacked like that seems like it would make the ideal set-up for a tower sized, lab in a box. With it built into a wooden tower case (providing those boxes don't mind being run on their side), it would make the ideal home lab for those without a rack. Now, how much to ship a couple of those boxes to the bottom of the world . . .
I was actually considering exactly this, since it'd be more convenient to place these on the floor next to my Thinkserver. All the components seem to lock down tight, the motherboard trays don't have any slack when latched, and the HDDs seem secure too.

I was just afraid to mount them on their side because....could you? But it seems like a good idea...
 

TLN

Active Member
Feb 26, 2016
523
84
28
34
Wonder if you can measure power consumption, while idle and under load.
ps. subscribed, really wanna see two racks in "tower case"
 

nickscott18

Member
Mar 15, 2013
77
19
8
I was actually considering exactly this, since it'd be more convenient to place these on the floor next to my Thinkserver. All the components seem to lock down tight, the motherboard trays don't have any slack when latched, and the HDDs seem secure too.

I was just afraid to mount them on their side because....could you? But it seems like a good idea...
I suppose the only thing that might be affected by running it on it's side is how the cooling would work. But I'm inclined to think because it's got ducts, and plenty of airflow running through it, that might not be an issue.

Also - picture 3 - would it seem like the main 4 fans could be swapped out with 120mm units to reduce fan noise (ie, work across both nodes - may require some alterations to the chassis. And with the power supply fans, a similar thing could possibly be achieved with some duicting
 

DonJon

Member
Apr 9, 2016
50
6
8
47
Could you please tell me the power up sequence. This is what happens with my unit. I'm wondering if it is defective.

Background on my setup
1. I got the same ELC transformer to power this server.
2. I installed 4 E5-2660 on them. Installed the heat sinks.
3. Just to start, I added 4 sticks (8GB each 1600Mhz RDIMM ECC), 2 on each node. One on A0 and one on B0 slots of each node. I only had 4 in hand to test these the rest of the modules are in order. All those 4 sticks were pulled from another working server, so they should be working no doubt about that.
4. Added a Asus Radeon 5450 video card to one node. It has VGA, HDMI and DVI output. Connected VGA and HDMI to two outputs. Normally the VGA would work by default, I have seen it on other systems.
5. Connected USB keyboard to the external port on the same node where video card is installed.

Powered it up....
a. Green LEDs blink a few times on the power supply. It was very quick and then changes to Yellow. The bottom LED near the hard drive cage also lit up Yellow.
b. On one node, Blue LED flashes and goes off.
c. On the other node, Blue LED is solidly lit. But no power to USB. No Video output.
d. After about 4 minutes, Fans start spinning on this node.
e. No fans spinning on the other node or any LEDs lit up.

What is the sequence you see on your units? I really suspect if the LED on the power supply should be lit GREEN instead of YELLOW on mine.

And I have no idea why one of the nodes doesn't respond with any indication.

Is this server defective by any chance?
 

polyfractal

Member
Apr 6, 2016
33
27
18
Wonder if you can measure power consumption, while idle and under load.
ps. subscribed, really wanna see two racks in "tower case"
My kill-a-watt arrived today, here are my numbers. One PSU, one node:
  • idle: 85W / 1amp
  • 1 core: 126W
  • 2 core: 156W
  • 4 core: 177W
  • 8 core: 218W
  • 16 core: 292W
All four nodes running:
  • idle: 232W
  • 64 cores: 1165W
  • 64 cores and all fans at max: 1260W / 11 amps
 
Last edited:
  • Like
Reactions: gigatexal

imchipz

Member
Apr 18, 2016
38
2
8
40
Wondering how does it pull 1260 W , isn't the PSU 700W rated?

Also thanks to (what I assume is you guys :p) The ebay price shot from 220 to 279! I was watching it :(
 

snake eyes

New Member
Jun 4, 2016
5
0
1
England
Hi guys, I'm really worried about the temperature of my server. Loading 100% the temp is 85 C, which is in my opinion, incredibly hot! Please tell me, if this is a normal temp for this kind of server?
 
Last edited:

polyfractal

Member
Apr 6, 2016
33
27
18
Hi guys, I'm really worried about the temperatures. Loading 100% the temp is 85 C, is hot! Please tell me, is normal temp for this server?
It seems to be normal for my four nodes at least. I've been using them pretty regularly since this post (one all the time as a desktop, the others sporadically as needed).

Under full, 100% load, the temperature does indeed hover around 80-85C. It appears to be on purpose, as the fans are only running at maybe 70% capacity. The behavior I always see is:

1. Load pegs at 100%
2. Temperature creeps up to 80-85, fans rev to around 40-50%.
3. Temp keeps going until about 90, fans hit 100%
4. Temp quickly drops to 70-85 range, fan returns to 70%
5. Temp basically stays consistent, same with fan

So it seems the board is happy to be in this range, since there is spare cooling capacity not being used.

Also note, under "real 100% load" situations, temps are a lot lower. E.g. if a database is chewing 90% cpu and hitting some disk IO, temps are a lot lower since the pulses of inactivity allow it to cool down quickly, despite being very short.
 
  • Like
Reactions: snake eyes

Rus Taliss

New Member
Sep 14, 2017
4
0
1
58
It seems to be normal for my four nodes at least. I've been using them pretty regularly since this post (one all the time as a desktop, the others sporadically as needed)...
Under 100% load, temperature does indeed hover around 80-85C... u said above temp 93degC b4 fans .. Great article can we advance this to a small cluster benchmark in a system as i earlier posted in conversations .. dont kno how to add peop .. in fact cant even start a thread .. how opaque is blog
if u have the time do u need IB switches and 1U servers to act as the PF_ring front end ? happy to send them to u (if u own yor own home) no renters.. pls

One saving is compact size for 2 nodes , and the mezzanine connector .. does it fit the 341X3 NIC ?
that i have been buying up ?

PolyFrACTAL .. was about to plunge in .. with a rack order ..one caution hit me .. (the ludicrous temp peak of 93deg C .. before fans kick to high speed .. i dont want thermal fatigue ruining my 2670 's ..and it will .. eventually. Where is the BIOS in all this .. cant change fanspeed / or kick in temp ? As i really need a cluster not a boiler , i bought 2 Foxconn 2650 's .. with the copper on those babies, i doubt temps higher than 60C flat out .. waiting for the null modem cable delivery.. even wide geo 32nm foundry develops fatigue .. Once the price comes down
 
Last edited: