Switch Recommendations for HPC Cluster

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

mikexrv

New Member
Feb 28, 2020
15
0
1
I'm setting up a HPC network with 4 Nodes.

I have set up a HPC cluster with 10G Ethernet (base-T) one year ago and wanted to delve into 40GbE for a new test environment. I will connect ASUS


I'm looking for simple and quite solution for home use.
Any recommendations or advice would be great.



 

RTM

Well-Known Member
Jan 26, 2014
956
359
63
I'm setting up a HPC network with 4 Nodes.

I have set up a HPC cluster with 10G Ethernet (base-T) one year ago and wanted to delve into 40GbE for a new test environment. I will connect ASUS

I'm looking for simple and quite solution for home use.
Any recommendations or advice would be great.
A few more details will help you get an answer, like what is the workload you want to put on the network? are we talking iSCSI, NFS etc, something like webserver traffic or something else entirely? Likewise some information about your "HPC cluster" would be nice. Also, is this setup your lab environment? or will you be using it for production use (meaning it'll be used to generate income)

What do you mean by "I will connect ASUS"? are all your machines/servers from Asus?

In any case, there are many affordable options if you are open to buying used stuff on eBay you can find it rather cheaply too. The quiet criterion might be difficult to fulfill, but that also depends on what you mean by that (quiet enough to be able to work next to it? quiet enough to be put in a closet somewhere.

Here are a few suggestions:
  • Don't buy a switch, if you install 2 dual port NICs (4 ports total, of which 3 will be used) you can connect a cable to each machine, it won't give you much (if any) in way of redundancy, but it'll be cheap and probably quiet
  • Buy a (relatively) cheap used Brocade/Ruckus switch, like mentioned in the ICX mega thread, I can't tell you which ones to get or whether or not it'll be quiet enough, but it is a popular option here.
 
  • Like
Reactions: mikexrv

mikexrv

New Member
Feb 28, 2020
15
0
1
A few more details will help you get an answer, like what is the workload you want to put on the network? are we talking iSCSI, NFS etc, something like webserver traffic or something else entirely? Likewise some information about your "HPC cluster" would be nice. Also, is this setup your lab environment? or will you be using it for production use (meaning it'll be used to generate income)

What do you mean by "I will connect ASUS"? are all your machines/servers from Asus?
cluster is used for finite element simulations (FEM) and fluid flow (CFD) simulations. It is used for research and own enviroment.

Cluster is build from four workstations and each based on the ASUS Prime Z390A motherbord. The whole cluster is very quite even uder full load. I suppose infiniband will create a lot of noise. I will have to find special room for new switch :)

I never thought about direct connection between headnode and computenodes but if it is possible it would be great. I have to read about it.

I take into account also used hardware from ebay. I put the question here to ask you about some reccomendations.
 

mikexrv

New Member
Feb 28, 2020
15
0
1
  • Don't buy a switch, if you install 2 dual port NICs (4 ports total, of which 3 will be used) you can connect a cable to each machine, it won't give you much (if any) in way of redundancy, but it'll be cheap and probably quiet.
You are right. Now I have to find 2 dual port NICs compatible with Asus Motherboard Prime Z390A and Linux CentOS.
 

RTM

Well-Known Member
Jan 26, 2014
956
359
63
That's great, I actually intended to write that you could install two NICs in each machine and have direct connections from each machine to another.
But of course, if you only need a network topology where slaves need to connect to the master, there is only a reason to install that many NICs in the master node.

I don't know much about your application, but you should be aware that pushing 40G is not necessarily easy to do, if the software you use isn't tuned for it, you may be unable to get anywhere near that bandwidth.

In terms of NICs, a popular option here is to buy Mellanox Connectx-3 cards, I am not that into the specific models and all the details, so I can't recommend a specific part to get, but generally speaking I am sure you can find a decent deal on ebay, though you will probably want to search the forum first as these are frequently recommended to figure out which models etc.
 
  • Like
Reactions: mikexrv

Cixelyn

Researcher
Nov 7, 2018
50
30
18
San Francisco
Just as an FYI: ConnectX-3 cards only support HW acceleration of RoCEv1, while the ConnectX-3 Pro and later support RoCEv2.

This may be a consideration if you are looking to test various HPC networking setups; as RTM says -- pushing 40G is rather nontrivial. It's also possible your clustered FEM/CFD software already supports RDMA-based communication; definitely worth looking into.
 

mikexrv

New Member
Feb 28, 2020
15
0
1
I don't know much about your application, but you should be aware that pushing 40G is not necessarily easy to do, if the software you use isn't tuned for it, you may be unable to get anywhere near that bandwidth.
the software support 40G, so I would test it :)

In terms of NICs, a popular option here is to buy Mellanox Connectx-3 cards, I am not that into the specific models and all the details, so I can't recommend a specific part to get, but generally speaking I am sure you can find a decent deal on ebay, though you will probably want to search the forum first as these are frequently recommended to figure out which models etc.
Propably I will buy Mellanox Connectx-3 but first I have to verify which motherboards support this connextions. I plan use AMD 3960X or 3950X with ASUS Pro WS X570-ACE.