Xeon E-2xxx vs Xeon D-2xxx for ML mule

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

Styp

Member
Aug 1, 2018
69
21
8
Hey,

I am in for upgrading some hardware and I came to the conclusion that I need an ML training mule separated from my current workstation.

Can anyone explain the advantages and disadvantages of going with eighter platform for this use-case? I am considering to run 2 1080ti as for know they are in my workstation.

Thanks!

Martin
 

Patrick

Administrator
Staff member
Dec 21, 2010
12,511
5,792
113
If I were doing 2x 1080 Ti I would look at Intel Xeon E5 (V4.)

The Intel Xeon E-2100 series does not have the bandwidth for two PCIe 3.0 x16 lanes. The Xeon D-2100 does. Intel Xeon E5 will allow you to use a single root complex which is good for NVIDIA nccl when you distribute training to more than one GPU. Xeon E5 pricing is good since you can pick up used gear. You can also have more RAM and still have PCIe lanes for NICs.
 

Styp

Member
Aug 1, 2018
69
21
8
Interestingly, I never considered PCI-E lanes while looking at the different platforms.
How big is the impact on 2x x8 vs 2x x16? How much CPU are you usually recommending in your builds?
 

Patrick

Administrator
Staff member
Dec 21, 2010
12,511
5,792
113
It can be a decent size impact. Using the E5 V3/V4 would allow you to have a single PCIe root complex. That means you can use nccl for your two GPUs which would yield a ~20-30% performance speedup over two GPUs on a Xeon D platform.

On the 2x 8 v. 2x 16 what you will run into is having to pass data from one GPU to the other. I would recommend x16 if possible. Also, consider you may want to add a network card and/or NVMe storage.
 

Styp

Member
Aug 1, 2018
69
21
8
You might move this to the DL section...

How can I check weather NCCL is used with tf? I am using an E5 V3 quad GPU rig at the place I work and often do multi GPU training but never questioned the optimization potential. Of course async augmentation and stuff yes, but only as far as I can control it from a software engineering side.

I am just questioning that because 2 1080ti could be replaced with one 2080ti down the road, I don't want the fastest rig at home. I just need a 'little' compute for personal research...
 

Styp

Member
Aug 1, 2018
69
21
8
It can be a decent size impact. Using the E5 V3/V4 would allow you to have a single PCIe root complex. That means you can use nccl for your two GPUs which would yield a ~20-30% performance speedup over two GPUs on a Xeon D platform.

On the 2x 8 v. 2x 16 what you will run into is having to pass data from one GPU to the other. I would recommend x16 if possible. Also, consider you may want to add a network card and/or NVMe storage.
To come back to this topic, @Patrick. Do you have a recommendation on a GPU focused mainboard. I don't need tons of storage, but a E5 V4 would be nice. Have to opt for a PCIe NVME for dataset handling.

Cheers!
Martin
 

Deslok

Well-Known Member
Jul 15, 2015
1,122
125
63
34
deslok.dyndns.org
Supermicro has the X10SRA and X10SRL which would both offer you a ton of PCIe
the SRA offers up to 16/8/8/8 for pcie 3.0 bandwith or 16/16/x/8(use the last slot for nvme?) the SRL unfortunatly only does 8x to any of it's slots but offers more slots total depending on how you approach NVME storage(unfortuantly neither offers m.2 but carriers are cheap as long as you don't need a switch chip on them) also the SRL is more appropriate for a server chassis cooling wise the SRA a desktop
X10SRA | Motherboards | Products | Super Micro Computer, Inc.
X10SRL-F | Motherboards | Products | Super Micro Computer, Inc.