We're looking to purchase two 4U nodes, one with 8 or 10 RTX 2080 Ti GPUs and the other with 8 or 10 Quadro RTX 6000 GPUs for deep learning, and I'm having trouble figuring out which architecture to go with.
We're debating between single-root and dual-root complex systems, and between E5 and Cascade Lake SP CPUs. From reading on the site, it sounds like going with a single-root complex with E5 CPUs is the best option, but I'd like to make sure we're not giving up on other advantages provided by the more recent SP CPUs, especially as the pages I found discussing the merits of the various systems are from 2018, and things may have changed since.
Some notes:
- the 2080 Ti node is to host 10 cards that we already have (Zotac, blower type), so a 10-GPU node would be better but we can consider 8-GPU configs if the speed is significantly better.
- we're planning to use NVlink bridges on each pair of GPUs in both systems.
- we are looking to buy machines based on Supermicro 4028 or 4029 systems.
We're debating between single-root and dual-root complex systems, and between E5 and Cascade Lake SP CPUs. From reading on the site, it sounds like going with a single-root complex with E5 CPUs is the best option, but I'd like to make sure we're not giving up on other advantages provided by the more recent SP CPUs, especially as the pages I found discussing the merits of the various systems are from 2018, and things may have changed since.
Some notes:
- the 2080 Ti node is to host 10 cards that we already have (Zotac, blower type), so a 10-GPU node would be better but we can consider 8-GPU configs if the speed is significantly better.
- we're planning to use NVlink bridges on each pair of GPUs in both systems.
- we are looking to buy machines based on Supermicro 4028 or 4029 systems.