A100 vs A6000 vs 3090 for DL and FP32/FP64

junewang05

New Member
Feb 5, 2022
1
0
0
Hello, I'm currently looking for a workstation for deep learning in computer vision tasks- image classification, depth prediction, pose estimation. I need at least 80G of VRAM with the potential to add more in the future, but I'm a bit struggling with gpu options.

Assume power consumption wouldn't be a problem, the gpus I'm comparing are A100 80G PCIe*1 vs. 3090*4 vs. A6000*2. They all meet my memory requirement, however A100's FP32 is half the other two although with impressive FP64.

A100 vs. A6000
Based on my findings, we don't really need FP64 unless it's for certain medical applications. But The Best GPUs for Deep Learning in 2020 — An In-depth Analysis is suggesting A100 outperforms A6000 ~50% in DL. Also the Stylegan project GitHub - NVlabs/stylegan: StyleGAN - Official TensorFlow Implementation uses NVIDIA DGX-1 with 8 Tesla V100 16G(Fp32=15TFLOPS) to train dataset of high-res 1024*1024 images, I'm getting a bit uncertain if my specific tasks would require FP64 since my dataset is also high-res images. If not, can I assume A6000*5(total 120G) could provide similar results for StyleGan?

A6000 vs. 3090
3090*4 should be a little bit better than A6000*2 based on RTX A6000 vs RTX 3090 Deep Learning Benchmarks | Lambda, but A6000 has more memory per card, might be a better fit for adding more cards later without changing much setup.

How would you choose among the three gpus?
 
Last edited:

Patriot

Moderator
Apr 18, 2011
1,354
738
113
While nvlink technically allows shared vram.... there are no 4 way bridges outside of DGX boxes.
Also, all of your performance metrics would be thrown out the window as they assume local vram/gpu task. You would be cutting the memory performance to 1/8th. It could be done, but you would need to adjust your batches in a way that.... doesn't need 80gb/card....
 

iceisfun

Member
Jul 19, 2014
30
4
8
our 3090s running in a server room so the heat issue is not bad for us have the lowest latency with bs1 inference