Renewing my DL hardware.

balnazzar · Oct 27, 2019

Hi fellas.

I'm in the process of renewing my deep learning hardware, and particularly the GPUs. My host machine has 40 x16 lanes and 3 slots in 16x/16x/8x, so I think I will leave it untouched. Presently, I work with two 1080TIs.

Now, to achieve a good price/perf ratio when it comes to buy GPUs is a little bit harder than one could think, and that essentially beacause of the de facto nvidia monopoly, which allows them to implement a relentless customer segmentation.

Suppose I manage to sell my two 1080TIs and make one grand out of them. I can add some 1500/2000 usd, for a maximum grand total of 3 grands. What I can buy?

Option 1: An rtx titan.
Badly overpriced. 24gb of memory are actually not that much for 2500$. It won't allow me to experiment with parallel training, which is one of my preferred areas of research.

Option 2: two 2080TIs
Almost as costly, and marginally better than my two 1080ti in FP32, some 40% better in FP16, given you use all the boring tricks necessary for mixed precision training to converge (and that if you have volta/turing. Pascal seldom converges in fp16, no matter using nvidia apex).
The main con here is that if you work with something not parallelizable, you are stuck with 11gb vram.

Option 3: three 2060 Super.
Cost-efficient solution. Less power consumption. Very good price performance ratio IF the model is parallelizable (say anything you can run with pytorch DataParallel). Otherwise it suffers from the memory limitation mentioned above, but worse, with 8gb per card.

Option 4: V100s/16gb in SXM2
Great option theoretically: they are crap cheap on ebay ($700), and I would be content with 2-3 of them. But I'm unable to find used SXM2 boards/servers. And given I somewhat find them, installing an SXM2 card without breaking something is not an inconsiderable task.

Option 5: three radeon VII with ROCm
No, I don't even want to talk about that! I don't want to ruin my mental health, seriously.

Option 6: wait some six months..
...and see if what nvidia launches in 2020. Maybe then I will be able to fine one quadro rtx8000/48gb at a fair price on ebay.

I urge you to notice how the vram amount is becoming critical. For example, to use a big NLP model with a transformer, my two 1080ti ran out of memory. To finetune EfficientNet-b7 I had to use a (expensive!) cloud instance with four tesla V100/32Gb, and memory occupation was >100Gb no matter the modest batch size of 24 images (600px). On top of that, nvidia left the vram amount unchanged as it moved from pascal to turing (6gb/8gb/11gb, with the only notable exception of the titan rtx).

I'd like to know your opinion, thanks.

BeTeP · Oct 27, 2019

balnazzar said:
My host machine has 40 x16 lanes and 3 slots in 16x/16x/8x

For a brief moment I was going to google which current platform had 640 PCIe lanes but then I noticed that you were dealing with budgets in single digit thousands and it helped me to regain my sanity.

Search

Renewing my DL hardware.

balnazzar

Active Member

BeTeP

Well-Known Member