Buy Tesla P100 or using cloud to training and inferences with cheaper GPU like Tesla P4?

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

ndklabs

New Member
Oct 17, 2023
2
0
1
Hi everyone,
For hobby purposes, I want to train models for object detection or stable defusion. Should I choose P100? I can get them for around 140 dollars each, and I plan to buy two of them, which is still cheaper than taking one RTX 3060 12GB.
Or training models on the cloud and deploying them locally with cheap inference cards like the Tesla P4 for about 45 dollars each?

Or something else?

Thanks in advance.
 

unwind-protect

Active Member
Mar 7, 2016
418
156
43
Boston
GPUs in the cloud didn't work out for me. The cheaper spot instances become unavailable in the evening when everybody is doing GPU work.
 
Last edited:

ndklabs

New Member
Oct 17, 2023
2
0
1
I was just looking at vast.ai, and the cost to run a VM with 1xP100 for a full month was almost as much as a Tesla P100. However, the decrease in power consumption across all systems by using P4 inference alone, which peaked at about 75w compared to more than 250w on a single P100, may be a crucial factor. It gave me headaches.
 

CyklonDX

Well-Known Member
Nov 8, 2022
848
279
63
P100 performance is around that of 2070 (in games), and 2080ti in ai workload like stable diffusion. Depending on your funds, i would advise getting A4000 instead.

P100 has no power states (it will always run at full clock rate - as clockgen was bound to nvlink), has fastest memory.
P40 has more vram (24GB), has power states - better for AI as you will less likely run into out of memory scenarios
A4000 is faster than both above, has power states, sadly only 16GB of vram.


P100 can only realistically utilize around 65% of its memory bandwidth (even less under full load). It makes little difference in scale of things.


In stable diffusion 1.4 same workload of creating image with k_euler

512x512 + upscale realesrgan:
P100: 24 sec
P40: 26 sec
A4000: 12 sec

1920x1080 + upscale realsrgan
P100: out of memory
P40: 49 sec
A4000: out of memory

(In theory 1280x768 is max you can get on k_euler with 16gb vram, but with some optimizations you could prob get more closer to 1080p)
 
Last edited:
  • Like
Reactions: BoredSysadmin

CyklonDX

Well-Known Member
Nov 8, 2022
848
279
63
This is for stable diffusion

1698082285558.png

with following models
1698082299785.png

Creating images such as: