SXM2 over PCIe

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

gsrcrxsi

Active Member
Dec 12, 2018
303
103
43
Yeah they’ve been around that price for quite a while. I bought a bunch of them. 15 in service now.

still trying to replace them with V100 SXM2 setups tho. Supply of the SXMV boards seems to have dried up quick.
 
  • Like
Reactions: CyklonDX

piranha32

Active Member
Mar 4, 2023
246
178
43

Underscore

New Member
Oct 21, 2023
6
0
1
Just FYI in case someone looking
Titan V has recently gone cheaps
Can be bought at around 400-500 usd on ebay.

For those interested here's of cards i had for my own research (non-sxm2 - compute fp16 oriented)
Are Titan V's even remotely worth it anymore? 12GB of Volta is notably worse than 11GB Turing—the 2080ti—even with HBM, and at that price point you could get the modded 22GB variant per @bayleyw's suggestion. Yeah no FP64 but you get that sweet INT4 instead, and you mentioned FP16 so 2080ti's about on par. Longer support and RT cores're a cool plus.

So the V100 seems to be the better option all-in-all.
 
Last edited:

CyklonDX

Well-Known Member
Nov 8, 2022
857
282
63
The 22G variant or RTX 5-8k are definitely better deal in terms of vram; the int8 or tensor performance is small price to pay for capacity.
In terms of performance in int8 in nv cards its almost always 4xfp32 since volta if i recall correctly - tho it doesn't scale that well in reality, Titan V only produced 3.9x fp32 when you disabled ecc on vram with ecc it was more of a 3.4x rendering it slower than 2080.

While Titan V is weak, 2080/Ti is weaker in out of the box in terms of time ~ but the difference is couple seconds at most.
If one is looking for fp16, int8 the 3080 is much better, and cheap option the performance is so much greater making both Titan V or 2080 just outdated retro-ware - nice cards to hang on the wall.

(same when you compare it to 7900xtx it just blows things out of the water as long as there's some support for amd, the language models i tested few days ago LM Studio - Discover and run local LLMs ~ i was quite surprised at performance, i ran similar models on 3080ti a year ago and it was taking like 1.2sec per response, so i was stunned since 7900xtx it was instantaneous - if i ever go into it again i may write up my results for 3080ti and 7900xtx/or and if i get 40 series or some other gpu - for now i have too many.)
 
Last edited:

bayleyw

Active Member
Jan 8, 2014
305
102
43
Titan V/Quadro GV100 are the last fp64-capable cards with display outputs so they have some value for scientific simulations, especially if you're a researcher running commercial software that doesn't like living on the cloud. For language modeling, in rough order of viability:

  • 3090/3090 Ti 24GB: probably all you ever need, for bs=1 inference the only faster cards are A100 and H100 which orders of magnitude more expensive. Also supports NVLINK'ed pairs for an improved training experience - get 48GB for half the price of an A6000, and faster too.
  • A6000 48GB: for the rich among us (or small startups). 2x the VRAM for 4x the price. Actually slower than a 3090 because it uses GDDR6, not GDDR6X. Build NVLINK'ed pairs and get 96GB for $7,000 - save seven grand over an A100 80GB!
  • RTX 8000 48GB: the poor man's version of the A6000, but Turing is not as well supported by frameworks as Ampere
  • 4090: so fast it's a weapon, and also supports Transformer Engine. Thanks to our dear friend George, also supports peermem on Linux. Not worth the extra money for batch size 1 inference, but might be worth it for training because it supports fp8
  • 2080 Ti 22GB: slow as balls, but feature rich: int8 tensor cores, NVLINK, two-slot blowers available. Not worth it for anything less than four cards, but really convenient in 4x/8x configs since you don't need to jump through hoops with risers.
 
  • Like
Reactions: piranha32

CyklonDX

Well-Known Member
Nov 8, 2022
857
282
63
for fp64 it might be worth to look at amd with zluda; there's been plenty of development on amd/windows side to run on directML (MI100, 210 might see a new life with those)
(just last night i managed to run stable diff on 7900xtx on windows10 ~ i don't have good comparison at this time with nv cards - as its not comfyui workflow alike)
 
Last edited:
  • Like
Reactions: piranha32