Automotive A100 SXM2 for FSD? (NVIDIA DRIVE A100)

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

Underscore

New Member
Oct 21, 2023
6
0
1
I recently found a new A100 listing on eBay. The fact that it really does appear to be an SXM2 model is what really caught my eye (plus the very cheap price for an A100).
1707322556003.png
Has anyone heard of this? The PCIe DRIVE A100's documented on TechPowerUp so I wonder if anybody's tried this before on any non-FSD project.
 

bayleyw

Active Member
Jan 8, 2014
335
108
43
These are nothing close to 'very cheap', they are engineering samples that only have 4 stacks of HBM (32GB, reduced bandwidth). I've seen them work fine on third-party PCI-e adapters, but its unclear if they have NVLINK and ostensibly if you're buying an SXM2 A100 instead of an SXM4 based one its because you have some SXM2 based trays that you would like to upgrade.

If they have NVLINK I'd pay about $2K for them, if they don't maybe $1K - after the adapter and cooler you're looking at a $1500 project for a six-slot (!) thick PCI-e GPU that is basically a 4090 with 8GB of extra memory.
 

Underscore

New Member
Oct 21, 2023
6
0
1
PCI-e GPU that is basically a 4090 with 8GB of extra memory.
Oh definitely not worth it as it is, but I'm just curious if this will ever be an option in say three to three years, about how old the V100 is now. As you know, the V100's an extremely viable option as a GPU especially when NVLinked, and its price dropped exponentially over 6 years as would be expected.

The standard A100 plus any SXM4 boards are far from being average consumer friendly even after some years excluding private single-GPU adapters, whereas an engineering sample, especially SXM2, could be a good option on a standard AOM-SXMV or 0CTHR in the (relatively) near future. Whether it can be NVLinked is of course an important question too as you mention.
 

bayleyw

Active Member
Jan 8, 2014
335
108
43
in three years time you'll be able to buy HGX A100s for cheap, and the Redstone boards will be much easier to deal with since there is only one implementation instead of half a dozen DGX-like flavors
 

MilkyWeight

New Member
Mar 15, 2024
12
1
3
These are nothing close to 'very cheap', they are engineering samples that only have 4 stacks of HBM (32GB, reduced bandwidth). I've seen them work fine on third-party PCI-e adapters, but its unclear if they have NVLINK and ostensibly if you're buying an SXM2 A100 instead of an SXM4 based one its because you have some SXM2 based trays that you would like to upgrade.

If they have NVLINK I'd pay about $2K for them, if they don't maybe $1K - after the adapter and cooler you're looking at a $1500 project for a six-slot (!) thick PCI-e GPU that is basically a 4090 with 8GB of extra memory.
I bought two of these to test out. I tried them on a server with an SXM2 GPU board/module. Didn’t even recognize it. So I got a PCIe adapter and tested. Same, isn’t even being recognized. Is there some trick to using them?
 
  • Like
Reactions: adaboost

kedzior

Active Member
Mar 21, 2018
125
27
28
50
Poland
I bought two of these to test out. I tried them on a server with an SXM2 GPU board/module. Didn’t even recognize it. So I got a PCIe adapter and tested. Same, isn’t even being recognized. Is there some trick to using them?
Did you managed them to work ?
 

xdever

Member
Jun 29, 2021
34
4
8
I'd also be interested if this actually works. Does somebody have hands-on experience with getting them to work and maybe an nvidia-smi screenshot? I asked a seller on ebay, and he/she claims it won't work in standard servers, which is strange given that SXM2 is just PCI-express with a different connector. Is it possible that it's using the same connectors, but the pinout is actually not SXM2?
 

bayleyw

Active Member
Jan 8, 2014
335
108
43
I think the firmware is wonky since DRIVE is a somewhat odd ARM platform. I also don't know how NVLINK on these would work - A100 NVLINK is clocked higher than V100 NVLINK and the baseboards are not going to like the higher data rates.
 

xdever

Member
Jun 29, 2021
34
4
8
But the BIOS (stored in an eeprom on the GPU) should be enough for them to show up on the PCI bus, right? @MilkyWeight was saying that it's not recognized for him/her. PCIe enumeration must happen before the driver can potentially load firmware. I'm personally not (yet) interested in NVLINK. That is a good point about the frequencies. But this also means that either there is a signal that indicates that this is a "special SXM2" to the GPU, or that these are downclocked NVLINKs, because the impedance of the traces on the board must be matched to the frequency they operate on. That could also explain why they won't start on the standard boards. The Chinese adapters have ~22 unconnected pins on the PCIE side, and all the NVLINK is unpopulated, leaving a lot of room for additional unknown signals that might be needed for powering up. Unless somebody has access to a DRIVE kit, this seems a bit hopeless.
 

xdever

Member
Jun 29, 2021
34
4
8
Another thing I forgot to mention: on my Chinese adapter, there was SMPS for 5V that couldn't provide enough current for my V100. I had to desolder it and connect it directly to the 5V line from the power supply of the PC, and then it was recognized. @MilkyWeight, you might also want to check the output of the 3.3V linear reg for the same reason.
 

xdever

Member
Jun 29, 2021
34
4
8
This reinforces my suspicion about the power supply issue. I found this, where they ran it in an adapter, and they posted some screenshots.
 

Leiko

Member
Aug 15, 2021
38
6
8
This reinforces my suspicion about the power supply issue. I found this, where they ran it in an adapter, and they posted some screenshots.
Found some pics on chinese second hand market of people running them on adapters / standalone pcie sxm2 boards.BCF1591E-001A-449D-8906-34061A394535_4_5005_c.jpeg
^this is from a xianyu listing for A100 DRIVE
 
Last edited:

gsrcrxsi

Active Member
Dec 12, 2018
420
141
43
if these come down a lot in price, they might be compelling. but they need to come down A LOT. current cheapest one ~$2500 on ebay.

I haven't really seen any real performance numbers. but looking at some of the other info online, it's not a full A100, it's cut down to 96 SMs (from the A100's full 108) and 32GB VRAM.

if you scale the FP32/FP64 specs from that, you land around 17.32 TFlops (FP32) and 8.66 TFlops. that's only about 10% better than a V100 32GB (~$800 on ebay, and dropping). If your application can benefit from the better tensor cores from Ampere, or the ability to do stuff like BF16 or TF32, it might scale more than that 10%.

i'd love to see some real numbers in a variety of workloads, HPC, raw FP compute, and other stuff.
 

bayleyw

Active Member
Jan 8, 2014
335
108
43
Ampere was never a compelling fp64 card to begin with. Full A100 was only 9.5 Tflops which was an incremental improvement over V100, but Ampere has almost 3x the tensor core throughput of Volta. More importantly it supports modern data formats (bf16) and libraries (FlashAttention2) which make it much easier to run DL frameworks out of the box - we are reaching the point where a lot of research codes only call flash2 because the devs built their model on Ampere, which means a lot of surgery is needed to get them running on Volta.

The problem is it seems like the SXM2 A100's don't run on SXM2 Nvlink baseboards. Without Nvlink you are better off with a 4090's which now support peermem and are about as fast.
 

gsrcrxsi

Active Member
Dec 12, 2018
420
141
43
Yeah, it would be cool if the A100 SXM2 "DRIVE" units worked on the AOM-SXMV boards, but I 100% expected they wouldn't.

maybe it *could* be made to work if there was a way to disable all the nvlink links to use them as a compute cluster of separate GPUs instead (which is how I use the V100s anyway). but I wont hold my breath on anything like that.

the individual SXM2->PCIe adapters are pretty pricey on a per-GPU price vs the AOM-SXMV, about 4x more expensive. $250/ea vs ~$250 for 4x. but that might be the only way forward if you want to use these A100s
 

matrix1

New Member
Aug 11, 2024
3
0
1
Yeah, it would be cool if the A100 SXM2 "DRIVE" units worked on the AOM-SXMV boards, but I 100% expected they wouldn't.

maybe it *could* be made to work if there was a way to disable all the nvlink links to use them as a compute cluster of separate GPUs instead (which is how I use the V100s anyway). but I wont hold my breath on anything like that.

the individual SXM2->PCIe adapters are pretty pricey on a per-GPU price vs the AOM-SXMV, about 4x more expensive. $250/ea vs ~$250 for 4x. but that might be the only way forward if you want to use these A100s
Yes, it can be used on the supermicro aom-sxmv board
 

matrix1

New Member
Aug 11, 2024
3
0
1
Ampere was never a compelling fp64 card to begin with. Full A100 was only 9.5 Tflops which was an incremental improvement over V100, but Ampere has almost 3x the tensor core throughput of Volta. More importantly it supports modern data formats (bf16) and libraries (FlashAttention2) which make it much easier to run DL frameworks out of the box - we are reaching the point where a lot of research codes only call flash2 because the devs built their model on Ampere, which means a lot of surgery is needed to get them running on Volta.

The problem is it seems like the SXM2 A100's don't run on SXM2 Nvlink baseboards. Without Nvlink you are better off with a 4090's which now support peermem and are about as fast.
I've seen it run on a supermicro sxm2 baseboard, but nvlink doesn't work properly and communicates with pcie3.0x16