Automotive A100 SXM2 for FSD? (NVIDIA DRIVE A100)

Underscore · Feb 7, 2024

I recently found a new A100 listing on eBay. The fact that it really does appear to be an SXM2 model is what really caught my eye (plus the very cheap price for an A100).

Has anyone heard of this? The PCIe DRIVE A100's documented on TechPowerUp so I wonder if anybody's tried this before on any non-FSD project.

bayleyw · Feb 7, 2024

These are nothing close to 'very cheap', they are engineering samples that only have 4 stacks of HBM (32GB, reduced bandwidth). I've seen them work fine on third-party PCI-e adapters, but its unclear if they have NVLINK and ostensibly if you're buying an SXM2 A100 instead of an SXM4 based one its because you have some SXM2 based trays that you would like to upgrade.

If they have NVLINK I'd pay about $2K for them, if they don't maybe $1K - after the adapter and cooler you're looking at a $1500 project for a six-slot (!) thick PCI-e GPU that is basically a 4090 with 8GB of extra memory.

Underscore · Feb 8, 2024

bayleyw said:
PCI-e GPU that is basically a 4090 with 8GB of extra memory.

Oh definitely not worth it as it is, but I'm just curious if this will ever be an option in say three to three years, about how old the V100 is now. As you know, the V100's an extremely viable option as a GPU especially when NVLinked, and its price dropped exponentially over 6 years as would be expected.

The standard A100 plus any SXM4 boards are far from being average consumer friendly even after some years excluding private single-GPU adapters, whereas an engineering sample, especially SXM2, could be a good option on a standard AOM-SXMV or 0CTHR in the (relatively) near future. Whether it can be NVLinked is of course an important question too as you mention.

bayleyw · Feb 8, 2024

in three years time you'll be able to buy HGX A100s for cheap, and the Redstone boards will be much easier to deal with since there is only one implementation instead of half a dozen DGX-like flavors

MilkyWeight · Mar 15, 2024

bayleyw said:
These are nothing close to 'very cheap', they are engineering samples that only have 4 stacks of HBM (32GB, reduced bandwidth). I've seen them work fine on third-party PCI-e adapters, but its unclear if they have NVLINK and ostensibly if you're buying an SXM2 A100 instead of an SXM4 based one its because you have some SXM2 based trays that you would like to upgrade.

If they have NVLINK I'd pay about $2K for them, if they don't maybe $1K - after the adapter and cooler you're looking at a $1500 project for a six-slot (!) thick PCI-e GPU that is basically a 4090 with 8GB of extra memory.

I bought two of these to test out. I tried them on a server with an SXM2 GPU board/module. Didn’t even recognize it. So I got a PCIe adapter and tested. Same, isn’t even being recognized. Is there some trick to using them?

bayleyw · Mar 16, 2024

I've definitely seen them run on PCIe adapters. What server did you test on?

kedzior · Jun 24, 2024

MilkyWeight said:
I bought two of these to test out. I tried them on a server with an SXM2 GPU board/module. Didn’t even recognize it. So I got a PCIe adapter and tested. Same, isn’t even being recognized. Is there some trick to using them?

Did you managed them to work ?

xdever · Jul 3, 2024

I'd also be interested if this actually works. Does somebody have hands-on experience with getting them to work and maybe an nvidia-smi screenshot? I asked a seller on ebay, and he/she claims it won't work in standard servers, which is strange given that SXM2 is just PCI-express with a different connector. Is it possible that it's using the same connectors, but the pinout is actually not SXM2?

bayleyw · Jul 3, 2024

I think the firmware is wonky since DRIVE is a somewhat odd ARM platform. I also don't know how NVLINK on these would work - A100 NVLINK is clocked higher than V100 NVLINK and the baseboards are not going to like the higher data rates.

xdever · Jul 3, 2024

But the BIOS (stored in an eeprom on the GPU) should be enough for them to show up on the PCI bus, right? @MilkyWeight was saying that it's not recognized for him/her. PCIe enumeration must happen before the driver can potentially load firmware. I'm personally not (yet) interested in NVLINK. That is a good point about the frequencies. But this also means that either there is a signal that indicates that this is a "special SXM2" to the GPU, or that these are downclocked NVLINKs, because the impedance of the traces on the board must be matched to the frequency they operate on. That could also explain why they won't start on the standard boards. The Chinese adapters have ~22 unconnected pins on the PCIE side, and all the NVLINK is unpopulated, leaving a lot of room for additional unknown signals that might be needed for powering up. Unless somebody has access to a DRIVE kit, this seems a bit hopeless.

xdever · Jul 3, 2024

Another thing I forgot to mention: on my Chinese adapter, there was SMPS for 5V that couldn't provide enough current for my V100. I had to desolder it and connect it directly to the 5V line from the power supply of the PC, and then it was recognized. @MilkyWeight, you might also want to check the output of the 3.3V linear reg for the same reason.

flibbertygibbitz · Jul 18, 2024

Just FYI for anyone else, I tried two in an IBM S822L and they bricked it. The server is dead now.

xdever · Jul 18, 2024

This reinforces my suspicion about the power supply issue. I found this, where they ran it in an adapter, and they posted some screenshots.

Leiko · Jul 18, 2024

xdever said:
This reinforces my suspicion about the power supply issue. I found this, where they ran it in an adapter, and they posted some screenshots.

Found some pics on chinese second hand market of people running them on adapters / standalone pcie sxm2 boards.

^this is from a xianyu listing for A100 DRIVE

matrix1 · Aug 11, 2024

Maybe you can refer to this blog post

https://www.841973620.net:888/index.php/archives/PG199.html

It should be noted that you must write one drivers on Windows systems

gsrcrxsi · Aug 14, 2024

if these come down a lot in price, they might be compelling. but they need to come down A LOT. current cheapest one ~$2500 on ebay.

I haven't really seen any real performance numbers. but looking at some of the other info online, it's not a full A100, it's cut down to 96 SMs (from the A100's full 108) and 32GB VRAM.

if you scale the FP32/FP64 specs from that, you land around 17.32 TFlops (FP32) and 8.66 TFlops. that's only about 10% better than a V100 32GB (~$800 on ebay, and dropping). If your application can benefit from the better tensor cores from Ampere, or the ability to do stuff like BF16 or TF32, it might scale more than that 10%.

i'd love to see some real numbers in a variety of workloads, HPC, raw FP compute, and other stuff.

bayleyw · Aug 14, 2024

Ampere was never a compelling fp64 card to begin with. Full A100 was only 9.5 Tflops which was an incremental improvement over V100, but Ampere has almost 3x the tensor core throughput of Volta. More importantly it supports modern data formats (bf16) and libraries (FlashAttention2) which make it much easier to run DL frameworks out of the box - we are reaching the point where a lot of research codes only call flash2 because the devs built their model on Ampere, which means a lot of surgery is needed to get them running on Volta.

The problem is it seems like the SXM2 A100's don't run on SXM2 Nvlink baseboards. Without Nvlink you are better off with a 4090's which now support peermem and are about as fast.

gsrcrxsi · Aug 15, 2024

Yeah, it would be cool if the A100 SXM2 "DRIVE" units worked on the AOM-SXMV boards, but I 100% expected they wouldn't.

maybe it *could* be made to work if there was a way to disable all the nvlink links to use them as a compute cluster of separate GPUs instead (which is how I use the V100s anyway). but I wont hold my breath on anything like that.

the individual SXM2->PCIe adapters are pretty pricey on a per-GPU price vs the AOM-SXMV, about 4x more expensive. $250/ea vs ~$250 for 4x. but that might be the only way forward if you want to use these A100s

matrix1 · Aug 24, 2024

gsrcrxsi said:
Yeah, it would be cool if the A100 SXM2 "DRIVE" units worked on the AOM-SXMV boards, but I 100% expected they wouldn't.

maybe it *could* be made to work if there was a way to disable all the nvlink links to use them as a compute cluster of separate GPUs instead (which is how I use the V100s anyway). but I wont hold my breath on anything like that.

the individual SXM2->PCIe adapters are pretty pricey on a per-GPU price vs the AOM-SXMV, about 4x more expensive. $250/ea vs ~$250 for 4x. but that might be the only way forward if you want to use these A100s

Yes, it can be used on the supermicro aom-sxmv board

matrix1 · Aug 24, 2024

bayleyw said:
Ampere was never a compelling fp64 card to begin with. Full A100 was only 9.5 Tflops which was an incremental improvement over V100, but Ampere has almost 3x the tensor core throughput of Volta. More importantly it supports modern data formats (bf16) and libraries (FlashAttention2) which make it much easier to run DL frameworks out of the box - we are reaching the point where a lot of research codes only call flash2 because the devs built their model on Ampere, which means a lot of surgery is needed to get them running on Volta.

The problem is it seems like the SXM2 A100's don't run on SXM2 Nvlink baseboards. Without Nvlink you are better off with a 4090's which now support peermem and are about as fast.

I've seen it run on a supermicro sxm2 baseboard, but nvlink doesn't work properly and communicates with pcie3.0x16

Automotive A100 SXM2 for FSD? (NVIDIA DRIVE A100)

New Member

Active Member

New Member

Active Member

New Member

Active Member

Active Member

Member

Active Member

Member

Member

New Member

Member

Member

New Member

Active Member

Active Member

Active Member

New Member

New Member