Automotive A100 SXM2 for FSD? (NVIDIA DRIVE A100)

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

blackcat1402

New Member
Dec 10, 2024
26
6
3
maybe yes, upon your use case, but i am happy with them together with exllamav2 for Qwen series inference, super fast :D
 

blackcat1402

New Member
Dec 10, 2024
26
6
3
NVIDIA DRIVE A100 Automotive SXM2 GPU (Model: 900-6G199-0000-C00) for Autonomous Vehicles Specifications
Key Features:
  • Model Number: 900-6G199-0000-C00
  • Product Type: Automotive GPU
  • Architecture: NVIDIA Ampere
  • Form Factor: SXM2 (for integration into automotive systems)
  • Target Application: Autonomous Vehicles, AI, Machine Learning, Edge Computing
Performance Specifications:
  • CUDA Cores: 6,912 CUDA cores
  • GPU Memory: 40GB HBM2 (High Bandwidth Memory 2)
  • Memory Bandwidth: 1.6 TB/s
  • FP32 (Single Precision): Up to 312 TOPs (Tera Operations per Second)
  • Tensor Operations (Tensor Cores): Up to 1,248 TOPs (AI and deep learning performance)
  • RT Cores (Ray Tracing): Advanced hardware acceleration for real-time ray tracing (RTX support)
  • Processing Power:
    • FP16 Performance: 624 TOPs (half precision)
    • INT8 Performance: 2,496 TOPs (ideal for AI inference tasks)
Key Technologies & Features:
  • NVIDIA Ampere Architecture: The latest GPU architecture built for high-performance AI, deep learning, and autonomous systems. It delivers significant improvements in performance, efficiency, and scale compared to previous generations.
  • Deep Learning & AI Acceleration:
    • Tensor Cores provide high-efficiency processing for matrix operations, accelerating deep learning models, neural network inference, and training.
    • The NVIDIA DRIVE platform supports autonomous driving, enabling processing of high-resolution sensor data, object detection, and decision-making in real time.
  • Safety & Reliability:
    • Designed and validated for automotive-grade applications, with certifications for ISO 26262 and other safety standards.
    • High-reliability components ensure safe operation in harsh automotive environments.
    • Built for 24/7 operation in dynamic conditions, including extreme temperatures and vibration.
  • Connectivity:
    • Integrates with NVIDIA DRIVE AGX platforms, offering flexible and scalable solutions for in-vehicle AI systems.
    • PCIe Gen 4 support allows for fast data transfer between the GPU and other system components.
    • Gigabit Ethernet and high-speed interconnects support communication with external systems for continuous data streaming and synchronization.
  • Multi-Modal Sensor Fusion:
    • Optimized to handle sensor data from cameras, LiDAR, radar, and other sensors for autonomous driving.
    • Real-time processing of video streams, radar data, and depth information to enable accurate perception and decision-making.
Computational Features:
  • Ray Tracing: Dedicated RT cores for real-time ray tracing, enhancing the visual quality of simulations used in autonomous vehicle testing and development.
  • AI Inference: Specialized Tensor Cores allow for high throughput in AI inference tasks, including object recognition, segmentation, and classification, key for autonomous driving.
  • High Precision Computations: Capable of performing floating-point and integer operations at extreme speeds, including FP64 (double precision) for scientific and engineering tasks in simulation and autonomous vehicle systems.
Power and Thermal:
  • Power Consumption: Typically around 300W (specific power requirements depend on usage and system configuration).
  • Thermal Design Power (TDP): Optimized for automotive environments with robust cooling solutions. The SXM2 form factor integrates well with thermal designs, ensuring stable performance under high load.
Software and Ecosystem:
  • NVIDIA DRIVE Software: Integrated with the NVIDIA DRIVE OS, a complete platform for autonomous vehicle development that includes libraries, tools, and frameworks for AI-based applications.
  • CUDA, cuDNN, TensorRT Support: Software libraries that provide accelerated computing frameworks for deep learning, neural networks, and AI workloads, ensuring the GPU is used to its full potential.
Automotive Integration:
  • Form Factor: SXM2 form factor is designed for seamless integration into automotive systems, providing scalability for high-performance computing within the vehicle.
  • Automotive Certifications: Built to meet rigorous automotive standards, including ISO 26262 functional safety and AEC-Q100 automotive-grade reliability requirements.
Applications and Use Cases:
  • Autonomous Vehicles: Real-time sensor fusion, perception, and decision-making for fully autonomous driving.
  • Driver Assistance Systems: Enhanced driver assistance features, including adaptive cruise control, lane-keeping assistance, and emergency braking.
  • AI and Machine Learning: High-throughput processing for AI inference tasks such as object detection, pedestrian tracking, and obstacle avoidance.
  • Simulation and Testing: Used in simulation environments to model, train, and test autonomous driving algorithms, offering high accuracy and fast processing speeds.
Conclusion:
The NVIDIA DRIVE A100 Automotive SXM2 GPU (Model: 900-6G199-0000-C00) is a cutting-edge solution for autonomous vehicles, designed to handle complex AI workloads, sensor fusion, and real-time decision-making. With its powerful Ampere architecture, 40GB of HBM2 memory, and support for Tensor Cores and Ray Tracing, the A100 delivers top-tier performance for autonomous driving systems. Its reliability, safety features, and ability to process large volumes of sensor data in real-time make it an ideal choice for next-generation autonomous vehicles, offering a scalable platform for both current and future automotive AI applications.
 

wheat_field

New Member
Nov 23, 2024
7
2
3
Going to try my hand at delidding one of these soon with a heat gun and xacto knife; will send notes on my experience if I'm successful. Before I try, wanted to check if anybody had any resources besides the image Xianyu listing, whether personal experience or online post.
 

pingyuin

New Member
Oct 30, 2024
14
8
3
Wow! You gotta be careful, cuz looks like supply of these modules are thinning. Speaking about heat gun try not to overheat anything and it's much better if you have thermocouple or anything alike to control temperature of lid while heating. Some glues are sensitive to thinners like rubbing alcohol, aceton, xylol and so on. Supposedly it wont make any harm if used instead of heating the lid. Speaking about those photo, looks like some thin metal spudger has been used and xacto probably is not the best tool to do such a job. And something else on the photo, that is thin metal wire and it probably has been used also to pry the lid. If you are going to use it as naked chip after you delid it you'll need to think about height of cooling solution which must be in contact with chiplets and DRMOS at the same time without big gap between chiplets and metal base. Phase change thermal interface is a must. Because of no bezel around the chiplets the risk of chipping is pretty high. If you are going to use liquid metal than make sure the lid is not too prone to LM-corrosion which in most cases means that it's nickel plated.
 
  • Like
Reactions: wheat_field

blackcat1402

New Member
Dec 10, 2024
26
6
3
Can I ask why you want to disassemble this SXM2? My current application is mainly LLM inference. Using the latest MOE architecture such as Qwen3 or GLM, the heat problem has been greatly reduced. I used to use three three-fan air cooling, each fan power is 6Watt, but now, three fans only use 1Watt to suppress power consumption; the biggest advantage of doing this is silence. The PG199 is right under my feet, connected to my HP thin and light notebook through the eGPU for local LLM inference, and the temperature is below 70degC. If you are also doing LLM inference, there is actually no need to disassemble this chip, which is in short supply. But combined with the latest LLM technology, it is really great!
 

wheat_field

New Member
Nov 23, 2024
7
2
3
I wanna satiate my curiosity; not a very good reason to be honest :p

I'm also (too) interested in the possibility of cooling this with a single 120mm radiator like with the Radeon R9 295X, and I'd wanna give myself as much thermal headroom as possible with so little cooling.
 
  • Like
Reactions: blackcat1402

wheat_field

New Member
Nov 23, 2024
7
2
3
Managed to get the lid off with out any evident damage to the GPU

Materials: Xacto knife with a flat blade, cheap Amazon heat gun, oven mitt and soldering hand. No thermalcouple is needed; the temperature needed to weaken the glue isn't very high.

The GPU is not soldered onto the IHS; I feel life I've seen this information online before but I can't find the source. After delidding I can confirm that this is true.

I took this process pretty slowly. Took me around an hour as I slowly increased the heat that I'd add the to IHS, but you could probably do this faster. One side of my IHS had a gap that I could just for the Xacto blade under, so I began on that side. I estimated the heat of the board by quickly touching the IHS; I think that the ideal temperature is just barely hot enough to touch.

I used the flat blade as a lever rather than trying to cut through the glue. It took roughly 45 minutes to see any progress. I went along the IHS side and tried prying gently a few times with the blade to weaken the glue before switching to a new location.

Eventually I could fit a signficant portion of the blade under the IHS. Afterwards, I continued widening until the IHS popped off. At this point, there's no need to keep the GPU hot.

Side note, I imagine that doing this can lead to some clearance issues. If you're using the watercooling adapter from Ebay though, you won't have any issues.

Very happy that I didn't break the GPU :)
 

Attachments

  • Like
Reactions: blackcat1402

blackcat1402

New Member
Dec 10, 2024
26
6
3
The Nvidia A100 32G SXM2 PG199+FA2 can support local agentic coding tools like Roo, Cline, and Kilo, with acceptable output speeds. Taking Qwen3 Coder 30b as an example, chat performance reaches 80-90 TPS, but drops to 40-50 TPS when using agent due to the long context window. With contexts exceeding 60K tokens, performance decreases by approximately 5 TPS.

The motherboard, CPU, and RAM also affect agent output speeds. For instance, a newer HP EliteBook 840 laptop with DDR5 memory and a PG199 card can achieve over 50 TPS, while an X99 motherboard with an E5 processor and DDR4 memory only reaches about 40 TPS.

Here are the speeds achieved on different models using the X99 platform (garantee context length > 128K by switching KV cache between FP16 and Q4) with the same prompt "generate python snakegame with pygame":
No.1 , Single PG199+Roo with Qwen3 Coder 30b: ~40 TPS,
No 2, Dual PG199+Roo with GLM4.5 Air 106b: ~18 TPS
No 3., Single PG199+Roo with Seed OSS 36b: ~15 TPS
 

wheat_field

New Member
Nov 23, 2024
7
2
3
Has anyone managed to get Vulkan working? Installing Windows drivers is pretty easy with NVCleanstall, but it doesn't give support for Vulkan/DX12 when loading Google Cloud's public vGPU drivers using templates from other A100 models.