SXM2 over PCIe

CyklonDX

Member
Nov 8, 2022
51
12
8
Hi,
With many V100 SXM2 cards popping on ebay for 400-700 USD, I believe its time to share potential solution on how to make those SXM2 cards work over PCIE in any suitable workstation/server. The Supermicro initial approach for SXM was a stand-alone board AOM-SXMV


(do note there are multiple models, best to get most simple one - as it will reduce amount of oculinks required.)

This board has 4 oculink connectors x8 (*you only need 2 of them plugged 2x8), and that in turn in supermicro systems is connected over normal pci-e raiser (RSC-GN2-A68) using oculink connector, and a single pcie x8 port that is connected directly from pcie lanes to cpu (like any dedicated pcie slot).



The supermicro motherboard itself has no nvlink chip, or anything special that allows for that AOM-SXMV to work unlike many other systems. AOM-SXMV has no manufacturer/mobo lock either - so it can be connected to any system. (Rest of the connectors in the picture are standard 8pin power cables.)

You would be looking for 2 of those (or a single 16i if you can find it.)

(do note, that you don't really need it to be pcie4.0, the link will work just as fine on pcie 3.0 x8, the only con is slower throughput between your system and the SXM GPU's)

The last thing would be cooling, I would recommend building some 1u small cage for the AOM-SXMV pcb, with 2x 40mm 16k-24k rpm fans for each GPU.

Regards
 
Last edited:
  • Like
Reactions: Alfa147x

bayleyw

Active Member
Jan 8, 2014
181
61
28
Does this...actually work?! The OcuLink ports on the AOM-SXMV are indeed routed to the PLX switches, but the other end goes into the bottom slot of the riser and is designed to interface with an Infiniband card plugged into that slot. "Officially" each of the card edge connectors is a PCIe x16 routed to the pair of PLX on the carrier board that handles the GPUs.

The AOM-SXMV is actually a simple beast - each pair of GPUs are connected by x16 to a PLX which connects a single x16 to the CPUs. The NVLinks are just passive traces on the carrier which connect the GPUs to each other, there are no routers.
 

CyklonDX

Member
Nov 8, 2022
51
12
8
I haven't mounted the AOM-SXMV board to a pc/other than supermicro v4 server myself, but this is brief result of few years spent troubleshooting SYS-1029GQ-TVRT systems at work. (support package expired...)

We had ran multiple tests, and were able to use just 2 oculinks (cpu1), and connect the single pcie link not through the port by cpu, but one on the cpu1 raiser (we were also able to attach sxmv board to another but same class supermicro server that only had pcie slots for pcie gpu's (just to test if it failed or our port by cpu failed - it worked). Thus my conclusion its just standard pcie 3.0 link x8.

1668329051985.png
(our ports by cpu on those systems - kept on dying - which forced us to get creative - we used a long raiser cable instead.)

I cannot 100% guarantee it will work for everyone (especially since power delivery itself will be a killer/dead end for many).
If my company ever decoms one of those boxes, i'll try it - but atm bit short on $$$ to buy parts and doing it myself.
(so i'm throwing a ball to other people)