SR-IOV on AMD Instinct MI25

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

iGene

Member
Jun 15, 2014
74
23
8
Taiwan
Hi all,

I recently got an AMD Instinct MI25 and trying to play with SR-IOV with it. It is one of the cards that listed supporting AMD MxGPU (SR-IOV) on their website.

However, AMD seems to stop updating their open source GIM driver for SR-IOV, and the only supported card the very old S7150. Tried modify the driver, but there is too much hardware changes between the 2 cards and can't made vBIOS patching work on MI25.

Has anyone successfully tried using MI25s with SR-IOV? Or the features is only available for large cloud companies?
 

CyklonDX

Well-Known Member
Nov 8, 2022
1,177
404
83
Your best bet for making this work is running vmware, as there are vmware drivers for esxi 7x.
(there are potentially newer ones too on different cards - keep in mind while its mi250 page, the drivers are for all mi series cards.) https://www.amd.com/en/support/serv...nct/amd-instinct-mi-series/amd-instinct-mi250
(*here are for guest)

Beyond vmware, you are out off luck. (Potentially hyper-v may allow you to do it.)
 

iGene

Member
Jun 15, 2014
74
23
8
Taiwan
Your best bet for making this work is running vmware, as there are vmware drivers for esxi 7x.
(there are potentially newer ones too on different cards - keep in mind while its mi250 page, the drivers are for all mi series cards.) https://www.amd.com/en/support/serv...nct/amd-instinct-mi-series/amd-instinct-mi250
(*here are for guest)

Beyond vmware, you are out off luck. (Potentially hyper-v may allow you to do it.)
Thanks for the reply, I don't know that MI250 driver will possibly work.

I was also trying Hyper-V at first as I heard that Azure is using Hyper-V + GPU-PV for their MI25 cards, but still I couldn't find a driver for them.
Let me try if VMware will work.

Personally I will prefer to use KVM but seems like there is no luck with going this path.
 
Last edited:

iGene

Member
Jun 15, 2014
74
23
8
Taiwan
Your best bet for making this work is running vmware, as there are vmware drivers for esxi 7x.
(there are potentially newer ones too on different cards - keep in mind while its mi250 page, the drivers are for all mi series cards.) https://www.amd.com/en/support/serv...nct/amd-instinct-mi-series/amd-instinct-mi250
(*here are for guest)

Beyond vmware, you are out off luck. (Potentially hyper-v may allow you to do it.)
The instinct MI250 drivers doesn't work, the VMware driver map only shows the PCI device ID for MI200 series.
While the VMware VIB for V340 includes the PCI device ID for Instinct MI25 and the VF did show up. Driver initializations on the host will fail (code 43 in Windows and the logs in screenshot for Linux). V340 guest driver couldn't be installed as the PCI device ID doesn't match.

Probably no luck with this card o_O
1669640050063.png1669640067061.png
 

CyklonDX

Well-Known Member
Nov 8, 2022
1,177
404
83
did you try those guest drivers? have you tried the normal pro drivers for vega 10? (you could also try the automated / detection - as some drivers do not display in the amd site driver dropdown)




Note the gpuid's match for following gpu's:



Your guest should match to
686C, 00, Radeon Instinct MI25 MxGPU
686C, 01, Radeon Instinct MI25 MxGPU
686C, 02, Radeon Instinct MI25 MxGPU
686C, 03, Radeon Pro V340 MxGPU
686C, 04, Radeon Instinct MI25x2 MxGPU
686C, 05, Radeon Pro V340L MxGPU
686C, 06, Radeon Instinct MI25 MxGPU


(all in all those should work https://www.amd.com/en/support/kb/release-notes/rn-pro-win-18-q4-v340)

potentially try those too (this page lists drivers for vmware guests and MxGPU Setup Script - not sure whats that)
 
Last edited:

iGene

Member
Jun 15, 2014
74
23
8
Taiwan
I've tried the drivers from Microsoft Azure (They have instances based on MI25) and it receive code 43. From the logs from Ubuntu, I believe its not the driver issue but the PF/VF issue from host.

Have you managed to make it work before?
 

CyklonDX

Well-Known Member
Nov 8, 2022
1,177
404
83
At my work we had trial of mi50 box for testing. (ran vmware esxi 6.7). Never had pleasure of setting it up, but seen it working.
I was under impression you were trying to get it up from vmware, there are no mxgpu guest drivers for non-vmware use.
Potentially if you fake hyper-v azure something you may get it working using mxgpu drivers. (depends how deep azure access lies, you may be able to grab their drivers.)

Potentially you should try to fake pcie ids. (maybe that would allow it to work.)
 

iGene

Member
Jun 15, 2014
74
23
8
Taiwan
At my work we had trial of mi50 box for testing. (ran vmware esxi 6.7). Never had pleasure of setting it up, but seen it working.
I was under impression you were trying to get it up from vmware, there are no mxgpu guest drivers for non-vmware use.
Potentially if you fake hyper-v azure something you may get it working using mxgpu drivers. (depends how deep azure access lies, you may be able to grab their drivers.)

Potentially you should try to fake pcie ids. (maybe that would allow it to work.)
I was previous trying on ESXi 7.0u3, probably going to try 6.7 in the future as it's the version that the driver is for.
 

CyklonDX

Well-Known Member
Nov 8, 2022
1,177
404
83

Try reaching out to them - they potentially have very rare drivers. Its 90% likely their drivers will also work for you (if you wanted to do hyper-v).
 

iGene

Member
Jun 15, 2014
74
23
8
Taiwan

Try reaching out to them - they potentially have very rare drivers. Its 90% likely their drivers will also work for you (if you wanted to do hyper-v).
I watched the full video, it seems like the dev node is shipped with Debian. The seller probably reinstalled the workstation into Windows therefore the driver isn't available. Therefore I don't think they have the drivers.

Did some search on Google and find out that the Stadia node kernel is actually open source, but the amd-cloudgpu module is externally referenced. I'm probably going to give up with this card if ESXi 6.7 with V340 VIM driver doesn't work.