[Solved] Missing instruction sets on SPR 8461v E3 EVQS (on X13SEM/X13DEI vs. X13SEI)

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

sam55todd

Active Member
May 11, 2023
115
28
28
How about W10? Does it handle instructions as expected? (although we already do know AMX doesn't show in HWInfo64 on W10).

Sorry, can't run requested tests/checks on my machine, still waiting for MCIO cables since on X13DEI M.2 slots implemented as x2 PCIe not x4 like with X13SEI (therefore will likely use those two M.2 M-keyed slots for WiFi via adapter + U.FL/RP-SMA cable like below) , plus will have to rebuild whole case with riser installation, etc. (already constrained with lanes because GPU takes x16 away, soundcard requiring x1 but it's x4 slot anyway, hit the limit with USBs so had to add another USB3 adapter in x4 bifurcated slot {despite needing only x1 for this card, probably should have used hub instead}, and so on.. so much waste..)

1696619651972.png

Also X13DEI doesn't implement all 80 PCIe lines but only 64 on CPU0 and even worse = 48 on CPU1
(abandoning SuperMicro' marketing "commitment" to always implement max base functionality).

So while I'm on single CPU (for which I can't express enough gratitude again) - for some time will be creatively focusing on other build priorities given physical limitations.
 
Last edited:

sam55todd

Active Member
May 11, 2023
115
28
28
Sorry, I meant for specific user/cpu/mb combination (BIOS & system config), the situation Syr is in as per initial pictures..
so statement is true but not to everyone..
 

Syr

Member
Sep 10, 2017
55
20
8
I think sam55todd was referring to what is happening w/ AMX on my system?

It did show up in HWINFO Win10 when a hypervisor is not running. The demo code failed
It didnt show up in HWINFO on Win10 when a hypervisor is running. The demo code failed
It did show up in HWINFO Win11 when a hypervisor is not running. The demo code almost always worked
I haven't yet had a chance to check it on Win11 when a hypervisor is running.
 
  • Like
Reactions: sam55todd

Syr

Member
Sep 10, 2017
55
20
8
Ok, I'm back home from dinner and got the new chip in the house.
Just did the test on Win11 w/ hypervisor (on the original cpu), but no dice, which is worrying... Fingers crossed that it works with the new chip, because that will be a serious disappointment if the chips can only do AMX or run a hypervisor but not both at the same time.
Time to go swap the chips now.
 
  • Like
Reactions: sam55todd

Syr

Member
Sep 10, 2017
55
20
8
Success! In windows 11 at least.
  • Without a hypervisor, its basically the same except it doesn't have the first-run corrupt output (I did not change or recompile the executable, I just used the one I created earlier, so its not initializing the buffers any differently), and is able to execute it in both msys2 AND cmd without errors. Also I have not been able to get it to spit out a blank run failure either, but I dont know if that was just a fluke or some OS level bug either because it happened so infrequently.
  • With a hypervisor, it still works in both cmd & msys2! Also here is what HWINFO reports for it - note that SGX and TME are now also marked as working when hyper-v is enabled, which was not the case before for me in either win10 or win11:
photo_2023-10-06_16-54-26.jpg

Windows desktop is also much more responsive now - I made the assumption that it was being laggy because windows switchable graphics was trying to render non-3d workloads on the BMC GPU, but it seems that was not the issue - I have no idea why it was lagging before I swapped the CPUs (no, the previous one wasn't thermally throttling, at least not according to any HWINFO sensors), but the lag is thankfully gone now.
I'm happy, these are "good enough" results for me, but I do want to continue doing some more tests - I'm hoping that if any other devs out there are trying to play around w/ AMX on their sytems, they can at least find something potentially useful on google if they encounter similar issues to the ones I have had.

Next up is win 10 (w/ and w/o a hypervisor), then a fresh win 11 installation (to see if the drives falling off the bus issue was due to the previous cpu, or if it was a MS installer or board problem), and lastly I'll put the linux drive back in, throw a fresh install on and run the tests there too.
 

Syr

Member
Sep 10, 2017
55
20
8
Update on testing: (Win10)
With the new CPU, I was still unable to get the AMX demo code working on Win 10 still regardless of hypervisor on/off, but at least it wasn't giving me Illegal Instruction crashes, it was just quietly exiting. Its likely Win10, like linux, requires special buffer initialization.

I need to head out for now but hopefully I'll at least be able to do one of either the ubuntu or win11 installation testing tonight.
 

sam55todd

Active Member
May 11, 2023
115
28
28
...
sam55todd, are you running any hypervisors? (ex, hyper-v, virtualbox) Sounds like you have set one up already.
...
I normally do enable these features (these days Hyper-V, 4 years ago was more on VMWare) on primary workstation for various test scenarios, at least used to before going completely into the cloud DBs, so hardly these days, unless "Windows Sandbox" counts too (but I've heard Windows virtualizes/ containerisalizes many things in it's core anyway).
With current SPR build haven't enabled anything, just clean USB-install and drivers, nothing else, base checks and stress-test for couple of hours, that's it. My M.2 drive sits on x2 PCIe lines, so no point of installing anything properly (and migrating data/drives) until MCIO / SFF-TA-1016 cables arrive (well, if it will at all, order status on Amazon "not dispatched" for almost a week, they often cancel it automatically, after that so I might have to re-order from someone else with another month of waiting time).

Despite interest can't commit to follow your detailed test/validation guide this weekend, will try but have lots of stuff in a pipeline, I wish I had a spare week to dedicate for finally completing PC build.
 
  • Like
Reactions: Syr

Syr

Member
Sep 10, 2017
55
20
8
Good luck with the build, and it would be interesting to know if you get consistent behavior.

Anyways I just got back a bit earlier than expected, going to reinstall my ubuntu drive (as well as do a fresh install of ubuntu on it) and do that testing next. I'll hopefully have time after to get to the windows 11 install test.
 

Syr

Member
Sep 10, 2017
55
20
8
Ok! So the intel code sample & AMX code detection is working just fine in ubuntu too, with and without a hypervisor. (Kernel version of my fresh install is 6.2)

Two issues I did encounter though:
  • I had to disable TME to get the ubuntu installer to work, otherwise it would throw an error about TME and simply halt. It might be fine to re-enable it after the install, but I have not yet tested that
  • I got the idea to additionally try the code sample in the VM (a case I had not previously really investigated because it wasnt even working on the host before). Well, it turns out AMX *does not* work inside of VMs running on most hypervisors it turns out - this is even the case for intel's dev cloud: Is It Possible to Use Intel® Advanced Matrix Extensions... . ESXI 8.0u1 is the only one I could find a statement indicating that it supports AMX in VMs.

I'll go see if TME works on the actual ubuntu install, and then give fresh win11 installation a try (With all drives enabled) and see if that works fine now.
 
  • Like
Reactions: sam55todd

Syr

Member
Sep 10, 2017
55
20
8
Ran the final tests I wanted to get in:

TME works when enabling it after installing ubuntu.

Win11 installed just fine w/ all the drives enabled during installation on this new cpu.

--

I guess with that, I can consider my issues resolved - they all either got fixed by the cpu swap, or they turned out to be other limitations (W10 needing some form of specific buffer initialization for AMX, AMX not being available inside of VMs (unless using specific hypervisors))

[Edit]: I've updated the original post w/ the final findings & marked the thread as solved now
 
Last edited:

RolloZ170

Well-Known Member
Apr 24, 2016
5,426
1,641
113
Windows desktop is also much more responsive now - I made the assumption that it was being laggy because windows switchable graphics was trying to render non-3d workloads on the BMC GPU, but it seems that was not the issue - I have no idea why it was lagging before I swapped the CPUs (no, the previous one wasn't thermally throttling, at least not according to any HWINFO sensors), but the lag is thankfully gone now.
this is core p/c states known issue, reported with W9-3495x too. as long the cores have some load its good, but if they all parked they took long to come up.
 
  • Like
Reactions: Syr

sam55todd

Active Member
May 11, 2023
115
28
28
..Well, it turns out AMX *does not* work inside of VMs running on most hypervisors it turns out - this is even the case for intel's dev cloud: Is It Possible to Use Intel® Advanced Matrix Extensions... . ESXI 8.0u1 is the only one I could find a statement indicating that it supports AMX in VMs..

..(W10 needing some form of specific buffer initialization for AMX, AMX not being available inside of VMs (unless using specific hypervisors))..
Cloud providers already are using Sapphire Rapids on some of their hosting/compute offers to customers, e.g. 8475B is used in AliBaba cloud (like g8i series) servers, AMX (and QAT/IAA/DSA accelerators) are mentioned in product descriptions (with same CPU model), therefore I assume they have managed to make it work in the end (or made required adjustments for production units, no matter how hard our suppliers tried to test and pick best CPU out of available in a basket - some things are beyond their control since these are truly QS items with all the implications {memory channel loss after some time, PCIe lane loss, abrupt instability/errors/BSODs, etc., it can die within say couple of months and there won't be any warranty from manufacturer}). QS is a gamble after all. Investing unreasonable {even $1K} finances into those (e.g. looking at current offerings on ebay) would be quite a mistake, unless you can extract enough personal value out of this (but those would be very specific circumstances).
 
Last edited:

RolloZ170

Well-Known Member
Apr 24, 2016
5,426
1,641
113
some things are beyond their control since these are truly QS items with all the implications {memory channel loss after some time, PCIe lane loss, abrupt instability/errors/BSODs, etc.,
Intel QS are identical with production units except with ES bit set to 1.
they have no warranty because they are free intel gifts(NDA) for qualified customers.
 
  • Like
Reactions: Syr and sam55todd

bayleyw

Active Member
Jan 8, 2014
306
102
43
So in retrospect I think there was an important piece of information missing from the OP's original post:

Windows desktop is also much more responsive now - I made the assumption that it was being laggy because windows switchable graphics was trying to render non-3d workloads on the BMC GPU, but it seems that was not the issue - I have no idea why it was lagging before I swapped the CPUs (no, the previous one wasn't thermally throttling, at least not according to any HWINFO sensors), but the lag is thankfully gone now.
This is a telltale sign of either bad RAM, bad IMC, or, occasionally, bad PCIe. I didn't suggest to reseat the CPU because you'd have to be unimaginably unlucky for bad RAM to break the same bits of CPUID every time, but the missing ISA flags ended up being an OS-related issue anyway. I had an octal channel system go unresponsive on Windows the other day (nonsense like "Task Manager takes five minutes to open") and it ended up requiring a memory reseat.

Octal channel systems are tricky to get stable, even more so when you're built out of a first-gen IMC like the one on SPR. This is especially true if you have a large air cooler: server boards are designed to run horizontally with at most a 4U cooler on them; the torque on the socket from a consumer tower cooler in a vertical chassis is enough to break a lot of stuff. For example, I have a rig that needed a lot of PCIe 3.0 lanes so I grabbed one of the Chinese 2S 2011-3 boards off of Amazon. It worked great until I added the second CPU and tower cooler, at which point it threw MCE's galore and often refused to boot. In the end, it was the coolers, and a combination of careful reseating and and supporting the board fixed it.
 

sam55todd

Active Member
May 11, 2023
115
28
28
Yeah, weight of the cooler (and height of it along with mass distribution because of increasing bending momentum forces) does become critical with vertical installations because it creates extra deformational pressures on PCB wave-forming it (as per red line).
1696760871198.png
 

RolloZ170

Well-Known Member
Apr 24, 2016
5,426
1,641
113
Windows desktop is also much more responsive now - I made the assumption that it was being laggy because windows switchable graphics was trying to render non-3d workloads on the BMC GPU
you have installed the ASPEED driver ?