I've spent a lot of time in the last 2-3 weeks to troubleshoot this, this will be my last attempt. I've already tried several things, but nothing worked (in fact, it only got worse). If you got a reliable 25G or faster network connection working on Windows 11, please let me know how you did it.
I appreciate everyone reading this and giving any input that might point me in the right direction, this could get lengthy:
I've been running a 10G network since 2011 or so and it always worked well with W7 and W10. I decided it's time to upgrade and so the first thing I did was getting a Mikrotik CRS510-8XS-2XQ-IN. which served as a drop-in replacement of my existing 10G switch and that worked well. It was running for a couple of months when I decided it's time to upgrade my desktop and with that also upgrade to 25G or 40G networking.
Different than in years prior, I opted for my first consumer hardware system in over a decade, because while getting a new Threadripper 7000 system was tempting, it would also be expensive and fairly wasteful. I chose an Asus TUF X670E-Plus and a Ryzen 9 7950X3D instead, knowing that with the GPU in place, the amount of remaining PCIe lanes will be (very) limited.
I wanted to run Linux on it, but reluctantly went for Windows 11 Pro instead because it made things easier at that moment (at least at first...). The slot for the NIC is a PCIe 4.0 x4 slot connected to one of the PROM21 chips of the X670. I still don't know which one because Asus refused to give me a block diagram and I couldn't be bothered to go through the PCIe topology myself yet.
The first NIC I tried was a Mellanox ConnectX-3 40G, connected to the switch via a DAC. It's a HPE 10/40 card flashed with a original 40/56 firmware and set to ethernet mode. Because this is an older generation, I knew it would not be able to reach 40G in a x4 slot, because PCIe 3.0 x4 isn't 40 Gbit/s, but that's ok. It didn't even get close, managing to transmit barely 8 Gbit/s and receive no more than 4 Gbit/s. I used iperf3, which I understand is a bit wonky on Windows, but real world speeds matched with its results.
Even though I thought it was not likely to be the problem, I also tried a direct connection without the switch to the TrueNAS on the other end, which was also equipped with an identical ConnectX-3, and also tried the NIC in the x16 slot of the desktop, but neither gave me any improvements. For a very short time, I managed to reach ~15 Gbit/s, which until today I was never able to reproduce on this machine. I learned that the driver for these cards that ships with Windows 11 (I was surprised that there even comes one with it) is supposedly not the best, so I installed the latest WinOF for Windows clients as well as the one for Windows servers and both didn't change anything. I also learned that the Mellanox cards aren't necessarily the first choice for TrueNAS (Core, BSD-based) because of the BSD driver, but testing against Linux-based TrueNAS Scale gave the same results.
I also went through the driver settings on the Windows 11 desktop, Interrupt Moderation is still on by default and I have no idea why (this already severely limited my network throughput on my first 10G connections way back in the day), so I turned it off. I also increased the Rx and Tx buffers. No difference.
At this point, I was stuck. The performance was worse than previously with the 10G network and I couldn't really narrow down where the problem is. However, I have another TrueNAS Core installed on a different machine with a 10G Intel NIC. It and the other TrueNAS with the ConnectX-3 NIC were able to transfer data between them at 10G while my Windows desktop wasn't even getting close to 10G to either of these machines. I started to suspect that there was nothing wrong with either of the TrueNAS servers and their drivers and NICs.
Next thing I did was trying an Intel XL710-QDA1 in my desktop, which made it worse. Instead of 8/4 Gbit/s, it was now more like 6-7 Gbit/s transmit and 2-3 Gbit/s receive. I changed the DAC for optical transceivers and that changed nothing.
Then I set up another Windows 11 Pro to test with, on a Asus WS C621E with two CPUs, so plenty of PCIe there and I gave it one of the ConnectX-3, a ConnectX-4 Lx (HPE 640SFP28 flashed with the latest original Mellanox firmware) and a Intel XL710-QDA1. Nothing, they all performed equally bad and on the same level as the NICs in my desktop. For the 40G cards, again I tried both DAC and optical transceivers and it made no difference again. For the 25G ConnectX-4 Lx, I tried an AOC and optical transceivers, also no difference. At this point, I ran all these tests against the TrueNAS Core which previously had the 10G Intel NIC, now equipped with one of the Intel XL710-QDA1. Since I had nothing to lose, I also tried Windows 11 Pro for Workstations and Windows Server 2022 on that test machine, both performed equally bad.
My desktop then got an Intel E810-XXVDA2 25G NIC, this is the first PCIe 4.0 NIC of the ones mentioned, but that also made no difference.
At that point I was out of hardware to test. I booted a Linux (Mint 21) from USB on my desktop as well as the test machine. The test machine got 23.4 Gbit/s using the ConnectX-4 Lx 25G and my desktop got 23.5 Gbit/s using the E810-XXVDA2, both against the TrueNAS Core with the Intel XL710-QDA1. I didn't test real world performance, but I'm fairly confident that 20+ Gbit/s would be possible.
I'm now also confident in saying that it's Windows 11 which is the problem here. Which brings me back to the start of this post: What's necessary to make Windows 11 do 25G networking? I've seen the article on STH from January 2023 about getting E810 100G NICs to work on Windows 11, which the driver doesn't support officially. After installing it unofficially, is 100G even possible with Windows 11?
I'm open for almost any suggestions. I'm close to building a 2nd desktop to run Windows on, so that I can run Linux on this one. The Windows desktop could then also get the GPU, which gives me fast networking on Linux and a free x16 slot for some NVMe storage. Not the worst idea ever.
Also, don't be gentle. If I'm the fool who went through all this and completely missed the obvious, let me know.
I appreciate everyone reading this and giving any input that might point me in the right direction, this could get lengthy:
I've been running a 10G network since 2011 or so and it always worked well with W7 and W10. I decided it's time to upgrade and so the first thing I did was getting a Mikrotik CRS510-8XS-2XQ-IN. which served as a drop-in replacement of my existing 10G switch and that worked well. It was running for a couple of months when I decided it's time to upgrade my desktop and with that also upgrade to 25G or 40G networking.
Different than in years prior, I opted for my first consumer hardware system in over a decade, because while getting a new Threadripper 7000 system was tempting, it would also be expensive and fairly wasteful. I chose an Asus TUF X670E-Plus and a Ryzen 9 7950X3D instead, knowing that with the GPU in place, the amount of remaining PCIe lanes will be (very) limited.
I wanted to run Linux on it, but reluctantly went for Windows 11 Pro instead because it made things easier at that moment (at least at first...). The slot for the NIC is a PCIe 4.0 x4 slot connected to one of the PROM21 chips of the X670. I still don't know which one because Asus refused to give me a block diagram and I couldn't be bothered to go through the PCIe topology myself yet.
The first NIC I tried was a Mellanox ConnectX-3 40G, connected to the switch via a DAC. It's a HPE 10/40 card flashed with a original 40/56 firmware and set to ethernet mode. Because this is an older generation, I knew it would not be able to reach 40G in a x4 slot, because PCIe 3.0 x4 isn't 40 Gbit/s, but that's ok. It didn't even get close, managing to transmit barely 8 Gbit/s and receive no more than 4 Gbit/s. I used iperf3, which I understand is a bit wonky on Windows, but real world speeds matched with its results.
Even though I thought it was not likely to be the problem, I also tried a direct connection without the switch to the TrueNAS on the other end, which was also equipped with an identical ConnectX-3, and also tried the NIC in the x16 slot of the desktop, but neither gave me any improvements. For a very short time, I managed to reach ~15 Gbit/s, which until today I was never able to reproduce on this machine. I learned that the driver for these cards that ships with Windows 11 (I was surprised that there even comes one with it) is supposedly not the best, so I installed the latest WinOF for Windows clients as well as the one for Windows servers and both didn't change anything. I also learned that the Mellanox cards aren't necessarily the first choice for TrueNAS (Core, BSD-based) because of the BSD driver, but testing against Linux-based TrueNAS Scale gave the same results.
I also went through the driver settings on the Windows 11 desktop, Interrupt Moderation is still on by default and I have no idea why (this already severely limited my network throughput on my first 10G connections way back in the day), so I turned it off. I also increased the Rx and Tx buffers. No difference.
At this point, I was stuck. The performance was worse than previously with the 10G network and I couldn't really narrow down where the problem is. However, I have another TrueNAS Core installed on a different machine with a 10G Intel NIC. It and the other TrueNAS with the ConnectX-3 NIC were able to transfer data between them at 10G while my Windows desktop wasn't even getting close to 10G to either of these machines. I started to suspect that there was nothing wrong with either of the TrueNAS servers and their drivers and NICs.
Next thing I did was trying an Intel XL710-QDA1 in my desktop, which made it worse. Instead of 8/4 Gbit/s, it was now more like 6-7 Gbit/s transmit and 2-3 Gbit/s receive. I changed the DAC for optical transceivers and that changed nothing.
Then I set up another Windows 11 Pro to test with, on a Asus WS C621E with two CPUs, so plenty of PCIe there and I gave it one of the ConnectX-3, a ConnectX-4 Lx (HPE 640SFP28 flashed with the latest original Mellanox firmware) and a Intel XL710-QDA1. Nothing, they all performed equally bad and on the same level as the NICs in my desktop. For the 40G cards, again I tried both DAC and optical transceivers and it made no difference again. For the 25G ConnectX-4 Lx, I tried an AOC and optical transceivers, also no difference. At this point, I ran all these tests against the TrueNAS Core which previously had the 10G Intel NIC, now equipped with one of the Intel XL710-QDA1. Since I had nothing to lose, I also tried Windows 11 Pro for Workstations and Windows Server 2022 on that test machine, both performed equally bad.
My desktop then got an Intel E810-XXVDA2 25G NIC, this is the first PCIe 4.0 NIC of the ones mentioned, but that also made no difference.
At that point I was out of hardware to test. I booted a Linux (Mint 21) from USB on my desktop as well as the test machine. The test machine got 23.4 Gbit/s using the ConnectX-4 Lx 25G and my desktop got 23.5 Gbit/s using the E810-XXVDA2, both against the TrueNAS Core with the Intel XL710-QDA1. I didn't test real world performance, but I'm fairly confident that 20+ Gbit/s would be possible.
I'm now also confident in saying that it's Windows 11 which is the problem here. Which brings me back to the start of this post: What's necessary to make Windows 11 do 25G networking? I've seen the article on STH from January 2023 about getting E810 100G NICs to work on Windows 11, which the driver doesn't support officially. After installing it unofficially, is 100G even possible with Windows 11?
I'm open for almost any suggestions. I'm close to building a 2nd desktop to run Windows on, so that I can run Linux on this one. The Windows desktop could then also get the GPU, which gives me fast networking on Linux and a free x16 slot for some NVMe storage. Not the worst idea ever.
Also, don't be gentle. If I'm the fool who went through all this and completely missed the obvious, let me know.