There's been a few posts here recently about people having slower than expected network performance when trying to get 10gb ethernet up and running. I may have stumbled across a possible cause and solution to some of these issues.
My main Windows 10 LTSC box (and my gaming rig) is an AMD 5900x class machine, with a run of the mill Gigabyte B550 motherboard. It has 3 16x pci-e slots, the top one has my Nvidia 3070 in it, and the bottom one has my Mellanox 4x NIC. In this system, the top pci-e slot gets 16 lanes, the middle slot 4 lanes, and the bottom pci-e slot only gets 2 lanes. However the middle slot shares pci-e lanes with one of my M.2 slots (which is occupied) and its also right next to the GPU so I don't use it.
My switch is a cheap but effective Horaco 8 port SFP+ switch, using a realtek 9303 chipset.
With my Mellanox, I get 10gb to my linux file server all day no problems at all. Iperf3 (even though its not 100% reliable on Windows) shows consistent 9.5gb transfer rates, and doing a SMB copy to my file server holds 1.00 to 1.05 GB/sec transfer rates, even with large 50GB+ sized iso files.
I decided to swap out the Mellanox to an Intel X710-DA2 I had sitting collecting dust, for the simple reason of hoping to cut down my power consumption a few watts (the X710's are known to be a pretty low power draw card). As soon as I rebooted with the new card and installed latest drivers and got it working, I re-ran iperf3 and my 50GB iso file copy tests. Performance dropped considerably, down to about 6.5gb/sec speeds in both tests. Knowing that my X710 is a known good card (I've had no problems getting 10gb/sec out of it in other machines), I decided something else had to be going on.
I booted my gaming box into Parted Magic (a usb based Linux distro) and ran iperf3 again. Same results as Windows. So that rules out a driver, iperf windows wonkiness, or OS issue.
Next up I grabbed an Intel X520, and plugged it into my gaming box and booted into Parted Magic. The X520 gave the exact same performance as the X710, so that DEFINITELY rules out the X710 being defective.
Now I'm beginning to think it's an issue with my gaming box. So onto the testbox.
This is a I7-9700K machine with a Gigabyte Z370 based motherboard and 2 pci-e slots (one providing 16 lanes, and the other 4 lanes). There is no GPU in this machine as I just use the Intel integrated graphics on it. I plugged the X710 into the top pci-e slot, booted it into Win 10 LTSC, and repeated the same tests. This time I got the same results as I got on my gaming box with the Mellanox, consistent 10gb transfer speeds. For reasons unknown, the X710 will NOT work in the bottom slot in my test box. However the X520 does work fine in the bottom slot of the test box. I tested the X520 on the top slot as a baseline test, and got 10gb no problem. Moving it to the bottom slot and "restricting" it to 4 lanes still gave me 10gb.
My guess is that the Mellanox 4 cards don't mind if they're only getting 2 pci-e lanes when running at 10gb, while the X710 gets "grumpy" if its not getting 4 pci-e lanes even though 2 lanes at version 3.0 should be more than fast enough for 10gb speeds. My testbox is able to give the Intel X710 or X520 all the pci-e lanes they want, and I suspect this is why the X710 and X520 gives the expected performance in that machine.
Moral of the story - The Intel NIC's appear to have a design flaw/bug where they don't perform well if they're not getting at least 4 lanes from the pci-e bus.
My main Windows 10 LTSC box (and my gaming rig) is an AMD 5900x class machine, with a run of the mill Gigabyte B550 motherboard. It has 3 16x pci-e slots, the top one has my Nvidia 3070 in it, and the bottom one has my Mellanox 4x NIC. In this system, the top pci-e slot gets 16 lanes, the middle slot 4 lanes, and the bottom pci-e slot only gets 2 lanes. However the middle slot shares pci-e lanes with one of my M.2 slots (which is occupied) and its also right next to the GPU so I don't use it.
My switch is a cheap but effective Horaco 8 port SFP+ switch, using a realtek 9303 chipset.
With my Mellanox, I get 10gb to my linux file server all day no problems at all. Iperf3 (even though its not 100% reliable on Windows) shows consistent 9.5gb transfer rates, and doing a SMB copy to my file server holds 1.00 to 1.05 GB/sec transfer rates, even with large 50GB+ sized iso files.
I decided to swap out the Mellanox to an Intel X710-DA2 I had sitting collecting dust, for the simple reason of hoping to cut down my power consumption a few watts (the X710's are known to be a pretty low power draw card). As soon as I rebooted with the new card and installed latest drivers and got it working, I re-ran iperf3 and my 50GB iso file copy tests. Performance dropped considerably, down to about 6.5gb/sec speeds in both tests. Knowing that my X710 is a known good card (I've had no problems getting 10gb/sec out of it in other machines), I decided something else had to be going on.
I booted my gaming box into Parted Magic (a usb based Linux distro) and ran iperf3 again. Same results as Windows. So that rules out a driver, iperf windows wonkiness, or OS issue.
Next up I grabbed an Intel X520, and plugged it into my gaming box and booted into Parted Magic. The X520 gave the exact same performance as the X710, so that DEFINITELY rules out the X710 being defective.
Now I'm beginning to think it's an issue with my gaming box. So onto the testbox.
This is a I7-9700K machine with a Gigabyte Z370 based motherboard and 2 pci-e slots (one providing 16 lanes, and the other 4 lanes). There is no GPU in this machine as I just use the Intel integrated graphics on it. I plugged the X710 into the top pci-e slot, booted it into Win 10 LTSC, and repeated the same tests. This time I got the same results as I got on my gaming box with the Mellanox, consistent 10gb transfer speeds. For reasons unknown, the X710 will NOT work in the bottom slot in my test box. However the X520 does work fine in the bottom slot of the test box. I tested the X520 on the top slot as a baseline test, and got 10gb no problem. Moving it to the bottom slot and "restricting" it to 4 lanes still gave me 10gb.
My guess is that the Mellanox 4 cards don't mind if they're only getting 2 pci-e lanes when running at 10gb, while the X710 gets "grumpy" if its not getting 4 pci-e lanes even though 2 lanes at version 3.0 should be more than fast enough for 10gb speeds. My testbox is able to give the Intel X710 or X520 all the pci-e lanes they want, and I suspect this is why the X710 and X520 gives the expected performance in that machine.
Moral of the story - The Intel NIC's appear to have a design flaw/bug where they don't perform well if they're not getting at least 4 lanes from the pci-e bus.