Windows 10 10gbe keeps freezing machine

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

lyris1

New Member
Jul 7, 2020
4
0
1
Hi all, This problem has been driving me nuts for a few years now.

I've had two 10gbe-enabled NAS units - a Synology and now a QNAP TS-1677x - and I have a RAID6 config in it with about 90tb of space.
The problem is that sometimes, when trying to access files and folders stored on the NAS, Windows will freeze up and the application trying to do the accessing will go into the "Not Responding" state. It does eventually respond, but this can take 30 minutes or more.

During this time, other machines can access the other 10gbe interface on the QNAP just fine. And on the machine with the problem, I can pull up the 1gbe interface just fine. Pulling out the 10gbe network cable (or disabling the 10gbe NIC in Device Manager) will instantly cause the machine to jump back into life (although of course, whatever file I was processing will error out since I pulled the connection).

I've swapped (okay, upgraded) the NAS, and during that same time, I swapped out all the drives in it. I've swapped the NICs (using both RJ45 and SFP+). I've moved to an entirely new CPU and motherboard. It still happens - the problem has survived just about everything.

I have two machines each connected to the QNAP NAS directly via its 10gbe RJ45 ports. Is this a bad idea? Should I be going through a 10gbe router/switch instead?

Any pointers would be much appreciated.
 

lyris1

New Member
Jul 7, 2020
4
0
1
On the QNAP end, it's using its own 10gbe ports (no separate add-on card). Those use the Aquantia AQC107 controller.
On the PC end, I have a card that's using the Intel X550-t chipset. I think it's a Startech card.
 

Bjorn Smith

Well-Known Member
Sep 3, 2019
876
482
63
49
r00t.dk
Check the event logs to see if anything fishy is happening - I am guessing its something that is blocking I/O for on the windows machine.

And also check if you have latest intel drivers - I recently had issues in windows with unstable connections and it turned out updating the driver with the one from intel - i.e. not using microsoft's version fixed my issue.

I have two machines each connected to the QNAP NAS directly via its 10gbe RJ45 ports. Is this a bad idea? Should I be going through a 10gbe router/switch instead?
It can give issues if the two machines also are connected via a switch - but I think you should be able to prevent any issues if you turn off routing between network cards on the Qnap machine - and if that is not possible, then a switch might be the correct solution - I have also seen issues on my own LAN when I accidently created a "loop" causing traffic to get stuck in the switches just going round and round.
 

lyris1

New Member
Jul 7, 2020
4
0
1
Thanks Bjorn - what's routing between network cards on the Qnap machine? I have a Virtual Adapter and Virtual Switch on the QNAP side, these allow the virtual machines running on the NAS to access the internet.

I am using the latest Intel drivers.
 

Bjorn Smith

Well-Known Member
Sep 3, 2019
876
482
63
49
r00t.dk
Try making a drawing of how your network topology is - I know that you have two machines connected to the qnap directly and my guess is that this somehow is the culprit, but please upload a drawing including your internet connection.

And how many network cards does the qnap machine have?

you wrote you had two machines connected to the qnap directly but also virtual machines running on qnap - I dont see that working unless you have three or more network cards on the qnap machine.
 

mervincm

Active Member
Jun 18, 2014
159
39
28
have you tried a fan on your PC's 10GBE NIC? I know mine was flakey till I put a little 40mm fan on its heat sync. was only about 10$, and plugged into a spare fan header on the systemboard
 
  • Like
Reactions: lyris1 and Dreece

Dreece

Active Member
Jan 22, 2019
503
161
43
The list of things to try, to check, is huge...
If I was in your shoes I'd just hit ebay and order in solarflare or mellanox 10g card and take it from there... if suddenly it all works, I'd just throw the intel into a box, and write on the outside using a black marker "to be investigated...", on the other side of the box I'd write "...never"... lol
 

madbrain

Active Member
Jan 5, 2019
212
44
28
FYI, there is new firmware and Windows drivers for Aquantia chipsets to be found at

You need a Windows box to flash the firmware. The combination of latest drivers and firmware seems to be giving very good improvements. Make sure to disable interrupt moderation in the Windows drivers to get the best performance. And enable jumbo frames if your switch supports it. I'm getting about 9 Gb/s in either direction in iperf between two Windows boxes running AQN-107 . This is with a single stream. One of the two boxes has a very old FX-8120 CPU (2011 vintage). These are going through a Trendnet 7080-ES switch. I have never gotten the full 10 Gb/s with a single TCP stream with Aquantia before. Previously I was always between 3-5 Gb/s. Now approaching full line rate finally ... Well, I was using two CAT5E + a coupler one one machine, so that didn't help things ... Switched to a CAT6 yesterday. Have some CAT8 on order for all 4 boxes and 2 switches running 10 gig ...

There are still some bugs in the Aquantia Windows drivers, though. One of them is that the Windows box responds to ping when using WOL. It shouldn't.
Linux drivers don't have that bug. I have reported the bug to Aquantia ...

I have just finished upgrading the flash and driver updates on 3 Aquantia cards, about to do the 4th ...
 

lyris1

New Member
Jul 7, 2020
4
0
1
Thanks for all the replies, everyone.

However:
have you tried a fan on your PC's 10GBE NIC? I know mine was flakey till I put a little 40mm fan on its heat sync. was only about 10$, and plugged into a spare fan header on the systemboard
As soon as I read this, I had a "Why on earth did I not think of that" moment - I know from swapping the NICs around that they're extremely hot a few minutes after powering the machine off. In fact, I did this test now and the heatsink on the PC's 10gbe NIC was so hot that it's painful to touch after a few seconds.

If this is the culprit, it would also explain why the problem is so elusive and hard to reproduced when I'm showing tech support people, and why the hangs come in clusters. So, a 40mm Noctua fan is on the way from Amazon! Fingers crossed.
 

mervincm

Active Member
Jun 18, 2014
159
39
28
It comes with using cards designed for high airflow servers in low airflow systems. I have even had to mount a 120mm fan blowing down on my 10 gig intel NIC and my LSI HBA. Using a DAC cable or SFP+ fibre transceivers will gen less heat than any cat5/6 options. Using an old/hot chipset w standard cabling via cat 6 is really a worst-case scenario for heat gen on a NIC. in any case, your 40 mm fan, with reasonable case air-flow will take care of any over-heating issues.
 
  • Like
Reactions: lyris1

acquacow

Well-Known Member
Feb 15, 2017
787
439
63
42
I've got X550s in my FreeNAS box and and an Asus XG-C100C (aquantia 107 I believe) 10gig nic in my windows 10 box.

It's the opposite of what you have OS-wise, but the two play nicely together.

I transfer several TB each way at 800-900MB/sec sustained on a regular basis.
 
  • Like
Reactions: mervincm