Fix Intel I219-V Detected Hardware Unit Hang

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

René!

New Member
Jan 1, 2018
9
8
3
36
When you're using VLAN's on your Intel I219-V, then you might run into the following issue:
Code:
May 22 20:31:21 x759100 kernel: [7294020.176745] e1000e 0000:00:1f.6 eno1: Detected Hardware Unit Hang:
May 22 20:31:21 x759100 kernel: [7294020.176745]   TDH                  <e8>
May 22 20:31:21 x759100 kernel: [7294020.176745]   TDT                  <f>
May 22 20:31:21 x759100 kernel: [7294020.176745]   next_to_use          <f>
May 22 20:31:21 x759100 kernel: [7294020.176745]   next_to_clean        <e7>
May 22 20:31:21 x759100 kernel: [7294020.176745] buffer_info[next_to_clean]:
May 22 20:31:21 x759100 kernel: [7294020.176745]   time_stamp           <16caf8cc5>
May 22 20:31:21 x759100 kernel: [7294020.176745]   next_to_watch        <e8>
May 22 20:31:21 x759100 kernel: [7294020.176745]   jiffies              <16caf91c8>
May 22 20:31:21 x759100 kernel: [7294020.176745]   next_to_watch.status <0>
May 22 20:31:21 x759100 kernel: [7294020.176745] MAC Status             <40080083>
May 22 20:31:21 x759100 kernel: [7294020.176745] PHY Status             <796d>
May 22 20:31:21 x759100 kernel: [7294020.176745] PHY 1000BASE-T Status  <3800>
May 22 20:31:21 x759100 kernel: [7294020.176745] PHY Extended Status    <3000>
May 22 20:31:21 x759100 kernel: [7294020.176745] PCI Status             <10>
May 22 20:31:21 x759100 kernel: [7294020.400450] e1000e 0000:00:1f.6 eno1: Reset adapter unexpectedly
May 22 20:31:22 x759100 kernel: [7294020.491058] vmbr0: port 1(eno1) entered disabled state
This issue resets the networking and might also cause kernel panics. I didn't dig too deep into the issue, but it seems that by using VLAN's and lots of bridges the hardware device exhausts it's memory and crashes.

To fix this issue I've applied the following change to `/etc/network/interfaces`:
Code:
    post-up /sbin/ethtool -K $IFACE tso off gso off
This will disable the Generic Receive Offload and TCP Segmentation Offload, and there probably will be some sort of performance hit on the CPU. However, after this change I'm not able to trigger this bug any more. The hardware is a Lenovo m720q and I'm running Debian 11 (bullseye).
 
  • Like
Reactions: Jeggs101

gb00s

Well-Known Member
Jul 25, 2018
1,177
587
113
Poland
If I'm not mistaken this is a I219-V issue with TSO & GSO per se and known at least since 2019. I thought this was fixed by a kernel update around 5.4 LTS.
 

René!

New Member
Jan 1, 2018
9
8
3
36
Well I'm running kernel version 5.10 and it doesn't seem to be solved, unless I disable TSO and GRO.
 

Stephan

Well-Known Member
Apr 21, 2017
920
698
93
Germany
IMHO its not solved even if you disable TSO and GSO in kernels > 5.4. Was still getting hung interface on anything newer, had to downgrade. So I recommend to watch the kernel log like a hawk for a bit. Fortunately 5.4 is supported by upstream until Dec 2025, at which point I will either throw out all machines which have this chip, or get a separate NIC PCIe card. Last solid NICs Intel brought out were i210, i211, X520 and X540 and those are all 10 years old now. I wasted 1-2 full days debugging this, won't buy a machine with i219 ever again.

Udev-rule for /etc/udev/rules.d/99-intel.conf:

SUBSYSTEM=="net", ACTION=="add", ATTRS{vendor}=="0x8086", ATTRS{device}=="0x15b7", RUN+="/usr/bin/ethtool -K $name tso off gso off"
 
  • Like
Reactions: Jeggs101