Fix Intel I219-V Detected Hardware Unit Hang

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

René!

New Member
Jan 1, 2018
10
9
3
38
When you're using VLAN's on your Intel I219-V, then you might run into the following issue:
Code:
May 22 20:31:21 x759100 kernel: [7294020.176745] e1000e 0000:00:1f.6 eno1: Detected Hardware Unit Hang:
May 22 20:31:21 x759100 kernel: [7294020.176745]   TDH                  <e8>
May 22 20:31:21 x759100 kernel: [7294020.176745]   TDT                  <f>
May 22 20:31:21 x759100 kernel: [7294020.176745]   next_to_use          <f>
May 22 20:31:21 x759100 kernel: [7294020.176745]   next_to_clean        <e7>
May 22 20:31:21 x759100 kernel: [7294020.176745] buffer_info[next_to_clean]:
May 22 20:31:21 x759100 kernel: [7294020.176745]   time_stamp           <16caf8cc5>
May 22 20:31:21 x759100 kernel: [7294020.176745]   next_to_watch        <e8>
May 22 20:31:21 x759100 kernel: [7294020.176745]   jiffies              <16caf91c8>
May 22 20:31:21 x759100 kernel: [7294020.176745]   next_to_watch.status <0>
May 22 20:31:21 x759100 kernel: [7294020.176745] MAC Status             <40080083>
May 22 20:31:21 x759100 kernel: [7294020.176745] PHY Status             <796d>
May 22 20:31:21 x759100 kernel: [7294020.176745] PHY 1000BASE-T Status  <3800>
May 22 20:31:21 x759100 kernel: [7294020.176745] PHY Extended Status    <3000>
May 22 20:31:21 x759100 kernel: [7294020.176745] PCI Status             <10>
May 22 20:31:21 x759100 kernel: [7294020.400450] e1000e 0000:00:1f.6 eno1: Reset adapter unexpectedly
May 22 20:31:22 x759100 kernel: [7294020.491058] vmbr0: port 1(eno1) entered disabled state
This issue resets the networking and might also cause kernel panics. I didn't dig too deep into the issue, but it seems that by using VLAN's and lots of bridges the hardware device exhausts it's memory and crashes.

To fix this issue I've applied the following change to `/etc/network/interfaces`:
Code:
    post-up /sbin/ethtool -K $IFACE tso off gso off
This will disable the Generic Receive Offload and TCP Segmentation Offload, and there probably will be some sort of performance hit on the CPU. However, after this change I'm not able to trigger this bug any more. The hardware is a Lenovo m720q and I'm running Debian 11 (bullseye).
 
  • Like
Reactions: crlt and Jeggs101

gb00s

Well-Known Member
Jul 25, 2018
1,345
741
113
Poland
If I'm not mistaken this is a I219-V issue with TSO & GSO per se and known at least since 2019. I thought this was fixed by a kernel update around 5.4 LTS.
 

René!

New Member
Jan 1, 2018
10
9
3
38
Well I'm running kernel version 5.10 and it doesn't seem to be solved, unless I disable TSO and GRO.
 

Stephan

Well-Known Member
Apr 21, 2017
1,105
862
113
Germany
IMHO its not solved even if you disable TSO and GSO in kernels > 5.4. Was still getting hung interface on anything newer, had to downgrade. So I recommend to watch the kernel log like a hawk for a bit. Fortunately 5.4 is supported by upstream until Dec 2025, at which point I will either throw out all machines which have this chip, or get a separate NIC PCIe card. Last solid NICs Intel brought out were i210, i211, X520 and X540 and those are all 10 years old now. I wasted 1-2 full days debugging this, won't buy a machine with i219 ever again.

Udev-rule for /etc/udev/rules.d/99-intel.conf:

SUBSYSTEM=="net", ACTION=="add", ATTRS{vendor}=="0x8086", ATTRS{device}=="0x15b7", RUN+="/usr/bin/ethtool -K $name tso off gso off"
 

diskdiddler

Member
Mar 3, 2017
109
13
18
47
IMHO its not solved even if you disable TSO and GSO in kernels > 5.4. Was still getting hung interface on anything newer, had to downgrade. So I recommend to watch the kernel log like a hawk for a bit. Fortunately 5.4 is supported by upstream until Dec 2025, at which point I will either throw out all machines which have this chip, or get a separate NIC PCIe card. Last solid NICs Intel brought out were i210, i211, X520 and X540 and those are all 10 years old now. I wasted 1-2 full days debugging this, won't buy a machine with i219 ever again.

Udev-rule for /etc/udev/rules.d/99-intel.conf:

SUBSYSTEM=="net", ACTION=="add", ATTRS{vendor}=="0x8086", ATTRS{device}=="0x15b7", RUN+="/usr/bin/ethtool -K $name tso off gso off"
Sorry can I just clarify for a moment here and ensure I'm reading this correctly.
Are you saying that kernels AFTER 5.4 are actually worse for this bug, than ones before 5.4?

I am currently knee deep in a file corruption issue which I believe to have been caused by this.
I'm currently using proxmox -
Linux 6.8.8-2-pve (2024-06-24T09:00Z)

I'm performing testing now that I've finally learnt the correct command as mentioned here:
How To Fix Proxmox Detected Hardware Unit Hang On Intel NICs - First2Host

Will update if that has not fixed my file corruption problems.
 

Stephan

Well-Known Member
Apr 21, 2017
1,105
862
113
Germany
@diskdiddler Been running 6.6.x kernels for a while and haven't seen e1000e driver dropping the link in a while. I still use tso off gso off in udev though, because I mistrust any acceleration on this product. Back when I wrote this the 5.4 LTS kernel was the only reliable kernel I could find.

The latest bug I have seen on 12xxx/13xxx/14xxx CPU systems with this NIC was the ME producing IPv6 multicast i.e. broadcast storms in the megabit/s range when Windows is in sleep or shut down. If you google "HBH ICMP6 multicast listener storm" you can see this affect all sorts of vendors. Only workaround is to disable AMT in BIOS. Or wait a little for the Intel CPU to die, due to (my last info) corroded vias on the chip.

Hire engineers, fire suits and DEI. Unpopular, until damage for company becomes life threateningly large.
 

diskdiddler

Member
Mar 3, 2017
109
13
18
47
I can confirm turning TSO and GSO off entirely has fixed my problem, it's been a heck of a thing, heck of a thing. Really caused me some problems this last week.

I couldn't disable AMT, for a homelab guy it's my life, it's incredibly useful!

Thanks for your reply, wish Intel or someone could nail this long term OR the driver ship with these flags disabled unless manually enabled, not the other way around.
 

TopQuark

New Member
Mar 7, 2018
9
8
3
I recently updated my VMWare install and all VM's no longer work. It was an install all the way back from EXSi 6.7 and upgraded along the way to 8.0U3. Took me days to figure it out until I saw this thread.

It was an Ethernet Connection (7) I219-LM to be exact.
 

Stephan

Well-Known Member
Apr 21, 2017
1,105
862
113
Germany
Maybe after takeover of Intel by Qualcomm the problem will be 'solved' for good. Or through following the example of Broadcom after the VMware takeover.
 

Apachez

Member
Jan 8, 2025
35
16
8
It seems that there are multiple possible fixes to this issue which seems to have been going on for at least 11 years...

1) Disable TSO.
2) Disable TSO and GSO.
3) Disable TSO, GSO and GRO.
4) Disable all offloading you can find (probably the worst workaround).
5) Disable EEE (energy efficient ethernet).

Regarding EEE I recognise this from being forced to disable EEE in switches towards network hosts that uses Apple hardware. I wouldnt be surprised if these are/were using Intel NIC's?

EEE comes in two flavours - one is to introduce microsleeps between sent packets in order to lower the powerconsumption of the PHY being used.

The other is to autodetect the TP-cable length and by that use less power to send packets over a 2 meter TP-cable vs. one thats lets say 95 meter.

The thing that puts EEE as a suspect are these lines from a driverupdate from Intel back in 2019 (might exist sooner):


82573(V/L/E) TX Unit Hang Messages
----------------------------------
Several adapters with the 82573 chipset display "TX unit hang" messages during
normal operation with the e1000edriver. The issue appears both with TSO enabled
and disabled and is caused by a power management function that is enabled in
the EEPROM. Early releases of the chipsets to vendors had the EEPROM bit that
enabled the feature. After the issue was discovered newer adapters were
released with the feature disabled in the EEPROM.

If you encounter the problem in an adapter, and the chipset is an 82573-based
one, you can verify that your adapter needs the fix by using ethtool:

# ethtool -e eth0

Offset Values
------ ------
0x0000 00 12 34 56 fe dc 30 0d 46 f7 f4 00 ff ff ff ff
0x0010 ff ff ff ff 6b 02 8c 10 d9 15 8c 10 86 80 de 83
^^

The value at offset 0x001e (de) has bit 0 unset. This enables the problematic
power saving feature. In this case, the EEPROM needs to read "df" at offset
0x001e.

A one-time EEPROM fix is available as a shell script. This script will verify
that the adapter is applicable to the fix and if the fix is needed or not. If
the fix is required, it applies the change to the EEPROM and updates the
checksum. The user must reboot the system after applying the fix if changes
were made to the EEPROM.

Example output of the script:

# bash fixeep-82573-dspd.sh eth0
eth0: is a "82573E Gigabit Ethernet Controller"
This fixup is applicable to your hardware executing command:
# ethtool -E eth0 magic 0x109a8086 offset 0x1e value 0xdf
Change made. You *MUST* reboot your machine before changes take effect!
The script can be downloaded at
.
There are also multiple ways to implement the "disable offloading" depending on which distro you use.

I prefer adding this to /etc/network/interfaces (example from Proxmox PVE 9.0):

Code:
iface ETH1 inet manual
        post-up /usr/bin/logger -p debug -t ifup "Disabling segmentation offload for ${IFACE}" && /sbin/ethtool -K ${IFACE} tso off gso off && /usr/bin/logger -p debug -t ifup "Disabled offload for ${IFACE}"
#FRONTEND
 

JohnBrown13

New Member
Nov 28, 2025
1
0
1
When you're using VLAN's on your Intel I219-V, then you might run into the following issue:
Code:
May 22 20:31:21 x759100 kernel: [7294020.176745] e1000e 0000:00:1f.6 eno1: Detected Hardware Unit Hang:
May 22 20:31:21 x759100 kernel: [7294020.176745]   TDH                  <e8>
May 22 20:31:21 x759100 kernel: [7294020.176745]   TDT                  <f>
May 22 20:31:21 x759100 kernel: [7294020.176745]   next_to_use          <f>
May 22 20:31:21 x759100 kernel: [7294020.176745]   next_to_clean        <e7>
May 22 20:31:21 x759100 kernel: [7294020.176745] buffer_info[next_to_clean]:
May 22 20:31:21 x759100 kernel: [7294020.176745]   time_stamp           <16caf8cc5>
May 22 20:31:21 x759100 kernel: [7294020.176745]   next_to_watch        <e8>
May 22 20:31:21 x759100 kernel: [7294020.176745]   jiffies              <16caf91c8>
May 22 20:31:21 x759100 kernel: [7294020.176745]   next_to_watch.status <0>
May 22 20:31:21 x759100 kernel: [7294020.176745] MAC Status             <40080083>
May 22 20:31:21 x759100 kernel: [7294020.176745] PHY Status             <796d>
May 22 20:31:21 x759100 kernel: [7294020.176745] PHY 1000BASE-T Status  <3800>
May 22 20:31:21 x759100 kernel: [7294020.176745] PHY Extended Status    <3000>
May 22 20:31:21 x759100 kernel: [7294020.176745] PCI Status             <10>
May 22 20:31:21 x759100 kernel: [7294020.400450] e1000e 0000:00:1f.6 eno1: Reset adapter unexpectedly
May 22 20:31:22 x759100 kernel: [7294020.491058] vmbr0: port 1(eno1) entered disabled state
This issue resets the networking and might also cause kernel panics. I didn't dig too deep into the issue, but it seems that by using VLAN's and lots of bridges the hardware device exhausts it's memory and crashes.

To fix this issue I've applied the following change to `/etc/network/interfaces`:
Code:
    post-up /sbin/ethtool -K $IFACE tso off gso off
This will disable the Generic Receive Offload and TCP Segmentation Offload, and there probably will be some sort of performance hit on the CPU. However, after this change I'm not able to trigger this bug any more. The hardware is a Lenovo m720q and I'm running Debian 11 (bullseye).
I’ve been running into the same issue with the I219-V when using VLANs — random hangs, “Hardware Unit Hang” messages, and occasional interface resets. I didn’t know it could be related to TSO/GSO. I’ll try disabling them with ethtool and see if it stabilizes things.