Some random RTO on my uBuntu router

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

hansyulian

New Member
Feb 16, 2023
6
0
1
So i am creating a home server that serve as:
- router (isc-dhcp-server)
- zfs file server
- sync backup server (syncthing)
- file sharing server (samba)
- torrent client (transmission-daemon)
in ubuntu server 22.04

hardware:
- gigabyte x79 ud3
- ram 32gb corsair dominator 1600 cl9
- boot drive: sandisk usb 3.2 64gb
- wan network port: rtl8111, additional PCIe card
- internal network port: onboard network from motherboard

I wonder why i experience some random connection drop. So i try the following:
1. ping 8.8.8.8 from the server
2. ping 8.8.8.8 from my main pc
3. ping 172.16.0.1 from my main pc which is my server system
4. ping 172.16.0.3 from my main pc which is my wife's pc

I notice that there are some request time out in random time on number (2) and (3) only, while the (1) and (4) is always connected when the request timeout happened. This causes some issue when gaming.

Based on the ping behaviour, i would believe that the connection between the switch to my server's internal port may have some issue.

I would like to know if someone has any experience in this and able to assist me in troubleshooting this random connection issue.
 

hansyulian

New Member
Feb 16, 2023
6
0
1
I attach the iperf test between my server and my PC. for some reason the 1st and 2nd is slow and it causes some request timeout in the ping. but the rest where it's about 900+ mbps, it's not even getting request timeout. Also after i left it for some time, it go back to 50 mbps and get timeout again. so i think there is something wrong with the power saving or such

------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 128 KByte (default)
------------------------------------------------------------
[ 1] local 172.16.0.1 port 5001 connected with 172.16.0.2 port 58112
[ ID] Interval Transfer Bandwidth
[ 1] 0.0000-10.0031 sec 231 MBytes 194 Mbits/sec
[ 2] local 172.16.0.1 port 5001 connected with 172.16.0.2 port 58126
[ ID] Interval Transfer Bandwidth
[ 2] 0.0000-12.7377 sec 92.1 MBytes 60.7 Mbits/sec
[ 3] local 172.16.0.1 port 5001 connected with 172.16.0.2 port 58127
[ ID] Interval Transfer Bandwidth
[ 3] 0.0000-10.0026 sec 1.11 GBytes 949 Mbits/sec
[ 4] local 172.16.0.1 port 5001 connected with 172.16.0.2 port 58128
[ ID] Interval Transfer Bandwidth
[ 4] 0.0000-10.0020 sec 1.10 GBytes 947 Mbits/sec
[ 5] local 172.16.0.1 port 5001 connected with 172.16.0.2 port 58129
[ ID] Interval Transfer Bandwidth
[ 5] 0.0000-10.0027 sec 1.10 GBytes 942 Mbits/sec
[ 6] local 172.16.0.1 port 5001 connected with 172.16.0.2 port 58130
[ ID] Interval Transfer Bandwidth
[ 6] 0.0000-10.0024 sec 1.11 GBytes 949 Mbits/sec
[ 7] local 172.16.0.1 port 5001 connected with 172.16.0.2 port 58131
[ ID] Interval Transfer Bandwidth
[ 7] 0.0000-10.0030 sec 1.09 GBytes 939 Mbits/sec
 

dazgluk

New Member
Jan 25, 2023
10
1
3
Looks like you are dropping packets somewhere.
Are you using the same NIC for both "LAN" and "WAN" ? or those are different once?
Start with checking out "ethtool" statistics, and see if there are any errors/discards here
Example is
ethtool -S enp2s0
NIC statistics:
tx_packets: 397076871
rx_packets: 306445113
tx_errors: 0
rx_errors: 0
rx_missed: 5140
align_errors: 0
tx_single_collisions: 0
tx_multi_collisions: 0
unicast: 305185764
broadcast: 81761
multicast: 1177588
tx_aborted: 0
tx_underrun: 0

So this NIC dropped 5140 because of performance, and zero packets because of physical errors.
Let's see what you have.
 

hansyulian

New Member
Feb 16, 2023
6
0
1
This is my NIC statistic:

NIC statistics:
rx_packets: 112487543
tx_packets: 40955547
rx_bytes: 146900129135
tx_bytes: 19643886367
rx_broadcast: 14230
tx_broadcast: 7394
rx_multicast: 2204
tx_multicast: 4918
rx_errors: 0
tx_errors: 0
tx_dropped: 0
multicast: 2204
collisions: 0
rx_length_errors: 0
rx_over_errors: 0
rx_crc_errors: 0
rx_frame_errors: 0
rx_no_buffer_count: 0
rx_missed_errors: 0
tx_aborted_errors: 0
tx_carrier_errors: 0
tx_fifo_errors: 0
tx_heartbeat_errors: 0
tx_window_errors: 0
tx_abort_late_coll: 0
tx_deferred_ok: 0
tx_single_coll_ok: 0
tx_multi_coll_ok: 0
tx_timeout_count: 0
tx_restart_queue: 0
rx_long_length_errors: 0
rx_short_length_errors: 0
rx_align_errors: 0
tx_tcp_seg_good: 92071
tx_tcp_seg_failed: 0
rx_flow_control_xon: 0
rx_flow_control_xoff: 0
tx_flow_control_xon: 0
tx_flow_control_xoff: 0
rx_csum_offload_good: 112455955
rx_csum_offload_errors: 0
rx_header_split: 0
alloc_rx_buff_failed: 0
tx_smbus: 0
rx_smbus: 0
dropped_smbus: 0
rx_dma_failed: 0
tx_dma_failed: 0
rx_hwtstamp_cleared: 0
uncorr_ecc_errors: 0
corr_ecc_errors: 0
tx_hwtstamp_timeouts: 0
tx_hwtstamp_skipped: 0
 

hansyulian

New Member
Feb 16, 2023
6
0
1
Ok i have some findings. so i switched the wan port and local port, then now the internal network work perfectly without any timeout, but the connection from my ubuntu server router to the internet can drop sometimes. I also found out that it only happened when the interface is trying to negotiate to 1gbps to my ISP router. The motherboard is Gigabyte X79-UD3 from 2011 and the interface is Intel Corporation 82579V Gigabit Network Connection (rev 05).

My current hypothesis that causes the connection drop is the auto negotiation between the network interface. Need to do some searching on why, but i'm trying to find out the network standard used in that interface now to find out if it's somehow outdated or any similar problem
 

dazgluk

New Member
Jan 25, 2023
10
1
3
Usually drops are seen in ethtool on one of the ends.
It may either be your NIC, or the end to the router.
The fact that it's only when you are negotiating 1gb/s may indicate it's a patchcodr also, due to 1gb/s requires four copper pairs, while 100mb/s requires only two.

Your previous ethtool looks good, so it's propably another end.
 

hansyulian

New Member
Feb 16, 2023
6
0
1
I already bought anothet realtek 8111, using same everything even router, switch, and patchcord, so now the WAN and LAN are using 2 same realtek 8111 just on 2 separate card, no more decade long intel 82579. The problem now gone, so i should say that somewhat intel 82579 is the problem, maybe it doesnt have a proper up to date negotiation protocol