TCP connections are silently failing in production, and you’re seeing intermittent packet loss or connection resets that don’t make sense.
This usually happens because a network device in the path is silently dropping or mangling TCP packets, and the application layer is too high-level to see it.
Common Causes and Fixes
-
Firewall State Table Exhaustion
- Diagnosis: Check firewall logs for "state table full" or similar messages. On Linux, you can check
conntrackusage:
Look forconntrack -Sentriesclose tomax_entries. - Fix: Increase the state table size on the firewall. For example, on a Cisco ASA, you might use
show memory detailto check usage andshow running-config | include timeoutto see current timeouts. Adjusting timeouts (e.g.,timeout tcp established 3600) can also help by freeing up entries faster, but increasing the table size is the direct fix. - Why it works: The firewall can’t track new connections if its memory for tracking existing ones is full, leading it to drop new SYN packets or subsequent data packets.
- Diagnosis: Check firewall logs for "state table full" or similar messages. On Linux, you can check
-
MTU Mismatch / Black Hole
- Diagnosis: Perform a
pingwith the "do not fragment" (DF) flag set and progressively larger packet sizes. On Linux:
If you can’t reach a certain size (e.g., 1472 bytes, which is 1500 MTU minus IP and TCP headers), but smaller sizes work, you have an MTU issue. You might see "Packet needs to be fragmented but DF set" errors.ping -M do -s 1472 <destination_ip> - Fix: Set the MTU on the sending interface to match the smallest MTU in the path. For example, on a Linux server’s Ethernet interface:
Alternatively, configure Path MTU Discovery (PMTUD) correctly on your endpoints and network devices, or use TCP MSS clamping on firewalls.ip link set dev eth0 mtu 1400 - Why it works: If a packet is too large for a link in the path and the router cannot fragment it (because DF is set), it will silently drop it. This fix ensures packets are small enough to traverse the path without fragmentation.
- Diagnosis: Perform a
-
TCP Window Scaling Issues
- Diagnosis: Use
tcpdumpon both ends of the connection and analyze thewin(window size) andws(window scale) values. Look for connections where the advertised window size is consistently small, or where the scale factor is zero or very low.
Then analyze with Wireshark.tcpdump -i eth0 -s 0 -w tcp_window.pcap 'tcp port 80' - Fix: Ensure TCP window scaling is enabled and correctly negotiated on both client and server. On Linux, check
sysctl net.ipv4.tcp_window_scaling. If it’s 0, enable it:
Also, ensuresysctl -w net.ipv4.tcp_window_scaling=1net.ipv4.tcp_rmemandnet.ipv4.tcp_wmemare set to reasonable values (e.g.,4096 87380 6291456). - Why it works: Without window scaling, the maximum TCP window size is 65,535 bytes, which is insufficient for high-bandwidth, high-latency links ("long fat networks"). If scaling is disabled or misconfigured, throughput plummets.
- Diagnosis: Use
-
TCP Keepalive Issues
- Diagnosis: If connections are dropping after a period of inactivity, check if keepalives are enabled and configured appropriately. Look for application logs indicating connections being closed unexpectedly.
- Fix: Enable TCP keepalives at the OS level and set appropriate intervals. On Linux:
This means after 1 hour of idle time, the OS will send probes every minute, and if 5 probes go unanswered, the connection is considered dead.sysctl -w net.ipv4.tcp_keepalive_time=3600 # 1 hour sysctl -w net.ipv4.tcp_keepalive_intvl=60 # 1 minute sysctl -w net.ipv4.tcp_keepalive_probes=5 # 5 probes - Why it works: Network devices (like firewalls or load balancers) often have idle connection timeouts. TCP keepalives send small packets to keep the connection alive in these devices’ state tables.
-
ECN (Explicit Congestion Notification) Misconfiguration
- Diagnosis: Look for repeated
TCP RetransmissionandTCP Dup ACKevents intcpdumpor packet captures, especially if the network is not saturated. If ECN is enabled, you might seeECN: ECE(ECN-Echo) flags in TCP segments, but if a device in the path doesn’t support ECN, it might drop packets marked with ECN bits. - Fix: Disable ECN on endpoints if it’s causing issues and not fully supported by the network path. On Linux, you can disable it via
sysctl:
Alternatively, configure network devices to properly handle ECN markings.sysctl -w net.ipv4.tcp_ecn=0 - Why it works: ECN is designed to signal congestion without dropping packets. However, if intermediate devices drop packets marked for ECN, it leads to packet loss that looks like a standard congestion event but can be harder to trace.
- Diagnosis: Look for repeated
-
TCP Selective Acknowledgement (SACK) Issues
- Diagnosis: If you see high retransmission rates without obvious packet loss, analyze
tcpdumpfor a lack of SACK information or inconsistent SACK blocks. This can happen if SACK is enabled but poorly implemented by a device or OS. - Fix: Ensure SACK is enabled and functioning correctly. On Linux, it’s usually enabled by default. You can check
sysctl net.ipv4.tcp_sack. If you suspect a specific device, you might need to disable it on the endpoints as a last resort. - Why it works: SACK allows the receiver to acknowledge non-contiguous blocks of received data. This significantly improves performance when multiple packets are lost in a single window, as the sender only needs to retransmit the missing segments. If SACK is broken, performance degrades severely.
- Diagnosis: If you see high retransmission rates without obvious packet loss, analyze
After fixing these, you might encounter TCP Zero Window issues if your application isn’t consuming data fast enough, leading to sender pauses.