This whole mess starts when your packets are too big for some hop on the network and get dropped, not because the network is broken, but because a piece of it is silently saying "nope, too big."

The core problem is that TCP, by default, doesn’t know the smallest Maximum Transmission Unit (MTU) along the entire path to its destination. If a TCP segment exceeds this smallest MTU, it has to be fragmented at the IP layer. Routers along the path might be configured to drop these fragmented packets, especially if they’re trying to enforce security policies or conserve resources. This leads to lost packets, retransmissions, and a connection that grinds to a halt or fails entirely.

Here’s how to dig in and fix it:

  • The "DF" Bit is Your Friend (and Foe): The "Don’t Fragment" (DF) bit in the IP header tells routers that if a packet is too big, they should drop it and send back an ICMP "Fragmentation Needed" message. This message is supposed to tell the sender the path MTU. The problem is, firewalls often block these ICMP messages, leaving the sender in the dark.

    • Diagnosis: On the sending host, try to ping the destination with the DF bit set and a large packet size.
      ping -M do -s 1472 <destination_ip>
      
      (The -s 1472 is for an IPv4 packet size of 1500 bytes total, assuming a 20-byte IP header and 20-byte ICMP header. Adjust as needed.) If this fails with "Frag needed" or no response, and you suspect ICMP is blocked, you’ve found a symptom.
    • Fix: Enable Path MTU Discovery (PMTUD) on your operating system and ensure intermediate firewalls allow ICMP Type 3, Code 4 messages.
      • Linux: net.ipv4.tcp_mtu_probing = 1 (or 2 for full PMTUD). This is usually set in /etc/sysctl.conf.
      • Windows: PMTUD is generally enabled by default. You can check/set it with netsh interface ipv4 set interface "<Interface Name>" pmtu discovery=true.
      • Why it works: This tells the TCP stack to actively probe for the MTU. It sends out packets with the DF bit set and progressively smaller sizes until it gets a successful reply, then it knows the path MTU.
  • Blackholed ICMP: This is the most common reason PMTUD fails. Firewalls or routers silently drop the "Fragmentation Needed" ICMP messages.

    • Diagnosis: Use traceroute or mtr to the destination. If you see a sudden increase in packet loss or timeouts at a specific hop, especially one that looks like a firewall or edge router, that’s a prime suspect.
      mtr <destination_ip>
      
    • Fix: The ideal fix is to configure firewalls to permit ICMP Type 3, Code 4 (Fragmentation Needed). If this isn’t possible (e.g., a managed service provider), you might have to resort to manually setting a lower MTU on your interface.
      ip link set dev eth0 mtu 1400
      
      (This sets the MTU for eth0 to 1400, which is smaller than the common 1500 byte Ethernet MTU and should avoid fragmentation on most paths.) Why it works: By forcing a smaller MTU on your end, you ensure packets don’t need fragmentation in the first place, bypassing the need for PMTUD to discover the path MTU and thus avoiding the problematic ICMP messages.
  • Jumbo Frames Misconfiguration: If you’re in an environment that supports Jumbo Frames (MTU > 1500, commonly 9000), a mismatch can cause issues. If one link supports them and another doesn’t, packets might be fragmented unexpectedly.

    • Diagnosis: Check the MTU settings on all network interfaces involved in the path, from the source to the destination, including any virtual interfaces or tunnels.
      ip addr show eth0 | grep mtu
      
      (On Linux)
    • Fix: Ensure all devices on the entire path consistently support and are configured for the same MTU size, or ensure intermediate devices correctly handle fragmentation if MTUs differ. If you can’t guarantee consistency, disable Jumbo Frames and stick to the standard 1500 MTU. Why it works: Consistency in MTU size across the path eliminates the need for fragmentation or the potential for fragmentation to occur on a link that cannot handle it.
  • VPN Tunneling MTU Overheads: VPNs add their own headers, effectively reducing the usable MTU for your actual data. If PMTUD isn’t working correctly across the VPN, you’ll hit fragmentation issues.

    • Diagnosis: Inside the VPN tunnel, try pinging the destination with a large packet size, but subtract the expected overhead of your VPN. For example, OpenVPN often adds 40-60 bytes.
      ping -M do -s 1400 <destination_ip_inside_vpn>
      
      (Assuming a 1500 byte total packet outside the VPN, and a 60-byte VPN overhead, you’d test with 1400 bytes of payload.)
    • Fix: Configure the VPN client or server to set a lower MTU on the virtual tunnel interface, or ensure PMTUD is functional and not being blocked by VPN security policies. For OpenVPN, you might use tun-mtu 1400 in the client/server config. Why it works: By pre-emptively reducing the MTU on the tunnel interface, you account for the VPN overhead, ensuring that the resulting packets are small enough to traverse the underlying network without fragmentation.
  • TCP MSS Clamping: Network devices (especially firewalls and routers) can be configured to "clamp" the Maximum Segment Size (MSS) advertised by endpoints. This is often done to prevent large TCP segments from causing fragmentation even if PMTUD is working. If this is set too low, it can unnecessarily limit throughput.

    • Diagnosis: Use packet captures (e.g., tcpdump or Wireshark) on the sending host. Look at the TCP SYN packet. The MSS option is advertised in the TCP header.
      tcpdump -i eth0 'tcp[tcpflags] & tcp-syn != 0 and src host <source_ip> and dst host <destination_ip>' -s 0 -w syn_capture.pcap
      
      Analyze syn_capture.pcap for the MSS value. A common value is 1460 (for 1500 MTU). If it’s significantly lower, MSS clamping might be the culprit.
    • Fix: Configure the firewall or router to allow a higher MSS value, or disable MSS clamping if it’s not strictly necessary. The MSS value should generally be MTU - 40 (for IPv4) or MTU - 60 (for IPv6). So, for an MTU of 1500, the MSS should be 1460. Why it works: By allowing a higher MSS, you permit the TCP endpoints to negotiate larger segment sizes, which can improve throughput, provided the network path can handle it.
  • Stale ARP Cache: While less common for MTU issues directly, an outdated ARP cache could theoretically point to the wrong MAC address for a gateway, leading to packets being sent with incorrect link-layer framing, which can manifest as strange connectivity problems that feel like MTU issues.

    • Diagnosis: Check the ARP cache on the sending host.
      arp -n
      
      Look for the gateway IP and its corresponding MAC address.
    • Fix: Clear the ARP cache for the specific entry.
      sudo ip neigh flush dev eth0 nud stale
      
      (This flushes stale entries on eth0 on Linux. A reboot also clears it.) Why it works: Re-populating the ARP cache ensures that the host is using the correct MAC address for the next-hop router, which is fundamental for proper link-layer communication.

Once you’ve addressed these, the next thing you’ll likely encounter is the subtle performance degradation caused by TCP Zero-Window situations, which often occur when PMTUD is working but not optimally tuned.

Want structured learning?

Take the full Tcp course →