The SSH daemon on the server (sshd) is abruptly terminating connections initiated by clients because it’s receiving unexpected data or the underlying network path is unstable.
Common Causes and Fixes
-
TCP Keepalives Not Configured or Too Infrequent
- Diagnosis: On the client, check
~/.ssh/configor/etc/ssh/ssh_configforServerAliveIntervalandServerAliveCountMax. On the server, check/etc/ssh/sshd_configforClientAliveIntervalandClientAliveCountMax. If these are absent or set to high values (e.g.,0or a large number forCountMax), the system might not be detecting a dead connection before the network infrastructure does. - Fix: On the client, add or modify these lines in
~/.ssh/config:
On the server, add or modify these lines inServerAliveInterval 60 ServerAliveCountMax 3/etc/ssh/sshd_configand restartsshd(sudo systemctl restart sshd):ClientAliveInterval 60 ClientAliveCountMax 3 - Why it works:
ServerAliveInterval 60tells the client to send a "null packet" to the server every 60 seconds.ServerAliveCountMax 3means the client will try 3 times before giving up. This keeps the connection alive in the eyes of intermediate network devices (like firewalls or NAT gateways) that might otherwise time out idle TCP sessions. The server-sideClientAlivedirectives do the same from the server’s perspective, ensuring the server knows if the client has gone silent.
- Diagnosis: On the client, check
-
Firewall/NAT Gateway State Table Timeout
- Diagnosis: This is most common when connecting through a corporate firewall or a home router with aggressive stateful inspection. If the connection is idle for longer than the firewall’s TCP connection timeout (often between 5 and 30 minutes), the firewall will drop the state for that connection. When the client or server next sends a packet, the firewall won’t recognize it as part of an established connection and will drop it, leading to a "reset by peer" if the other end receives the dropped packet.
- Fix: Implement the TCP Keepalive settings as described in point 1. If that’s insufficient, you might need to configure the firewall/NAT device to increase its TCP connection timeout for established SSH sessions, though this is often not user-configurable on consumer-grade equipment.
- Why it works: Keepalives prevent the connection from appearing idle to the firewall, thus preventing the state table entry from timing out.
-
MTU Mismatch or Path MTU Discovery (PMTUD) Issues
- Diagnosis: If SSH traffic, especially during data transfer (like SCP or SFTP), is being fragmented and some fragments are being dropped by a network device that doesn’t support or properly handle fragmentation (e.g., some VPNs, older routers), it can cause connection issues. You might see this more often when transferring large files.
- Fix: On the client, you can try setting a specific MTU for the SSH connection. Add to
~/.ssh/config:
Or, more directly, though less common and requires careful testing:IPQoS 0x08 # This sets the ToS byte for "minimize-delay" which often implies smaller packets.
A more robust approach is to ensure PMTUD is working correctly end-to-end, which involves ensuring ICMP "Fragmentation Needed" messages are not blocked by any network device.# Try reducing the MTU for the SSH connection itself. # This is a more advanced and often unnecessary step. # ssh -o "MtuBytes=1400" user@host - Why it works:
IPQoS 0x08can sometimes influence the packet size generated by the TCP stack. Directly setting an MTU (if supported by the SSH client or via tunnel configuration) forces smaller packets, reducing the chance of fragmentation issues. Proper PMTUD allows the client to discover the smallest MTU along the path and adjust its packet sizes accordingly, avoiding fragmentation altogether.
-
Underlying Network Instability (Packet Loss/Jitter)
- Diagnosis: High packet loss or significant jitter on the network path between the client and server can cause TCP to repeatedly retransmit packets. Eventually, if too many retransmissions occur, TCP will give up and declare the connection dead, often resulting in a "reset by peer" error on one side. Use tools like
mtr(My Traceroute) orpingwith large packet sizes (ping -s 1400 <host>) to test for packet loss and latency. - Fix: This isn’t an SSH configuration fix, but a network problem. You’ll need to troubleshoot the network path. This might involve contacting your ISP, your network administrator, or reconfiguring your local network equipment.
- Why it works: SSH relies on a stable TCP connection. If the underlying network is unreliable, TCP’s error correction mechanisms will eventually fail, leading to connection termination. Fixing the network path restores the reliability TCP needs.
- Diagnosis: High packet loss or significant jitter on the network path between the client and server can cause TCP to repeatedly retransmit packets. Eventually, if too many retransmissions occur, TCP will give up and declare the connection dead, often resulting in a "reset by peer" error on one side. Use tools like
-
SSH Client or Server Version Incompatibility / Bugs
- Diagnosis: Although rare, specific versions of
sshdor the SSH client can have bugs that lead to connection resets under certain conditions (e.g., specific ciphers, complex authentication methods, or unusual session activity). Checkssh -Von the client andsshd -Von the server to see your versions. - Fix: Update both your SSH client and server to the latest stable versions. Consult the release notes for known issues related to connection stability.
- Why it works: Software bugs can manifest in unpredictable ways, including premature connection termination. Updates often contain bug fixes that resolve these underlying issues.
- Diagnosis: Although rare, specific versions of
-
Resource Exhaustion on the Server
- Diagnosis: If the SSH server (
sshd) is under heavy load (e.g., too many concurrent connections, high CPU usage, or running out of memory), it might start dropping connections to manage its resources. Check server load usingtop,htop, oruptime. Look for errors in/var/log/auth.logor/var/log/securerelated tosshdor resource limits. - Fix: Optimize server performance, increase server resources (CPU, RAM), or implement connection limits in
/etc/ssh/sshd_configusingMaxSessionsandMaxStartups.
Restart# Example in /etc/ssh/sshd_config MaxSessions 10 MaxStartups 50:30:100 # (50 unauthenticated, then 30% of 100, then 100)sshdafter changes. - Why it works:
MaxSessionslimits the number of simultaneous open sessions per network connection.MaxStartupslimits the number of concurrent unauthenticated connections, preventing a denial-of-service attack from overwhelming thesshdprocess before authentication. By managing resource consumption, the server becomes more stable.
- Diagnosis: If the SSH server (
-
TCPKeepAlivevs.ClientAliveInterval/ServerAliveInterval- Diagnosis: It’s important to distinguish between the OS-level
TCPKeepAlivesetting and the SSH-specificClientAliveInterval/ServerAliveInterval. The OSTCPKeepAlive(often configured viasysctllikenet.ipv4.tcp_keepalive_time) is a lower-level mechanism. If this is set very high or disabled, and the SSH keepalives (point 1) are also misconfigured, you can have issues. - Fix: Ensure both OS-level TCP keepalives and SSH keepalives are reasonably configured. For Linux, you might set:
Apply with# In /etc/sysctl.conf or a file in /etc/sysctl.d/ net.ipv4.tcp_keepalive_time = 600 net.ipv4.tcp_keepalive_intvl = 60 net.ipv4.tcp_keepalive_probes = 5sudo sysctl -p. - Why it works: The OS-level TCP keepalives are a final safety net. If the SSH keepalives fail or are absent, these will still attempt to probe the connection. Setting them to values like 600 seconds (10 minutes) for the initial delay, 60 seconds for intervals, and 5 probes means the OS will detect a dead connection after about 15 minutes of inactivity, which is generally acceptable.
- Diagnosis: It’s important to distinguish between the OS-level
After implementing these fixes, the next error you might encounter relates to authentication failures if the underlying connection stability was masking other issues, or simply a clean connection that stays open.