The SSH agent is failing to sign requests because it can’t access the private key material, usually due to incorrect permissions or a corrupted agent socket.
Common Causes and Fixes
1. Incorrect Permissions on ~/.ssh Directory or Key Files
The SSH agent, and SSH in general, is very strict about file permissions for security reasons. If your ~/.ssh directory or your private key files (id_rsa, id_ed25519, etc.) are too accessible, SSH will refuse to use them.
Diagnosis: Check permissions with:
ls -ld ~/.ssh
ls -l ~/.ssh/id_*
The ~/.ssh directory should be 700 (drwx------), and private key files should be 600 (-rw-------). Public keys (.pub) can be more permissive, like 644.
Fix: Correct permissions with:
chmod 700 ~/.ssh
chmod 600 ~/.ssh/id_*
Why it works: This prevents other users on your system from reading or modifying your sensitive private keys, a fundamental security requirement for SSH.
2. SSH Agent Not Running or Not Properly Initialized
The ssh-agent process needs to be running and its environment variables (SSH_AUTH_SOCK, SSH_AGENT_PID) set correctly in your current shell session for ssh commands to find it.
Diagnosis: Check if the agent is running and its variables are set:
echo $SSH_AUTH_SOCK
echo $SSH_AGENT_PID
If these are empty, the agent is not initialized for your session. You can also try to list keys in the agent:
ssh-add -l
If this command hangs or returns an error about not being able to connect, the agent isn’t accessible.
Fix: Start a new agent and set the environment variables:
eval $(ssh-agent -s)
Then add your keys:
ssh-add ~/.ssh/id_rsa
ssh-add ~/.ssh/id_ed25519
Why it works: eval $(ssh-agent -s) starts a new agent process in the background and outputs shell commands that set SSH_AUTH_SOCK and SSH_AGENT_PID in your current shell, allowing ssh and ssh-add to communicate with the agent. ssh-add then loads your private keys into the running agent’s memory.
3. Agent Socket File Permissions/Ownership
The SSH_AUTH_SOCK environment variable points to a Unix domain socket file that the ssh-agent process creates. If the permissions or ownership of this socket file are incorrect, other processes (like your ssh client) won’t be able to connect to it. This can happen if the agent is run by a different user or if the socket file itself has been tampered with.
Diagnosis:
Find the socket file using SSH_AUTH_SOCK:
echo $SSH_AUTH_SOCK
Then check its permissions and ownership:
ls -l /tmp/ssh-XXXXXX/agent.PID
The socket file should typically be owned by your user and have restrictive permissions (e.g., srwx------).
Fix: If permissions are wrong, try restarting the agent as described in cause #2. If the socket file is missing or corrupted, you might need to kill the old agent process and start a new one.
eval $(ssh-agent -k) # Kills the current agent
eval $(ssh-agent -s) # Starts a new one
ssh-add ~/.ssh/id_rsa
Why it works: By killing and restarting the agent, a new socket file with correct permissions and ownership is created, allowing your ssh client to establish a connection.
4. Corrupted SSH Agent Keyring or Key File
Sometimes, the private key file itself might become corrupted, or the way it was added to the agent might have been interrupted, leading to an invalid entry in the agent’s keyring.
Diagnosis:
Try to list keys and then re-add them. If ssh-add -l still fails after verifying permissions and agent setup, or if ssh-add reports errors when adding the key, the key itself might be the issue.
Fix: Remove the problematic key from the agent and re-add it:
ssh-add -d ~/.ssh/id_rsa
ssh-add ~/.ssh/id_rsa
If this still fails, try generating a new key pair and using that.
Why it works: ssh-add -d removes the specific key from the agent’s memory. Re-adding it forces the agent to re-parse and load the key, potentially fixing issues caused by a bad prior load.
5. SELinux or AppArmor Restrictions
Security Enhanced Linux (SELinux) or AppArmor can sometimes prevent the ssh-agent or ssh client from accessing necessary files or sockets, especially in hardened environments or on systems where these security modules are aggressively configured.
Diagnosis: Check system logs for SELinux or AppArmor denials. On systems with SELinux:
sudo ausearch -m avc -ts recent
On systems with AppArmor:
sudo journalctl -xe | grep apparmor
Look for messages related to ssh-agent, ssh, or your home directory.
Fix:
This is highly system-specific. For SELinux, you might need to adjust contexts or booleans. For example, if ssh-agent is blocked from accessing home directories:
sudo semanage fcontext -a -t ssh_home_t "$HOME/.ssh(/.*)?"
sudo restorecon -Rv "$HOME/.ssh"
Note: Modifying SELinux/AppArmor policies requires caution and understanding of your system’s security posture. Consult your distribution’s documentation.
Why it works: SELinux/AppArmor enforce mandatory access control policies. Adjusting these policies allows the legitimate processes (ssh-agent, ssh) to perform their required operations (accessing keys, sockets).
6. ssh-agent Process Halted or Crashed
The ssh-agent process itself might have unexpectedly terminated due to a bug, system resource exhaustion, or a signal.
Diagnosis:
Check if the ssh-agent process is still running:
pgrep ssh-agent
If no PID is returned, the agent has stopped.
Fix: Restart the agent as described in cause #2:
eval $(ssh-agent -s)
ssh-add ~/.ssh/id_rsa
Why it works: A fresh ssh-agent process is started, providing a new, functional agent and socket for ssh to connect to.
After fixing these issues, you should be able to successfully sign SSH requests. The next error you’ll likely encounter is a "Permission denied (publickey)" error if your public key isn’t correctly configured on the remote server.