You can use tcpdump to measure traffic by host, but the most surprising thing is how often people don’t realize that tcpdump itself can be a bottleneck if you’re not careful with your filtering.
Let’s see it in action. Imagine you have a server and you want to know which of your clients is hogging the network. You’d typically run something like this on the server:
sudo tcpdump -i eth0 -w - 'host 192.168.1.100 or host 192.168.1.101 or host 192.168.1.102' > client_traffic.pcap
This captures all traffic going to or from those three specific client IPs on the eth0 interface and writes it to a file. Then, you’d analyze client_traffic.pcap with tcpdump again or Wireshark.
But how do you get from raw packets to "who’s using the bandwidth"? You need to aggregate the data. A common way is to use tcpdump to filter and then awk to sum up the packet sizes for each host.
Here’s a more direct approach to get a real-time breakdown:
sudo tcpdump -i eth0 -n -s 0 -w - 'tcp' | awk -W interactive '
BEGIN {
FS=" ";
current_time = systime();
}
{
# Extract IP addresses and packet size.
# This is a simplification; real parsing needs more robust regex for different packet structures.
if ($3 ~ /[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+/) {
src_ip = $3;
dst_ip = $5;
# Packet size is usually the last field, but can vary.
# For TCP, we're interested in payload size, which isn't directly in tcpdump output without deeper parsing.
# For simplicity, we'll sum total packet length here.
packet_len = $NF;
} else if ($5 ~ /[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+/) {
src_ip = $5;
dst_ip = $3;
packet_len = $NF;
}
# Track traffic for both source and destination IPs.
# We need to account for traffic *to* the server and *from* the server.
# In a client/server scenario, you'd usually focus on traffic *to* the server.
# Let's assume we're on the server and want to see client traffic.
if (dst_ip ~ /^192\.168\.1\./) { # Traffic *to* the server
host_traffic[dst_ip] += packet_len;
}
if (src_ip ~ /^192\.168\.1\./) { # Traffic *from* the server
host_traffic[src_ip] += packet_len;
}
# Print a summary every 10 seconds
if (systime() - current_time >= 10) {
print "--- Bandwidth Report (Last 10s) ---";
for (ip in host_traffic) {
printf "%-15s: %10d bytes\n", ip, host_traffic[ip];
host_traffic[ip] = 0; # Reset for next interval
}
current_time = systime();
print "-----------------------------------\n";
}
}
'
This command pipes the raw tcpdump output to awk. awk then parses each line, extracting source and destination IPs and packet lengths. It maintains a running total for each IP address. Every 10 seconds, it prints the current totals and resets them. The -n flag prevents tcpdump from resolving hostnames, which speeds up capture significantly. -s 0 tells tcpdump to capture the full packet, not just the headers. -w - writes the raw packets to standard output.
The core problem tcpdump helps you solve here is visibility into network traffic patterns. Without it, you’re blind to which hosts are consuming bandwidth, making troubleshooting or capacity planning incredibly difficult. You can see protocols, ports, and the sheer volume of data moving between specific IP addresses.
The mental model is straightforward: tcpdump acts as a highly efficient packet tap. You apply filters (host X, port Y, tcp, udp) to reduce the noise. The output is a stream of packet metadata. Tools like awk or Wireshark then process this stream to aggregate, analyze, and visualize the data. The key is that tcpdump itself is just the collector; the real analysis happens downstream.
A common pitfall is trying to capture everything and then filter in Wireshark. If you have a busy network, tcpdump can drop packets before they even hit your analysis tool, or the capture itself can consume so much CPU that it distorts your measurements. The more specific your tcpdump filter, the less work tcpdump has to do, and the more accurate your data will be. For instance, instead of tcpdump -i eth0 -w all.pcap, use tcpdump -i eth0 -w client_a.pcap 'host 192.168.1.100'.
When you’re dealing with very high-speed interfaces (10Gbps+), even tcpdump with aggressive filtering can struggle. In such scenarios, dedicated network monitoring appliances or kernel-level tapping mechanisms that offload packet processing might be necessary.
The next concept you’ll run into is analyzing the payload of the traffic, not just the volume, which requires deeper packet inspection and often tools designed for application-layer analysis.