SIP is a surprisingly fragile protocol, and when calls go sideways, the problem usually isn’t with the audio itself but with the handshake that sets it up.

Let’s watch a SIP call unfold in real-time.

Imagine two SIP phones, 10.0.0.1 and 10.0.0.2, trying to establish a call. The Session Initiation Protocol (SIP) is the signaling language they use, and tcpdump is our eavesdropper. We’ll filter for just SIP traffic, usually UDP port 5060 (though sometimes TCP 5060 or other ports like 5061 for TLS).

sudo tcpdump -i eth0 udp port 5060 -n -s0 -w sip_capture.pcap

Here, -i eth0 specifies the network interface, udp port 5060 filters for SIP traffic, -n prevents DNS lookups (keeping IPs visible), -s0 captures the full packet, and -w sip_capture.pcap writes it to a file for later analysis.

When 10.0.0.1 initiates a call, it sends an INVITE request.

10.0.0.1 -> 10.0.0.2:

INVITE sip:user@10.0.0.2 SIP/2.0
Via: SIP/2.0/UDP 10.0.0.1:5060
From: <sip:alice@10.0.0.1>
To: <sip:bob@10.0.0.2>
Call-ID: 12345@10.0.0.1
CSeq: 1 INVITE
Contact: <sip:alice@10.0.0.1:5060>
Content-Type: application/sdp
Content-Length: 200

v=0
o=alice 12345 12345 IN IP4 10.0.0.1
s=SIP Call
c=IN IP4 10.0.0.1
t=0 0
m=audio 4000 RTP/AVP 0 8
a=rtpmap:0 PCMU/8000
a=rtpmap:8 PCMA/8000

This INVITE contains the Session Description Protocol (SDP) payload, detailing the media types (audio in this case), codecs (PCMU/PCMA), and the IP address and port (4000) where the caller expects to receive the RTP audio stream.

10.0.0.2 responds with a 180 Ringing if the call is being answered, or a 100 Trying if it’s just processing.

10.0.0.2 -> 10.0.0.1:

SIP/2.0 180 Ringing
Via: SIP/2.0/UDP 10.0.0.1:5060
From: <sip:alice@10.0.0.1>
To: <sip:bob@10.0.0.2>;tag=abcde
Call-ID: 12345@10.0.0.1
CSeq: 1 INVITE
Contact: <sip:bob@10.0.0.2:5060>
Content-Length: 0

Notice the tag=abcde in the To header – this is crucial for uniquely identifying a specific dialog branch.

Once 10.0.0.2 answers, it sends a 200 OK. This response also contains an SDP payload, but this time it specifies the IP address and port (5000) where 10.0.0.2 will send its audio.

10.0.0.2 -> 10.0.0.1:

SIP/2.0 200 OK
Via: SIP/2.0/UDP 10.0.0.1:5060
From: <sip:alice@10.0.0.1>
To: <sip:bob@10.0.0.2>;tag=abcde
Call-ID: 12345@10.0.0.1
CSeq: 1 INVITE
Contact: <sip:bob@10.0.0.2:5060>
Content-Type: application/sdp
Content-Length: 200

v=0
o=bob 54321 54321 IN IP4 10.0.0.2
s=SIP Call
c=IN IP4 10.0.0.2
t=0 0
m=audio 5000 RTP/AVP 0 8
a=rtpmap:0 PCMU/8000
a=rtpmap:8 PCMA/8000

The caller (10.0.0.1) acknowledges this with an ACK.

10.0.0.1 -> 10.0.0.2:

ACK sip:bob@10.0.0.2:5060 SIP/2.0
Via: SIP/2.0/UDP 10.0.0.1:5060
From: <sip:alice@10.0.0.1>
To: <sip:bob@10.0.0.2>;tag=abcde
Call-ID: 12345@10.0.0.1
CSeq: 1 ACK
Content-Length: 0

At this point, the signaling is complete. The phones now know each other’s IP addresses and RTP ports (4000 and 5000) and can start sending audio directly to each other. tcpdump will now show UDP packets on these ports, which are the actual voice data (RTP).

The most surprising thing about SIP signaling is how much information is packed into plain text headers, making it incredibly easy to debug with tools like tcpdump or Wireshark, provided you know what to look for.

When a call fails, the tcpdump trace is invaluable for pinpointing where the handshake broke. Did the INVITE not arrive? Did the 200 OK never get sent or received? Was the SDP information incorrect, leading to no audio path? Examining the sequence of requests and responses, and crucially, the presence and content of the SDP, is the key.

The Call-ID header, combined with the From and To tags, uniquely identifies a specific call leg. If you see responses without a matching Call-ID or CSeq number, it indicates a severe signaling problem, often a network issue dropping packets or a misconfigured SIP proxy.

If you’re not seeing RTP traffic after a successful 200 OK and ACK, the problem is almost certainly with the SDP negotiation. Check that the IP addresses and ports in the c= and m= lines of the SDP are correct and reachable. A common mistake is using the SIP signaling IP address in the SDP when the media should flow from a different network interface or NAT-ed IP.

The OPTIONS method, often overlooked, is used by SIP devices to query each other’s capabilities. If a device isn’t responding to OPTIONS requests, it might indicate it’s down, firewalled, or misconfigured, preventing it from even participating in call setup.

The Via header is critical for routing responses back to the correct client. If a proxy modifies the Via header incorrectly or if packets are arriving on a different interface than advertised, responses can get lost.

Finally, remember that SIP can run over TCP or TLS, not just UDP. If you’re not seeing traffic on UDP 5060, check for TCP 5060 or even TCP 5061 (for sips). The tcpdump filter would then change to tcp port 5060 or tcp port 5061.

The next hurdle you’ll face is understanding how SIP proxies and NAT traversal complicate this seemingly simple exchange.

Want structured learning?

Take the full Tcpdump course →