The most surprising thing about Wireshark’s VoIP analysis is how much raw data you can glean about call quality directly from packet captures, even without specialized monitoring tools.

Let’s look at a live SIP and RTP session. Imagine we’ve captured traffic during a choppy phone call.

# Packet 10: SIP INVITE
SIP/SDP: Request: INVITE sip:alice@example.com SIP/2.0
  Via: SIP/2.0/UDP 192.168.1.100:5060;branch=z9hG4bK12345
  From: <sip:bob@192.168.1.100>;tag=9876
  To: <sip:alice@example.com>
  Call-ID: abcdef123456@192.168.1.100
  CSeq: 1 INVITE
  Content-Type: application/sdp
  Content-Length: 150

  v=0
  o=bob 12345 67890 IN IP4 192.168.1.100
  s=VoIP Call
  c=IN IP4 192.168.1.100
  t=0 0
  m=audio 16456 RTP/AVP 0 8 101
  a=rtpmap:0 PCMU/8000
  a=rtpmap:8 PCMA/8000
  a=rtpmap:101 telephone-event/8000
  a=fmtp:101 0-16

This INVITE packet establishes the call. Notice the Via and From headers showing the originating IP and port (192.168.1.100:5060). The Call-ID is crucial for correlating subsequent packets. The sdp section details the media capabilities: m=audio 16456 RTP/AVP tells us that audio will be sent using RTP on UDP port 16456, and the supported codecs are PCMU (G.711 mu-law), PCMA (G.711 A-law), and telephone-event for DTMF tones.

# Packet 15: SIP 200 OK
SIP/SDP: Status: 200 OK
  Via: SIP/2.0/UDP 192.168.1.100:5060;branch=z9hG4bK12345;received=192.168.1.50
  From: <sip:bob@192.168.1.100>;tag=9876
  To: <sip:alice@example.com>;tag=54321
  Call-ID: abcdef123456@192.168.1.100
  CSeq: 1 INVITE
  Content-Type: application/sdp
  Content-Length: 145

  v=0
  o=alice 98765 43210 IN IP4 192.168.1.50
  s=VoIP Call
  c=IN IP4 192.168.1.50
  t=0 0
  m=audio 49170 RTP/AVP 0 8 101
  a=rtpmap:0 PCMU/8000
  a=rtpmap:8 PCMA/8000
  a=rtpmap:101 telephone-event/8000
  a=fmtp:101 0-16

This 200 OK confirms the call. The received parameter in the Via header indicates the IP address the SIP server (or peer) saw the INVITE arrive from (192.168.1.50). The To tag is now present, completing the call leg. Crucially, the SDP here shows Alice’s chosen audio port: m=audio 49170 RTP/AVP on her IP address 192.168.1.50.

Now, the actual voice data flows via RTP. We can filter for RTP packets and then use Wireshark’s "Follow" functionality.

To see the RTP stream:

  1. Filter for udp.port == 16456 (or the port specified in the 200 OK).
  2. Right-click on an RTP packet.
  3. Select "Follow" -> "RTP Stream".

This opens a new window showing the audio data. You’ll see timestamps, sequence numbers, and payload types.

# RTP Packet (Bob to Alice)
Frame 25: UDP: Source Port: 16456  Destination Port: 49170
RTP: Payload type: 0 (PCMU)  Sequence number: 100  Timestamp: 123456789
  (Audio data - raw bytes)
# RTP Packet (Alice to Bob)
Frame 28: UDP: Source Port: 49170  Destination Port: 16456
RTP: Payload type: 0 (PCMU)  Sequence number: 200  Timestamp: 987654321
  (Audio data - raw bytes)

Wireshark’s RTP Stream window is invaluable. It displays packet loss, jitter (variation in packet arrival time), and can even attempt to play back the audio. Look for:

  • Sequence Number Gaps: Indicates packet loss. If packet 100 is followed by packet 102, packet 101 is missing.
  • Jitter: Wireshark calculates this. High jitter means the packets are arriving erratically, causing choppy audio.
  • Inter-arrival Time: The time between consecutive packets. Deviations from the expected (e.g., 20ms for G.711 at 50 packets/sec) point to network congestion or processing delays.
  • Payload Type Mismatch: If one side sends PCMU and the other expects PCMA, audio will be garbled.

The system works by dissecting the SIP signaling to identify the RTP streams (IP addresses, ports, codecs) and then analyzing the RTP packets themselves. SIP sets up the call parameters, and RTP carries the actual media. Wireshark’s ability to follow these streams and present metrics like packet loss and jitter directly from the captured packets is its power.

One key detail often overlooked is the impact of NAT. If your VoIP endpoints are behind Network Address Translation (NAT), the IP addresses and ports seen in SIP Via and SDP c= lines might be private (e.g., 192.168.1.x). However, the actual RTP packets will have the public IP address of the NAT device as the source. Wireshark’s SIP dissector usually handles this well by inspecting Via headers and potentially NAT-specific SIP parameters, but it’s essential to understand that the "source IP" in the packet header might not be the end-user’s actual IP.

Beyond basic playback and loss, Wireshark can also dissect and display RTCP (RTP Control Protocol) packets, which provide real-time statistics about the RTP session, including packet loss, jitter, and round-trip delay, offering even deeper insights into call quality.

Want structured learning?

Take the full Wireshark course →