UDP can actually be made reliable, and it’s the default for many high-performance systems that absolutely cannot tolerate packet loss.
Let’s see UDP in action, but not in a way you’d expect. Imagine a real-time multiplayer game. Players are moving, shooting, and their actions need to be reflected on everyone else’s screen immediately. If a packet saying "Player X moved left" gets lost, waiting for a retransmission would cause a stutter, a lag spike, a visible jump. Instead, the game engine sends the latest position update. The lost packet is irrelevant because a newer one is already on its way. This is the "unreliable" part, and it’s a feature, not a bug, for this use case.
import socket
# Sender (e.g., game server)
sender_socket = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
server_address = ('localhost', 12345)
message = b"Player 1: x=10, y=25, orientation=90"
sender_socket.sendto(message, server_address)
print(f"Sent: {message.decode()}")
sender_socket.close()
# Receiver (e.g., game client)
receiver_socket = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
receiver_socket.bind(server_address)
data, addr = receiver_socket.recvfrom(1024)
print(f"Received: {data.decode()} from {addr}")
receiver_socket.close()
This simple example shows the core of UDP: fire-and-forget. The sender sends data, and that’s it. There’s no confirmation, no guarantee it arrived. This is UDP’s fundamental characteristic.
Now, how do we make this "unreliable" protocol reliable for things where data must arrive, like financial transactions or critical control signals? You build reliability on top of UDP. This is where the "reliable UDP" patterns come in.
The most common pattern is a sequence number and acknowledgment (ACK) system.
- Sequence Numbers: Every UDP packet sent is assigned a unique, monotonically increasing sequence number. This is added as a header to the actual payload.
- Sender Buffering: The sender doesn’t discard the packet immediately after sending. It keeps a copy in a buffer until it receives an acknowledgment for it.
- Receiver Buffering & Ordering: The receiver also buffers incoming packets. If a packet arrives out of order (e.g., packet 3 arrives before packet 2), it’s held in a buffer.
- Acknowledgments: When the receiver successfully processes a packet (e.g., it’s the expected sequence number, or it’s a packet it can use even if out of order), it sends back an ACK packet to the sender. This ACK packet includes the sequence number of the packet it’s acknowledging.
- Retransmission: If the sender doesn’t receive an ACK for a buffered packet within a certain timeout period, it assumes the packet was lost and retransmits it.
This creates a reliable stream over UDP. It’s often called "UDP Lite" or a custom reliable UDP protocol. Think of protocols like QUIC (which underlies HTTP/3) or some older multiplayer game networking libraries that implemented their own TCP-like reliability on UDP.
The problem this solves is the overhead of TCP. TCP has head-of-line blocking (if one packet in a stream is lost, the entire stream stalls until it’s retransmitted) and its connection setup (three-way handshake) can add latency. By building reliability on UDP, you can avoid these TCP-specific issues. For example, in QUIC, if a packet containing data for stream A is lost, streams B and C can continue to make progress without waiting.
You control the reliability by tuning parameters like the retransmission timeout, the size of the sender and receiver buffers, and the logic for handling out-of-order packets. For instance, a shorter retransmission timeout means faster detection of loss and retransmission, but can lead to more spurious retransmissions if network latency is variable. Larger buffers can smooth out bursts of incoming packets but increase memory usage and latency.
The most surprising thing about building reliability on UDP is that you can also implement flow control and congestion control – mechanisms traditionally associated with TCP – in user space, and often do a better job. For example, instead of the operating system kernel managing congestion, an application can use application-level metrics (like how quickly ACKs are returning, or how many packets are being dropped at the receiver’s buffer) to adjust its sending rate. This allows for more fine-grained control and quicker adaptation to network conditions, especially in scenarios with many UDP flows competing. QUIC is a prime example of this, with its own sophisticated congestion control algorithms that can be updated independently of the operating system.
The next step is often understanding how these reliable UDP protocols handle multiple independent streams of data over a single UDP connection, a concept known as multiplexing.