ZeroMQ’s multicast with PGM/EPGM is not just about sending messages to many; it’s about sending messages to many reliably without knowing who they are and without wasting bandwidth on those who aren’t listening.
Let’s watch it in action. Imagine we have a publisher pgm_pub sending market data and multiple subscribers pgm_sub receiving it.
Publisher (pgm_pub.py):
import zmq
import time
context = zmq.Context()
socket = context.socket(zmq.PUB)
# PGM requires a multicast address and a port
socket.bind("pgm://239.0.0.1:7000")
print("Publisher bound to pgm://239.0.0.1:7000")
message_count = 0
while True:
message = f"Market Update {message_count}"
print(f"Sending: {message}")
socket.send_string(message)
message_count += 1
time.sleep(0.5)
Subscriber (pgm_sub.py):
import zmq
context = zmq.Context()
socket = context.socket(zmq.SUB)
# PGM requires joining the multicast group
socket.connect("pgm://239.0.0.1:7000")
# For SUB sockets in ZeroMQ, you must subscribe to something.
# An empty string subscribes to all messages.
socket.setsockopt_string(zmq.SUBSCRIBE, "")
print("Subscriber connected to pgm://239.0.0.1:7000 and subscribed to all.")
while True:
message = socket.recv_string()
print(f"Received: {message}")
Run pgm_pub.py on one terminal and multiple instances of pgm_sub.py on others. You’ll see the publisher send messages, and each subscriber will receive them, even if they start up after the publisher.
This setup elegantly solves the problem of broadcasting data to a dynamic, potentially massive, and unknown set of consumers. Traditional point-to-point messaging would require the publisher to maintain a connection to each subscriber, a scaling nightmare. Multicast, especially with PGM/EPGM, shifts this burden. The publisher sends a single datagram to a multicast group address. Network routers, if configured for multicast, then replicate this datagram as needed to reach all active subscribers who have joined that group. ZeroMQ’s PGM integration adds a layer of reliability on top of UDP’s best-effort delivery.
The core of PGM (Pragmatic General Multicast) lies in its ability to provide reliable, ordered delivery over unreliable datagram networks. It achieves this through techniques like Negative Acknowledgements (NACKs) and retransmissions. When a receiver misses a packet, it doesn’t ask the sender directly (which would defeat the purpose of multicast). Instead, it periodically broadcasts a NACK for the missing packet. Senders (or other receivers who have the packet) detect this NACK and retransmit. EPGM (Enhanced Pragmatic General Multicast) builds on this, often adding features for faster joins/leaves and improved performance.
ZeroMQ exposes PGM through the pgm:// transport. For publishers, this means binding to a multicast IP address (typically in the 224.0.0.0/4 or 239.0.0.0/8 range) and a port. Subscribers connect to this same address and port. The critical difference for subscribers is that they don’t just connect; they join the multicast group. In ZeroMQ, this is implicitly handled by the connect() call on a SUB socket when using the pgm:// transport. You don’t need explicit setsockopt(zmq.JOIN_GROUP, ...) calls if you’re connecting to a PGM address.
The setsockopt_string(zmq.SUBSCRIBE, "") on the subscriber is still crucial. Even though PGM handles the network-level joining, the ZeroMQ SUB socket operates on a publish-subscribe model within the multicast stream. The empty string subscribes to all messages published on that group. If you wanted to filter, you’d provide a topic prefix here.
A key aspect of PGM is its reliance on UDP. This means it can traverse firewalls more easily than TCP-based protocols, and it avoids the head-of-line blocking issues inherent in TCP. However, it also means that network infrastructure must be configured to support multicast routing. This often involves enabling PGM/IP multicast on routers and switches in the path between publishers and subscribers. Without proper multicast enablement, the datagrams simply won’t be forwarded to all intended destinations.
The distinction between PGM and EPGM in ZeroMQ is subtle but important for high-scale deployments. While both offer reliability, EPGM is generally considered more robust and performant, especially under heavy load or with frequent network changes. ZeroMQ typically defaults to EPGM if the underlying system libraries support it, otherwise falling back to PGM. You can explicitly request EPGM with epgm://.
When you use pgm:// or epgm://, ZeroMQ is essentially wrapping the standard PGM/EPGM libraries provided by your operating system. This means that the reliability mechanisms – the NACKs, the retransmissions, the ordering guarantees – are handled at a lower level than typical ZeroMQ message routing. The send() call effectively hands off the message to the PGM/EPGM stack, and the recv() call waits for the PGM/EPGM stack to deliver a message.
The most surprising thing about ZeroMQ’s PGM/EPGM is how it achieves reliability without a central broker or direct peer-to-peer acknowledgments. The entire mechanism relies on a clever dance of broadcasting NACKs and opportunistic retransmissions among receivers, with the sender acting as a passive participant in the recovery process. This distributed recovery is what allows it to scale to thousands or millions of recipients without overwhelming the publisher.
The next hurdle you’ll likely encounter is managing multicast group membership and ensuring proper network configuration for multicast routing.