TCP Concurrent Server Design: Threads vs epoll (2026)

Threads and epoll are two fundamentally different approaches to building concurrent network servers, and the "better" choice depends entirely on the workload and operational constraints.

Let’s watch a simple thread-per-connection server in action. Imagine you have a web server that just echoes back whatever it receives.

import socket
import threading

def handle_client(client_socket):
    request = client_socket.recv(1024)
    print(f"Received: {request.decode()}")
    client_socket.send(request)
    client_socket.close()

server_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
server_socket.bind(('0.0.0.0', 8080))
server_socket.listen(5)
print("Server listening on port 8080")

while True:
    client_socket, addr = server_socket.accept()
    print(f"Accepted connection from {addr}")
    client_handler = threading.Thread(target=handle_client, args=(client_socket,))
    client_handler.start()

When a client connects, server_socket.accept() blocks until a connection arrives. Then, a new threading.Thread is created to handle that specific client, and the main loop immediately goes back to accept(), ready for the next connection. Each thread has its own stack, its own set of file descriptors (though they all point to the same underlying sockets), and its own execution context. This makes the logic within handle_client very straightforward: block on recv, process, block on send, close.

Now, consider an epoll based server. This is significantly more complex to write but scales differently.

import socket
import select

server_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
server_socket.setblocking(False)
server_socket.bind(('0.0.0.0', 8080))
server_socket.listen(5)

epoll = select.epoll()
epoll.register(server_socket.fileno(), select.EPOLLIN)

connections = {}
buffer = {}

try:
    while True:
        events = epoll.poll(1) # Poll with a timeout of 1 second
        for fd, event in events:
            if fd == server_socket.fileno():
                client_socket, addr = server_socket.accept()
                client_socket.setblocking(False)
                epoll.register(client_socket.fileno(), select.EPOLLIN)
                connections[client_socket.fileno()] = client_socket
                buffer[client_socket.fileno()] = b''
            elif event & select.EPOLLIN:
                client_socket = connections[fd]
                data = client_socket.recv(1024)
                if data:
                    buffer[fd] += data
                    # In a real server, you'd process the data here
                    # For echo, we'd prepare to send it back
                    print(f"Received from {fd}: {data.decode()}")
                    # To echo back, we'd register for EPOLLOUT:
                    # epoll.modify(fd, select.EPOLLIN | select.EPOLLOUT)
                else: # Client closed connection
                    epoll.unregister(fd)
                    client_socket.close()
                    del connections[fd]
                    del buffer[fd]
            # elif event & select.EPOLLOUT: # Would handle sending data
            #     client_socket = connections[fd]
            #     sent_bytes = client_socket.send(buffer[fd])
            #     buffer[fd] = buffer[fd][sent_bytes:]
            #     if not buffer[fd]:
            #         epoll.modify(fd, select.EPOLLIN) # Go back to listening for input
finally:
    epoll.unregister(server_socket.fileno())
    epoll.close()
    server_socket.close()

The core idea here is that a single thread is responsible for managing many connections. Instead of creating a new thread for each connection, the server registers the server socket’s file descriptor with an epoll instance, telling it to notify us when the socket is ready for reading (select.EPOLLIN). When epoll.poll() returns, it tells us which file descriptors are ready. If it’s the server socket, we accept() a new connection and register the new client socket’s file descriptor with epoll for reading. If a client socket is ready for reading, we recv() from it. Crucially, recv() on a non-blocking socket will not block the entire thread; it will return immediately with whatever data is available, or an empty byte string if the client closed the connection. The single thread then loops back to epoll.poll(), efficiently waiting for any of the registered file descriptors to become ready.

The massive difference lies in resource utilization. Threads have overhead: memory for their stacks (typically 8MB by default on Linux), context switching costs when the OS switches between them, and kernel data structures. A server handling 100,000 concurrent connections using threads would require gigabytes of RAM just for thread stacks and would thrash the CPU with context switches. An epoll server, however, uses a single thread (or a small pool of threads) and manages thousands of connections by monitoring their readiness. The epoll kernel data structure is very efficient, and the single thread only does work when data is actually available.

The true power of epoll (and its counterparts like kqueue on BSD/macOS or IOCP on Windows) is its ability to handle I/O multiplexing at the kernel level. The kernel efficiently tracks the state of all registered file descriptors and only wakes up the user-space process when an event occurs on one of them. This event-driven model avoids the per-connection thread overhead, allowing a single process to manage a vast number of connections with minimal resource consumption.

What most people miss about epoll is that while it’s great for I/O-bound tasks (waiting for network data), it doesn’t magically make CPU-bound tasks fast. If your handle_client logic involves heavy computation, that computation will still block the single thread managing all connections, bringing the entire server to a halt. For CPU-bound workloads, you’d typically combine epoll for I/O management with a thread pool or process pool to offload the heavy computation.

The next hurdle you’ll face is handling graceful shutdown and managing the state transitions for sending data.

More Deep Dives in Tcp