HTTP, despite its name, isn’t a transport protocol; it’s an application-layer protocol that uses TCP for reliable, ordered delivery of messages.

Let’s watch HTTP talk to a web server. We’ll use curl with -v for verbose output, showing us the TLS handshake and the HTTP requests/responses.

Imagine you’re requesting https://example.com.

curl -v https://example.com

Here’s what happens, simplified:

  1. TCP Connection Establishment (The Three-Way Handshake):

    • Your client (browser or curl) sends a SYN packet to the server’s IP address on port 443 (for HTTPS).
    • The server receives the SYN, acknowledges it with a SYN-ACK, and allocates resources for the connection.
    • Your client receives the SYN-ACK and sends an ACK to confirm. Now, a stable TCP connection exists. This is the foundation.
  2. TLS Handshake:

    • Since it’s HTTPS, your client and the server now negotiate a secure channel. This involves exchanging certificates, agreeing on encryption algorithms, and generating session keys. This is why you see a lot of back-and-forth before any HTTP data is sent.
  3. HTTP Request:

    • Once the TLS tunnel is established, your client sends the actual HTTP request, like GET / HTTP/1.1. It includes headers like Host: example.com, User-Agent: curl/7.68.0, and Accept: */*.
  4. HTTP Response:

    • The server processes the request, fetches the resource (the HTML for the homepage), and sends back an HTTP response. This starts with a status line like HTTP/1.1 200 OK, followed by headers (Content-Type: text/html, Content-Length: 1256), and finally the HTML body.
  5. TCP Connection Closure:

    • After the response is sent and received, either the client or the server can initiate closing the connection by sending a FIN packet. The other side acknowledges (ACK), and eventually, both sides send their own FINs and ACKs to gracefully tear down the TCP connection. This is the "four-way handshake" for closing.

This entire dance, from TCP SYN to TCP FIN, is the lifecycle of a single HTTP request-response pair over a single TCP connection.

The Problem HTTP Tries to Solve: Latency

Every request-response cycle involves:

  • TCP handshake (RTTs)
  • TLS handshake (more RTTs)
  • HTTP request transmission
  • HTTP response transmission

If you needed to fetch 10 resources (HTML, CSS, JS, images), and each required a new TCP and TLS handshake, you’d be waiting for dozens of round trips. This is what happened with HTTP/1.0.

HTTP/1.1: The Introduction of Keep-Alive and Pipelining

HTTP/1.1 introduced two key features to combat this latency:

  • Connection Keep-Alive: By default, HTTP/1.1 connections are kept open after a request-response cycle. The Connection: keep-alive header (often implicit if not specified otherwise) tells the server, "Don’t close this TCP connection yet; I might have more requests." This eliminates the need for repeated TCP and TLS handshakes for subsequent requests to the same origin.

  • HTTP Pipelining: This is where it gets interesting. With pipelining, a client could send multiple HTTP requests over a single, persistent TCP connection without waiting for the responses to the preceding requests.

Let’s visualize pipelining. Imagine you need GET /page.html, GET /style.css, and GET /script.js.

Without Pipelining (Sequential):

  1. Client: SYN (TCP)
  2. Server: SYN-ACK
  3. Client: ACK
  4. Client: TLS handshake…
  5. Client: GET /page.html
  6. Server: 200 OK ... page.html
  7. Client: GET /style.css
  8. Server: 200 OK ... style.css
  9. Client: GET /script.js
  10. Server: 200 OK ... script.js
  11. Client/Server: FIN (TCP close)

Notice how request 2 (/style.css) can only be sent after response 1 (/page.html) is fully received.

With Pipelining:

  1. Client: SYN (TCP)
  2. Server: SYN-ACK
  3. Client: ACK
  4. Client: TLS handshake…
  5. Client: GET /page.html
  6. Client: GET /style.css (sent immediately after #5, before server responds to #5)
  7. Client: GET /script.js (sent immediately after #6)
  8. Server: 200 OK ... page.html (response for #5)
  9. Server: 200 OK ... style.css (response for #6)
  10. Server: 200 OK ... script.js (response for #7)
  11. Client/Server: FIN (TCP close)

The key here is that requests 5, 6, and 7 are sent back-to-back. The server receives them all, processes them, and sends the responses back in the same order they were received.

The Mechanical Problem with Pipelining

The fatal flaw of HTTP/1.1 pipelining is that responses must be delivered in the same order that requests were sent. This is because TCP guarantees ordered delivery of packets, not ordered delivery of application-level messages. If the server takes a long time to generate the response for the first request, it can’t send the response for the second request (which might be ready immediately) until the first one is done. This leads to a phenomenon called Head-of-Line Blocking (HOL) at the application layer.

Imagine this:

  • Request 1: GET /slow_resource (takes 5 seconds to generate)
  • Request 2: GET /fast_resource (takes 0.1 seconds to generate)

With pipelining, the client sends both. The server receives them. It starts generating /slow_resource. While it’s doing that, it cannot send the response for /fast_resource, even though it’s ready. The client waits for /slow_resource to finish, then gets /fast_resource. The entire pipeline is blocked by the slowest request.

This HOL blocking made pipelining brittle and difficult to implement correctly across all servers and clients. Many servers and intermediate proxies didn’t support it, or had buggy implementations. Consequently, most browsers abandoned pipelining in favor of opening multiple TCP connections simultaneously (typically 6-8 per origin) to achieve parallelism, even though it meant more TCP/TLS handshakes. This is why you often see multiple GET requests happening concurrently in network traces.

HTTP/2 and HTTP/3 solve this problem fundamentally with multiplexing over a single connection, where multiple requests and responses can be interleaved and out of order.

The most surprising thing about HTTP/1.1 pipelining is that it was designed to solve latency by sending requests in parallel over a single connection, but it introduced its own form of head-of-line blocking that made it practically unusable.

Want structured learning?

Take the full Tcp course →