Http - cywongg/2025 GitHub Wiki

Below is a set of hierarchical notes summarizing the lecture content on HTTP/1.x. Each major header is phrased as a question (e.g., "Intro to X – Why are we using X that is conducive to learning?") to help guide your study and understanding.

---

# HTTP Protocols Overview Notes

---

## 1. Intro to HTTP/1.x – Why Do We Still Use It?

- **What is HTTP/1.x and why does it persist?**
  - **Simplicity & Legacy:**  
    HTTP/1.0 and HTTP/1.1 have been around for decades and remain in use because they are simple to implement and work reliably in many cases.
  - **User Inertia:**  
    Many users and systems stick with HTTP/1.x rather than upgrading to HTTP/2, even when improved performance is available.
  - **Trade-offs:**  
    HTTP/1.x carries limitations (such as a lack of efficient buffering and heavy connection overhead), yet its simplicity and historical compatibility keep it relevant.

---

## 2. Intro to HTTP Requests – What Are Its Parts and Why Are They Important?

- **Q: What comprises an HTTP request and why is each part crucial?**

  - **Method:**  
    - **GET:** For reading data. Generally no body is attached.
    - **POST:** Often used to update or submit data—even when PUT might be conceptually more appropriate.
    - **Other Methods:** PUT, DELETE, HEAD, etc., although GET and POST are most common in web usage.
    
  - **Path and URL:**  
    - **Structure:**  
      The URL consists of the domain (e.g., `www.example.com`) and the path (e.g., `/about`). If nothing is specified, it defaults to `/`.
    - **Query Parameters:**  
      Appended after a `?` (e.g., `/search?q=test&sort=asc`) and fully integrated into the URL.
    - **URL Limitations:**  
      Each server may impose its own maximum length limit (e.g., 2000 or 8000 characters), which is especially important for GET requests.

  - **Protocol Version:**  
    - Typically indicated as HTTP/1.1, ensuring the client and server agree on how to interpret the request.
    
  - **Headers:**  
    - **Key-Value Pairs:**  
      Examples include `Content-Type`, `Content-Length`, `Cookie`, `User-Agent`, and especially the `Host` header.
    - **Host Header Importance:**  
      It became crucial as multiple domains began sharing the same IP address. This header tells the server which website is intended.

  - **Body:**  
    - **Usage:**  
      Mainly present in POST requests; GET requests typically do not carry a body.
    - **Determination:**  
      The `Content-Length` header specifies the size of the body in bytes.

---

## 3. Intro to HTTP Responses – How Is Data Sent Back and Why Does Format Matter?

- **Q: How is an HTTP response structured and what is the significance of its elements?**

  - **Status Line:**  
    - **Protocol & Status Code:**  
      E.g., `HTTP/1.1 200 OK` where `200` indicates success.
    - **Reason Phrase:**  
      Historically, a text description (e.g., "OK") was included even though it is redundant. (A Chrome bug shows inconsistent behavior with these descriptions in various HTTP versions.)
    
  - **Headers:**  
    - **Metadata:**  
      Includes information on content type, content length, redirection (via the `Location` header), and other control data.
    - **Role:**  
      They define how the response body should be processed by the client.
    
  - **Body:**  
    - **Content Delivery:**  
      Contains the actual data being returned (HTML content, JSON, images, etc.).
    - **Defined by Headers:**  
      The `Content-Length` header is used to determine when the body ends.

---

## 4. Intro to Connection Management – How Do TCP and TLS Shape HTTP/1.x Communications?

- **Q: Why is TCP used for HTTP/1.x and what roles do TLS and persistent connections play?**

  - **TCP Connection:**  
    - **Setup:**  
      Every HTTP/1.x request starts with a TCP three-way handshake. While reliable, this setup is resource-intensive.
    - **Statelessness:**  
      HTTP/1.x is inherently stateless, closing connections once a single request/response is complete.
      
  - **Persistent (Keep-Alive) Connections:**  
    - **Purpose:**  
      To reuse the same TCP connection for multiple HTTP requests. This reduces the overhead of establishing new connections repeatedly.
    - **Benefits:**  
      Lower latency, reduced CPU and memory overhead since fewer connections need to be managed.
    - **Drawbacks:**  
      If many resources need to be fetched, without additional methods like pipelining, the single connection might be underutilized.
      
  - **TLS Handshake (Security):**  
    - **Sequence:**  
      After the TCP handshake, if encryption is required, a TLS handshake occurs.
    - **Key Exchange:**  
      Utilizes mechanisms like Diffie–Hellman or RSA to securely negotiate a symmetric key.
    - **Symmetric Encryption:**  
      Used for the actual data transfer after key exchange because of its speed, though securely exchanging the key is critical.

---

## 5. Intro to HTTP Pipelining – Can Multiple Requests Improve Performance?

- **Q: What is HTTP pipelining and why is it not widely used?**

  - **Concept of Pipelining:**  
    - **Definition:**  
      Pipelining allows a client to send multiple HTTP requests over a single TCP connection without waiting for each individual response.
    - **Analogy:**  
      Comparable to a CPU pipeline where different stages of multiple instructions run concurrently.
  
  - **Challenges with Pipelining:**  
    - **Response Order:**  
      Responses must be delivered in the same order as the requests. If a response (e.g., a large HTML page) takes longer, it blocks subsequent responses (e.g., small images).
    - **Proxy & Intermediate Devices:**  
      Proxies may not maintain the correct order, leading to misinterpretation of responses.
    - **Security Risks (HTTP Smuggling):**  
      Inconsistent handling of multiple headers (e.g., differing `Content-Length` or combined with `Transfer-Encoding`) can confuse servers and proxies, potentially creating vulnerabilities.
  
  - **Outcome:**  
    Pipelining is disabled by default in most HTTP/1.1 implementations due to these reliability and security challenges.

---

## 6. Intro to HTTP/2 – How Does It Improve Upon HTTP/1.x?

- **Q: What are the key improvements of HTTP/2 over HTTP/1.x, and how do these changes enhance performance and security?**

  - **Multiplexing:**  
    - **Parallel Requests:**  
      Multiple HTTP requests and responses can be interleaved over a single TCP connection using stream identifiers.
    - **Elimination of Head-of-Line Blocking:**  
      Unlike HTTP/1.1 pipelining, multiplexing allows responses to be delivered as soon as they are ready without needing to maintain strict order.
    
  - **Header Compression:**  
    - **Compression Benefits:**  
      Both headers and bodies can be compressed, reducing the size of transmitted data.
    - **Security Concerns:**  
      In HTTP/1.x, header compression was sometimes disabled due to vulnerabilities like the "CRIME" attack.
    
  - **Server Push:**  
    - **Proactive Resource Delivery:**  
      Servers can “push” resources to the client in anticipation of its needs.  
    - **Real-World Adoption:**  
      This feature did not become widely used or supported.
    
  - **Security by Default:**  
    - **Mandatory TLS:**  
      HTTP/2 is generally deployed over TLS to prevent protocol ossification (where traditional network devices block or interfere with new protocols).
    - **Protocol Negotiation:**  
      Uses TLS extensions like ALPN (Application-Layer Protocol Negotiation) to negotiate HTTP/2 usage before any application data is sent.

---

## 7. Intro to HTTP/3 – What Problems Does It Address and How?

- **Q: How does HTTP/3 address the limitations found in HTTP/2 and previous versions?**

  - **Transition to QUIC:**  
    - **Underlying Transport:**  
      Unlike HTTP/1.x and HTTP/2, which run on top of TCP, HTTP/3 is designed to run over QUIC—a protocol built on top of UDP with built-in congestion control.
    - **Benefits over TCP:**  
      QUIC reduces the latency involved in connection establishment and avoids the head-of-line blocking issues inherent in TCP.
    
  - **Enhanced Performance:**  
    - **Faster Handshakes:**  
      QUIC combines connection and security negotiations, leading to a faster setup.
    - **Robustness in Packet Loss:**  
      QUIC’s design handles packet loss more gracefully, maintaining a smoother transfer of web content.

---

## 8. Summary – What Are the Key Takeaways?

- **HTTP/1.x:**  
  - Simple, straightforward, and historically reliable.
  - Uses separate TCP connections for each request unless keep-alive is specified.
  - Suffers from inefficiencies like high overhead per connection and limited support for pipelining.

- **Connection Management:**  
  - TCP’s 3-way handshake and subsequent TLS handshake (if used) create latency.
  - Persistent connections (via keep-alive) help mitigate these delays, while pipelining showed potential but was limited by reliability challenges.

- **HTTP/2 Enhancements:**  
  - Multiplexing allows parallel requests on a single connection.
  - Compression of headers and bodies reduces data overhead.
  - Security is improved by making TLS the norm and using ALPN for protocol negotiation.

- **HTTP/3 Evolution:**  
  - Moves away from TCP to QUIC, offering lower latency and better handling of packet loss.
  - Represents a significant step forward in addressing the head-of-line blocking issues that have long hampered HTTP performance.

---

These notes provide a detailed yet intuitive summary of how HTTP has evolved from version 1.x to HTTP/3, highlighting the reasons for each change and the trade-offs involved. Use this structure as a reference guide when studying the protocol’s nuances and its practical implications for modern web communication.