2.2 The Web and HTTP (part 1)
Overview of HTTP and Web Connections
Introduction to HTTP
- The session begins with an overview of the web and HTTP, focusing on persistent and non-persistent connections, major messages (request and response), and cookies for state persistence.
- This is the first part of a two-part series on HTTP, indicating that there will be more detailed discussions in subsequent sections.
Understanding Web Pages
- A web page consists of a base HTML file along with referenced objects (e.g., images, audio files), each addressable by a URL which includes a hostname and path.
Client-Server Model
- HTTP operates on the client-server model; clients can be web browsers or embedded devices while servers can be traditional or general-purpose.
- An example is given where a PC running Firefox makes an HTTP request to a server, demonstrating how both browsers communicate using the HTTP protocol.
HTTP Transactions
TCP Connection Establishment
- An HTTP transaction involves opening a TCP connection to a web server on port 80, exchanging one or more messages before closing the connection.
Stateless Nature of HTTP
- HTTP is described as stateless; it does not maintain internal state about ongoing requests. Each request-response cycle is independent.
- The simplicity of being stateless avoids complications related to multi-step transactions and cleanup issues when failures occur.
Types of HTTP Connections
Persistent vs Non-Persistent Connections
- There are two types of connections: persistent (multiple objects transferred over one TCP connection) and non-persistent (one object per TCP connection).
Non-Persistent Connections
- In non-persistent connections, establishing multiple TCP connections is necessary for downloading several objects, leading to increased round-trip times (RTTs).
Persistent Connections
- Persistent connections allow multiple objects to be sent serially over one established TCP connection. This method corresponds with HTTP version 1.1.
Example Workflow in Non-Persistent HTTP
Step-by-Step Process
- An example illustrates how non-persistent HTTP works when requesting a webpage containing text and references to JPEG images:
- Step 1a: Client initiates TCP connection at port 80.
- Step 1b: Server accepts the connection but no requests have been made yet.
Requesting Objects
- After establishing the connection, the client sends an HTTP request message for the base HTML file. The server processes this request and sends back the response message containing requested content.
Response Time Analysis
Components of Response Time
- The response time for non-persistent HTTP includes:
- One RTT for initiating the TCP connection,
- Another RTT for transmitting the request and receiving initial bytes from the response,
Understanding HTTP Connections and Messages
Non-Persistent vs. Persistent HTTP Connections
- The non-persistent HTTP response time is calculated as two round-trip times (RTTs) plus the file transmission time, indicating that fetching a web object requires two RTTs.
- Multiple objects can be retrieved in parallel; however, reducing latency from two RTTs to one RTT is desirable for faster information retrieval.
- Persistent connections, introduced in HTTP/1.1, allow the server to keep the connection open after sending a response, enabling subsequent messages without waiting for a new TCP connection.
Structure of HTTP Messages
- There are two types of HTTP messages: request messages and response messages. Protocol defines their format and order.
Request Messages
- A request message starts with a request line containing the method (e.g., GET), URL, and HTTP version followed by a carriage return line feed.
- Header lines provide additional information such as host name, browser type (e.g., Firefox), accepted object types, preferred language (e.g., US English), and connection status.
- The request message concludes with an empty line; it is designed to be human-readable.
Types of Request Methods
- Common methods include:
- GET: Retrieve data from the server.
- POST: Upload form data to the server.
- PUT: Upload or replace an existing object on the server at a specified URL.
- HEAD: Similar to GET but retrieves only headers without body content.
Response Messages Overview
- A response message begins with a status line that includes the HTTP version used (e.g., 1.1), followed by a status code (e.g., 200 OK).
- Important components include:
- Status code indicating success or failure of requests (e.g., 200 for success).
- Short status phrases providing context about the response.
Response Headers
- Following the status line are header lines that give additional details like:
- Date and time when the response was sent.
- Server type (e.g., Apache version).
- Last modified date of documents and content length/type.
Common Response Status Codes
- Examples include:
- 200 OK: Indicates successful request processing.
- 404 Not Found: Indicates requested document was not found on the server.
Statelessness in HTTP
- Although HTTP is stateless, servers can maintain user state through cookies which store user-related information between transactions.
Cookies Mechanism
Understanding Cookies in HTTP Interactions
How Cookies Function in HTTP Requests
- Cookies are sent in the HTTP response header from a server to a client, allowing the server to remember user interactions across multiple requests.
- In an example scenario, a client makes several requests to an Amazon server, which stores cookie-related information in its backend database.
- Initially, the client sends a request without cookies; upon receiving this request, the Amazon server creates and stores a cookie before responding with it.
- The subsequent request includes the cookie value, enabling the server to tailor responses based on previous interactions (e.g., offering deals on items viewed).
- This mechanism allows for personalized experiences; for instance, if a user returns after some time, the server can remind them of past interests.
Uses and Privacy Concerns of Cookies
- Cookies help maintain state about users between HTTP transactions—useful for remembering logins or shopping cart contents.
- However, there are significant privacy concerns associated with cookies; they can track user behavior across different websites through third-party cookies.
- The EU's GDPR mandates that non-essential cookies require explicit consent from users before activation and data collection.
- Many websites now prompt users to agree to cookie policies before accessing their services due to these regulations.
Summary of Key Learnings
- The discussion covered various aspects of web interactions including types of HTTP connections and how cookies function as state management tools on servers.