The Browser Workflow, part 1
Step 1. User Types URL
The UI handles user input and triggers the navigation process.
Component Involved: User Interface
Step 2. URL Parsing
The browser engine parses the URL into its components: protocol (HTTP/HTTPS), domain, path, query parameters, and fragment.
Component Involved: Browser Engine
URL Structure: scheme://username:password@hostname:port/path?query#fragment
More details on URL:
- RFC 3986: Uniform Resource Identifier (URI): Generic Syntax
- RFC 1738: Uniform Resource Locators (URL)
Components of a URL
- Scheme: Specifies the protocol used to access the resource (e.g., http, https, ftp). Example:
https
- Username (optional): Used for authentication. Example:
username:password
- Hostname: Specifies the domain name or IP address where the resource is hosted. Example:
www.example.com
- Port (optional): Specifies the port number to connect to the host. Example:
:8080
- Path: Specifies the specific resource within the host that the client wants to access. Example:
/path/to/resource
- Query (optional): Provides additional parameters to the resource, often used in the form of key-value pairs. Example:
?key1=value1&key2=value2
- Fragment (optional): Refers to a subsection within the resource. Example:
#section1
A URI is a more general term that encompasses both URLs and URNs (Uniform Resource Names). It can be used to identify a resource either by location (URL) or by name (URN).
Components of a URI:
- Scheme: Defines the namespace of the URI (e.g., http, https, urn). Example:
urn
- Path: Defines the specific resource within the namespace. Example:
path/to/resource
A URI can be either a URL or a URN, depending on how it identifies the resource:
- URL: A URI that provides a means of locating the resource. Example:
https://www.example.com/path/to/resource
- URN: A URI that names the resource but does not specify its location. Example:
urn:isbn:0451450523
Step 3. DNS Resolution
The browser checks its DNS cache. If the IP address is not cached, it performs a DNS query to resolve the domain.
Component Involved: Networking
1. User Types URL
The user types a URL (e.g., www.example.com ) in the browser’s address bar. The browser’s User Interface (UI) captures the input and initiates the navigation process.
2. URL Parsing
The browser engine parses the URL to extract its components: protocol (HTTP/HTTPS), domain (example.com), path (/index.html), query parameters, and fragment identifiers.
3. Check Browser Cache
The browser checks its local DNS cache to see if the IP address for the domain is already stored.
If the IP address is found, the browser uses it, skipping further DNS resolution steps.
4. Check OS Cache
If the IP address is not in the browser cache, the browser queries the operating system’s DNS cache.
The OS maintains a DNS cache for all applications. If the IP address is found here, it is returned to the browser.
5. Query the DNS Resolver
If the IP address is not found in the OS cache, the OS forwards the request to the configured DNS resolver.
This resolver is usually provided by the user’s Internet Service Provider (ISP) or a public DNS service like Google DNS (8.8.8.8) or Cloudflare DNS (1.1.1.1).
6. Recursive DNS Resolution
Root DNS Servers: The DNS resolver queries a root server, which responds with the address of a TLD (Top-Level Domain) server responsible for .com domains.
TLD DNS Servers: The resolver then queries the TLD server, which responds with the address of an authoritative DNS server for example.com.
Authoritative DNS Servers: The resolver queries the authoritative server, which responds with the final IP address for www.example.com . If the authoritative server doesn’t have the IP address, it may delegate the query to another server or respond with a negative result.
7. Caching the Response
The IP address is returned to the DNS resolver, which caches it to optimize future queries. The resolver forwards the IP address to the OS, which also caches it. Finally, the OS returns the IP address to the browser, which caches it as well.
Each cache level has a Time To Live (TTL) value that determines how long the IP address will be stored.
8. Using the IP Address
The browser uses the resolved IP address to establish a TCP connection with the web server.
For HTTPS connections, an SSL/TLS handshake is performed to secure the connection. Once the connection is established, the browser sends an HTTP request to retrieve the desired web page or resource.
Key Components in DNS Resolution
– Stores recently resolved domain names and their corresponding IP addresses for quick retrieval.
– Reduces the need for repeated DNS lookups, improving performance.
– Maintains a cache of DNS lookups for all applications running on the system.
– Acts as a secondary layer of caching after the browser cache.
– A server that handles DNS queries for client machines.
– Performs recursive lookups by querying other DNS servers on behalf of the client.
– Caches responses to optimize future queries.
– The highest level in the DNS hierarchy.
– Direct queries to the appropriate Top-Level Domain (TLD) servers.
– Servers responsible for specific top-level domains (e.g., .com, .org, .net).
– Direct queries to the appropriate authoritative DNS servers for the requested domain.
– Servers that have the final answer for DNS queries.
– Store the DNS records for specific domains and provide the IP address associated with the requested domain name.
DNS Spoofing/Cache Poisoning
– Attackers can insert false DNS records into the cache of a resolver, redirecting traffic to malicious sites.
– Mitigation: Use DNSSEC (DNS Security Extensions) to ensure DNS responses are authenticated.
DNS over HTTPS (DoH) and DNS over TLS (DoT)
– Encrypt DNS queries to prevent eavesdropping and manipulation.
– Implementation: Modern browsers and operating systems are increasingly supporting these protocols to enhance security and privacy.
Step 4. Establishing a Connection
The browser establishes a TCP connection to the server using the resolved IP address.
Component Involved: Networking
1. Client Sends SYN
The client sends a TCP segment with the SYN flag set to initiate the connection to the server at the resolved IP address and port (typically port 80 for HTTP or port 443 for HTTPS).
2. Server Responds with SYN-ACK
The server responds with a TCP segment that has both SYN and ACK flags set, acknowledging the client’s request to establish a connection.
3. Client Sends ACK
The client sends a final acknowledgment (ACK) to the server, completing the TCP three-way handshake.
4. Client Sends ClientHello (for HTTPS)
The client sends a ClientHello message to the server, specifying the SSL/TLS version, cipher suites, and compression methods it supports.
5. Server Responds with ServerHello
The server responds with a ServerHello message, choosing the SSL/TLS version and cipher suite from the options provided by the client.
6. Server Sends Certificate
The server sends its digital certificate to the client. This certificate includes the server’s public key and is issued by a trusted Certificate Authority (CA).
7. Server Sends ServerKeyExchange (if needed)
If the chosen cipher suite requires additional key exchange information (e.g., for Diffie-Hellman key exchange), the server sends this information.
8. Server Requests Client Certificate (if needed):
If the server requires client authentication, it sends a CertificateRequest message.
9. Server Sends ServerHelloDone
The server sends a ServerHelloDone message, indicating it has finished its initial handshake messages.
10. Client Sends Client Certificate (if needed)
If requested by the server, the client sends its digital certificate.
11. Client Sends ClientKeyExchange
The client sends a ClientKeyExchange message, which may contain a pre-master secret, depending on the cipher suite.
12. Client Sends CertificateVerify (if needed)
If client authentication is used, the client sends a CertificateVerify message to verify the client’s certificate.
13. Client Sends ChangeCipherSpec
The client sends a ChangeCipherSpec message, indicating that future messages will be encrypted using the negotiated cipher suite.
14. Client Sends Finished
The client sends a Finished message, encrypted with the session key, to verify that the handshake was successful.
15. Server Sends ChangeCipherSpec and Finished
The server responds with its own ChangeCipherSpec and Finished messages, also encrypted with the session key.
Post-Handshake Data Transfer
Secure Data Transmission: Once the SSL/TLS handshake is complete, the client and server can securely exchange data over the established encrypted connection.
HTTP Request: The client sends an HTTP request (e.g., GET /index.html) over the established TCP connection. The server processes the request and sends back an HTTP response with the requested content.
Connection Termination
Client Sends FIN: The client sends a TCP segment with the FIN (finish) flag set to indicate it wants to terminate the connection.
Server Sends ACK and FIN: The server acknowledges the client’s FIN with an ACK and sends its own FIN to indicate it is ready to close the connection.
Client Sends Final ACK: The client sends a final ACK to acknowledge the server’s FIN, completing the termination process.
TCP Packet Structure
A TCP packet, also known as a TCP segment, consists of a header and data. The header contains essential control information required for the TCP protocol to manage data transmission, while the data part carries the actual payload. Here is a detailed breakdown of the TCP packet structure:
- The TCP header typically occupies 20 to 60 bytes, depending on the presence of optional fields. The standard header length is 20 bytes. Below are the fields included in a TCP header:
- Payload (Variable length) is the actual data being transported by the TCP segment. The length of the data is determined by the total length of the IP packet minus the TCP header length.
Fields in the TCP Header:
Source Port (16 bits):
- The port number of the sender (source) application.
Destination Port (16 bits):
- The port number of the receiver (destination) application.
Sequence Number (32 bits):
- A sequence number for the first byte of data in this segment.
- Used to ensure data is received in order and to facilitate retransmissions.
Acknowledgment Number (32 bits):
- If the ACK flag is set, this number contains the value of the next sequence number that the sender is expecting to receive.
- Acknowledges receipt of data.
Data Offset (4 bits):
- Also known as the header length.
- Specifies the size of the TCP header in 32-bit words.
- The minimum value is 5 (indicating a 20-byte header), and the maximum value is 15 (indicating a 60-byte header).
Reserved (3 bits):
- Reserved for future use.
- Must be set to zero.
Flags (9 bits):
- NS (1 bit): ECN-nonce concealment protection (experimental).
- CWR (1 bit): Congestion Window Reduced flag. Set by the sender to indicate that it received a TCP segment with the ECE flag set.
- ECE (1 bit): ECN-Echo. Indicates that the IP packet had the CE (congestion experienced) flag set.
- URG (1 bit): Urgent pointer field significant.
- ACK (1 bit): Acknowledgment field significant.
- PSH (1 bit): Push function. Asks to push the buffered data to the receiving application.
- RST (1 bit): Reset the connection.
- SYN (1 bit): Synchronize sequence numbers to initiate a connection.
- FIN (1 bit): No more data from the sender (finish).
Window Size (16 bits):
- Specifies the size of the sender’s receive window (flow control).
- It indicates the number of bytes the sender is willing to accept.
Checksum (16 bits):
- Used for error-checking the header and data.
Urgent Pointer (16 bits):
- If the URG flag is set, this 16-bit field is an offset from the sequence number indicating the last urgent data byte.
Options (Variable, usually 0-40 bytes):
- Optional fields that can extend the header.
- Options include maximum segment size (MSS), window scale, timestamp, and more.
Padding:
- Extra bytes added to ensure the TCP header is a multiple of 32 bits in length.
Step 5. HTTP Request
The browser constructs an HTTP request with headers, cookies, and other relevant data and sends it to the server.
Component Involved: Networking
An HTTP request is a message sent by a client to a server in the HTTP (Hypertext Transfer Protocol) protocol. The request aims to retrieve or manipulate resources on a web server. Here’s a comprehensive breakdown of an HTTP request, including its components and the details involved.
Structure of an HTTP Request
An HTTP request consists of several components:
- Request Line
- Headers
- Body (optional)
1. Request Line
The request line is the first line of the HTTP request and contains three elements:
- HTTP Method: Specifies the action to be performed (e.g., GET, POST, PUT, DELETE).
- Request-URI: The Uniform Resource Identifier that identifies the resource on the server.
- HTTP Version: Indicates the HTTP protocol version (e.g., HTTP/1.1, HTTP/2).
Example:
GET /index.html HTTP/1.1
copy
2. Headers
Headers provide additional information about the request, such as metadata about the client, the resource being requested, and how the client expects the server to handle the request.
Common headers
Host:
- Specifies the domain name of the server and (optionally) the port number.
- Example:
Host: www.example.com
User-Agent:
- Contains information about the client software (e.g., browser type and version).
- Example:
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36
Accept:
- Specifies the media types the client can handle.
- Example:
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Language:
- Specifies the preferred languages for the response.
- Example:
Accept-Language: en-US,en;q=0.5
Accept-Encoding:
- Indicates the content encoding (e.g., gzip, deflate) the client can handle.
- Example:
Accept-Encoding: gzip, deflate, br
Connection:
- Controls whether the connection remains open after the request/response cycle.
- Example:
Connection: keep-alive
Cookie:
- Sends stored cookies to the server.
- Example:
Cookie: sessionId=abc123; username=johndoe
Authorization:
- Contains credentials for authenticating the client with the server.
- Example:
Authorization: Basic QWxhZGRpbjpPcGVuU2VzYW1l
Content-Type:
- Indicates the media type of the body of the request (used with POST and PUT methods).
- Example:
Content-Type: application/json
3. Body (Optional)
The body of an HTTP request is optional and is typically used with methods like POST and PUT to send data to the server. The body contains the actual data being sent to the server.
POST /api/users HTTP/1.1
Host: www.example.com
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate, br
Connection: keep-alive
Cookie: sessionId=abc123; username=johndoe
Authorization: Basic QWxhZGRpbjpPcGVuU2VzYW1l
Content-Type: application/json
Content-Length: 49
{
"username": "johndoe",
"password": "s3cr3t"
}
copy
Step 6. Server Response
The server responds with an HTTP response, including status code, headers, and body (HTML content).
When a server receives an HTTP request from a client, it processes the request and sends back an HTTP response. The response includes essential information such as the status code, headers, and the body (content). Here’s a comprehensive breakdown of an HTTP server response, including its components and the details involved.
Structure of an HTTP Server Response
An HTTP server response consists of several components:
- Status Line
- Headers
- Body (optional)
1. Status Line
The status line is the first line of the HTTP response and contains three elements:
- HTTP Version: Indicates the HTTP protocol version (e.g., HTTP/1.1, HTTP/2).
- Status Code: A three-digit code indicating the result of the request (e.g., 200 for success, 404 for not found).
- Reason Phrase: A textual description of the status code.
HTTP/1.1 200 OK
copy
2. Headers
Headers provide additional information about the response, such as metadata about the server, the content being sent, and instructions for the client.
Common HTTP Headers:
Date:
- The date and time the response was generated.
- Example:
Date: Sun, 26 Jul 2024 12:28:53 GMT
Server:
- Information about the server software.
- Example:
Server: Apache/2.4.1 (Unix)
Content-Type:
- The media type of the response body.
- Example:
Content-Type: text/html; charset=UTF-8
Content-Length:
- The length of the response body in bytes.
- Example:
Content-Length: 138
Connection:
- Options for the connection (e.g., keep-alive, close).
- Example:
Connection: keep-alive
Set-Cookie:
- Used to send cookies from the server to the client.
- Example:
Set-Cookie: sessionId=abc123; Path=/
Cache-Control:
- Directives for caching mechanisms in both requests and responses.
- Example:
Cache-Control: max-age=3600
Location:
- Used in redirection responses to indicate the URL to redirect to.
- Example:
Location: https://www.example.com/newpage
Content-Encoding:
- The type of encoding used on the data (e.g., gzip).
- Example:
Content-Encoding: gzip
Transfer-Encoding:
- The form of encoding used to safely transfer the payload body to the user (e.g., chunked).
- Example:
Transfer-Encoding: chunked
3. Body (Optional)
The body of the HTTP response contains the actual content being sent to the client, such as HTML, images, JSON data, etc. The body is optional and depends on the nature of the request and response.
HTTP/1.1 200 OK
Date: Sun, 26 Jul 2024 12:28:53 GMT
Server: Apache/2.4.1 (Unix)
Content-Type: text/html; charset=UTF-8
Content-Length: 138
Connection: keep-alive
Set-Cookie: sessionId=abc123; Path=/
Cache-Control: max-age=3600
<!DOCTYPE html>
<html>
<head>
<title>Example Page</title>
</head>
<body>
<h1>Hello, World!</h1>
<p>This is an example page.</p>
</body>
</html>
copy
See next item for Step 7.