The Browser Workflow, part 1

Step 1. User Types URL

The UI handles user input and triggers the navigation process.

Component Involved: User Interface


Step 2. URL Parsing

The browser engine parses the URL into its components: protocol (HTTP/HTTPS), domain, path, query parameters, and fragment.

Component Involved: Browser Engine

URL Structure: scheme://username:password@hostname:port/path?query#fragment

More details on URL:

Components of a URL

  • Scheme: Specifies the protocol used to access the resource (e.g., http, https, ftp). Example: https
  • Username (optional): Used for authentication. Example: username:password
  • Hostname: Specifies the domain name or IP address where the resource is hosted. Example: www.example.com
  • Port (optional): Specifies the port number to connect to the host. Example: :8080
  • Path: Specifies the specific resource within the host that the client wants to access. Example: /path/to/resource
  • Query (optional): Provides additional parameters to the resource, often used in the form of key-value pairs. Example: ?key1=value1&key2=value2
  • Fragment (optional): Refers to a subsection within the resource. Example: #section1

A URI is a more general term that encompasses both URLs and URNs (Uniform Resource Names). It can be used to identify a resource either by location (URL) or by name (URN).

Components of a URI:

  • Scheme: Defines the namespace of the URI (e.g., http, https, urn). Example: urn
  • Path: Defines the specific resource within the namespace. Example: path/to/resource

A URI can be either a URL or a URN, depending on how it identifies the resource:

  • URL: A URI that provides a means of locating the resource. Example: https://www.example.com/path/to/resource
  • URN: A URI that names the resource but does not specify its location. Example: urn:isbn:0451450523

Step 3. DNS Resolution

The browser checks its DNS cache. If the IP address is not cached, it performs a DNS query to resolve the domain.

Component Involved: Networking

1. User Types URL

The user types a URL (e.g., www.example.com ) in the browser’s address bar. The browser’s User Interface (UI) captures the input and initiates the navigation process.

2. URL Parsing

The browser engine parses the URL to extract its components: protocol (HTTP/HTTPS), domain (example.com), path (/index.html), query parameters, and fragment identifiers.

3. Check Browser Cache

The browser checks its local DNS cache to see if the IP address for the domain is already stored.

If the IP address is found, the browser uses it, skipping further DNS resolution steps.

4. Check OS Cache

If the IP address is not in the browser cache, the browser queries the operating system’s DNS cache.

The OS maintains a DNS cache for all applications. If the IP address is found here, it is returned to the browser.

5. Query the DNS Resolver

If the IP address is not found in the OS cache, the OS forwards the request to the configured DNS resolver.

This resolver is usually provided by the user’s Internet Service Provider (ISP) or a public DNS service like Google DNS (8.8.8.8) or Cloudflare DNS (1.1.1.1).

6. Recursive DNS Resolution

Root DNS Servers: The DNS resolver queries a root server, which responds with the address of a TLD (Top-Level Domain) server responsible for .com domains.

TLD DNS Servers: The resolver then queries the TLD server, which responds with the address of an authoritative DNS server for example.com.

Authoritative DNS Servers: The resolver queries the authoritative server, which responds with the final IP address for www.example.com . If the authoritative server doesn’t have the IP address, it may delegate the query to another server or respond with a negative result.

7. Caching the Response

The IP address is returned to the DNS resolver, which caches it to optimize future queries. The resolver forwards the IP address to the OS, which also caches it. Finally, the OS returns the IP address to the browser, which caches it as well.

Each cache level has a Time To Live (TTL) value that determines how long the IP address will be stored.

8. Using the IP Address

The browser uses the resolved IP address to establish a TCP connection with the web server.

For HTTPS connections, an SSL/TLS handshake is performed to secure the connection. Once the connection is established, the browser sends an HTTP request to retrieve the desired web page or resource.

Key Components in DNS Resolution

– Stores recently resolved domain names and their corresponding IP addresses for quick retrieval.

– Reduces the need for repeated DNS lookups, improving performance.

– Maintains a cache of DNS lookups for all applications running on the system.

– Acts as a secondary layer of caching after the browser cache.

– A server that handles DNS queries for client machines.

– Performs recursive lookups by querying other DNS servers on behalf of the client.

– Caches responses to optimize future queries.

– The highest level in the DNS hierarchy.

– Direct queries to the appropriate Top-Level Domain (TLD) servers.

– Servers responsible for specific top-level domains (e.g., .com, .org, .net).

– Direct queries to the appropriate authoritative DNS servers for the requested domain.

– Servers that have the final answer for DNS queries.

– Store the DNS records for specific domains and provide the IP address associated with the requested domain name.

DNS Spoofing/Cache Poisoning

– Attackers can insert false DNS records into the cache of a resolver, redirecting traffic to malicious sites.

– Mitigation: Use DNSSEC (DNS Security Extensions) to ensure DNS responses are authenticated.

DNS over HTTPS (DoH) and DNS over TLS (DoT)

– Encrypt DNS queries to prevent eavesdropping and manipulation.

– Implementation: Modern browsers and operating systems are increasingly supporting these protocols to enhance security and privacy.


Step 4. Establishing a Connection

The browser establishes a TCP connection to the server using the resolved IP address.

Component Involved: Networking

1. Client Sends SYN

The client sends a TCP segment with the SYN flag set to initiate the connection to the server at the resolved IP address and port (typically port 80 for HTTP or port 443 for HTTPS).

2. Server Responds with SYN-ACK

The server responds with a TCP segment that has both SYN and ACK flags set, acknowledging the client’s request to establish a connection.

3. Client Sends ACK

The client sends a final acknowledgment (ACK) to the server, completing the TCP three-way handshake.

4. Client Sends ClientHello (for HTTPS)

The client sends a ClientHello message to the server, specifying the SSL/TLS version, cipher suites, and compression methods it supports.

5. Server Responds with ServerHello

The server responds with a ServerHello message, choosing the SSL/TLS version and cipher suite from the options provided by the client.

6. Server Sends Certificate

The server sends its digital certificate to the client. This certificate includes the server’s public key and is issued by a trusted Certificate Authority (CA).

7. Server Sends ServerKeyExchange (if needed)

If the chosen cipher suite requires additional key exchange information (e.g., for Diffie-Hellman key exchange), the server sends this information.

8. Server Requests Client Certificate (if needed):

If the server requires client authentication, it sends a CertificateRequest message.

9. Server Sends ServerHelloDone

The server sends a ServerHelloDone message, indicating it has finished its initial handshake messages.

10. Client Sends Client Certificate (if needed)

If requested by the server, the client sends its digital certificate.

11. Client Sends ClientKeyExchange

The client sends a ClientKeyExchange message, which may contain a pre-master secret, depending on the cipher suite.

12. Client Sends CertificateVerify (if needed)

If client authentication is used, the client sends a CertificateVerify message to verify the client’s certificate.

13. Client Sends ChangeCipherSpec

The client sends a ChangeCipherSpec message, indicating that future messages will be encrypted using the negotiated cipher suite.

14. Client Sends Finished

The client sends a Finished message, encrypted with the session key, to verify that the handshake was successful.

15. Server Sends ChangeCipherSpec and Finished

The server responds with its own ChangeCipherSpec and Finished messages, also encrypted with the session key.

Post-Handshake Data Transfer

Secure Data Transmission: Once the SSL/TLS handshake is complete, the client and server can securely exchange data over the established encrypted connection.

HTTP Request: The client sends an HTTP request (e.g., GET /index.html) over the established TCP connection. The server processes the request and sends back an HTTP response with the requested content.

Connection Termination

Client Sends FIN: The client sends a TCP segment with the FIN (finish) flag set to indicate it wants to terminate the connection.

Server Sends ACK and FIN: The server acknowledges the client’s FIN with an ACK and sends its own FIN to indicate it is ready to close the connection.

Client Sends Final ACK: The client sends a final ACK to acknowledge the server’s FIN, completing the termination process.

TCP Packet Structure

A TCP packet, also known as a TCP segment, consists of a header and data. The header contains essential control information required for the TCP protocol to manage data transmission, while the data part carries the actual payload. Here is a detailed breakdown of the TCP packet structure:

  • The TCP header typically occupies 20 to 60 bytes, depending on the presence of optional fields. The standard header length is 20 bytes. Below are the fields included in a TCP header:
  • Payload (Variable length) is the actual data being transported by the TCP segment. The length of the data is determined by the total length of the IP packet minus the TCP header length.

Fields in the TCP Header:

  • Source Port (16 bits):

    • The port number of the sender (source) application.
  • Destination Port (16 bits):

    • The port number of the receiver (destination) application.
  • Sequence Number (32 bits):

    • A sequence number for the first byte of data in this segment.
    • Used to ensure data is received in order and to facilitate retransmissions.
  • Acknowledgment Number (32 bits):

    • If the ACK flag is set, this number contains the value of the next sequence number that the sender is expecting to receive.
    • Acknowledges receipt of data.
  • Data Offset (4 bits):

    • Also known as the header length.
    • Specifies the size of the TCP header in 32-bit words.
    • The minimum value is 5 (indicating a 20-byte header), and the maximum value is 15 (indicating a 60-byte header).
  • Reserved (3 bits):

    • Reserved for future use.
    • Must be set to zero.
  • Flags (9 bits):

    • NS (1 bit): ECN-nonce concealment protection (experimental).
    • CWR (1 bit): Congestion Window Reduced flag. Set by the sender to indicate that it received a TCP segment with the ECE flag set.
    • ECE (1 bit): ECN-Echo. Indicates that the IP packet had the CE (congestion experienced) flag set.
    • URG (1 bit): Urgent pointer field significant.
    • ACK (1 bit): Acknowledgment field significant.
    • PSH (1 bit): Push function. Asks to push the buffered data to the receiving application.
    • RST (1 bit): Reset the connection.
    • SYN (1 bit): Synchronize sequence numbers to initiate a connection.
    • FIN (1 bit): No more data from the sender (finish).
  • Window Size (16 bits):

    • Specifies the size of the sender’s receive window (flow control).
    • It indicates the number of bytes the sender is willing to accept.
  • Checksum (16 bits):

    • Used for error-checking the header and data.
  • Urgent Pointer (16 bits):

    • If the URG flag is set, this 16-bit field is an offset from the sequence number indicating the last urgent data byte.
  • Options (Variable, usually 0-40 bytes):

    • Optional fields that can extend the header.
    • Options include maximum segment size (MSS), window scale, timestamp, and more.
  • Padding:

    • Extra bytes added to ensure the TCP header is a multiple of 32 bits in length.

Step 5. HTTP Request

The browser constructs an HTTP request with headers, cookies, and other relevant data and sends it to the server.

Component Involved: Networking

An HTTP request is a message sent by a client to a server in the HTTP (Hypertext Transfer Protocol) protocol. The request aims to retrieve or manipulate resources on a web server. Here’s a comprehensive breakdown of an HTTP request, including its components and the details involved.

Structure of an HTTP Request

An HTTP request consists of several components:

  1. Request Line
  2. Headers
  3. Body (optional)

1. Request Line

The request line is the first line of the HTTP request and contains three elements:

  • HTTP Method: Specifies the action to be performed (e.g., GET, POST, PUT, DELETE).
  • Request-URI: The Uniform Resource Identifier that identifies the resource on the server.
  • HTTP Version: Indicates the HTTP protocol version (e.g., HTTP/1.1, HTTP/2).

Example:

GET /index.html HTTP/1.1
copy

2. Headers

Headers provide additional information about the request, such as metadata about the client, the resource being requested, and how the client expects the server to handle the request.

Common headers

  • Host:

    • Specifies the domain name of the server and (optionally) the port number.
    • Example: Host: www.example.com
  • User-Agent:

    • Contains information about the client software (e.g., browser type and version).
    • Example: User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36
  • Accept:

    • Specifies the media types the client can handle.
    • Example: Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
  • Accept-Language:

    • Specifies the preferred languages for the response.
    • Example: Accept-Language: en-US,en;q=0.5
  • Accept-Encoding:

    • Indicates the content encoding (e.g., gzip, deflate) the client can handle.
    • Example: Accept-Encoding: gzip, deflate, br
  • Connection:

    • Controls whether the connection remains open after the request/response cycle.
    • Example: Connection: keep-alive
  • Cookie:

    • Sends stored cookies to the server.
    • Example: Cookie: sessionId=abc123; username=johndoe
  • Authorization:

    • Contains credentials for authenticating the client with the server.
    • Example: Authorization: Basic QWxhZGRpbjpPcGVuU2VzYW1l
  • Content-Type:

    • Indicates the media type of the body of the request (used with POST and PUT methods).
    • Example: Content-Type: application/json

3. Body (Optional)

The body of an HTTP request is optional and is typically used with methods like POST and PUT to send data to the server. The body contains the actual data being sent to the server.

POST /api/users HTTP/1.1
Host: www.example.com
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate, br
Connection: keep-alive
Cookie: sessionId=abc123; username=johndoe
Authorization: Basic QWxhZGRpbjpPcGVuU2VzYW1l
Content-Type: application/json
Content-Length: 49

{
  "username": "johndoe",
  "password": "s3cr3t"
}
copy

Step 6. Server Response

The server responds with an HTTP response, including status code, headers, and body (HTML content).

When a server receives an HTTP request from a client, it processes the request and sends back an HTTP response. The response includes essential information such as the status code, headers, and the body (content). Here’s a comprehensive breakdown of an HTTP server response, including its components and the details involved.

Structure of an HTTP Server Response

An HTTP server response consists of several components:

  1. Status Line
  2. Headers
  3. Body (optional)

1. Status Line

The status line is the first line of the HTTP response and contains three elements:

  • HTTP Version: Indicates the HTTP protocol version (e.g., HTTP/1.1, HTTP/2).
  • Status Code: A three-digit code indicating the result of the request (e.g., 200 for success, 404 for not found).
  • Reason Phrase: A textual description of the status code.
HTTP/1.1 200 OK
copy

2. Headers

Headers provide additional information about the response, such as metadata about the server, the content being sent, and instructions for the client.

Common HTTP Headers:

  • Date:

    • The date and time the response was generated.
    • Example: Date: Sun, 26 Jul 2024 12:28:53 GMT
  • Server:

    • Information about the server software.
    • Example: Server: Apache/2.4.1 (Unix)
  • Content-Type:

    • The media type of the response body.
    • Example: Content-Type: text/html; charset=UTF-8
  • Content-Length:

    • The length of the response body in bytes.
    • Example: Content-Length: 138
  • Connection:

    • Options for the connection (e.g., keep-alive, close).
    • Example: Connection: keep-alive
  • Set-Cookie:

    • Used to send cookies from the server to the client.
    • Example: Set-Cookie: sessionId=abc123; Path=/
  • Cache-Control:

    • Directives for caching mechanisms in both requests and responses.
    • Example: Cache-Control: max-age=3600
  • Location:

    • Used in redirection responses to indicate the URL to redirect to.
    • Example: Location: https://www.example.com/newpage
  • Content-Encoding:

    • The type of encoding used on the data (e.g., gzip).
    • Example: Content-Encoding: gzip
  • Transfer-Encoding:

    • The form of encoding used to safely transfer the payload body to the user (e.g., chunked).
    • Example: Transfer-Encoding: chunked

3. Body (Optional)

The body of the HTTP response contains the actual content being sent to the client, such as HTML, images, JSON data, etc. The body is optional and depends on the nature of the request and response.

HTTP/1.1 200 OK
Date: Sun, 26 Jul 2024 12:28:53 GMT
Server: Apache/2.4.1 (Unix)
Content-Type: text/html; charset=UTF-8
Content-Length: 138
Connection: keep-alive
Set-Cookie: sessionId=abc123; Path=/
Cache-Control: max-age=3600

<!DOCTYPE html>
<html>
<head>
    <title>Example Page</title>
</head>
<body>
    <h1>Hello, World!</h1>
    <p>This is an example page.</p>
</body>
</html>
copy

See next item for Step 7.