Understanding HTTPS: Comprehensive Overview of Key Aspects
Written on
Chapter 1: Introduction to HTTPS
HTTPS addresses the security vulnerabilities associated with HTTP's plain text transmission, particularly the risk of man-in-the-middle (MITM) attacks. Originally known as HTTP over SSL (or HTTP Secure), it utilizes Secure Sockets Layer (SSL), which has since evolved into Transport Layer Security (TLS). Here, we will outline the essential elements of HTTPS.
The first video, titled "How HTTPS Works (...and SSL/TLS too)," provides an insightful explanation of the underlying principles of HTTPS and its relationship with SSL/TLS.
HTTPS Versions
Although the terms SSL and TLS are often used interchangeably, they primarily refer to the TLS protocol in contemporary discussions. The TLS protocol has several versions, including 1.1, 1.2, and 1.3. While TLS 1.2 was once the standard, TLS 1.3 is now the recommended version, offering enhanced security and efficiency through improvements in the Handshake and Record protocols.
TLS 1.3 eliminates several outdated and insecure encryption algorithms found in TLS 1.2, such as RC4, DES, 3DES, AES-CBC, and MD5. This change significantly reduces potential security vulnerabilities. Additionally, TLS 1.3 optimizes performance by cutting down the number of round trips (RTT) needed during the handshake phase. In optimal conditions, it can complete the handshake in just one round trip and supports a 0-RTT extension, contrasting with the two or more required by TLS 1.2.
The protocol is also designed to ensure forward compatibility through an extended protocol in the hello handshake message, which will not be detailed here.
Core Processes of HTTPS
While the specific processes may differ across versions, the general operation of HTTPS can be summarized as follows.
The accompanying diagram from ByteByteGo effectively illustrates the fundamental interactions and encryption processes involved in HTTPS. Key steps include establishing a TCP connection, negotiating symmetric encryption keys via asymmetric encryption, and conducting communication through symmetric encryption.
The structure of HTTPS, or more accurately TLS, is intricately designed. Key components include the Record Layer, which serves as the data transport channel, and various sub-protocols that operate on it. In TLS, the Record serves as the basic data transmission unit, akin to TCP segments and IP packets.
The Handshake protocol is critical within this framework, and its activity can be monitored using tools like Wireshark.
HTTPS SNI Extension
In the early days of the Internet, servers were typically single-machine setups, and older protocols like SSL v2 exhibited design flaws. The assumption was that a single server with a unique IP would host only one domain service. Consequently, once DNS resolution occurred, the server could directly connect to the IP and utilize a specific certificate for that domain. However, with the rise of cloud computing, virtual hosting, and IPv4 address scarcity, it became common for a single server to host multiple domains. This created a challenge for servers in determining which domain's SSL certificate the client was attempting to access, leading to the development of the HTTPS Server Name Indication (SNI) extension.
SNI is a TLS protocol extension that enables the client to convey the desired hostname to the server during the handshake. This allows a server to manage multiple domains' HTTPS services on a single IP address and provide the appropriate certificate for each.
Although this issue may seem straightforward, it posed significant challenges during the early widespread adoption of HTTPS, particularly as many CDN providers did not support SNI at that time. However, as of 2024, major software ecosystems like Nginx and various vendors now fully support this feature.
To utilize SNI, the -servername option can be employed in the OpenSSL s_client command:
openssl s_client -connect example.com:443 -servername example.com
Alternatively, if you're using the OpenSSL Library, you can implement SNI in the code with functions such as SSL_set_tlsext_host_name and BIO_set_conn_hostname.
HTTPS Certificate Mechanism
HTTPS employs a combination of asymmetric, symmetric, and hash algorithms within a public key system to achieve encryption, decryption, signing, verification, and more, thereby fulfilling four primary security objectives: confidentiality, integrity, authentication, and non-repudiation. It also offers defenses against typical MITM attacks.
To address public key trust issues, a certificate and trust chain mechanism is utilized. Certificates are issued by third-party Certificate Authorities (CAs) and typically stored in files with extensions like .crt, .cer, or .pem. These files adhere to specific standards, such as X.509, and contain essential information, including the public key, certificate holder details, issuing authority, validity period, and digital signature.
Well-known CAs include DigiCert, VeriSign, Entrust, and Let's Encrypt, with the certificates they issue categorized into Domain Validated (DV), Organization Validated (OV), and Extended Validation (EV), each reflecting different trust levels. Nonetheless, CAs themselves can face trust issues, as smaller CAs depend on the validation from larger CAs, and the chain ultimately leads to "self-signed certificates" or "root certificates."
Most operating systems and browsers come pre-installed with root certificates from major CAs, and during HTTPS communication, the certificate chain is verified step-by-step until reaching the root certificate.
HTTPS Software Ecosystem
While the ecosystem surrounding HTTPS or TLS is extensive, OpenSSL remains the dominant player. OpenSSL supports nearly all publicly accessible encryption algorithms and protocols, establishing itself as the de facto standard. Many applications, such as Apache and Nginx, utilize it as the foundational library to implement TLS capabilities.
Originating from SSLeay, OpenSSL has spawned various branches, including Google's BoringSSL and OpenBSD's LibreSSL. Its comprehensive content can be navigated using the openssl command, and those seeking specific details may consult resources like ChatGPT.
HTTPS Acceleration Solutions
Despite its advantages, HTTPS can introduce latency. Therefore, numerous optimizations for full-site HTTPS implementation could be discussed in greater detail. Here are a few highlights:
- RTT Optimization: Particularly vital in IO-intensive environments, this involves protocol upgrades, such as transitioning to HTTP/3 and TLS 1.3, which enhance RTT through different methodologies.
- Single-Step Performance Optimization: Strategies include deploying TLS acceleration cards, establishing dedicated TLS clusters or modules, and focusing on concepts like TLS session resumption.
The second video, "SSL, TLS, HTTPS Explained," offers a deeper understanding of the distinctions and functions of these protocols.