Going Online. A look into how domain names turn to IP addresses.

We have been exploring the internet for ages. Typing in the domain name and somehow find the website magically open. This Saturday I try to demystify most of the stuff and try to crunch in as many information about how your domain name is resolved into a website.

DNS records (getting your website live)

Getting a domain is easy. Few clicks, few dollars and bam you have your fancy new website. Thereafter lies a few technical stuff you should know before you can host your website and emails and show them off to the world. Let’s understand the necessary DNS records you need to know to set up your server.

  1. NS Records (Nameserver Records): NS Resource Records are records in the DNS database to determine which authoritative name servers are used for the domain. The DNS database is used to convert (sub)domain names to IP addresses. They work as a distributed telephone book. Records are kept in cache for a time (mostly 24 hours) that’s why it often requires at least 24 hours to change a domain. (Else the old value van still be stored somewhere in a cache. Supposedly you buy your domain from GoDaddy then by default GoDaddy will be the authoritative server pointing to your website. NS records exist SOLELY to define WHICH NAMESERVERS are responsible for a particular domain.
  2. A Records (Address Mapping records): Once your machine finds the authoritative server having the details of your website. The machine then queries for the A records for the domain. The A records contain the hostname as well as the IP address of the server where your files/website are hosted. The A records consist of the name of the record, the address of the server and the TTL (Time to live for the record), this specifies the amount of time the record is allowed to be cached by a resolver. The A records stores IPv4 addresses.
  3. AAAA Records (IP Version 6 Address record): Same functionality as A records but it saves the IPv6 records.
  4. CNAME Records (Canonical Name record): A CNAME record is used to point a domain to another domain. When a DNS client requests a record that contains a CNAME, which points to another hostname, the DNS resolution process is repeated with the new hostname.
  5. MX Records (Mail Exchange records): Specifies an SMTP email server for the domain, used to route outgoing emails to an email server.
  6. PTR Records (Reverse-lookup Pointer records): A DNS pointer record (PTR for short) provides the domain name associated with an IP address. A DNS PTR record is exactly the opposite of the ‘A’ record, which provides the IP address associated with a domain name. They are used for reverse domain lookups. Some email anti-spam filters use reverse DNS to check the domain names of email addresses and see if the associated IP addresses are likely to be used by legitimate email servers. If a domain has no PTR record, or if the PTR record contains the wrong domain, email services may block all emails from that domain.
  7. CERT Records (Certificate record): Stores encryption certificates — PKIX, SPKI, PGP, and so on. CERT records are used for generically storing certificates within DNS and are most commonly used by systems for email encryption. To create a CERT record, you must specify the certificate type, the key tag, the algorithm, and then the certificate, which is either the certificate itself, the CRL, a URL of the certificate, or fingerprint and a URL.
  8. SRV Records (Service Location records): A record that advertises a service and how to connect with it. SRV records help with service discovery. For example, SRV records are used in Internet Telephony to define where a SIP service may be found. An SRV record typically defines a symbolic name and the transport protocol used as part of the domain name. It defines the priority, weight, port, and target for the service in the record content.
  9. TXT Records: Text records were originally intended to be human-readable metadata but now they are mostly served as placing machine-readable metadata to verify domain ownership etc.

DNS Lookup

  1. A client sends a recursive query to a DNS name server to request the IP address that corresponds to the name ftp.contoso.com. A recursive query indicates that the client wants a definitive answer to its query. The response to the recursive query must be a valid address or a message indicating that the address cannot be found.
  2. Because the DNS server is not authoritative for the name and does not have the answer in its cache, the DNS server uses root hints to find the IP address of the DNS root server.
  3. The DNS name server uses an iterative query to ask the DNS root server to resolve the name ftp.contoso.com. An iterative query indicates that the server will accept a referral to another server in place of a definitive answer to the query. The query lands on one of the 13 DNS root servers. Now to clear things out, the root servers work in a cluster and are accessible to the world by only 13 IP addresses. Because the name ftp.contoso.com ends with the label com, the DNS root server returns a referral to the Com server that hosts the com zone.
  4. The DNS server uses an iterative query to ask the Com server (Top Level Domain Server for com domain)to resolve the name ftp.contoso.com. Because the name ftp.contoso.com ends with the name contoso.com, the Com server returns a referral to the Contoso server that hosts the contoso.com zone.
  5. The DNS server uses an iterative query to ask the Contoso server to resolve the name ftp.contoso.com (Authoratative Name Server). The Contoso server finds the answer in its zone data and then returns the answer to the server.
  6. The server then returns the result to the client.

OSI Model

Now that we know how do we tell the name servers where we exist and how our PC will ask the name server where our website exists. Let’s talk a bit about what happens during the connection. For that, we’ll first take a glance at the OSI Model.

A quick mnemonics I use to remember the order is A Penguine said that nobody drinks Pepsi. Just a handy tip. Since most of the things are covered in the diagram I’ll try to explain OSI with a short story.

Suppose you are the CEO of a French company and want to transfer some document to an English speaking company. You being the CEO at the application layer, act as the point of contact and are involved with all the communication. Suppose you draft a 400-page message to be transferred to another company. It’s then the role of the presentation layer to convert your document written in French to English. As the translator, the presentation layer converts the data sent by the application layer of the transmitting node into an acceptable and compatible data format based on the applicable network protocol and architecture. Upon arrival at the receiving computer, the presentation layer translates the data into an acceptable format usable by the application layer. The function could include Character-Code Translation, Data Conversion, Data Compression, Data Encryption and Decryption etc. Then the document is passed on to the session layer, who’s job is to call the other company and establish communication that a document will be sent to them. The session layer tracks the dialogues between systems, which are also called sessions. This layer manages a session by initiating the opening and closing of sessions between end-user application processes. At the transport layer, the document sent by the application layer is broken down into segments so that it can be sent as a chunk. Suppose a chunk is misplaced then the whole document will not be lost. This includes taking data from the session layer and breaking it up into chunks called segments before sending it to layer 3. The transport layer is also responsible for flow control and error control. Flow control determines an optimal speed of transmission to ensure that a sender with a fast connection doesn’t overwhelm a receiver with a slow connection. The transport layer performs error control on the receiving end by ensuring that the data received is complete, and requesting retransmission if it isn’t. The packet is then sent to the network layer which decides the route to take to reach the other company. The network layer also finds the best physical path for the data to reach its destination; this is known as routing. And is responsible for the frame creation adding a to and from the address on the frames. It talks about IP addresses and routing of packets to achieve end-to-end communication. The data link layer talks about Ethernet, WiFi, Bluetooth, etc. — talks about network cards, links between them and allows the creation of local networks (via dedicated or shared media). In the OSI model Network layer is responsible for the ‘source-to-destination’ delivery of a packet possibly across multiple networks( links ), whereas the data link layer oversees the delivery of the packets btw ‘two’ systems on the same network. And then the physical layer converts the data into electrical signals and the data is thus sent.

TLS Handshake

The TLS Handshake involves the following steps.

  1. Client Hello: The client expresses their interest in connecting with the server. The client presents it’s cypher suites to the server. The server makes the decision of what cypher suite will be ultimately used for the communication based on the latest version of the suite that it can support. There are various parts to a cypher suite.
    1. Protocol: TLS 1.3, TLS 1.2, SSL V3, SSL V2
    2. Key Exchange: Diffie Helman or RSA
    3. Authentication (used to authenticate the server): RSA, Elliptic Curve Digital Signature Authentication
    4. Cipher (dictates the algorithm that will be used to encrypt the data): Advanced Encryption Standard(GCM/CBC), Camellia
    5. Message Authentication Code (dictates the method the connection will use to carry out data integrity checks.): Secure Hash Algorithm, MD5
    So a dummy cypher suite might look like TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256, which says that the protocol that will be used is TLS, They key exchange method will be an elliptic curve, Diffie Helman, with RSA mechanism for a handshake and AES with 128 session encryption size AES GCM version of cypher with SHA 256 digest size
  2. Server Hello: The server chooses the cypher suite that it will be using along with the certificate that contains the public key of the server that initiates the asymmetric encryption that is necessary to begin the symmetric encryption. The client then uses that public key to encrypt things it’ll be sending to the server. Also, the client then checks for the certificate validity.
    The certificate contains the following thing.

1. Version Number: X.509 is a standard format for public key certificates, digital documents that securely associate cryptographic key pairs with identities such as websites, individuals, or organizations.
2. Serial Number: Unique number assigned by the certificate authority
3. Signature Algo: The algorithm used by the certificate authority to sign the certificate RSA or SHA 256
4. Signature Hash Algorithm: The big certificate is hashed into a small chunk and Signature Hash Algo defines the algorithm used for it(SHA 256)
5. Issuer: Certificate Authority
6. Valid from and Valid to
7. Subject: Common Name, State, Locality etc
8. Public Key: The key that will be used by the client to communicate with the server
And more…

The server then also responds with a Hello Done message saying all the information has been communicated to the client.

3. Change Cypher Spec (Client Finished): At this point, the client has the public key of the server. The client then generates the pre-master secret key and sends it back to the server. The client also then sends a client finished message saying that it has sent all the necessary details for the server. The client also calculates a symmetric key and stores it to itself for future bulk encryption use.

4. Change Cypher Spec (Server Finished): The server calculates the symmetric key and saves it for further communication

5. Bulk Data Transfer: The communication started now.

SSL Certificate Verification

During SSL/TLS connections, the server authenticates according to the handshake and record protocols. When initiating the handshake protocol, the server presents a signed X.509 certificate to the client. The X.509 certificate’s signature must be verified by the client before establishing an HTTPS connection. The required format and information contained in an X.509 certificate enable the client to confidently authenticate and verify the integrity of the certified identity.

Client browsers and applications rely heavily on their trust in Certificate Authorities(CA) for proper validation of X.509 certificates. Every client application and Operating System (OS) maintains a list of trusted Root CA Certificates, this list is called a “Trust Store.” For example, at the current time of writing, the Firefox trust store holds 150 root certificates that are automatically trusted by their web browser.

As part of the X.509 verification process, each certificate must be signed by the same issuer CA named in its certificate. The client must be able to follow a hierarchical path of certification that recursively links back to at least one root CA listed in the client’s trust store.

There are three basic types of entities that comprise a valid chain of trust: Root, Intermediate, and End-entity.

  1. Root Certificate (Trust Anchor): A Root certificate is a self-signed certificate that follows the standards of the X.509 certificate.
  2. Intermediate Certificate (The Issuing CA): At least one intermediate certificate will almost always be present in an SSL certificate chain. They provide a vital link to enable the Root CA to extend its trustworthy reputation to otherwise untrustworthy end-entities. The issuing CA functions as middlemen between the secure root and server certificate. This allows the Root CA to remain securely stored offline, providing an extra level of security. Trust in the root CA is always explicit. Each operating system, 3rd party web browsers, and custom applications ship with over 100 pre-installed trusted root CA certificates. In contrast, non-root certificates are implicitly trusted and are not required to be shipped with an OS, web browser, or certificate-aware application.
  3. Server Certificate (The End Entity): The end-entity provides critical information to the issuing CA via a Certificate Signing Request form. The certificate is then signed and issued by a trusted CA, attesting that the information provided was correct at the issuance time. The SSL connection to a server will fail if the certificate has not been verified and signed.

Perfect Forward Secrecy and Diffie Helman

In normal traffic encryption, the client uses the server’s public key to generate a per master secret and the server then uses its private key to help generate a session secret. If in future the server’s private key is compromised all communication, past and future can be decrypted and read by the malicious actor. Perfect forward secrecy is an attempt to solve this problem.

In cryptography, forward secrecy (FS), also known as perfect forward secrecy (PFS), is a feature of specific key agreement protocols that gives assurances that session keys will not be compromised even if long-term secrets used in the session key exchange are compromised.

For HTTPS, the long-term secret is typically the Private signing key of the server. Forward secrecy protects past sessions against future compromises of keys or passwords. By generating a unique session key for every session a user initiates, the compromise of a single session key will not affect any data other than that exchanged in the specific session protected by that particular key.

One of the widely used algorithms for this is the Diffie Helman. Let’s look at how it works.

Suppose Alice and Bob want to talk to each other securely without Eve knowing what they are talking about. And the communication needs to take place through symmetric cryptography.

Let’s try to understand the above diagram with a bit of math.

Alice and Bob agree to use the prime p = 941 and the primitive root g = 627. Alice chooses the secret key a = 347 and computes A = 390 ≡ 627347 (mod 941). Similarly, Bob chooses the secret key b = 781 and computes B = 691 ≡ 627781 (mod 941). Alice sends Bob the number 390 and Bob sends Alice the number 691. Both of these transmissions are done over an insecure channel, so both A = 390 and B = 691 should be considered public knowledge. The numbers a = 347 and b = 781 are not transmitted and remain secret. Then Alice and Bob are both able to compute the number the secret key 470.

Extras: DNS Poisoning

As explained above. DNS queries are cached every time your ISP or your PC resolves it. The cache remains valid until the TTL(time to live) for the cache does not expire.

Instead of using TCP, which requires both communicating parties to perform a ‘handshake’ to initiate communication and verify the identity of the devices, DNS requests and responses use UDP or the User Datagram Protocol. With UDP, there is no guarantee that a connection is open, that the recipient is ready to receive, or that the sender is who they say they are. UDP is vulnerable to forging for this reason — an attacker can send a message via UDP and pretend it’s a response from a legitimate server by forging the header data. If a DNS resolver receives a forged response, it accepts and caches the data uncritically because there is no way to verify if the information is accurate and comes from a legitimate source. Despite these major points of vulnerability in the DNS caching process, DNS poisoning attacks are not easy. Because the DNS resolver does actually query the authoritative nameserver, attackers have only a few milliseconds to send the fake reply before the real reply from the authoritative nameserver arrives.

Extras: How to shut down the internet and DNSSEC

DNS lookup happens over UDP. The main downside of UDP is that unlike TLS UDP does not require a handshake for communication. Thus is much vulnerable to man in the middle attacks. Suppose your root server asks the authoritative server the location of medium.com. A man in the middle, if he is fast enough to read the packet can quickly give some other IP for the query and the root server will trust it.

To protect us from this impersonation, The Domain Name System Security Extensions (DNSSEC) was introduced. DNSSEC ensures that the source is actually the source it’s claiming to be. To understand how it works I’ll introduce one more term to you called the Chain of Trust.

As mentioned in the above section. Your query from your ISP goes to the root server, which then forwards it to the respective top-level domain which forwards them to the correct authoritative server to resolve the IP of a website. For this to happen correctly. The ISP must trust the root servers saying that it’ll give the correct answer, the root servers must trust the Top Level Domain Servers and the TLDs must trust your server. This is how the chain is established. Since it is such a crucial part of the internet and organization called The Internet Corporation for Assigned Names and Numbers (ICANN) oversees the process to see it happens smoothly.

ICANN in 2010 signed the 13 root servers, saying that they trust these resources thereby letting the root server delegate some judicial powers. The TLDs in turn then gets a certificate from the root servers since the trust the root servers. And the root servers can trust the TLDs now based on the signatures they have given to them. The TLDs then give certificates to the authoritative name servers signing their DNS records.

To sum up, when the recursive server asks the root server the location of google.com. The root server replies with the IP of the TLD server and the corresponding public key of the server. Since we trust the root server we trust the public key provided by the root server. TLD when sending the referral for the authoritative name server sends the public key to the authoritative name server, we receive the request, since we had the key to the TLD we can verify if the response is received by the TLD and finally by google.com’s authoritative name server.

Now with that out of the way. How do we shut off the internet then? Remember ICANN? The one that signs the 13 name servers. For signing the name servers they use asymmetric cryptography. The public key is then distributed to everyone and the private key is stored in a very very very safe location by ICANN in a safe with FBI grade security. The private keys are stored in 4 hardware security modules. 2 at a place and the rest 2 placed 2500 miles apart just for redundancy. To get to the HSM modules apart from military-grade security need smart cards. Which are held by 7 people in the world (actually 14, 7 people more for redundancy). If the DNS is compromised. 5 of the 7 people are required to visit the facility to open the HSMs to reset the private key or worst shut down the DNS which will invalidate the root servers and all other servers down the chain.

References

  1. Big diagram of the blog
  2. Cloudflare blog
  3. Hussein Nasser
  4. F5 DevCentral

Sometimes it is the people no one can imagine anything of, do the things no one can imagine.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store