DNS: the Domain Name System:
In the passing notes metaphor above, I said that the name of the recipient had to be on the outside of the note. This is true for HTTP requests too... they need to say who they are going to.
But you can't use a name for them. None of the routers would know who you were talking about. Instead, you have to use an IP address. That's how the routers in between know which server you want to send your request to.
This causes a problem. You don't want users to have to remember your site's IP address. Instead, you want to be able to give your site a catchy name... something that users can remember.
This is why we have the domain name system (DNS). Your browser uses DNS to convert the site name to an IP address. This process - converting the domain name to an IP address - is called domain name resolution.
How does the browser know how to do this?
One option would be to have a big list, like a phone book in the browser. But as new web sites came online, or as sites moved to new servers, it would be hard to keep that list up-to-date.
So instead of having one list which keeps track of all of the domain names, there are lots of smaller lists that are linked to each other. This allows them to be managed independently.
In order to get the IP address that corresponds to a domain name, you have to find the list that contains that domain name. Doing this is kind of like a treasure hunt.
What would this treasure hunt look like for a site like the English version of wikipedia, en.wikipedia.org?
We can split this domain into parts.
With these parts, we can hunt for the list that contains the IP address for the site. We need some help in our quest, though. The tool that will go on this hunt for us and find the IP address is called a resolver.
First, the resolver talks to a server called the Root DNS. It knows of a few different Root DNS servers, so it sends the request to one of them. The resolver asks the Root DNS where it can find more info about addresses in the .org top-level domain.
The Root DNS will give the resolver an address for a server that knows about .org addresses.
This next server is called a top-level domain (TLD) name server. The TLD server knows about all of the second-level domains that end with .org.
It doesn't know anything about the subdomains under wikipedia.org, though, so it doesn't know the IP address for en.wikipedia.org.
The TLD name server will tell the resolver to ask Wikipedia's name server.
The resolver is almost done now. Wikipedia's name server is what's called the authoritative server. It knows about all of the domains under wikipedia.org. So this server knows about en.wikipedia.org, and other subdomains like the German version, de.wikipedia.org. The authoritative server tells the resolver which IP address has the HTML files for the site.
The resolver will return the IP address for en.wikipedia.org to the operating system.
This process is called recursive resolution, because you have to go back and forth asking different servers what's basically the same question.
I said we need a resolver to help us in our quest. But how does the browser find this resolver? In general, it asks the computer's operating system to set it up with a resolver that can help.
How does the operating system know which resolver to use? There are two possible ways.
You can configure your computer to use a resolver you trust. But very few people do this.
Instead, most people just use the default. And by default, the OS will just use whatever resolver the network told it to. When the computer connects to the network and gets its IP address, the network recommends a resolver to use.
This means that the resolver that you're using can change multiple times per day. If you head to the coffee shop for an afternoon work session, you're probably using a different resolver than you were in the morning. And this is true even if you have configured your own resolver, because there's no security in the DNS protocol.
How can DNS be exploited?:
So how can this system make users vulnerable?
Usually a resolver will tell each DNS server what domain you are looking for. This request sometimes includes your full IP address. Or if not your full IP address, increasingly often the request includes most of your IP address, which can easily be combined with other information to figure out your identity.
This means that every server that you ask to help with domain name resolution sees what site you're looking for. But more than that, it also means that anyone on the path to those servers sees your requests, too.
There are a few ways that this system puts users' data at risk. The two major risks are tracking and spoofing.
Like I said above, it's easy to take the full or partial IP address info and figure out who's asking for that web site. This means that the DNS server and anyone along the path to that DNS server - called on-path routers - can create a profile of you. They can create a record of all of the web sites that they've seen you look up.
And that data is valuable. Many people and companies will pay lots of money to see what you are browsing for.
Even if you didn't have to worry about the possibly nefarious DNS servers or on-path routers, you still risk having your data harvested and sold. That's because the resolver itselfâ€Šâ€”â€Šthe one that the network gives to you - could be untrustworthy.
Even if you trust your network's recommended resolver, you're probably only using that resolver when you're at home. Like I mentioned before, whenever you go to a coffee shop or hotel or use any other network, you're probably using a different resolver. And who knows what its data collection policies are?
Beyond having your data collected and then sold without your knowledge or consent, there are even more dangerous ways the system can be exploited.
With spoofing, someone on the path between the DNS server and you changes the response. Instead of telling you the real IP address, a spoofer will give you the wrong IP address for a site. This way, they can block you from visiting the real site or send you to a scam one.
Again, this is a case where the resolver itself might act nefariously.
For example, let's say you're shopping for something at Megastore. You want to do a price check to see if you can get it cheaper at a competing online store, big-box.com.
But if you're on Megastore WiFi, you're probably using their resolver. That resolver could hijack the request to big-box.com and lie to you, saying that the site is unavailable.
This Tutorial is taken from A cartoon intro to DNS over HTTPS