NAT notes - fordsfords/fordsfords.github.io GitHub Wiki

A Network Address Translation (NAT) router is a type of router that ... wait for it ... translates addresses.

(Ba dum)

There are several different kinds of NATs, depending on what you are trying to do.

One-to-many

The hugely vast majority of NATs deployed on planet Earth are One-to-many NAT. This is what your WIFI router is. It is what "saved the Internet" by allowing a local network with many hosts to use private IP addresses (starting with 10. or 172. or 192.), and those local hosts can access sites on the public Internet using TCP.

Imagine for a moment that the NAT router was just a plain old router. It just forwards traffic. You could imagine the laptop trying to establish a TCP connection to github.com. The TCP "SYN" packet has 140.82.114.4 as its destination address and 192.168.1.2 as its source address. Let's say that SYN packet made it to Github and let's even say that Github wanted to accept the connection. So it wants to send a "SYN/ACK" packet back to 192.168.1.2. OOPS! Too bad. That address is not a publicly routed address. Github's ISP will refuse to route that packet. (In reality, the original SYN packet would probably never make it to Github.) For Github to be able to communicate, the packet needs to come from a host with a public IP address. So you could just ask your ISP for a bunch of public IP addresses for your laptop, phone, desktop, thermostat, refrigerator, can opener, etc. Good luck with that - the Internet ran out of IPv4 addresses quite a while ago.

Instead, the router is a one-to-many NAT router. It has a single public IP address, 157.130.92.19, on the Internet-facing interface. When the laptop sends its SYN packet, its source address is 192.168.1.2, but when the NAT forwards it to the public Internet, it translates the source address to the NAT's public IP address, 157.130.92.19. Now when Github gets the SYN packet, it can properly respond with "SYN/ACK" back to 157.130.92.19, and the NAT router will get it.

But now what? The NAT just got a SYN/ACK from Github. The destination address of that packet is the NAT's own address. The NAT might be thinking, "Hmm ... I bet one of my LAN devices asked for this connection, and I translated it as I forwarded it. I wonder which device sent the original SYN. Phone? Desktop?"

This gets us to the root of the one-to-many NAT - it maintains state. Each time a private-side host sends a SYN packet, the NAT remembers the source IP:Port. It then allocates a new source port and overwrites (translates) the SYN packet's source IP and port. That gets forwarded to GitHub. When GitHub's response comes back to the NAT's IP:Port, it uses that port to look up the original source IP:Port, overwrites that info onto Github's response packet's destination IP:Port, and forwards it to the private LAN.

The NAT needs to maintain this state over time. Every packet sent on that TCP connection needs to have that translation done, both on the outgoing path and the incoming path. It's only when the TCP connection is terminated (with FIN or RST) that the NAT can free up the state entry and the port. And since TCP connections can die without a proper FIN or RST, the NAT will have an internal activity timeout after which it will free up the state. This can cause problems if you run applications with long-lived TCP connections that have stretches of inactivity - the NAT can disconnect you without your knowledge.

Note that this puts an upper limit on the number of simultaneous TCP connections that can be maintained by the NAT. Since each connection needs to have a NAT port allocated for it, the NAT can never support more than 65535 active connections. (Older consumer-grade routers probably can't even handle that many; I don't know if new ones can.) This might not seem like a big restriction, but remember that ALL connections from the LAN side are aggregated by the NAT. So the laptop, phone, desktop, thermostat, refrigerator, can opener, etc. need to share the 65535 (or less) possible active connections.

Routing vs. Forwarding

There is a subtle point that's worth emphasizing. When an internal host, say laptop, wants to send a packet to an external host, say github, it uses that external host's IP address as the destination address. Since the laptop is not directly connected to the 140... network, the laptop's IP layer will decide on a gateway to use and will put that gateway's MAC address as the ethernet destination of the ethernet packet. But the IP destination address is still Github's 140... address. As the packet traverses any number of intermediary networks on its way to Github, the packet's destination Ethernet address will change from hop to hop, but the destination IP address will always be Github's 140... address.

In contrast, when Github sends a reply, that destination IP address will be the NAT's 157... public IP address. Again, the as the packet traverses the intermediary networks, the ethernet destination will change with each hop, but the destination IP address will always be the NAT's 157... address. Once it finally arrives at the NAT, the NAT will modify the destination IP address to the laptop's 192... internal IP address.

This difference is important. With outgoing packets, IP routing is used end-to-end. Therefore, the NAT's inward-facing interface has to appear in the laptop's routing table (in this case, as the default route). I.e. the NAT is listed as a gateway to other networks. With returning packets, the packet arrives at its initial destination, the NAT, only to be forwarded to the final destination by the NAT. Public machines do NOT list the NAT's public IP as a gateway to the internal network.

Limitations

The one-to-many NAT has some limitations. Let's say you want to make a service whereby when somebody checks something into a Github repo, Github will establish a TCP connection back to your laptop and inform you. Sounds straightforward - just have Github send a SYN packet to your laptop's IP, 192.168.1.2. OOPS! Too bad. That's a private IP address and isn't routed. In point of fact, there are probably millions of devices in the world with that IP address. So Github will have to send its SYN to the NAT's public address, 157.130.92.19. But what's the NAT going to do with that packet? It could look in its state for the port, but it won't find anything. The NAT's state only has entries for established TCP connections.

The fact of the matter is that Github can't establish a TCP connection directly to one of your private machines, at least not with the minimally-configured NAT that we've been talking about. All connections have to be initiated from the private side. To accomplish the goal of being informed when somebody checks into a Github repo, your private host will have to periodically initiate a TCP connection to Github and ask it if it has anything to say.

This is not a bug, it's a feature. It's why a NAT router is also considered to be a firewall. Hackers on the public internet cannot initiate connections to hosts on your private network.

But there are some applications (especially gamers) who want to be able to allow true peer-to-peer communication. NAT routers will let you define additional rules to allow incoming packets that are not associated with an already-established TCP connection. Typically what you do is define a port (either TCP or UDP) that will forward to a specific host on the private side. For example, you could have TCP port 80 forwarded to the desktop, 192.168.1.3. This will allow your desktop to present a web server to the public internet.

Cascading NATs

It is probable that the public internet-facing interface on your home router is not assigned an actual public IP address. Instead, your router is probably part of a larger NAT. A consumer ISPs typically choose to conserve their limited sets of public IP addresses by defining NATs that more than one consumer will share. (Example)

Basic NAT

We've talked about the one-to-many NAT, which is what the hugely vast majority of NATs worldwide are.

A Basic NAT is much simpler and does not involve the NAT maintaining any state. This just has the NAT translating IP addresses, not ports. I've also heard this called a "one-to-one" NAT and even a "two-way NAT".

Here's a possible application of a basic NAT: let's say that you have two independent private networks, maintained by independent organizations. The two orgs agree that they should set up a direct link between Host1b and Host2b.

You can't just bridge these two networks together - they both have a 10.1.1.10 host for one thing. And the two organizations probably want to limit the visibility of their respective networks to just the hosts involved in the transactions. So they want to use a NAT router.

In this NAT, there is no state held, no TCP connection table, no ports being re-written. Only IP addresses. And the NAT's two IP addresses are NOT used as gateways to other networks and therefore do not appear in any routing tables.

When Host1b wants to send something to Host2b, it sends a packet to the local IP address 10.1.1.251, which will be received by the NAT. The NAT will replace the destination with 10.1.1.11 and the source address with 10.1.1.249 and forward it to Host2b. If Host2b replies, it will send it to 10.1.1.249. The NAT router will replace the destination IP with 10.1.1.23 and the source with 10.1.1.251 and forward the packet to Host1b. (BTW, the NAT also has to recalculate the IP checksum after the addresses are modified.)

Note that with this configuration, no other communication paths are possible; just Host1b with Host2b.

(FYI - the IP addresses recognized by the NAT router are chosen by the two orgs' network administrators to be unused IP addresses in their respective 10.1.1.0 subnets.)