tags : Networking, peer-to-peer
Important links
Links
- How NAT traversal works · Tailscale : Lot of what’s in this page comes from this post.
- Peer-to-Peer Communication Across Network Address Translators
- how does linux nat a ping?
- NAT is inevitable even with IPv6 | Lobsters
- Others: pwnat
RFCs
- NAT
- TCP
- https://datatracker.ietf.org/doc/html/rfc5382 (NAT for TCP)
- Usually we prefer UDP when using NAT
- UDP
- https://datatracker.ietf.org/doc/html/rfc4787 (NAT for UDP)
- https://datatracker.ietf.org/doc/html/rfc6888 (update on 4787)
- https://datatracker.ietf.org/doc/html/rfc7857 (update on 4787)
- TCP
- STUN(Session Traversal Utilities for NAT)
- TURN(Traversal Using Relays around NAT)
FAQ
What external servers can be necessary when doing NAT traversal?
Coordination/Signaling/Rendezvous/side channel server
- It solves discovery
- This is usually managed by the application developer themselves
- WebRTC tells you to go home and bring your own signaling channel
- Eg. Mainline DHT, Browsers usually use websockets etc, DCUtR.
- See Bittorrent
STUN server
- It tells peers what they look like on the internet
- RFC 5389: Session Traversal Utilities for NAT (STUN)
- Eg. STUN/ICE, AutoNAT
Relay server
- Basically something in between that delivers messages from one client to another.
- It is usually used as a fallbak when peers absolutely can’t talk to each other p2p
- Eg. TURN, simple WebSocket-based relay, Syncthing Relay Server, DERP Servers (tailscale), Circuit Relay
Can coordination/signaling/rendezvous server & STUN server be same?
They are not the same, but I guess you could in a way mix them into one if your application/protocol needs be like that. I am not aware of things that do it atm.
Does NAT only do port mapping?
- It has to create a mapping, may not only be port mapping
- See How does a NAT server forward ping ICMP echo reply packets
What is the problem?
- NAT by itself is cool and all, decent solution for ipv4 exhaustion when ipv6 not already there imo
- But problem comes when we want to use it for p2p stuff, cuz w p2p we have this deadlock
- Both sides have to speak first
- Need to open all the FW on both sides
- But neither side knows to whom to speak
- Need remote
ip:port
to create a proper5-tuple
for FW
- Need remote
- And can’t know remote
ip:port
until the other side speaks first- Can’t know if remote is never able to share its
ip:port
because it needs ourip:port
too.
- Can’t know if remote is never able to share its
- Both sides have to speak first
- So the solution(and also the problem) to this is “NAT traversal”. And it does not work 100% of the time. But we can try our best + there are fallbacks.
What do we need?
Final goal: Have UDP packets flowing bidirectionally between 2 devices.
For this we have:
- 1 socket
- Initially we’ll using the socket to NAT traversal stuff
- After that we can send out actual data and do whatever
- It needs to be the same socket, otherwise this cannot happen
For this we need:
- Protocol needs to be UDP
- UDP simply has more success rate
- TCP can work but is a hassle because additional handshakes and holepunching needs timing to be right. If you really want streaming data, might go w QUIC. Bittorrent uses something called uTCP which is wrapper around UDP.
- Direct control over socket that’ll do network I/O
- Data transfer (main protocol) will need to happen through the same socket as the udp socket via which nat travesal will happen
What do we need to overcome?
Stateful Firewalls
- See Firewalls
- These maintain a state of
5-tuple
(s_ip:s_port:d_ip:d_port:TCP/UDP
)
Common scenario for home network
- This is about how you’re able to connect to any website behind a firewall. Same for connecting via VPN. i.e You’re behind a firewall and your destination is public.
- All inbound connections are blocked
- Once there is a outbound for
d_ip:d_port
froms_ip:s_port
, packets fromd_ip:d_port
will be allowed. - But there are timeouts to this allowing of inbound packets so you need to take measures to like periodically probing etc.
P2P scenario
- In this case both of the parties are behind a firewall
- From the common scenario, we know if the timeout has not expired and there exists a
5-tuple
to thed_ip:d_port
,d_ip:d_port
will be able to connect to us. We can exploit this. - If both the peers know the
ip:port
of each other and use it to open respective firewalls for each other, we have a W. - For both peers to know the
ip:port
of each other- We need: Coordination server / Signaling channel
- Example: DHT, Tailscale CP
- We might try sending packets continuously or at some exact specific time etc. Basically if both packets reach each other before FW timeout we are done here.
NAT Box
- Usually NAT box is combination of
Stateful Firewall + NAT Box
- Nat Box: A Device that modifies L3 header info(
ip:port
) and keeps a record of it. - It can do SNAT and DNAT among other things, but we’re not worried about DNAT atm. (See Selfhosting for info in DNAT). For NAT traversal we’re talking only about
SNAT
. - We basically do “port-mapping” which is stored in something called “NAT table”.
- Basically
internal_ip:internal_port
mapped toexternal_ip:external_port
. external_port
might be same asinternal_port
but we assume it’s not to keep things simple and more realistic.
- Basically
NAT Box(Filtering+Mapping) Types
- In EDM/EDF/EIM/EIF, “endpoint” always refers to destination endpoint.
- For NAT traversal(Hole Punching based), EIF/EDF doesn’t really matter, if it’s EIM, we’d be able to do NAT traversal using STUN.
- Additionally, if one side is using Easy NAT, other side is using Hard NAT, or both using easy NAT etc. all make things different. i.e Remote could be behind a NAT aswell.
Easiness | NAT Type I | EDF/EIF? | EDM/EIM? | (NAT internal) <= [(NAT external)] <- Remote | NAT Type II (older rfc) | STUN | Eg. |
---|---|---|---|---|---|---|---|
0 | Easy(1:N) | EIF | EIM | (INT_ADDR, INT_PORT) <= [ (EXT_ADDR, INT_PORT) <- (*, *)] | Full Cone | WORK | |
1 | Easy(1:N) | EDF | EIM | (INT_ADDR, INT_PORT) <= [ (EXT_ADDR, INT_PORT) <- (REM_ADDR, *)] | Address-Restricted Cone | WORK | |
2 | Easy(1:N) | EDF | EIM | (INT_ADDR, INT_PORT) <= [ (EXT_ADDR, INT_PORT) <- (REM_ADDR, REM_PORT)] | Port-Restricted Cone | WORK(most common) | Home NATs |
4 | Hard(1:1) | EDF | EDM | (INT_ADDR, INT_PORT) <= [ (EXT_ADDR, EXT_PORT1) <- (REM_ADDR, REM_PORT1)] | Symmetric/Bi-Directional | NO WORK(we need TURN ) | Cloud NATs |
(INT_ADDR, INT_PORT) <= [ (EXT_ADDR, EXT_PORT2) <- (REM_ADDR, REM_PORT2)] | |||||||
3 | N/A Hard(1:1) | EDF | EDM | (INT_ADDR, INT_PORT) <= [ (EXT_ADDR, EXT_PORT1) <- (REM_ADDR, REM_PORT1)] | Another variant of sym (port only changes for different addr) | ||
(INT_ADDR, INT_PORT) <= [ (EXT_ADDR, EXT_PORT1) <- (REM_ADDR, REM_PORT2)] | |||||||
- | N/A | EIF | EDM | - | - | - |
Easy and Hard is what we care for p2p nat traversal(hole punching)
Easy (Destination Endpoint-Independent Mapping)
A dog looks like a dog to both you and me. We can say yes that’s a dog.
external_ip:external_port
forinternal_ip:internal_port
is the same for everyone on the internet.
Hard(Destination Endpoint-and-Port-Dependent Mapping)
A dog looks like a dog to me and a cat to you. But atleast we know it’s an animal. So we can guess
external_ip:external_port
forinternal_ip:internal_port
is different for different entities on the internet.
-
Guessing port in Hard NAT
Some large Enterprise NATs use an IP address pooling behavior of “Arbitrary” as a means of hiding the IP address assigned to specific endpoints by making their assignment less predictable. Other NATs use the same external IP address mapping for all sessions associated with the same internal IP address. These NATs have an “IP address pooling” behavior of “Paired”.
STUN
does not work.- The ext
NAT IP:PORT
given by STUN will not be the one we try to connect to the remote because now it differs. - But even if we have Hard NATS on either side, because the spec recommends IP pooling of
"Paired"
, the external IP most probably will be fixed. - We can keep guessing the port address for a while, so in 30mins we should get a port based on birthdy. (See tailscale blog for more info)
- The ext
NAT Box and P2P
- Main purpose of NAT was for the IPv4 exhaustion
- But using NAT meant issues for P2P as with NAT things are no longer really P2P
- Now there are different ways you can setup NAT which makes it more or less easier to do P2P
EDM(Hard)
is basically bad news for “NAT traversal”.
We know
- How to deal w FW for normal and P2P connections
- That NAT box = FW + L3 header modifier + port mapper
- Different types of NAT and they port map differently
How NAT causes problems for P2P
- Now
internal_ip:internal_port
is not how we connect to other systems on the internet when using NAT.- We need to know
external_ip:external_port
so that we can let our peer know about it via the coordination server.
- We need to know
- Problem is, we do not know what is our
external_ip:external_port
because it’s per socket and sort of out of our control(in many cases).- This is where solutions such as a STUN server can help.
Guidelines
- Symmetric to Symmetric: TURN.
- Symmetric to Port-Restricted Cone: TURN.
- Symmetric to Address-Restricted Cone: STUN (but probably not reliable).
- Symmetric to Full Cone: STUN.
- Everything else: STUN.
Some solutions to P2P NAT traversal
Manual/Dynamic port mapping
- If the NAT is in our control we can try manually mapping the
external_ip:external_port
using port mapping software like upnp, NAT-PMP, PCP(NAT-PMPv2) etc. - This is usually not possible if you’re behind a
CGNAT
Use a STUN server
- A
STUN server
tells theinternal host
what it looks like on the internet. It needs to live outside the NATed network for this. Internal host
can then provide this info to thecoordination server
so thatother hosts
can use this info to send packets.- This generally only works with
EMI NAT
When STUN server is not enough (TURN
)
- With EDM NAT, the info we get from the
STUN server
will be incorrect because the sourceip:port
mapping will differ by destination andSTUN server
andthe peer
we want to talk to are different destinations. - Now we need a
Relay server
.TURN
is an example ofRelay server
.
What else?
NAT hairpinning
CGNAT
- NATs can be good and bad on how they are implemented by the ISPs
- These are unlike your home NATs.
- They usually
- Are big
- Don’t allow custom port mapping
- Have policies (because carriers run it)
- Can block stuff at their will
- Limit customers on use of the addresses and ports
- f’around with timeouts
More info
- Carrier-grade NAT - Wikipedia
- RFC 3519 - Mobile IP Traversal of Network Address Translation (NAT) Devices
- Multi-level Architecture for P2P Services in Mobile Networks | SpringerLink
- What is Carrier-grade NAT (CGN/CGNAT)? | Glossary | A10 Networks
- networking - Why 3G network NAT cannot be bypassed? - Stack Overflow
Okay, how to implement?
Custom solution
You can always come up with a custom implementation from scratch if you want but there are already some recommended stuff
STUN/TURN and ICE
- What are STUN, TURN, and ICE?
- webrtc - ICE vs STUN vs TURN - Stack Overflow
- ICE always tastes better when it trickles!
- What type of NAT combinations requires a TURN server?
- under what scenarios does SERVER REFLEXIVE and PEER REFLEXIVE addresses/candidates differ from each other?
libp2p
- This is what IPFS uses under the hood for p2p mechanisms
- They also have this one: libp2p + webrtc, I am not sure how that is supposed to work.
- See Hole Punching - libp2p
- Components
- Coordination server: DCUtR
- DCUtR is interesting because it claims to not require centralized signalling server unlike most other implementations.
- STUN server: AutoNAT
- Relay server: Circuit Relay
- Coordination server: DCUtR
WebRTC
See WebRTC