A p2p Vision for QUIC
This article was co-authored by Christian Huitema. You can find his blog at privateoctopus.com.
Over the years, the IETF has standardized numerous protocols for establishing IP packet flows through NATs and firewalls, including STUN, ICE, and TURN.
This is an inherently messy topic, and I can highly recommend reading Eric Rescorla’s blog post series about NATs (part 1, part 2, part 3). I won’t go into details about how exactly NATs work (again, read the ekr’s blog posts!), but in a nutshell, they rewrite the IP of packets passing through the NAT.
(192.168.1.10) participant NAT as NAT
(203.0.113.5) participant Server as Server
(198.51.100.20) Client->>NAT: Packet (Src: 192.168.1.10:1234) NAT->>Server: Packet (Src: 203.0.113.5:4321) Server-->>NAT: Response (Dst: 203.0.113.5:4321) NAT-->>Client: Response (Dst: 192.168.1.10:1234)
This allows multiple clients behind that NAT to share the same external IP address. Clients are able to reach any server on the internet, but it doesn’t allow nodes on the internet to reach the client, since the NAT won’t forward packets to the client, unless it determines that they belong to an existing flow. In that sense, the NAT acts as a firewall.
Although simple in theory, there are a lot of different ways to implement a NAT. For our purposes, the main difference lies in how the port numbers for the outgoing packets are allocated. Depending on the port allocation logic, it might or might not possible to establish a direct connection between two peers.
The problem that p2p network engineers hope to solve is the following: How can two nodes that are both behind a NAT, respectively, connect to each other, no matter the kind of NAT?
In this post, we explore how QUIC can be leveraged to provide a comprehensive solution for NAT traversal, encompassing everything from address discovery to UDP proxying, potentially simplifying and improving upon traditional p2p networking approaches.
The Traditional Way
Address Discovery using STUN
Depending on the deployment scenario, a new node joining the network might or might not know its (public) IP address. Traditionally, applications use STUN (RFC 8489) to discover their public IP address. In a nutshell, a STUN client sends a “Binding Request” to a STUN server. The “Binding Response” of the server encodes the IP source address and the source port that the server observed on the client’s request.
The client can infer from the responses to the STUN requests if it is located behind a NAT. The client might even compare responses from different STUN servers and attempt to infer the type of NAT, although this is notoriously difficult to get right. The IETF has pretty much given up on this approach (see section 2 of RFC 5389).
The STUN protocol can run over UDP, TCP, DTLS (RFC 7350) or TLS.
Hole Punching Coordination using ICE
As we’ve seen above, once the NAT has seen the first packet to a remote server pass through, the NAT opens up the return path, allowing packets from the outside world to reach the node. The idea behind hole punching is to have both peers send packets simultaneously, each of them “punching” a hole in their respective firewall, establishing a direct flow of packets between the two nodes.
The traditional hole punching process is specified by ICE RFC 8445. ICE starts with an address gathering phase, in which the two peers separately contact STUN servers to obtain lists of candidate IP addresses and ports. They may add to that list a set of TURN addresses (see next section, “relaying”).
One of the endpoints creates a list of available addresses, ordered by priority, and sends it to its peer. The peer compares that to its own list, establishes a list of “candidate pairs”, and sends it back. At that point, both endpoints have the same list of candidate pairs, and start the “connectivity” check. Each host will try to send a STUN binding request from its selected address to the paired address, and respond to STUN requests that it might receive from the peer. If at least one of these trials succeeds, the peers have established a new connection over this address pair. If several succeed, they keep the preferred pair.
Relaying using TURN
Unfortunately, no matter how hard you try, there is a certain percentage of nodes for whom hole punching will never work. This is because their NAT behaves in an unpredictable way. While most NATs are well-behaved, some aren’t. This is one of the sad facts of life that network engineers have to deal with.
The only solution for this problem is to employ the help of a third party, i.e. a server that is not located behind a NAT, and therefore can be reached by peers directly, without any hole punching. This server can then relay traffic between the two peers.
Of course, this comes at a cost. The path via the relay might be slower (in terms of latency and / or bandwidth) than a (hypothetical) direct path would have been. And relaying traffic is not for free for the operator of the relay: both processing resources as well as bandwidth cost money. However, we don’t really have a choice here, and despite these shortcomings, having a relayed connection might be preferable to having no connectivity at all.
The traditional solution relies on TURN servers, specified in RFC 5766 to provide these “last resort” connectivity. The node behind a NAT can ask the TURN server to open an UDP or TCP port. This “TURN Port” is usually specialized: the client specifies the address of the peer that will be able to send data to that port, or to which data will be sent. The corresponding IP address and port number will be sent to the peer, and will be the basis for some last resort “candidate pairs” used in the coordinated hole punching.
The previous section mentioned that in some cases more than one tried address pair will succeed. This is particularly true for the pairs that include a TURN provided address. This is why the trials will try to collect all the working pairs and pick the higher priority one. If both a “hole punching” and a “TURN” pair succeed, they will typically only retain the “hole punching” pair.
The QUIC Way
QUIC Connection Migration
RFC 9000 defines how clients can migrate an existing QUIC connection to a different IP:port tuple. When we designed this mechanism, the primary use case we envisioned was solving the “parking lot problem”. Imagine you have a mobile phone, and you walk from your office (where the phone has WiFi) to the parking lot (with no / bad WiFi coverage). In this case, the client could detect that the WiFi connection is worsening, and migrate the connection to its cellular interface. Crucially, this would keep all connection state (e.g. open streams, datagram flows, etc.) intact, and would therefore be transparent to the application.
On detecting a network interface change, e.g. leaving the office, QUIC Path Migration works by first sending a so-called probing packet to the server. The purpose of this packet is to probe if the path actually works. The client includes a PATH_CHALLENGE frame in this packet, to which the server responds with a PATH_RESPONSE frame. This makes sure that the new path actually works (for example, that the path doesn’t block UDP packets), and supports QUIC (for example, allows packets that satisfy QUIC’s MTU requirements).
On the wire, this path probing procedure looks pretty similar to a hole punch attempt. We just need a tiny modification to make this work in the p2p use case: If we could get the server to send probe packets as well, we could kill two birds with one stone: We’d punch a hole through the firewall, and at the same time verify connectivity on the path.
Of course this is not the only thing needed to achieve hole punching. Before the nodes can even send probe packets, we need to learn about the peer’s reflexive address, and be able to coordinate the timing. We’ll come to this in a bit, but first we’ll describe how we can replace STUN to discover our reflexive addresses.
QUIC Address Discovery
Typically nodes use STUN to discover their reflexive addresses. In essence, STUN is a request-response protocol here, where the client requests the server to report the observed address of the request packet.
In principle, we could achieve the same inside of a QUIC connection: The server could report the address of the client using, for example, a newly defined QUIC frame, and vice versa. This is exactly what the QUIC Address Discovery draft specifies.
The mechanism is really simple: every time a new path is established (incl. the path used for the handshake), endpoints inform each other of the observed address. This is a very efficient mechanism: Since the OBSERVED_ADDRESS frame is defined as a probing frame, it can be bundled with the PATH_CHALLENGE and PATH_RESPONSE frames used to probe a new path.
Performing address discovery over QUIC comes with multiple advantages:
- QUIC packets are encrypted. An observer is not able to observe the exchange of OBSERVED_ADDRESS frames, nor interfere with this exchange (e.g. by tampering with the frame contents).
- It doesn’t require running any additional services (i.e. a STUN server / client). It’s sufficient to enable the Address Discovery extension on a large enough number of nodes.
Of course, in either case the client has to trust that the server is sending honest responses. A misbehaving server could respond with spoofed addresses, causing the “hole punching” packets to later be sent to these addresses. This is not hard to defend against: Clients can obtain some protection against such attacks by contacting several servers and comparing their responses.
Hole Punching Coordination
The node has now learned its reflexive addresses, and we know how to use QUIC’s connection migration mechanism to establish the NAT port mappings required to allow the establishment of a direct path. We now want to establish a connection to another node, behind its respective NAT.For the moment, we’ll assume that the two nodes are able to communicate via a (proxied) QUIC connection. We’ll see how this works in detail in the next section. The only thing that matters for now is that the nodes are able to communicate with each other.
The NAT traversal draft defines how two nodes can negotiate hole punching attempts with each other. Out of convenience, the process is driven almost entirely by the client (i.e. the node that initiated the QUIC connection). This is not because the roles of the peers are fundamentally different (they are both peers in the same p2p network), but it leads to significant simplifications of the protocol. It also reduces the difference to RFC 9000, where connection migration can only be initiated by the client.
The server informs the clients about its reflexive address using ADD_ADDRESS frames. Multiple ADD_ADDRESS frames can be sent if the server has multiple reflexive addresses.
The ICE RFC goes into great detail on how to form candidate pairs from both nodes’ reflexive addresses, because both nodes need to agree on the ordering of the candidate pairs. Since the client is driving this process, we don’t need to specify any address matching logic that client and server would need to agree on.
Once the client has formed address pairs (and once it feels like it’s the right time to start a hole punch attempt), it sends a PUNCH_ME_NOW frame to the server. The PUNCH_ME_NOW contains both the client’s and the server’s reflexive addresses.
Immediately after sending the PUNCH_ME_NOW frame, the client starts path probing on the path formed by these two addresses. Equivalently, as soon as the server receives the PUNCH_ME_NOW frame, it starts path probing the path from its end. Timing is crucial here: As we’ve seen above, the path probing packets create the NAT binding required to allow the other side’s packets to make it through the NAT.
Both the client and the server will send PATH_CHALLENGE frames on a new path when sending or responding to PUNCH_ME_NOW
. They will need to allocate a yet unused Connection ID to the new path that they
are trying to establish. This implies that the number of parallel attempts is limited by the number of available Connection ID. This has both upsides and downside. On the one hand, having a limit reduces the amount of resource that a peer can be forced to consume, which makes the protocol more stable.
On the other hand, if the limits are reached, the next attempt will only be possible after one of the previous challenges has been abandoned, and the peer has provided a replacement Connection ID. This might be a slow process.
Whether that process is too slow is debatable. Endpoints that plan to engage in p2p hole punching may be configured to provide a number of Connection IDs sufficient for most practical attempts. Also, the initial path will be available until the migration succeeds, which means the application endpoints do not need to wait the success of the negotiation to start exchanging data.
Adding new QUIC frames like ADD_ADDRESS
or PUNCH_ME_NOW
is somewhat controversial. Misbehaving peers could send spoofed addresses in these frames, causing the peer to send hole punching packets to third parties. This is similar to the request forgery attacks described in the security section of RFC 9000, and calls at least for the same kind of defenses. This is something we will keep in mind when evolving the NAT traversal draft.
Relaying UDP packets over HTTP
RFC 9298 defines how UDP packets can be proxied in HTTP. The exchange starts a regular HTTP request: The client sends a so-called Extended CONNECT request to the proxy on a QUIC stream, instructing the proxy to open a UDP socket and proxy a flow of UDP packets to a target server.
Once the proxy has accepted the proxying request, UDP packets are sent in HTTP Datagrams (RFC 9297), which themselves are a thin wrapper around QUIC DATAGRAM frames. QUIC Datagrams are a new QUIC frame defined in RFC 9221, which are sent in QUIC packets exchanged after completion of the QUIC handshake. They’re therefore encrypted the same way that any other data exchanged over the QUIC connection is. However, if a packet containing a DATAGRAM is lost, the DATAGRAM frame is not retransmitted. This makes DATAGRAMs suitable to proxy unreliable packets, such as UDP packets.
Multiple UDP flows to different target servers can be proxied in the same QUIC connection.
Proxying UDP packets is almost what we need to make relaying work in the p2p scenario, but not quite: While the client can reach any IP via the proxy, it’s still not possible for other nodes to communicate with the client (unless contacted first by the client).
Fortunately, there’s already a draft describing how to Proxy UDP Listeners in HTTP. The primary use case for this draft is running WebRTC over CONNECT-UDP. This is a very similar problem to the one we’re trying to solve: WebRTC peers actually use the ICE protocol to establish a direct connection, and for that they need to know their reflexive transport addresses.
The mechanism is pretty straight-forward: The proxy allocates a new IP:port for the client, and forwards all UDP packets on this socket to the client. Of course, it also has to include the 2-tuple that the packet originated from.
The simplificity of this approach is at the same time its biggest limitation: Since there are only 65535 port numbers (many of which are reserved), a proxy can only handle a limited number of clients at the same time. To be clear, this still allows tens of thousands of concurrent clients, and many deployment scenarios will never run into this limit.
It might be possible to work around this limit in the future by using a similar approach as the QUIC-aware proxying draft.
Preparing for Multipath
The QUIC Working Group is finalizing the Multipath Extensions for QUIC. As the name suggests, this extensions allow multiple paths to be used simultaneously. For the p2p use case, this means that endpoints could keep the initial path available, even after a direct path was created by NAT traversal. These paths could either be used for load sharing or as a backup.
To get these benefits, we will need minor adaptations of the mechanism described here – effectively, managing connection IDs and path IDs in a multipath version of the PUNCH_ME_NOW
frame. We should work on that once the Multipath Extension for QUIC has made more progress in the IETF.
Putting All the Pieces Together
Now that we’ve explored all the components, let’s put them together and build a small p2p application running on top of QUIC.
When the node boots up, it first connects to a few hard-coded boot nodes. The majority of these nodes support the QUIC Address Discovery extension, so the node is able to learn that it’s behind a NAT, and what the NAT’s public addresses are.
It then connects to a relay and reserves an IP:port tuple with the relay. The node can now advertise this address to other peers in the network, for example by registering in some kind of peer directory, or by registering itself with the p2p network’s DHT.
At this point, other nodes can connect to the relay at this port, and have all their packets relayed. We’ve achieved the first goal: we have established connectivity. The relayed connections can immediately be used to exchange application data. Now the goal is to lighten the load on the relay server, and to obtain a direct (and potentially lower-latency, higher-throughput) to the peer.
The nodes employ the mechanism described in the NAT traversal draft to punch holes through their respective NATs. This hole punching procedure might take a few attempts, depending on the number of candidate pairs (and if hole punching attempts are run in parallel), but should generally only take a few seconds.
Most importantly, this is entirely transparent to the application: The application can start to use the relayed connection, and use it all the while the QUIC stack tries to establish the direct path.
Where are we on this?
So far, there’s no implementation of this protocol in production, but a lot of the documents have made their way through the IETF process and have now become widely deployed RFCs.
Specifically, the remaining pieces of the puzzle are:
- The Proxy UDP Listeners in HTTP draft, which allows clients to reserve an IP:port tuple on a relay server.
- The QUIC Address Discovery draft, which allows endpoints to learn about their public addresses. The current version of this draft is implemented by two different QUIC stacks: picoquic and a fork of quinn.
- The NAT Traversal for QUIC draft, which defines how to coordinate hole punching attempts between peers.
The quic-go project and the QUIC Interop Runner are community-funded projects.
If you find my work useful, please considering sponsoring: