4 APR, 2024

How to discover and pair networked components: adar

Please read the introductory and overview post if you haven't yet!

So you've got two computers on a local network - how do they know that each other exists?

The complex simple way

Just make it the user's problem. This has been the tradtional method for local Network Attached Storage (NAS) devices: figure out the assigned IP address(es), and manually type them in to each connecting device. Or even worse, configure your DHCP server to hand out a static adddress if it isn't already maintained.

This is the equivalent of collecting all your friends phone numbers to communicate with them - it's an uneccessary detail in today's world. And regarding the bit about static addresses? That's like if your friends numbers changed every time they restarted their phone!

A possible solution

We don't remember IP addresses for all the websites we visit each day. We probably don't even remember a time when you needed to do so! It's almost as if we figured out how to map human-friendly names to IP addresses in the 70's, before the internet even became available to the public. For entirely unknown reasons, this has never quite gotten popular with home networks, despite being just as convenient and supported. Hostnames are a thing y'all.

But it's only a half-solution, as it still requires the user to set and type in the hostnames. That's not convenient. What we need is some way for each device to ask about other devices that it knows how to talk to...

Configuration is boring, make the machines do it

This is not a new problem. In fact, it's been solved so many times it's got at least 4 different names!

Avahi, Bonjour, DNS-SD, mDNS

These are all different names for the same basic technology and principles, generally categorised as zeroconf.

zeroconf

All of our devices and gadgets can host a variety of services. As such, zeroconf enables each host to advertise some services as well as associated metadata.

For example, if you are a printer called "PrinterName", you would broadcast as: PrinterName._ipp._tcp.local.

Let's break down that address: it works very similarly to a regular website address, where components are separated by a dot. But why is there a dot on the end? That's not how it works! Well, depending on how you read into the actual DNS specification, it... does exist. Everybody just moves on and doesn't show it or have the user interact with it, and that's fine according to the specifications. In this case though, it is definitely necessary.

Next up (reading right-to-left) is the qualifier for the local network, simply called... "local"! Following on from that (to the left) are two more identifiers, one for transport protocol, and another for service type. These are prefixed by underscores for historical reasons (see the next section).

Lastly is the actual human-friendly name of the device itself - this is the only bit that the user would see and interact with.

Historical reasons

The underscore. Why the underscore?

The underlying technology is mDNS, which merely specifies a method for providing the minimal functionality of a single DNS server by a distribution of hosts. Thus, the proper technology basis is DNS-SD (Service Discovery), which further only formalises a combination of DNS technologies for the use of zeroconf. Thus, the actual document is the DNS Resource Record specifier for SRV (service) records, which was initially published as an experimental protocol in 1996 as RFC2052.

But that's not all, it doesn't actually say that underscores need to exist! Instead, it was obsoleted in 2000 by the proposed standard RFC2782, which specifies that the service and protocol names are as you'd expect, but also prepended by an underscore to "avoid collisions with DNS labels that occur in nature".

What this all means is that they want to make sure that nothing gets confused between a service using the TCP protocol, and the website https://tcp.com. This works because you aren't allowed underscores in DNS names, only hyphens.

How does this work for adar?

adar uses the host devices' name for interaction with the users, such that it might broadcast on the network as desktop-PC._adar._tcp.local.

Furthermore, it will advertise all the IP addresses that it is available on, though peers will initially use just the name to try to connect. Thusly, adar plug into industry-standard technologies to advertise and connect peers together using human-friendly device names. Next up, how does this connection process actually work?

Pairing

Everyone knows how to connect to a Wi-Fi network. Everyone knows how to pair to a Bluetooth accessory. So I assumed that whatever devices adar will be used on are already connected to the network, and I copied the Bluetooth pairing process. Okay, not literally, but in spirit it's quite similar.

Once the new peer has been discovered (as desribed above), we check that there is version compatibility between the two peers. If there is possibility of a compatible version, we pick the most recent mutual version, and then check with the user to see if we should try to pair.

If the user gives the go-ahead, we create a new TCP connection and send a command to the peer to request to pair. Once the peer confirms that pairing is good to go, we can start connecting!

Creating a connection

All commands are exchanged over a TCP connection with the peer. Once the connection is established, each peer seeks to identify the other, and ensure that they are connected directly to one of the advertised addresses, such that there is no machine-in-the-middle (which would be a terrible security hole!).

The version is formally agreed upon between each peer, each time they connect. I know, I know, we just checked before pairing! But you pair once, and connect thereafter and updates may change the situation (for better or worse), so it is essential to confirm and actually agree on the version to use.

Lastly, a Diffie-Hellman key exchange occurs to set up a mutual secret key to use for encrypting data between that specific pair of peers. This exchange creates a 1024-bit long key that is not known to any other machine on the network, even if there was a machine listening in on all the communications. It's really clever maths, but I can't explain it all here. To confirm that there was no machine-in-the-middle (yes, again!), the user can confirm that a sequence of 6 numbers (in two groups of 3) are identical between the two peers that they are pairing. You might be familiar with this process from Bluetooth pairing, and indeed it's the same, and it serves the same purpose! These numbers are actually the first two bytes (16 bits) of the shared key, so they directly relate to the cryptographic security of the system, but are presented in an accessible way that the user can confirm it for themselves.

Transferring data between peers

Now that we've found the peers, pairing with them, and securely connected to them, we can finally get started with sharing some data around!

Read the next post in the series on data transfer across a network.