Localhost

As software engineers we all know about localhost, the hostname for the local machine. As we all know this is the same as 127.0.0.1, right?

Not so fast.

First, 127.0.0.1 does means localhost, but it actually means something more specific: IPv4 localhost. Of course there's another localhost for IPv6 and that's ::1. So we know that 127.0.0.1 means localhost, and we know that ::1 means localhost. But what does localhost mean? Well, it could refer to either the IPv4 loopback address or the IPv6 loopback address.

Here's what's in /etc/hosts on my machine (Fedora 24):

127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

As you can see localhost matches both 127.0.0.1 and ::1. So what are the rules for what glibc will do in this situation when resolving localhost? What are the rules for what Python will do when connecting? Will getaddrinfo(2) prefer the IPv4 address or the IPv6 address, and will it even be consistent? As you can see there's a lot of ambiguity here.

Ideally all of your software works seamlessly with both IPv4 and IPv6 and this isn't an issue. But if you rely on using localhost know that there is ambiguity, and that you can't rely on using one protocol or the other.

The other issue to be wary of is how software represents remote addresses. Most software programs will happily resolve localhost when making outgoing connections, but typically they won't try to map an incoming address of 127.0.0.1 or ::1 back to a human readable name like localhost. So if you're used to using localhost there's going to be an asymmetry, where your logs are filled with IP addresses but your configs use "localhost". This can lead to nasty surprises in unexpected situations, such as having a Varnish rule that matches against the regex ^localhost fail because Varnish represents the remote peer name as 127.0.0.1.

I recommend that you pick either 127.0.0.1 or ::1 and stick to it consistently. By being explicit about the protocol and using the same format that the machine uses when displaying socket addresses you'll run into a lot fewer surprises.