As software engineers we all know about localhost, the hostname for the local
machine. As we all know this is the same as 127.0.0.1
, right?
Not so fast.
First, 127.0.0.1
does means localhost, but it actually means something more
specific: IPv4 localhost. Of course there's another localhost for IPv6 and
that's ::1
. So we know that 127.0.0.1
means localhost, and we know that
::1
means localhost. But what does localhost mean? Well, it could refer to
either the IPv4 loopback address or the IPv6 loopback address.
Here's what's in /etc/hosts
on my machine (Fedora 24):
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
As you can see localhost matches both 127.0.0.1
and ::1
. So what are the
rules for what glibc will do in this situation when resolving localhost? What
are the rules for what Python will do when connecting?
Will getaddrinfo(2)
prefer the IPv4 address or the IPv6 address, and will it even be consistent? As
you can see there's a lot of ambiguity here.
Ideally all of your software works seamlessly with both IPv4 and IPv6 and this isn't an issue. But if you rely on using localhost know that there is ambiguity, and that you can't rely on using one protocol or the other.
The other issue to be wary of is how software represents remote addresses. Most
software programs will happily resolve localhost when making outgoing
connections, but typically they won't try to map an incoming address of
127.0.0.1
or ::1
back to a human readable name like localhost. So if you're
used to using localhost there's going to be an asymmetry, where your logs are
filled with IP addresses but your configs use "localhost". This can lead
to
nasty surprises in
unexpected situations, such as having a Varnish rule that matches against the
regex ^localhost
fail because Varnish represents the remote peer name as
127.0.0.1
.
I recommend that you pick either 127.0.0.1
or ::1
and stick to it
consistently. By being explicit about the protocol and using the same format
that the machine uses when displaying socket addresses you'll run into a lot
fewer surprises.