Previously, the prefix was limited to [8,16,24,32] for IPv4 and
to multiples of 4 for IPv6. This patch also makes the prefix-length optional
for --rev-server.
Inspired by a patch from DL6ER <dl6er@dl6er.de>, but completely
re-written by srk. All bugs are his.
Domain patterns in --address, --server and --local have, for many years,
matched complete labels only, so
--server=/google.com/1.2.3.4
will apply to google.com and www.google.com but NOT supergoogle.com
This commit introduces an optional '*' at the LHS of the domain string which
changes this behaviour so as to include substring matches _within_ labels. So,
--server=/*google.com/1.2.3.4
applies to google.com, www.google.com AND supergoogle.com.
This should be largely transparent, but it drastically
improves performance and reduces memory foot-print when
configuring large numbers domains of the form
local=/adserver.com/
or
local=/adserver.com/#
Lookup times now grow as log-to-base-2 of the number of domains,
rather than greater than linearly, as before.
The change makes multiple addresses associated with a domain work
address=/example.com/1.2.3.4
address=/example.com/5.6.7.8
It also handles multiple upstream servers for a domain better; using
the same try/retry alogrithms as non domain-specific servers. This
also applies to DNSSEC-generated queries.
Finally, some of the oldest and gnarliest code in dnsmasq has had
a significant clean-up. It's far from perfect, but it _is_ better.
Previously, without min-port or max-port configured, dnsmasq would
default to the compiled in defaults for those, which are 1024 and
65535. Now, when neither are configured, it defaults instead to
the kernel's ephemeral port range, which is typically
32768 to 60999 on Linux systems. This change eliminates the
possibility that dnsmasq may be using a registered port > 1024
when a long-running daemon starts up and wishes to claim it.
This change does likely slighly reduce the number of random ports
and therefore the protection from reply spoofing. The older
behaviour can be restored using the min-port and max-port config
switches should that be a concern.
CVE-2021-3448 applies.
It's possible to specify the source address or interface to be
used when contacting upstream nameservers: server=8.8.8.8@1.2.3.4
or server=8.8.8.8@1.2.3.4#66 or server=8.8.8.8@eth0, and all of
these have, until now, used a single socket, bound to a fixed
port. This was originally done to allow an error (non-existent
interface, or non-local address) to be detected at start-up. This
means that any upstream servers specified in such a way don't use
random source ports, and are more susceptible to cache-poisoning
attacks.
We now use random ports where possible, even when the
source is specified, so server=8.8.8.8@1.2.3.4 or
server=8.8.8.8@eth0 will use random source
ports. server=8.8.8.8@1.2.3.4#66 or any use of --query-port will
use the explicitly configured port, and should only be done with
understanding of the security implications.
Note that this change changes non-existing interface, or non-local
source address errors from fatal to run-time. The error will be
logged and communiction with the server not possible.
MTU were obtained early during iface_allowed check. But often it
returned from the function without ever using it. Because calls to
kernel might be costy, move fetching it only when it would be assigned.
Request only one re-read of addresses and/or routes
Previous implementation re-reads systemd addresses exactly the same
number of time equal number of notifications received.
This is not necessary, we need just notification of change, then re-read
the current state and adapt listeners. Repeated re-reading slows netlink
processing and highers CPU usage on mass interface changes.
Continue reading multicast events from netlink, even when ENOBUFS
arrive. Broadcasts are not trusted anyway and refresh would be done in
iface_enumerate. Save queued events sent again.
Remove sleeping on netlink ENOBUFS
With reduced number of written events netlink should receive ENOBUFS
rarely. It does not make sense to wait if it is received. It is just a
signal some packets got missing. Fast reading all pending packets is required,
seq checking ensures it already. Finishes changes by
commit 1d07667ac7.
Move restart from iface_enumerate to enumerate_interfaces
When ENOBUFS is received, restart of reading addresses is done. But
previously found addresses might not have been found this time. In order
to catch this, restart both IPv4 and IPv6 enumeration with clearing
found interfaces first. It should deliver up-to-date state also after
ENOBUFS.
Read all netlink messages before netlink restart
Before writing again into netlink socket, try fetching all pending
messages. They would be ignored, only might trigger new address
synchronization. Should ensure new try has better chance to succeed.
ENOBUFS error handling was improved. Netlink is correctly drained before
sending a new request again. It seems ENOBUFS supression is no longer
necessary or wanted. Let kernel tell us when it failed and handle it a
good way.
The initial call to enumerate_interfaces() happens before the
logging subsystem in initialised and the startup banner logged.
It's not intended that syslog be written at this point.
Save listening address into listener. Use it to find existing listeners
before creating new one. If it exist, increase just used counter.
Release only listeners not already used.
Duplicates family in listener.
If interface is recreated with the same address but different index, it
would not change any other parameter.
Test also address family on incoming TCP queries.
Log in debug mode listening on interfaces. They can be dynamically
found, include interface number, since it is checked on TCP connections.
Print also addresses found on them.
This was the source of a large number of #ifdefs, originally
included for use with old embedded libc versions. I'm
sure no-one wants or needs IPv6-free code these days, so this
is a move towards more maintainable code.
If we're talking to upstream servers from a fixed port, specified by query-port
we create the fds to do this once, before dropping root, so that ports <1024 can be used.
But we call check_servers() before reading /etc/resolv.conf, so if the only servers
are in resolv.conf, at that point there will be no servers, and the fds get garbage
collected away, only to be recreated (but without root) after we read /etc/resolv.conf
Make pre-allocated server fds immortal, to avoid this problem.
If query-port is set, we create sockets bound to the wildcard address and the query port for
IPv4 and IPv6, but the IPv6 one fails, because is covers IPv4 as well, and an IPv4 socket
already exists (it gets created first). Set V6ONLY to avoid this.
dnsmasq allows to specify a interface for each name server passed with
the -S option or pushed through D-Bus; when an interface is set,
queries to the server will be forced via that interface.
Currently dnsmasq uses SO_BINDTODEVICE to enforce that traffic goes
through the given interface; SO_BINDTODEVICE also guarantees that any
response coming from other interfaces is ignored.
This can cause problems in some scenarios: consider the case where
eth0 and eth1 are in the same subnet and eth0 has a name server ns0
associated. There is no guarantee that the response to a query sent
via eth0 to ns0 will be received on eth0 because the local router may
have in the ARP table the MAC address of eth1 for the IP of eth0. This
can happen because Linux sends ARP responses for all the IPs of the
machine through all interfaces. The response packet on the wrong
interface will be dropped because of SO_BINDTODEVICE and the
resolution will fail.
To avoid this situation, dnsmasq should only restrict queries, but not
responses, to the given interface. A way to do this on Linux is with
the IP_UNICAST_IF and IPV6_UNICAST_IF socket options which were added
in kernel 3.4 and, respectively, glibc versions 2.16 and 2.26.
Reported-by: Hector Martin <marcan@marcan.st>
Signed-off-by: Beniamino Galvani <bgalvani@redhat.com>
By default 30 first servers are listed individually to system log, and
then a count of the remaining items. With e.g. a NXDOMAIN based adblock
service, dnsmasq lists 30 unnecessary ad sites every time when dnsmasq
evaluates the list. But the actual nameservers in use are evaluated last
and are not displayed as they get included in the "remaining items" total.
Handle the "local addresses only" separately and list only a few of them.
Remove the "local addresses only" from the general count.