Fix bug which caused dnsmasq to lose track of processes forked
to handle TCP DNS connections under heavy load. The code
checked that at least one free process table slot was
available before listening on TCP sockets, but didn't take
into account that more than one TCP connection could
arrive, so that check was not sufficient to ensure that
there would be slots for all new processes. It compounded
this error by silently failing to store the process when
it did run out of slots. Even when this bug is triggered,
all the right things happen, and answers are still returned.
Only under very exceptional circumstances, does the bug
manifest itself: see
https://lists.thekelleys.org.uk/pipermail/dnsmasq-discuss/2021q2/014976.html
Thanks to Tijs Van Buggenhout for finding the conditions under
which the bug manifests itself, and then working out
exactly what was going on.
If two queries arrive a second or so apart, they cannot be a try and
a retry from the same client (retries are at least three seconds apart.)
It's therefore safe not to forward the second query, but answer them
both when the reply arrives for the first.
This changes the behaviour introduced in
141a26f979
We re-introduce the distinction between a query
which is retried from the same source, and one which is
repeated from different sources.
In the later case, we still forward the query, to avoid
problems when the reply to the first query is lost
(see f8cf456920) but we suppress the behaviour
that's used on a retry, when the query is sent to
all available servers in parallel.
Retry -> all servers.
Repeat -> next server.
This avoids a significant increase in upstream traffic on
busy instances which see lots of queries for common names.
It does mean the clients which repeat queries from new source ports,
rather than retrying them from the same source port, will see
different behaviour, but it in fact restores the pre-2.83 behaviour,
so it's not expected to be a practical problem.
Check sender of all received packets, as specified in RFC 1350 para 4.
My understanding of the example in the RFC is that it in fact only
applies to server-to-client packets, and packet loss or duplication
cannot result in a client sending from more than one port to a server.
This check is not, therefore, strictly needed on the server side.
It's still useful, and adds a little security against packet
spoofing. (though if you're running TFTP on a public network with
bad actors, nothing can really save you.)
RHEL/CentOS 7 does not compile with DNSSEC enabled, because older
version is not supported. Add few defines to compile also on older
nettle versions.
Adds also major version 4 check, taking into account higher major
version.
One change to server_test_type forgot to set SERV_DO_DNSSEC. One new
place still can be reused.
Fixes commit e10a9239e1, thanks to
Xingcong Li for spotting it.
Previously, without min-port or max-port configured, dnsmasq would
default to the compiled in defaults for those, which are 1024 and
65535. Now, when neither are configured, it defaults instead to
the kernel's ephemeral port range, which is typically
32768 to 60999 on Linux systems. This change eliminates the
possibility that dnsmasq may be using a registered port > 1024
when a long-running daemon starts up and wishes to claim it.
This change does likely slighly reduce the number of random ports
and therefore the protection from reply spoofing. The older
behaviour can be restored using the min-port and max-port config
switches should that be a concern.
One part in dnssec retry path did not dump sent retry into dump file.
Make sure it is dumped all times it is sent by common function shared on
multiple places. Reduce a bit also server sending.
CVE-2021-3448 applies.
It's possible to specify the source address or interface to be
used when contacting upstream nameservers: server=8.8.8.8@1.2.3.4
or server=8.8.8.8@1.2.3.4#66 or server=8.8.8.8@eth0, and all of
these have, until now, used a single socket, bound to a fixed
port. This was originally done to allow an error (non-existent
interface, or non-local address) to be detected at start-up. This
means that any upstream servers specified in such a way don't use
random source ports, and are more susceptible to cache-poisoning
attacks.
We now use random ports where possible, even when the
source is specified, so server=8.8.8.8@1.2.3.4 or
server=8.8.8.8@eth0 will use random source
ports. server=8.8.8.8@1.2.3.4#66 or any use of --query-port will
use the explicitly configured port, and should only be done with
understanding of the security implications.
Note that this change changes non-existing interface, or non-local
source address errors from fatal to run-time. The error will be
logged and communiction with the server not possible.
At least on Fedora 32 with GCC 10.2.1, dnsmasq compilation emits warning:
tftp.c: In function ‘tftp_request’:
tftp.c:754:3: warning: ‘strcpy’ source argument is the same as
destination [-Wrestrict]
754 | strcpy(daemon->namebuff, file);
And indeed it is the same source always on line 477, sometimes also on
571 in tftp.c
Attached patch fixes the warning and possible undefined behaviour on
tftp error.
MTU were obtained early during iface_allowed check. But often it
returned from the function without ever using it. Because calls to
kernel might be costy, move fetching it only when it would be assigned.
netlink_multicast used 3 calls to fcntl in order to set O_NONBLOCK on
socket. It is possible to pass MSG_DONTWAIT flag just to recvmsg function,
without setting it permanently on socket. Save few kernel calls and use
recvmsg flags.
It is supported since kernel 2.2, should be fine for any device still
receiving updates.
Previously we were always using <sys/poll.h> since
HAVE_POLL_H is never set. This looks like an autoconfism
that has crept in, but we don't use autoconf.
poll.h is the correct header file, as far as I can tell.
Request only one re-read of addresses and/or routes
Previous implementation re-reads systemd addresses exactly the same
number of time equal number of notifications received.
This is not necessary, we need just notification of change, then re-read
the current state and adapt listeners. Repeated re-reading slows netlink
processing and highers CPU usage on mass interface changes.
Continue reading multicast events from netlink, even when ENOBUFS
arrive. Broadcasts are not trusted anyway and refresh would be done in
iface_enumerate. Save queued events sent again.
Remove sleeping on netlink ENOBUFS
With reduced number of written events netlink should receive ENOBUFS
rarely. It does not make sense to wait if it is received. It is just a
signal some packets got missing. Fast reading all pending packets is required,
seq checking ensures it already. Finishes changes by
commit 1d07667ac7.
Move restart from iface_enumerate to enumerate_interfaces
When ENOBUFS is received, restart of reading addresses is done. But
previously found addresses might not have been found this time. In order
to catch this, restart both IPv4 and IPv6 enumeration with clearing
found interfaces first. It should deliver up-to-date state also after
ENOBUFS.
Read all netlink messages before netlink restart
Before writing again into netlink socket, try fetching all pending
messages. They would be ignored, only might trigger new address
synchronization. Should ensure new try has better chance to succeed.
ENOBUFS error handling was improved. Netlink is correctly drained before
sending a new request again. It seems ENOBUFS supression is no longer
necessary or wanted. Let kernel tell us when it failed and handle it a
good way.
Remove distinction between retry with same QID/SP and
retry for same query with different QID/SP. If the
QID/SP are the same as an existing one, simply retry,
if a new QID/SP is seen, add to the list to be replied to.
The new logic in 2.83/2.84 which merges distinct requests for the
same domain causes problems with clients which do retries as distinct
requests (differing IDs and/or source ports.) The retries just get
piggy-backed on the first, failed, request.
The logic is now changed so that distinct requests for repeated
queries still get merged into a single ID/source port, but they now
always trigger a re-try upstream.
Thanks to Nicholas Mu for his analysis.
If identical queries from IPv4 and IPv6 sources are combined by the
new code added in 15b60ddf93 then replies
can end up being sent via the wrong family of socket. The ->fd
should be per query, not per-question.
In bind-interfaces mode, this could also result in replies being sent
via the wrong socket even when IPv4/IPV6 issues are not in play.
Unlike COPTS=-DHAVE_DNSSEC, allow usage of just sha256 function from
nettle, but keep DNSSEC disabled at build time. Skips use of internal
hash implementation without support for validation built-in.
If we add the EDNS client subnet option, or the client's
MAC address, then the reply we get back may very depending on
that. Since the cache is ignorant of such things, it's not safe to
cache such replies. This patch determines when a dangerous EDNS
option is being added and disables caching.
Note that for much the same reason, we can't combine multiple
queries for the same question when dangerous EDNS options are
being added, and the code now handles that in the same way. This
query combining is required for security against cache poisoning,
so disabling the cache has a security function as well as a
correctness one.
Previously, such queries would all be forwarded
independently. This is, in theory, inefficent but in practise
not a problem, _except_ that is means that an answer for any
of the forwarded queries will be accepted and cached.
An attacker can send a query multiple times, and for each repeat,
another {port, ID} becomes capable of accepting the answer he is
sending in the blind, to random IDs and ports. The chance of a
succesful attack is therefore multiplied by the number of repeats
of the query. The new behaviour detects repeated queries and
merely stores the clients sending repeats so that when the
first query completes, the answer can be sent to all the
clients who asked. Refer: CERT VU#434904.