Timeouts for TCP connections to non-responive servers are very long.
This in not appropriate for DNS connections.
Set timeouts for connection setup, sending data and recieving data.
The timeouts for connection setup and sending data are set at 5 seconds.
For recieving the reply this is doubled, to take into account the
time for usptream to actually get the answer.
Thanks to Petr Menšík for pointing out this problem, and finding a better
and more portable solution than the one in place heretofore.
In the post 2020 flag-day world, we limit UDP packets to 1232 bytes
which can go anywhere, so the dodgy code to try and determine the
functional maxmimum packet size on the path from upstream servers
is obsolete.
When doing DNSSEC validation, a single downstream query may
trigger many upstream queries. On an unreliable network, there
may not be enough downstream retries to ensure that all these
queries complete.
This was kinda strange before, with a lot of cargo-cult copied code,
and no clear strategy.
Now it works like this:
When talking upstream we always add a pseudoheader, and set the
UDP packet size to --edns-packet-max unless we've had problems
talking to a server, when it's reduced to 1280 if that fixes things.
Answering queries from downstream, we get the answer (either from
upstream or local data) If local data won't fit the advertised size
(or 512 if there's not pseudoheader) return truncated. If upstream
returns truncated, do likewise. If upstream is OK, but the answer is
too big for downstream, truncate the answer.
We only need to order two server records on the ->serial field.
Literal address records are smaller and don't have
this field and don't need to be ordered on it.
To actually provoke this bug seems to need the same server-literal
to be repeated twice, eg --address=/a/1.1.1.1 --address-/a/1.1.1.1
which is clearly rare in the wild, but if it did exist it could
provoke a SIGSEV. Thanks to Daniel Rhea for fuzzing this one.
Now we're always saving the query, this is no longer necessary,
and allows the removal of a lot of quite hairy code.
Much more code removal to come, once the TCP code path is also purged.
A relatively common situation is that the reply to a downstream query
will fit in a UDP packet when no DNSSEC RRs are present, but overflows
when the RRSIGS, NSEC ect are added. This extends the automatic
move from UDP to TCP to downstream queries which get truncated replies,
in the hope that once stripped of the DNSSEC RRs, the reply can be returned
via UDP, nwithout making the downstream retry with TCP.
If the downstream sets the DO bit, (ie it wants the DNSSEC RRs, then
this path is not taken, since the downstream will have to get a truncated
repsonse and retry to get a correct answer.
Heretofore, when a validating the result of an external query triggers
a DNSKEY or DS query and the result of that query is truncated, dnsmasq
has forced the whole validation process to move to TCP by returning a
truncated reply to the original requestor. This forces the original
requestor to retry the query in TCP mode, and the DNSSEC subqueries
also get made via TCP and everything works.
Note that in general the actual answer being validated is not large
enough to trigger truncation, and there's no reason not to return that
answer via UDP if we can validate it successfully. It follows that
a substandard client which can't do TCP queries will still work if the
answer could be returned via UDP, but fails if it gets an artifically
truncated answer and cannot move to TCP.
This patch teaches dnsmasq to move to TCP for DNSSEC queries when
validating UDP answers. That makes the substandard clients mentioned
above work, and saves a round trip even for clients that can do TCP.
The msg_controllen field passed to sendmsg is computed using the
CMSG_SPACE(), which is correct, but CMSG_SPACE() can include
padding bytes at the end of the passed structure if they are required
to align subsequent structures in the buffer. Make sure these
bytes are zeroed to avoid passing uninitiased memory to the kernel,
even though it will never touch these bytes.
Also tidy up the mashalling code in send_from to use pointers to
the structure being filled out, rather than a temporary copy which
then gets memcpy()'d into place. The DHCP sendmsg code has always
worked like this.
Thanks to Dominik Derigs for running Memcheck as submitting the
initial patch.
CAP_NET_ADMIN is needed in the DHCPv4 code to place entries into
the ARP cache. If it's configured to unconditionally broadcast
to unconfigured clients, it never touches the ARP cache and
doesn't need CAP_NET_ADMIN.
Thanks to Martin Ivičič <max.enhanced@gmail.com> for prompting this.
In generalising the RR filter code, the Dbus methods
controlling filtering A and AAAA records
got severely broken. This, and the previous commit,
fixes things.
Replies from upstream with a REFUSED rcode can result in
log messages stating that a resource limit has been exceeded,
which is not the case.
Thanks to Dominik Derigs and the Pi-hole project for
spotting this.