If dnsmasq is configured to add an EDNS client subnet to a query,
it is careful to suppress use of the cache, since a cached answer may
not be valid for a query with a different client subnet.
Extend this behaviour to queries which arrive a dnsmasq
already carrying an EDNS client subnet.
This change is rather more involved than may seem necessary at first sight,
since the existing code relies on all queries being decorated by dnsmasq
and therefore not cached, so there is no chance that an incoming query
might hit the cache and cache lookup don't need to be suppressed, just
cache insertion. When downstream queries may be a mix of client-subnet
bearing and plain vanilla, it can't be assumed that the answers are never
in the cache, and queries with subnets must not do lookups.
I am attaching an incremental git-am ready patch to go on top your Git HEAD,
to fix all sorts of issues and make this conforming C99 with default
options set,
and fix another load of warnings you receive when setting the compiler
to pick the nits,
-pedantic-errors -std=c99 (or c11, c18, c2x).
It changes many void * to uint8_t * to make the "increment by bytes"
explicit.
You can't do:
void *foo;
// ...
foo += 2.
We can't answer and shouldn't forward non-QUERY DNS requests.
This patch fixes handling such requests from TCP connections; before
the connection would be closed without reply.
It also changes the RCODE in the answer from REFUSED to NOTIMP and
provides clearer logging.
When DNSEC validation is enabled, but a query is not validated
because it gets forwarded to a non-DNSEC-capable upstream
server, the rr_status array is not correctly cleared, with
the effect that the answer may be maked as DNSSEC validated
if the immediately preceding query was DNS signed and validated.
Timeouts for TCP connections to non-responive servers are very long.
This in not appropriate for DNS connections.
Set timeouts for connection setup, sending data and recieving data.
The timeouts for connection setup and sending data are set at 5 seconds.
For recieving the reply this is doubled, to take into account the
time for usptream to actually get the answer.
Thanks to Petr Menšík for pointing out this problem, and finding a better
and more portable solution than the one in place heretofore.
In the post 2020 flag-day world, we limit UDP packets to 1232 bytes
which can go anywhere, so the dodgy code to try and determine the
functional maxmimum packet size on the path from upstream servers
is obsolete.
This was kinda strange before, with a lot of cargo-cult copied code,
and no clear strategy.
Now it works like this:
When talking upstream we always add a pseudoheader, and set the
UDP packet size to --edns-packet-max unless we've had problems
talking to a server, when it's reduced to 1280 if that fixes things.
Answering queries from downstream, we get the answer (either from
upstream or local data) If local data won't fit the advertised size
(or 512 if there's not pseudoheader) return truncated. If upstream
returns truncated, do likewise. If upstream is OK, but the answer is
too big for downstream, truncate the answer.
Now we're always saving the query, this is no longer necessary,
and allows the removal of a lot of quite hairy code.
Much more code removal to come, once the TCP code path is also purged.
A relatively common situation is that the reply to a downstream query
will fit in a UDP packet when no DNSSEC RRs are present, but overflows
when the RRSIGS, NSEC ect are added. This extends the automatic
move from UDP to TCP to downstream queries which get truncated replies,
in the hope that once stripped of the DNSSEC RRs, the reply can be returned
via UDP, nwithout making the downstream retry with TCP.
If the downstream sets the DO bit, (ie it wants the DNSSEC RRs, then
this path is not taken, since the downstream will have to get a truncated
repsonse and retry to get a correct answer.
Heretofore, when a validating the result of an external query triggers
a DNSKEY or DS query and the result of that query is truncated, dnsmasq
has forced the whole validation process to move to TCP by returning a
truncated reply to the original requestor. This forces the original
requestor to retry the query in TCP mode, and the DNSSEC subqueries
also get made via TCP and everything works.
Note that in general the actual answer being validated is not large
enough to trigger truncation, and there's no reason not to return that
answer via UDP if we can validate it successfully. It follows that
a substandard client which can't do TCP queries will still work if the
answer could be returned via UDP, but fails if it gets an artifically
truncated answer and cannot move to TCP.
This patch teaches dnsmasq to move to TCP for DNSSEC queries when
validating UDP answers. That makes the substandard clients mentioned
above work, and saves a round trip even for clients that can do TCP.
The msg_controllen field passed to sendmsg is computed using the
CMSG_SPACE(), which is correct, but CMSG_SPACE() can include
padding bytes at the end of the passed structure if they are required
to align subsequent structures in the buffer. Make sure these
bytes are zeroed to avoid passing uninitiased memory to the kernel,
even though it will never touch these bytes.
Also tidy up the mashalling code in send_from to use pointers to
the structure being filled out, rather than a temporary copy which
then gets memcpy()'d into place. The DHCP sendmsg code has always
worked like this.
Thanks to Dominik Derigs for running Memcheck as submitting the
initial patch.
Replies from upstream with a REFUSED rcode can result in
log messages stating that a resource limit has been exceeded,
which is not the case.
Thanks to Dominik Derigs and the Pi-hole project for
spotting this.
By calculating the hash of a DNSKEY once for each digest algo,
we reduce the hashing work from (no. DS) x (no. DNSKEY) to
(no. DNSKEY) x (no. distinct digests)
The number of distinct digests can never be more than 255 and
it's limited by which hashes we implement, so currently only 4.
An attacker can create DNSSEC signed domains which need a lot of
work to verfify. We limit the number of crypto operations to
avoid DoS attacks by CPU exhaustion.
In extract_addresses() the "secure" argument is only set if the
whole reply is validated (ie the AD bit can be set). Even without
that, some records may be validated, and should be marked
as such in the cache.
Related, the DNS doctor code has to update the flags for individual
RRs as it works, not the global "secure" flag.
Now we can cache arbirary RRs, give more correct answers when
replying negative answers from cache.
To implement this needed the DNS-doctor code to be untangled from
find_soa(), so it should be under suspicion for any regresssions
in that department.
answer_request() builds answers in the same packet buffer
as the request. This means that any EDNS0 header from the
original request is overwritten. If the answer is in cache, that's
fine: dnsmasq adds its own EDNS0 header, but if the cache lookup fails
partially and the request needs to be sent upstream, it's a problem.
This was fixed a long, long time ago by running the cache
lookup twice if the request included an EDNS0 header. The first time,
nothing would be written to the answer packet, nad if the cache
lookup failed, the untouched question packet was still available
to forward upstream. If cache lookup succeeded, the whole thing
was done again, this time writing the data into the reply packet.
In a world where EDNS0 was rare and so was memory, this was a
reasonable solution. Today EDNS0 is ubiquitous so basically
every query is being looked up twice in the cache. There's also
the problem that any code change which makes successive cache lookups
for a query possibly return different answers adds a subtle hidden
bug, because this hack depends on absence of that behaviour.
This commit removes the lookup-twice hack entirely. answer_request()
can now return zero and overwrite the question packet. The code which
was previously added to support stale caching by saving a copy of the
query in the block-storage system is extended to always be active.
This handles the case where answer_request() returns no answer OR
a stale answer and a copy of the original query is needed to forward
upstream.