The proximate cause for doing this is to fix a bug that
causes replies to PTR queries with more than one answer to have the
second and subsequent answers ignored.
The fix turned into a small re-write which removed a very old hack.
When caching reponses which include CNAME records, the cache system
stores the CNAME with a link to the record representing the target of
the CNAME. This isn't possible for PTR records representing IP
addresses since the name stored is the target of the PTR, record and
its name is inferred from the address in the cache record. Such
cache records have the F_REVERSE flag set. To get
around this, long ago, the code which stores such records elided the
CNAME entirely, so
4.3.2.1.in-addr.arpa CNAME 18/3.2.1.in-addr.arpa
18/3.2.1.in-addr.arpa PTR myhost.example.com
would be stored as
4.3.2.1.in-addr.arpa PTR myhost.example.com
and returned from the cache to subsequent requestor in that form.
Since that hack was committed, dnsmasq has learned to cache arbitrary
RRs. So now we can store the PTR records for all the no-trivial cases.
The means the CNAME chains ending in PTR records don't get mangled,
and we can store PTR records whose name in not w.x.y.x.in-addr.arpa
or the IPv6 equivalent.
If DNS is happening over TCP, the query is handled by a forked
process. Of ipset ot nftset is configured, this might include
inserting addresses in the *sets. Before this update, that
was done by the forked process using handles inherited from the
parent "master" process.
This is inherently racy. If the master process or another
child process tries to do updates at the same time, the
updates can clash and fail.
To see this, you need a busy server doing lots of DNS
queries over TCP, and ipset or nftset configured.
Going forward, we use the already established pipe to send the
updates from the child back to the master process, which
serialises them.
There's no point in checking if the query has edns0 headers _after_
adding our own.
This has the affect that if --add-cpe-id or --add-subnet or their friends
are configured, a query via TCP without EDNS0 will get an answer with EDNS0.
It's highly unlikely that this breaks anything, but it is incorrect.
Thanks to Tijs Van Buggenhout for spotting this.
This is a regession introduced in 3b6df06fb8.
When dnsmasq is started without upstreams (yet), but a
DNS query comes in that needs forwarding dnsmasq now potentially crashes as
the value for "first" variable is undetermined.
A segmentation violation occurs when the index
is out of bounds of serverarray.
Credits go to pedro0311 <pedro@freshtomato.org>
To complement the previous one, which fixed the retry path
when the query is retried from a different id/source address, this
fixes retries from the same id/source address.
A retry to upstream DNS servers triggered by the following conditions
1) A query asking for the same data as a previous query which has not yet been answered.
2) The second query arrives more than two seconds after the first.
3) Either the source of the second query or the id field differs from the first.
fails to set the case of the retry to the same pattern as the first attempt.
However dnsmasq expects the reply from upstream to have the case
pattern of the first attempt.
If the answer to the retry arrives before the answer to the first
query, dnsmasq will notice the case mismatch, log an error, and
ignore the answer.
The worst case scenario would be the first upstream query or reply is
lost and there would follow a short period where all queries for that
particular domain would fail.
This is a 2.91 development issue, it doesn't apply to previous stable releases.
Fix a case sensitivity problem which has been lurking for a long while.
When we get example.com and Example.com and combine them, we send whichever
query arrives first upstream and then later answer it, and we also
answer the second with the same answer. That means that if example.com
arrives first, it will get the answer example.com - good - but Example.com
will _also_ get the answer example.com - not so good.
In theory, fixing this is simple without having to keep seperate
copies of all the queries: Just use the bit-vector representation
of case flipping that we have for 0x20-encoding to keep the
differences in case. The complication comes from the fact that
the existing bit-vector code only holds data on the first 32 alpha
letters, because we only flip that up to many for 0x20 encoding.
In practise, the delta between combined queries can almost always
be represented with that data, since almost all queries are
all lower case and we only purturb the first 32 letters with
0x20 encoding. It's therefore worth keeping the existing,
efficient data structure for the 99.9% of the time it works.
For the 0.1% it doesn't, however, one needs an arbitrary-length data
structure with the resource implications of that.
Thanks to Peter Tirsek for the well researched bug report which set me
on to these problems.
We must only compare case when mapping an answer from upstream
to a forwarding record, not when checking a query to see if it's a
duplicate. Since the saved query name is scrambled, that ensures
that almost all such checks will wrongly fail.
Thanks to Peter Tirsek for an exemplary bug report for this.
The "bit 0x20 encoding" implemented in 995a16ca0c
can interact badly with (hopefully) rare broken upstream servers. Provide
an option to turn it off and a log message to give a clue as to why DNS service
is non-functional.
This provides extra protection against reply-spoof attacks.
Since DNS queries are case-insensitive, it's possible to randomly flip
the case of letters in a query and still get the correct answer back.
This adds an extra dimension for a cache-poisoning attacker to guess
when sending replies in-the-blind since it's expected that the
legitimate answer will have the same pattern of upper and lower case
as the query, so any replies which don't can be ignored as
malicious.
The amount of extra entropy clearly depends on the number
of a-z and A-Z characters in the query, and this implementation puts a
hard limit of 32 bits to make rescource allocation easy. This about
doubles entropy over the standard random ID and random port
combination.
When checking that an answer is the answer to the question that
we asked, compare the name in a case-sensitive manner.
Clients can set the letters in a query to a random pattern of
uppercase and lowercase to add more randomness as protection against
cache-poisoning attacks, and we don't want to nullify that.
This actually restores the status quo before
commit ed6d29a784
since matching questions and answers using a checksum
can't help but be case sensitive.
This patch is a preparation for introducing DNS-0x20
in the dnsmasq query path.
When dnsmasq is configured to act as an authoritative server and has
an authoritative zone configured, and recieves a query for
that zone _as_forwarder_ it answers the query directly rather
than forwarding it. This doesn't affect the answer, but it
saves dnsmasq forwarding the query to the recusor upstream,
whch then bounces it back to dnsmasq in auth mode. The
exception should be when the query is for the root of zone, for a DS
RR. The answer to that has to come from the parent, via the
recursor, and will typically be a proof-of-nonexistence since
dnsmasq doesn't support signed zones. This patch suppresses
local answers and forces forwarding to the upstream recursor
for such queries. It stops breakage when a DNSSEC validating
client makes queries to dnsmasq acting as forwarder for a zone
for which it is authoritative.
If we get a duplicate answer for a query via UDP which we have
either already received and started DNSSEC validation, or was
truncated and we've passed to TCP, then just ignore it.
The code was already in place, but had evolved wonky and
only worked for error replies which would otherwise prompt
a retransmit.
If dnsmasq is configured to add an EDNS client subnet to a query,
it is careful to suppress use of the cache, since a cached answer may
not be valid for a query with a different client subnet.
Extend this behaviour to queries which arrive a dnsmasq
already carrying an EDNS client subnet.
This change is rather more involved than may seem necessary at first sight,
since the existing code relies on all queries being decorated by dnsmasq
and therefore not cached, so there is no chance that an incoming query
might hit the cache and cache lookup don't need to be suppressed, just
cache insertion. When downstream queries may be a mix of client-subnet
bearing and plain vanilla, it can't be assumed that the answers are never
in the cache, and queries with subnets must not do lookups.
I am attaching an incremental git-am ready patch to go on top your Git HEAD,
to fix all sorts of issues and make this conforming C99 with default
options set,
and fix another load of warnings you receive when setting the compiler
to pick the nits,
-pedantic-errors -std=c99 (or c11, c18, c2x).
It changes many void * to uint8_t * to make the "increment by bytes"
explicit.
You can't do:
void *foo;
// ...
foo += 2.
We can't answer and shouldn't forward non-QUERY DNS requests.
This patch fixes handling such requests from TCP connections; before
the connection would be closed without reply.
It also changes the RCODE in the answer from REFUSED to NOTIMP and
provides clearer logging.
When DNSEC validation is enabled, but a query is not validated
because it gets forwarded to a non-DNSEC-capable upstream
server, the rr_status array is not correctly cleared, with
the effect that the answer may be maked as DNSSEC validated
if the immediately preceding query was DNS signed and validated.
Timeouts for TCP connections to non-responive servers are very long.
This in not appropriate for DNS connections.
Set timeouts for connection setup, sending data and recieving data.
The timeouts for connection setup and sending data are set at 5 seconds.
For recieving the reply this is doubled, to take into account the
time for usptream to actually get the answer.
Thanks to Petr Menšík for pointing out this problem, and finding a better
and more portable solution than the one in place heretofore.
In the post 2020 flag-day world, we limit UDP packets to 1232 bytes
which can go anywhere, so the dodgy code to try and determine the
functional maxmimum packet size on the path from upstream servers
is obsolete.