Fix a case sensitivity problem which has been lurking for a long while.
When we get example.com and Example.com and combine them, we send whichever
query arrives first upstream and then later answer it, and we also
answer the second with the same answer. That means that if example.com
arrives first, it will get the answer example.com - good - but Example.com
will _also_ get the answer example.com - not so good.
In theory, fixing this is simple without having to keep seperate
copies of all the queries: Just use the bit-vector representation
of case flipping that we have for 0x20-encoding to keep the
differences in case. The complication comes from the fact that
the existing bit-vector code only holds data on the first 32 alpha
letters, because we only flip that up to many for 0x20 encoding.
In practise, the delta between combined queries can almost always
be represented with that data, since almost all queries are
all lower case and we only purturb the first 32 letters with
0x20 encoding. It's therefore worth keeping the existing,
efficient data structure for the 99.9% of the time it works.
For the 0.1% it doesn't, however, one needs an arbitrary-length data
structure with the resource implications of that.
Thanks to Peter Tirsek for the well researched bug report which set me
on to these problems.
The "bit 0x20 encoding" implemented in 995a16ca0c
can interact badly with (hopefully) rare broken upstream servers. Provide
an option to turn it off and a log message to give a clue as to why DNS service
is non-functional.
This provides extra protection against reply-spoof attacks.
Since DNS queries are case-insensitive, it's possible to randomly flip
the case of letters in a query and still get the correct answer back.
This adds an extra dimension for a cache-poisoning attacker to guess
when sending replies in-the-blind since it's expected that the
legitimate answer will have the same pattern of upper and lower case
as the query, so any replies which don't can be ignored as
malicious.
The amount of extra entropy clearly depends on the number
of a-z and A-Z characters in the query, and this implementation puts a
hard limit of 32 bits to make rescource allocation easy. This about
doubles entropy over the standard random ID and random port
combination.
When checking that an answer is the answer to the question that
we asked, compare the name in a case-sensitive manner.
Clients can set the letters in a query to a random pattern of
uppercase and lowercase to add more randomness as protection against
cache-poisoning attacks, and we don't want to nullify that.
This actually restores the status quo before
commit ed6d29a784
since matching questions and answers using a checksum
can't help but be case sensitive.
This patch is a preparation for introducing DNS-0x20
in the dnsmasq query path.
If the client asks for DNSSEC RRs via the do bit, and
we have an answer cached, we can only return the cached
answer if the RR was not validated. This is because
we don't the extra info (RRSIGS, NSECs) for a complete
validated answer. In that case we have to forward again.
This bug was that the "is the cache entry validated" test was
in an outer loop rather than an inner one. A cache hit on
a different RRtype that wasn't validated would satify the
condition to use the cache, even if the cache entry for
the required RRtype didn't. The only time when there can be a mix
of validated and non validated cache entries for the same domain
is when most are not validated, but one is a negative cache for
a DS record.
This bug took a long time to find.
If dnsmasq is configured to add an EDNS client subnet to a query,
it is careful to suppress use of the cache, since a cached answer may
not be valid for a query with a different client subnet.
Extend this behaviour to queries which arrive a dnsmasq
already carrying an EDNS client subnet.
This change is rather more involved than may seem necessary at first sight,
since the existing code relies on all queries being decorated by dnsmasq
and therefore not cached, so there is no chance that an incoming query
might hit the cache and cache lookup don't need to be suppressed, just
cache insertion. When downstream queries may be a mix of client-subnet
bearing and plain vanilla, it can't be assumed that the answers are never
in the cache, and queries with subnets must not do lookups.
We can't answer and shouldn't forward non-QUERY DNS requests.
This patch fixes handling such requests from TCP connections; before
the connection would be closed without reply.
It also changes the RCODE in the answer from REFUSED to NOTIMP and
provides clearer logging.
When DNSEC validation is enabled, but a query is not validated
because it gets forwarded to a non-DNSEC-capable upstream
server, the rr_status array is not correctly cleared, with
the effect that the answer may be maked as DNSSEC validated
if the immediately preceding query was DNS signed and validated.
This was kinda strange before, with a lot of cargo-cult copied code,
and no clear strategy.
Now it works like this:
When talking upstream we always add a pseudoheader, and set the
UDP packet size to --edns-packet-max unless we've had problems
talking to a server, when it's reduced to 1280 if that fixes things.
Answering queries from downstream, we get the answer (either from
upstream or local data) If local data won't fit the advertised size
(or 512 if there's not pseudoheader) return truncated. If upstream
returns truncated, do likewise. If upstream is OK, but the answer is
too big for downstream, truncate the answer.
In extract_addresses() the "secure" argument is only set if the
whole reply is validated (ie the AD bit can be set). Even without
that, some records may be validated, and should be marked
as such in the cache.
Related, the DNS doctor code has to update the flags for individual
RRs as it works, not the global "secure" flag.
Now we can cache arbirary RRs, give more correct answers when
replying negative answers from cache.
To implement this needed the DNS-doctor code to be untangled from
find_soa(), so it should be under suspicion for any regresssions
in that department.
By design, dnsmasq forwards queries for RR-types it has no data
on, even if it has data for the same domain and other RR-types.
This can lead to an inconsitent view of the DNS when an upstream
server returns NXDOMAIN for an RR-type and domain but the same domain
but a different RR-type gets an answer from dnsmasq. To avoid this,
dnsmasq converts NXDOMAIN answer from upstream to NODATA answers if
it would answer a query for the domain and a different RR-type.
An oversight missed out --synth-domain from the code to do this, so
--synth-domain=thekelleys.org.uk,192.168.0.0/24
would result in the correct answer to an A query for
192-168.0.1.thekelleys.org.uk and an AAAA query for the same domain
would be forwarded upstream and the resulting NXDOMAIN reply
returned.
After the fix, the reply gets converted to NODATA.
Thanks to Matt Wong for spotting the bug.
If the cache insertion process fails for any reason, any
blockdata storage allocated needs to be freed.
Thanks to Damian Sawicki for spotting the problem and
supplying patches against earlier releases. This patch by SRK,
and any bugs are his.
answer_request() builds answers in the same packet buffer
as the request. This means that any EDNS0 header from the
original request is overwritten. If the answer is in cache, that's
fine: dnsmasq adds its own EDNS0 header, but if the cache lookup fails
partially and the request needs to be sent upstream, it's a problem.
This was fixed a long, long time ago by running the cache
lookup twice if the request included an EDNS0 header. The first time,
nothing would be written to the answer packet, nad if the cache
lookup failed, the untouched question packet was still available
to forward upstream. If cache lookup succeeded, the whole thing
was done again, this time writing the data into the reply packet.
In a world where EDNS0 was rare and so was memory, this was a
reasonable solution. Today EDNS0 is ubiquitous so basically
every query is being looked up twice in the cache. There's also
the problem that any code change which makes successive cache lookups
for a query possibly return different answers adds a subtle hidden
bug, because this hack depends on absence of that behaviour.
This commit removes the lookup-twice hack entirely. answer_request()
can now return zero and overwrite the question packet. The code which
was previously added to support stale caching by saving a copy of the
query in the block-storage system is extended to always be active.
This handles the case where answer_request() returns no answer OR
a stale answer and a copy of the original query is needed to forward
upstream.
Caching an answer which has more that one RR, with at least
one answer being <=13 bytes and at least one being >13 bytes
can screw up the F_KEYTAG flag bit, resulting in the wrong
type of the address union being used and either a bad value
return or a crash in the block code.
Thanks to Dominik Derigs and the Pi-hole project for finding
and characterising this.
the compiler padding it with an extra 8 bytes.
Use the F_KEYTAG flag in a a cache record to discriminate between
an arbitrary RR stored entirely in the addr union and one
which has a point to block storage.
If a NODATA answer is returned instead of actual data for A or AAAA
queries because of the existence of --filter-A or --filter-AAAA
config options, then mark the replies with an EDE "filtered" tag.
Basic patch by Petr Menšík, tweaked by Simon Kelley to apply onto
the preceding caching patches.
If --filter-AAAA is set and we have cached entry for
the domain in question fpr any RR type that allows us to
return a NODATA reply when --filter-AAAA is set without
going upstream. Similarly for --filter-A.
As defined in the C standard:
In all cases the argument is an int, the value of which shall
be representable as an unsigned char or shall equal the value
of the macro EOF. If the argument has any other value, the
behavior is undefined.
This is because they're designed to work with the int values returned
by getc or fgetc; they need extra work to handle a char value.
If EOF is -1 (as it almost always is), with 8-bit bytes, the allowed
inputs to the ctype(3) functions are:
{-1, 0, 1, 2, 3, ..., 255}.
However, on platforms where char is signed, such as x86 with the
usual ABI, code like
char *arg = ...;
... isspace(*arg) ...
may pass in values in the range:
{-128, -127, -126, ..., -2, -1, 0, 1, ..., 127}.
This has two problems:
1. Inputs in the set {-128, -127, -126, ..., -2} are forbidden.
2. The non-EOF byte 0xff is conflated with the value EOF = -1, so
even though the input is not forbidden, it may give the wrong
answer.
Casting char to int first before passing the result to ctype(3)
doesn't help: inputs like -128 are unchanged by this cast. It is
necessary to cast char inputs to unsigned char first; you can then
cast to int if you like but there's no need because the functions
will always convert the argument to int by definition. So the above
fragment needs to be:
char *arg = ...;
... isspace((unsigned char)*arg) ...
This patch inserts unsigned char casts where necessary, and changes
int casts to unsigned char casts where the input is char.
I left alone int casts where the input is unsigned char already --
they're not immediately harmful, although they would have the effect
of suppressing some compiler warnings if the input is ever changed to
be char instead of unsigned char, so it might be better to remove
those casts too.
I also left alone calls where the input is int to begin with because
it came from getc; casting to unsigned char here would be wrong, of
course.
This patch also changes the method of calling querystr() such that
it is only called when logging is enabled, to eliminate any
possible performance problems from searching the larger table.