If the cache insertion process fails for any reason, any
blockdata storage allocated needs to be freed.
Thanks to Damian Sawicki for spotting the problem and
supplying patches against earlier releases. This patch by SRK,
and any bugs are his.
answer_request() builds answers in the same packet buffer
as the request. This means that any EDNS0 header from the
original request is overwritten. If the answer is in cache, that's
fine: dnsmasq adds its own EDNS0 header, but if the cache lookup fails
partially and the request needs to be sent upstream, it's a problem.
This was fixed a long, long time ago by running the cache
lookup twice if the request included an EDNS0 header. The first time,
nothing would be written to the answer packet, nad if the cache
lookup failed, the untouched question packet was still available
to forward upstream. If cache lookup succeeded, the whole thing
was done again, this time writing the data into the reply packet.
In a world where EDNS0 was rare and so was memory, this was a
reasonable solution. Today EDNS0 is ubiquitous so basically
every query is being looked up twice in the cache. There's also
the problem that any code change which makes successive cache lookups
for a query possibly return different answers adds a subtle hidden
bug, because this hack depends on absence of that behaviour.
This commit removes the lookup-twice hack entirely. answer_request()
can now return zero and overwrite the question packet. The code which
was previously added to support stale caching by saving a copy of the
query in the block-storage system is extended to always be active.
This handles the case where answer_request() returns no answer OR
a stale answer and a copy of the original query is needed to forward
upstream.
Caching an answer which has more that one RR, with at least
one answer being <=13 bytes and at least one being >13 bytes
can screw up the F_KEYTAG flag bit, resulting in the wrong
type of the address union being used and either a bad value
return or a crash in the block code.
Thanks to Dominik Derigs and the Pi-hole project for finding
and characterising this.
By default TCP connect takes minutes to fail when trying to
connect a server which is not responding and for which the
network layer doesn't generate HOSTUNREACH errors.
This is doubled because having failed to connect in FASTOPEN
mode, the code then tries again with a call to connect().
We set TCP_SYNCNT to 2, which make the timeout about 10 seconds.
This in an unportable Linux feature, so it doesn't work on other
platforms.
No longer try connect() if sendmsg in fastopen mode fails with
ETIMEDOUT or EHOSTUNREACH since the story will just be the same.
On 15/5/2023 8.8.8.8 was returning SERVFAIL for a query on ec.europa.eu
ec.europa.eu is not a domain cut, that happens at jrc.ec.europa.eu. which
does return a signed proof of non-existance for a DS record.
Abandoning the search for a DS or proof of non existence at ec.europa.eu
renders everything within that domain BOGUS, since nothing is signed.
This code changes behaviour on a SERVFAIL to continue looking
deeper for a DS or proof of its nonexistence.
After replying with stale data, dnsmasq sends the query upstream to
refresh the cache asynchronously and sometimes sends the wrong packet:
packet length can be wrong, and if an EDE marking stale data is added
to the answer that can end up in the query also. This bug only seems
to cause problems when the usptream server is a DOH/DOT proxy. Thanks
to Justin He for the bug report.
If I configure dnsmasq to use dbus and then restart dbus.service with watchers present,
it crashes dnsmasq. The reason is simple, it uses loop to walk over watchers to call
dbus handling code. But from that code the same list can be modified and watchers removed.
But the list iteration continues anyway.
Restart the loop if list were modified.
A NXDOMAIN answer recieved over TCP by a child process would
be correctly sent back to the master process which would then
fail to insert it into the cache.
the compiler padding it with an extra 8 bytes.
Use the F_KEYTAG flag in a a cache record to discriminate between
an arbitrary RR stored entirely in the addr union and one
which has a point to block storage.
If a NODATA answer is returned instead of actual data for A or AAAA
queries because of the existence of --filter-A or --filter-AAAA
config options, then mark the replies with an EDE "filtered" tag.
Basic patch by Petr Menšík, tweaked by Simon Kelley to apply onto
the preceding caching patches.
If --filter-AAAA is set and we have cached entry for
the domain in question fpr any RR type that allows us to
return a NODATA reply when --filter-AAAA is set without
going upstream. Similarly for --filter-A.
Dynamic-host was implemented to ignore interface addresses with /32
(or /128 for IPv6) prefix lengths, since they are not useful for
synthesising addresses.
Due to a bug before 2.88, this didn't work for IPv4, and some have
used --dynamic-host=example.com,0.0.0.0,eth0 to do the equivalent of
--interface-name for such interfaces. When the bug was fixed in 2.88
these uses broke.
Since this behaviour seems to violate the principle of least surprise,
and since the 2.88 fix is breaking existing imstallations, this
commit removes the check on /32 and /128 prefix lengths to solve both
problems.