Now we're always saving the query, this is no longer necessary,
and allows the removal of a lot of quite hairy code.
Much more code removal to come, once the TCP code path is also purged.
A relatively common situation is that the reply to a downstream query
will fit in a UDP packet when no DNSSEC RRs are present, but overflows
when the RRSIGS, NSEC ect are added. This extends the automatic
move from UDP to TCP to downstream queries which get truncated replies,
in the hope that once stripped of the DNSSEC RRs, the reply can be returned
via UDP, nwithout making the downstream retry with TCP.
If the downstream sets the DO bit, (ie it wants the DNSSEC RRs, then
this path is not taken, since the downstream will have to get a truncated
repsonse and retry to get a correct answer.
Heretofore, when a validating the result of an external query triggers
a DNSKEY or DS query and the result of that query is truncated, dnsmasq
has forced the whole validation process to move to TCP by returning a
truncated reply to the original requestor. This forces the original
requestor to retry the query in TCP mode, and the DNSSEC subqueries
also get made via TCP and everything works.
Note that in general the actual answer being validated is not large
enough to trigger truncation, and there's no reason not to return that
answer via UDP if we can validate it successfully. It follows that
a substandard client which can't do TCP queries will still work if the
answer could be returned via UDP, but fails if it gets an artifically
truncated answer and cannot move to TCP.
This patch teaches dnsmasq to move to TCP for DNSSEC queries when
validating UDP answers. That makes the substandard clients mentioned
above work, and saves a round trip even for clients that can do TCP.
The msg_controllen field passed to sendmsg is computed using the
CMSG_SPACE(), which is correct, but CMSG_SPACE() can include
padding bytes at the end of the passed structure if they are required
to align subsequent structures in the buffer. Make sure these
bytes are zeroed to avoid passing uninitiased memory to the kernel,
even though it will never touch these bytes.
Also tidy up the mashalling code in send_from to use pointers to
the structure being filled out, rather than a temporary copy which
then gets memcpy()'d into place. The DHCP sendmsg code has always
worked like this.
Thanks to Dominik Derigs for running Memcheck as submitting the
initial patch.
CAP_NET_ADMIN is needed in the DHCPv4 code to place entries into
the ARP cache. If it's configured to unconditionally broadcast
to unconfigured clients, it never touches the ARP cache and
doesn't need CAP_NET_ADMIN.
Thanks to Martin Ivičič <max.enhanced@gmail.com> for prompting this.
In generalising the RR filter code, the Dbus methods
controlling filtering A and AAAA records
got severely broken. This, and the previous commit,
fixes things.
Replies from upstream with a REFUSED rcode can result in
log messages stating that a resource limit has been exceeded,
which is not the case.
Thanks to Dominik Derigs and the Pi-hole project for
spotting this.
By calculating the hash of a DNSKEY once for each digest algo,
we reduce the hashing work from (no. DS) x (no. DNSKEY) to
(no. DNSKEY) x (no. distinct digests)
The number of distinct digests can never be more than 255 and
it's limited by which hashes we implement, so currently only 4.
An attacker can create DNSSEC signed domains which need a lot of
work to verfify. We limit the number of crypto operations to
avoid DoS attacks by CPU exhaustion.
If there are no dynamic configuration directories configured with
dhcp-hostsdir, dhcp-optsdir and hostsdir then we need to use inotify
only to track changes to resolv-files, but we don't need to do
that when DNS is disabled (port=0) or no resolv-files are configured.
It turns out that inotify slots can be a scarce resource, so not
using one when it's not needed is a Goood Thing.
Patch by HL, description above from SRK.
In extract_addresses() the "secure" argument is only set if the
whole reply is validated (ie the AD bit can be set). Even without
that, some records may be validated, and should be marked
as such in the cache.
Related, the DNS doctor code has to update the flags for individual
RRs as it works, not the global "secure" flag.
Now we can cache arbirary RRs, give more correct answers when
replying negative answers from cache.
To implement this needed the DNS-doctor code to be untangled from
find_soa(), so it should be under suspicion for any regresssions
in that department.
Similar to local-service, but more strict. Listen only on localhost
unless other interface is specified. Has no effect when interface is
provided explicitly. I had multiple bugs fillen on Fedora, because I have
changed default configuration to:
interface=lo
bind-interfaces
People just adding configuration parts to /etc/dnsmasq.d or appending to
existing configuration often fail to see some defaults are already there.
Give them auto-ignored configuration as smart default.
Signed-off-by: Petr Menšík <pemensik@redhat.com>
Do not add a new parameter on command line. Instead add just parameter
for behaviour modification of existing local-service option. Now it
accepts two optional values:
- net: exactly the same as before
- host: bind only to lo interface, do not listen on any other addresses
than loopback.
By design, dnsmasq forwards queries for RR-types it has no data
on, even if it has data for the same domain and other RR-types.
This can lead to an inconsitent view of the DNS when an upstream
server returns NXDOMAIN for an RR-type and domain but the same domain
but a different RR-type gets an answer from dnsmasq. To avoid this,
dnsmasq converts NXDOMAIN answer from upstream to NODATA answers if
it would answer a query for the domain and a different RR-type.
An oversight missed out --synth-domain from the code to do this, so
--synth-domain=thekelleys.org.uk,192.168.0.0/24
would result in the correct answer to an A query for
192-168.0.1.thekelleys.org.uk and an AAAA query for the same domain
would be forwarded upstream and the resulting NXDOMAIN reply
returned.
After the fix, the reply gets converted to NODATA.
Thanks to Matt Wong for spotting the bug.
At startup, the leases file is read by lease_init(), and
in lease_init() undecorated hostnames are expanded into
FQDNs by adding the domain associated with the address
of the lease.
lease_init() happens relavtively early in the startup, party because
if it calls the dhcp-lease helper script, we don't want that to inherit
a load of sensitive file descriptors. This has implications if domains
are defined using the --domain=example.com,eth0 format since it's long
before we call enumerate_interfaces(), so get_domain fails for such domains.
The patch just moves the hostname expansion function to a seperate
subroutine that gets called later, after enumerate_interfaces().
Add the relevant information to the metrics and to the output of
dump_cache() (which is called when dnsmasq receives SIGUSR1).
Hence, users not collecting metrics will still be able to
troubleshoot with SIGUSR1. In addition to the current usage,
dump_cache() contains the information on the highest usage
since it was last called.
In bind-dynamic mode, its OK to fail to bind a socket to an address
given by --listen-address if no interface with that address exists
for the time being. Dnsmasq will attempt to create the socket again
when the host's network configuration changes.
The code used to ignore pretty much any error from bind(), which is
incorrect and can lead to confusing behaviour. This change make ONLY
a return of EADDRNOTAVAIL from bind() a non-error: anything else will be
fatal during startup phase, or logged after startup phase.
Thanks to Petr Menšík for the problem report and first-pass patch.
Bug report here:
https://lists.thekelleys.org.uk/pipermail/dnsmasq-discuss/2023q4/017332.html
This error probably has no practical effect since even if the hash
is wrong, it's only compared internally to other hashes computed using
the same code.
Understanding the error:
hash-questions.c:168:21: runtime error: left shift of 128 by 24 places
cannot be represented in type 'int'
requires a certain amount of c-lawyerliness. I think the problem is that
m[i] = data[j] << 24
promotes the unsigned char data array value to int before doing the shift and
then promotes the result to unsigned char to match the type of m[i].
What needs to happen is to cast the unsigned char to unsigned int
BEFORE the shift.
This patch does that with explicit casts.