TFTP fails with a timeout when the transfer exceeds 128K blocks.
The block field in an ACK packet is 16 bits, so large transfers
need to deal with this field wrapping. Testing failure meant
that it worked for the first wrap, but not for subsequent ones.
Having fixed the block number problem, file size calculations
accidentally being done in 32 bits not 64 bits then broke
transfers larger than 2G or maybe 4G.
The first problem was probably introduced in commit
ebef27f321 and has not appeared
in a stable release. The second appears to have been there forever.
Clearly, nobody is mad enough to transfer multi-gigabyte files
over TFTP.
Thanks to Jean-François JUBLIN for spotting this and supplying
useful packet dumps that made it easy to diagnose.
GCC complains that writing the five-character "ODNS\0" string into
a four-element char magic[4] array truncates the NUL character.
The warning's rationale is that this is incompatible with C++, or
maybe non-intentional.
GCC 8 has added a nonstring variable attribute, clang 20.1 does
not yet support this, but clang's Git head does.
Add an ATTRIBUTE_NONSTRING macro, currently only defined on GCC >= 8
as __attribute__ ((nonstring)). This successfully suppresses
the warning on Fedora Linux 42's default compiler.
The alternative would be to replace the "ODNS" literal by {0} and
instead memcpy(opt.magic, "ODNS", sizeof(opt.magic)); on the next line,
which is correct, C++ compatible, but also less concise.
The proximate cause for doing this is to fix a bug that
causes replies to PTR queries with more than one answer to have the
second and subsequent answers ignored.
The fix turned into a small re-write which removed a very old hack.
When caching reponses which include CNAME records, the cache system
stores the CNAME with a link to the record representing the target of
the CNAME. This isn't possible for PTR records representing IP
addresses since the name stored is the target of the PTR, record and
its name is inferred from the address in the cache record. Such
cache records have the F_REVERSE flag set. To get
around this, long ago, the code which stores such records elided the
CNAME entirely, so
4.3.2.1.in-addr.arpa CNAME 18/3.2.1.in-addr.arpa
18/3.2.1.in-addr.arpa PTR myhost.example.com
would be stored as
4.3.2.1.in-addr.arpa PTR myhost.example.com
and returned from the cache to subsequent requestor in that form.
Since that hack was committed, dnsmasq has learned to cache arbitrary
RRs. So now we can store the PTR records for all the no-trivial cases.
The means the CNAME chains ending in PTR records don't get mangled,
and we can store PTR records whose name in not w.x.y.x.in-addr.arpa
or the IPv6 equivalent.
The large public DNS services seem not to return proof-of-nonexistence
for DS records at the start of RFC-1918 in-addr.arpa domains and the their
IPv6 equivalents. 10.in-addr.arpa, 168.192.in-addr.arpa etc.
Since dnsmasq already has an option which instructs it not bother
upstream servers with pointless queries about these address ranges,
namely --bogus-priv, we extend that to enable behaviour which allows
dnsmasq to assume that insecure NXDOMAIN replies for these domains
are expected and to assume that the domains are legitimately unsigned.
This behaviour only matters when some address range is directed to
another upstream server using --rev-server. In that case it allows
replies from that server to pass DNSSEC validation. Without such a
server configured, queries are never sent upstream so they are never
validated and the new behaviour is moot.
If DNS is happening over TCP, the query is handled by a forked
process. Of ipset ot nftset is configured, this might include
inserting addresses in the *sets. Before this update, that
was done by the forked process using handles inherited from the
parent "master" process.
This is inherently racy. If the master process or another
child process tries to do updates at the same time, the
updates can clash and fail.
To see this, you need a busy server doing lots of DNS
queries over TCP, and ipset or nftset configured.
Going forward, we use the already established pipe to send the
updates from the child back to the master process, which
serialises them.
Consider what happens when the same domain appears in
--address and --server.
This commit fixes the order, I think correctly like this:
highest to lowest priority.
--address with a IPv4 or IPv6 address (as long as the query matches the type)
--address with # for all-zeros, as long as the query is A or AAAA)
--address with no address, which returns NXDOMAIN or NOERROR for all types.
--server with address set to # to use the unqualified servers.
--server with matching domain.
--server without domain or from /etc/resolv.conf.
Note that the above is only valid when same domain appears.
The domain being matched is determined first, and has a higher
priority, so you can send google.com to a server and force com
to return NXDOMAIN and for google.com the server config will
override the address config, because there's a longer match.
Fix a case sensitivity problem which has been lurking for a long while.
When we get example.com and Example.com and combine them, we send whichever
query arrives first upstream and then later answer it, and we also
answer the second with the same answer. That means that if example.com
arrives first, it will get the answer example.com - good - but Example.com
will _also_ get the answer example.com - not so good.
In theory, fixing this is simple without having to keep seperate
copies of all the queries: Just use the bit-vector representation
of case flipping that we have for 0x20-encoding to keep the
differences in case. The complication comes from the fact that
the existing bit-vector code only holds data on the first 32 alpha
letters, because we only flip that up to many for 0x20 encoding.
In practise, the delta between combined queries can almost always
be represented with that data, since almost all queries are
all lower case and we only purturb the first 32 letters with
0x20 encoding. It's therefore worth keeping the existing,
efficient data structure for the 99.9% of the time it works.
For the 0.1% it doesn't, however, one needs an arbitrary-length data
structure with the resource implications of that.
Thanks to Peter Tirsek for the well researched bug report which set me
on to these problems.
We must only compare case when mapping an answer from upstream
to a forwarding record, not when checking a query to see if it's a
duplicate. Since the saved query name is scrambled, that ensures
that almost all such checks will wrongly fail.
Thanks to Peter Tirsek for an exemplary bug report for this.
The "bit 0x20 encoding" implemented in 995a16ca0c
can interact badly with (hopefully) rare broken upstream servers. Provide
an option to turn it off and a log message to give a clue as to why DNS service
is non-functional.
This provides extra protection against reply-spoof attacks.
Since DNS queries are case-insensitive, it's possible to randomly flip
the case of letters in a query and still get the correct answer back.
This adds an extra dimension for a cache-poisoning attacker to guess
when sending replies in-the-blind since it's expected that the
legitimate answer will have the same pattern of upper and lower case
as the query, so any replies which don't can be ignored as
malicious.
The amount of extra entropy clearly depends on the number
of a-z and A-Z characters in the query, and this implementation puts a
hard limit of 32 bits to make rescource allocation easy. This about
doubles entropy over the standard random ID and random port
combination.
If dnsmasq is configured to add an EDNS client subnet to a query,
it is careful to suppress use of the cache, since a cached answer may
not be valid for a query with a different client subnet.
Extend this behaviour to queries which arrive a dnsmasq
already carrying an EDNS client subnet.
This change is rather more involved than may seem necessary at first sight,
since the existing code relies on all queries being decorated by dnsmasq
and therefore not cached, so there is no chance that an incoming query
might hit the cache and cache lookup don't need to be suppressed, just
cache insertion. When downstream queries may be a mix of client-subnet
bearing and plain vanilla, it can't be assumed that the answers are never
in the cache, and queries with subnets must not do lookups.
I am attaching an incremental git-am ready patch to go on top your Git HEAD,
to fix all sorts of issues and make this conforming C99 with default
options set,
and fix another load of warnings you receive when setting the compiler
to pick the nits,
-pedantic-errors -std=c99 (or c11, c18, c2x).
It changes many void * to uint8_t * to make the "increment by bytes"
explicit.
You can't do:
void *foo;
// ...
foo += 2.
This acts almost exactly like --dhcp-option except that the defined option
is only sent when replying to PXE clients. More importantly, these
options are sent in reply PXE clients when dnsmasq in acting in PXE
proxy mode. In PXE proxy mode, the set of options sent is defined by
the PXE standard and the normal set of options is not sent. This config
allows arbitrary options in PXE-proxy replies. A typical use-case is
to send option 175 to iPXE. Thanks to Jason Berry for finding the
requirement for this.
A bug in gentoo linux https://bugs.gentoo.org/945183 reported that dnsmasq 2.90 fails to compile with GCC 15.
The issue is that while previous versions of GCC defaulted to the C17 standard and C23 could be selected with
"-std=c23" or "-std=gnu23", GCC 15 defaults to C23. In C23 incompatible pointer types are an error instead of
a warning, so the "int (*callback)()" incomplete prototypes cause errors.
For example, compiling dnsmasq 2.90 with gcc 14.2.1 and "-std=gnu23" fails with errors such as:
lease.c: In function `lease_find_interfaces':
lease.c:467:34: warning: passing argument 3 of `iface_enumerate' from incompatible pointer type [-Wincompatible-pointer-types[https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#index-Wincompatible-pointer-types]]
467 | iface_enumerate(AF_INET, &now, find_interface_v4);
| ^~~~~~~~~~~~~~~~~
| |
| int (*)(struct in_addr, int, char *, struct in_addr, struct in_addr, void *)
In file included from lease.c:17:
dnsmasq.h:1662:50: note: expected `int (*)(void)' but argument is of type `int (*)(struct in_addr, int, char *, struct in_addr, struct in_addr, void *)'
1662 | int iface_enumerate(int family, void *parm, int (callback)());
| ~~~~~^~~~~~~~~~~
This patch uses a typedef'ed union of pointer types to get type checking of the pointers. If that's too complicated,
another way might be to use (void *) casts to disable type checking.
Also, some of the IPv6 callbacks had "int preferred, int valid" and some had
"unsigned int preferred, unsigned int valid". This patch changes them all to "unsigned int"
so they're the same and to avoid casting "u32" to "int", eg:
u32 preferred = 0xffffffff;
callback(..., (int)preferred, ...)
Even if those cast values aren't used in the callback, casting u32 to "int" feels bad, especially if "int" is 32 bits.
Timeouts for TCP connections to non-responive servers are very long.
This in not appropriate for DNS connections.
Set timeouts for connection setup, sending data and recieving data.
The timeouts for connection setup and sending data are set at 5 seconds.
For recieving the reply this is doubled, to take into account the
time for usptream to actually get the answer.
Thanks to Petr Menšík for pointing out this problem, and finding a better
and more portable solution than the one in place heretofore.
In the post 2020 flag-day world, we limit UDP packets to 1232 bytes
which can go anywhere, so the dodgy code to try and determine the
functional maxmimum packet size on the path from upstream servers
is obsolete.
This was kinda strange before, with a lot of cargo-cult copied code,
and no clear strategy.
Now it works like this:
When talking upstream we always add a pseudoheader, and set the
UDP packet size to --edns-packet-max unless we've had problems
talking to a server, when it's reduced to 1280 if that fixes things.
Answering queries from downstream, we get the answer (either from
upstream or local data) If local data won't fit the advertised size
(or 512 if there's not pseudoheader) return truncated. If upstream
returns truncated, do likewise. If upstream is OK, but the answer is
too big for downstream, truncate the answer.
Now we're always saving the query, this is no longer necessary,
and allows the removal of a lot of quite hairy code.
Much more code removal to come, once the TCP code path is also purged.
A relatively common situation is that the reply to a downstream query
will fit in a UDP packet when no DNSSEC RRs are present, but overflows
when the RRSIGS, NSEC ect are added. This extends the automatic
move from UDP to TCP to downstream queries which get truncated replies,
in the hope that once stripped of the DNSSEC RRs, the reply can be returned
via UDP, nwithout making the downstream retry with TCP.
If the downstream sets the DO bit, (ie it wants the DNSSEC RRs, then
this path is not taken, since the downstream will have to get a truncated
repsonse and retry to get a correct answer.
Heretofore, when a validating the result of an external query triggers
a DNSKEY or DS query and the result of that query is truncated, dnsmasq
has forced the whole validation process to move to TCP by returning a
truncated reply to the original requestor. This forces the original
requestor to retry the query in TCP mode, and the DNSSEC subqueries
also get made via TCP and everything works.
Note that in general the actual answer being validated is not large
enough to trigger truncation, and there's no reason not to return that
answer via UDP if we can validate it successfully. It follows that
a substandard client which can't do TCP queries will still work if the
answer could be returned via UDP, but fails if it gets an artifically
truncated answer and cannot move to TCP.
This patch teaches dnsmasq to move to TCP for DNSSEC queries when
validating UDP answers. That makes the substandard clients mentioned
above work, and saves a round trip even for clients that can do TCP.
By calculating the hash of a DNSKEY once for each digest algo,
we reduce the hashing work from (no. DS) x (no. DNSKEY) to
(no. DNSKEY) x (no. distinct digests)
The number of distinct digests can never be more than 255 and
it's limited by which hashes we implement, so currently only 4.
Now we can cache arbirary RRs, give more correct answers when
replying negative answers from cache.
To implement this needed the DNS-doctor code to be untangled from
find_soa(), so it should be under suspicion for any regresssions
in that department.
Similar to local-service, but more strict. Listen only on localhost
unless other interface is specified. Has no effect when interface is
provided explicitly. I had multiple bugs fillen on Fedora, because I have
changed default configuration to:
interface=lo
bind-interfaces
People just adding configuration parts to /etc/dnsmasq.d or appending to
existing configuration often fail to see some defaults are already there.
Give them auto-ignored configuration as smart default.
Signed-off-by: Petr Menšík <pemensik@redhat.com>
Do not add a new parameter on command line. Instead add just parameter
for behaviour modification of existing local-service option. Now it
accepts two optional values:
- net: exactly the same as before
- host: bind only to lo interface, do not listen on any other addresses
than loopback.
At startup, the leases file is read by lease_init(), and
in lease_init() undecorated hostnames are expanded into
FQDNs by adding the domain associated with the address
of the lease.
lease_init() happens relavtively early in the startup, party because
if it calls the dhcp-lease helper script, we don't want that to inherit
a load of sensitive file descriptors. This has implications if domains
are defined using the --domain=example.com,eth0 format since it's long
before we call enumerate_interfaces(), so get_domain fails for such domains.
The patch just moves the hostname expansion function to a seperate
subroutine that gets called later, after enumerate_interfaces().