Some of our Openstack users run quite large number of dnsmasq instances
on single host. They started hitting default limit of inotify socket
number on single system after upgrade to more recent version. System
defaults of sysctl fs.inotify.max_user_instances is 128. They reached
limit of 116 dnsmasq instances, then more instances failed to start.
I was surprised they have any use case for such high number of
instances. They use one dnsmasq for one virtual network.
I found simple way to avoid hitting low system limit. They do not use
resolv.conf for name server configuration or any dhcp hosts or options
directory. Created inotify socket is never used in that case. Simple
patch attached.
I know we can raise inotify system limit. I think better is to not waste
resources that are left unused.
The current logic is naive in the case that there is more than
one RRset in an answer (Typically, when a non-CNAME query is answered
by one or more CNAME RRs, and then then an answer RRset.)
If all the RRsets validate, then they are cached and marked as validated,
but if any RRset doesn't validate, then the AD flag is not set (good) and
ALL the RRsets are cached marked as not validated.
This breaks when, eg, the answer contains a validated CNAME, pointing
to a non-validated answer. A subsequent query for the CNAME without do
will get an answer with the AD flag wrongly reset, and worse, the same
query with do will get a cached answer without RRSIGS, rather than
being forwarded.
The code now records the validation of individual RRsets and that
is used to correctly set the "validated" bits in the cache entries.
The logic to determine is an EDNS0 header was added was wrong. It compared
the packet length before and after the operations on the EDNS0 header,
but these can include adding options to an existing EDNS0 header. So
a query may have an existing EDNS0 header, which is extended, and logic
thinks that it had a header added de-novo.
Replace this with a simpler system. Check if the packet has an EDSN0 header,
do the updates/additions, and then check again. If it didn't have one
initially, but it has one laterly, that's the correct condition
to strip the header from a reply, and to assume that the client
cannot handle packets larger than 512 bytes.
dnsmasq allows to specify a interface for each name server passed with
the -S option or pushed through D-Bus; when an interface is set,
queries to the server will be forced via that interface.
Currently dnsmasq uses SO_BINDTODEVICE to enforce that traffic goes
through the given interface; SO_BINDTODEVICE also guarantees that any
response coming from other interfaces is ignored.
This can cause problems in some scenarios: consider the case where
eth0 and eth1 are in the same subnet and eth0 has a name server ns0
associated. There is no guarantee that the response to a query sent
via eth0 to ns0 will be received on eth0 because the local router may
have in the ARP table the MAC address of eth1 for the IP of eth0. This
can happen because Linux sends ARP responses for all the IPs of the
machine through all interfaces. The response packet on the wrong
interface will be dropped because of SO_BINDTODEVICE and the
resolution will fail.
To avoid this situation, dnsmasq should only restrict queries, but not
responses, to the given interface. A way to do this on Linux is with
the IP_UNICAST_IF and IPV6_UNICAST_IF socket options which were added
in kernel 3.4 and, respectively, glibc versions 2.16 and 2.26.
Reported-by: Hector Martin <marcan@marcan.st>
Signed-off-by: Beniamino Galvani <bgalvani@redhat.com>
Further fix to 0549c73b7e
Handles case when RR name is not a pointer to the question,
only occurs for some auth-mode replies, therefore not
detected by fuzzing (?)
Fix out-of-memory Dos vulnerability. An attacker which can
send malicious DNS queries to dnsmasq can trigger memory
allocations in the add_pseudoheader function
The allocated memory is never freed which leads to a DoS
through memory exhaustion. dnsmasq is vulnerable only
if one of the following option is specified:
--add-mac, --add-cpe-id or --add-subnet.
Fix DoS in DNS. Invalid boundary checks in the
add_pseudoheader function allows a memcpy call with negative
size An attacker which can send malicious DNS queries
to dnsmasq can trigger a DoS remotely.
dnsmasq is vulnerable only if one of the following option is
specified: --add-mac, --add-cpe-id or --add-subnet.
Fix information leak in DHCPv6. A crafted DHCPv6 packet can
cause dnsmasq to forward memory from outside the packet
buffer to a DHCPv6 server when acting as a relay.
Fix heap overflow in IPv6 router advertisement code.
This is a potentially serious security hole, as a
crafted RA request can overflow a buffer and crash or
control dnsmasq. Attacker must be on the local network.
Fix heap overflow in DNS code. This is a potentially serious
security hole. It allows an attacker who can make DNS
requests to dnsmasq, and who controls the contents of
a domain, which is thereby queried, to overflow
(by 2 bytes) a heap buffer and either crash, or
even take control of, dnsmasq.
libidn2 strips underscores from international domain names
when encoding them. Indeed, it strips underscores even if
no encoding is necessary, which breaks SRV records.
Don't submit domain names to IDN encoding if they contain
one or more underscores to fix this.
If a DNS server replies REFUSED for a given DNS query in strict order mode
no failover to the next DNS server is triggered as the failover logic only
covers non strict mode.
As a result the client will be returned the REFUSED reply without first
falling back to the secondary DNS server(s).
Make failover support work as well for strict mode config in case REFUSED is
replied by deleting the strict order check and rely only on forwardall being
equal to 0 which is the case in non strict mode when a single server has been
contacted or when strict order mode has been configured.
Commit f77700aa, which fixes a compiler warning, also breaks the
behaviour of prepending ".<layer>" to basenames in --pxe-service: in
situations where the basename contains a ".", the ".<layer>" suffix is
erroneously added, and in situations where the basename doesn't contain
a ".", the ".<layer>" suffix is erroneously omitted.
A patch against the git HEAD is attached that inverts this logic and
restores the expected behaviour of --pxe-service.