To provide feature parity with `bin/tests/system/ans.pl` add a control
command to allow easy switching between different sequences of
ResponseHandlers.
It saves an indent and brackets on the call sites.
Also sort the handlers alphabetically where their order doesn't matter
and split the fallback handlers into a separate call to signify that
their position in the end matters.
Since there was no 10.53.0.6 server in the test, renumber the remaining
ones so that there's no gap in the server names.
This commit simply moves the ans.py files without any changes and
renumbers the IP addresses in tests.
Previously, the ans8 server had different response modes that applied to
all queries. Replace it with AsyncDnsServer that has serves the different
response modes under different domains without the need to change the
server behaviour at runtime.
Add the new queries that require an ns3 fallback to the ns3/example.db
zone.
The server has three modes of operation - either no response, a partial
AXFR or a complete AXFR. To test the fallback behaviour of dig, these
actions are be combined in a specific sequences. To set up the desired
server behaviour, use the _control queries for the server.
The action can be used to close the connection even after some response
was sent, depending on the ordering of actions in the handler that uses
it. Rename it to CloseConnection to use a more fitting name.
If at all possible, all the responses should be created by
AsyncDnsServer's internal methods. To ensure this, mark them with a
magic attribute and check it on send and crash the server if a manually
created response is detected.
Fix the qmin test server which uses `make_response`.
As `dns_view_findzonecut()` only returns either ISC_R_SUCCESS or
DNS_R_NXDOMAIN, and since it automatically disassociates the rdatasets
in case of failure, some call sites are simplified.
Currently we add an rrset-order cyclic statement to the default config.
Since the rrset-order allows matching a subset of all names, it must
be implemented with a string comparison against a wildcard, and since
the statement applies per rrset, this can result in millions of
comparisons per second on a busy authoritative server.
This commit removes rrset-order from the default config, but adds back
a code shim in query_setorder to preserve the previous behaviour.
Now that the configuration options `edns-version`, `edns-udp-size`,
`max-udp-size`, `no-cookie-udp-size` and `padding` have strict boundaries
(configuration failing if they are not respected), remove configuration
loading code which implicitely raises or lowers them.
Ensure that named can handle a situation where the zone is signed with a
truncated, self-signed revoked DNSKEY. The signatures are inevitably
bogus and a SERVFAIL is expected. However, prior to CVE-2025-8677 fix,
this could trigger an assertion failure.
Create a signed zone file that contains malformed ZSKs with colliding
key tags. The ZSKs don't represent valid ECDSA keys and will cause a
crypto failure when attempting to use them. Sign the zone with KSK, with
the exception of one record which is "signed" with the invalid ZSKs.
Check that the resolver aborts the DNSSEC verification after
encountering the first crypto failure, indicating malformed DNSKEY.
In 6e684d44 I mistakenly set the default for `default_aa` for
`AsyncDnsServer()` to `True` and then explicitly set it to True in
cases where all the `ResponseHandlers` said
`yield DnsResponseSend(..., authoritative=True)` as if the default was
`False`.
Also the rest of `AsyncDnsServer` code (namely `_prepare_responses`)
reads like `default_aa` is `False` by default.
This accidentally changed the behavior of servers which don't set the
`default_aa` and where AA is not set from the zone data
(e.g. `dispatch/ans3`).
Commit c17ac42608 changed some tests to
wait for "zone_needdump" messages instead of "sending notifies", because
notifies are rate limited and "zone_needdump" happen on every change.
However, inspecting the logs, the "zone_needdump" changes happen more
than once (likely because the re-signing is done in batches):
received control channel command 'sign step3.zsk-prepub.manual'
zone_journal: zone step3.zsk-prepub.manual/IN (signed): enter
zone_needdump: zone step3.zsk-prepub.manual/IN (signed): enter
zone_journal: zone step3.zsk-prepub.manual/IN (signed): enter
zone_needdump: zone step3.zsk-prepub.manual/IN (signed): enter
zone_journal: zone step3.zsk-prepub.manual/IN (signed): enter
zone_needdump: zone step3.zsk-prepub.manual/IN (signed): enter
zone step3.zsk-prepub.manual/IN (signed): sending notifies
This means we are running the rollover step checks too fast in some
test runs.
Revert the wait for log change for the rollover-zsk-prepub test.
Add a type to all dns_zone_(get|set) functions that apply to sending
notifies, so the options can be set and retrieved separately per type.
This affects dns_zone_setnotifydefer, dns_zone_getnotifydefer,
dns_zone_setnotifydelay, dns_zone_getnotifydelay,
dns_zone_setnotifysrc4, and dns_zone_setnotifysrc6.
The functions dns_zone_getnotifysrc4 and dns_zone_getnotifysrc6 are
unused and can be removed.
A single spoofed DNAME answer can impact many names, and because of the
nature of DNAME, the attacker can use randomized query names to get
unlimited number of tries to spoof the answer. To limit impact, we
should not be accepting DNAME over insecure transport, like UDP without
cookies etc.
In short, the attacker tries to spoof at least one answer that has the
following form:
opcode QUERY
rcode NOERROR
flags QR AA
;QUESTION
trigger$RANDOM.test. IN A
;ANSWER
trigger$RANDOM.test. 3600 IN CNAME trigger$RANDOM.attacker.net.
test. 3600 IN DNAME attacker.net.
;AUTHORITY
;ADDITIONAL
This has been discovered internally.
Co-authored-by: Michał Kępień <michal@isc.org>
In short, the attacker tries to spoof at least one answer that has the
following form:
rcode NOERROR
flags QR
;QUESTION
trigger$RANDOM.victim. IN TXT
;ANSWER
;AUTHORITY
trigger$RANDOM.victim. 3600 IN NS ns.victim.
;ADDITIONAL
ns.victim. 3600 IN A 10.53.0.3
This attack was originally reported as "test case 2".
Co-authored-by: Michał Kępień <michal@isc.org>
Before the fixes for CVE-2025-40778, an unsolicited in-bailiwick NS
record was accepted from a (spoofed) answer, enabling a single spoofed A
query/response to redirect traffic for a whole delegation.
In short, the attacker tries to spoof at least one answer that has the
following form:
rcode NOERROR
flags QR AA
;QUESTION
trigger$RANDOM.victim. IN TXT
;ANSWER
trigger$RANDOM.victim. 3600 IN TXT "spoofed answer with extra NS"
;AUTHORITY
victim. 3600 IN NS ns.attacker.
;ADDITIONAL
This attack was originally reported as "test case 1".
Co-authored-by: Michał Kępień <michal@isc.org>
Before the fixes for CVE-2025-40778, a positive answer was allowed to
overwrite sibling NS RRs. The answer had to be a positive AA=1 answer
with a fake NS along with it. This combination of conditions avoided
the code path with "unrelated <RRTYPE>" detection logic.
If it were some other answer, named from the main branch would detect
the attempt and log:
DNS format error from 10.53.0.1#16386 resolving trigger/A for <unknown>: unrelated NS victim in trigger authority section
In short, the attacker tries to spoof at least one answer that has the
following form:
opcode QUERY
rcode NOERROR
flags QR AA
;QUESTION
trigger$RANDOM. IN A
;ANSWER
trigger$RANDOM. 3600 IN A 10.53.0.3
;AUTHORITY
victim. 3600 IN NS ns.attacker.
;ADDITIONAL
ns.attacker. 3600 IN A 10.53.0.3
This attack was originally reported as "test case 1c".
Co-authored-by: Michał Kępień <michal@isc.org>
Add bin/tests/system/ans.py, a bare-bones DNS server that can be used in
system tests instead of full-blown named instances when a server is only
required to return zone-based data. Where applicable, this reduces load
on the test host and the amount of generated logs.
Due to the way various asyncio-related objects (tasks, streams,
transports, selectors) are referencing each other, pausing reads for a
TCP transport (which in practice means removing the client socket from
the set of descriptors monitored by a selector) can cause the client
task (AsyncDnsServer._handle_tcp()) to be prematurely garbage-collected,
causing asyncio code to raise a "Task was destroyed but it is pending!"
exception. Who knew that solutions as elegant as the one introduced by
e407888507 could cause unexpected trouble?
Fix by making a horrible hack even more horrible, specifically by
keeping a reference to each incoming TCP connection to protect its
related asyncio objects from getting garbage-collected. This prevents
AsyncDnsServer from closing any of the ignored TCP connections
indefinitely, which is obviously a pretty brain-dead idea for a
production-grade DNS server, but AsyncDnsServer was never meant to be
one and this hack reliably solves the problem at hand.
Only apply this change for the IgnoreAllConnections handler as the
ConnectionReset handler triggers a connection reset immediately after
pausing reads for an incoming TCP connection.
As pointed out in e407888507, the proper
solution would require implementing a custom asyncio transport from
scratch and that is still deemed to be too much work for the purpose at
hand. Let's see how much longer we can limp along with the existing
approach.
Calling asyncio.Future.set_exception() or asyncio.Future.set_result()
more than once for a given Future object raises an
asyncio.InvalidStateError exception.
In the case of AsyncServer:
- it is enough to capture the first exception raised by higher-level
logic as no exceptions at all are expected to be raised in the first
place,
- no distinction is made between SIGINT and SIGTERM; the only purpose
of the signal handler is to make the server exit cleanly.
Given the above, make both AsyncServer._handle_exception() and
AsyncServer._signal_done() idempotent by ignoring
asyncio.InvalidStateError exceptions raised by the relevant
asyncio.Future.set_*() calls.
If we change from NSEC3 to NSEC we should not produce a zone with
missing NSEC records.
The code only considered having seen a record if there was previously
a signature present at the owner name. However with opt-out, insecure
delegations don't have a RRSIG record. Reconfiguring to NSEC causes
all insecure delegations to have a missing NSEC record.
Add a DNAME record to the test zone to also cover DNAME delegations.
This reverts commit 21295bc188.
In a sense, the ans6 black holeserver, based on asyncserver, "does
nothing". In our case, it won't respond to any query, and if the
IgnoreAllConnections connection handler was installed, it would not read
anything from the client socket.
Previously, sending notifications to an unconfigured address resulted in
no communication from the target (10.53.10.53); hence, the ns3
configuration comment requested a "non-responsive notify recipient (no
reply, no ICMP errors)".
However, examining the PCAP of ans6 reveals some communication from the
10.53.0.6 server to the 10.53.0.3 client, including ICMP Destination
Unreachable (Port Unreachable), and TCP SYN/ACK.
The ans6 communication seems to be sufficiently different to touch
different code paths in named, resulting in the BIND 9.20 backport
failing in the "checking notify retries expire within 30 seconds" test.
But we better revert it from "main" as well.
The RFC says There MUST NOT be more than one DSYNC record for each
combination of RRtype and Scheme. If we encounter more we should drop
the response, as the DSYNC RRset is invalid.
When doing rollover and the CDS/CDNSKEY RRset is updated, test that a
NOTIFY(CDS) message is sent. For other steps in the rollover, prohibit
any dsyncfetch activity.
When starting up the services, send notifies for the existing CDS RRset.
This requires setting up a chain of trust for the test, so the DSYNC
records can be retrieved and validated.
This feature requires enabling 'notify-cds' and 'dnssec-validation'.
In this test, the scanner is pointed to ns2. Since there is no code
for receiving NOTIFY(CDS) messages for delegations, this is treated
as "not authoritative". Checking for this log message ensures us that
the NOTIFY(CDS) message was actually sent.
Now that we log the type of the notify, some expected log messages
in the system tests need to be adjusted accordingly.
The bin/tests/system/nsec3/tests_nsec3_retransfer.py log is changed
to zone_needdump because it is more reliable. Other tests were
adjusted similar in MR !11265, but !11226 introduced a new
"sending notify" log line.
Symlink ns1 and ns2 to rollover/ns1 and rollover/ns2.
Symlink ns3/template.db.j2.manual to rollover/ns3/template.db.j2.manual.
Since the bootstrapping is done before the templates are rendered
automatically, replace @DEFAULT_ALGORITHM@ in ns3/kasp.conf.j2 to
ecdsa256 and rename to ns3/kasp.conf.
Symlink ns1 and ns2 to rollover/ns1 and rollover/ns2.
Symlink ns3/template.db.j2.manual to rollover/ns3/template.db.j2.manual.
Since the bootstrapping is done before the templates are rendered
automatically, replace @DEFAULT_ALGORITHM@ in ns3/kasp.conf.j2 to
ecdsa256 and rename to ns3/kasp.conf.