What happens when a remote server doesn't understand EDNS0?

Question:

What are the situations (timeouts, FORMERR .. etc) that mark a server as unable to speak EDNS0? What happens afterwards?

Answer:

named tries to send a query withEDNS(0);

If the responding server simply ignores the OPT record, this is OK as the response either fits into the 512 bytes message size, or sets the TC (truncated) bit and named retries the query over TCP.

If no response is received, or the response received indicates 'failure', then named will try again with EDNS(0) but with the packet size limited to 512 bytes; if that fails, it will try again without EDNS(0).

Failure, in this context, is anything that prevents named from receiving a query response from the authoritative server, for example:

Returns FORMERR - in this case, named retries without EDNS
Returns BADVER (which suggests some understanding of EDNS) - in this case, named retries without EDNS as clearly the responding server isn't properly EDNS compliant
Returns NOTIMP - in this case, named retries without EDNS
Returns a badly formatted response - in this case, named rejects (and logs) the response and internally it's handled as no response or FORMERR.

*For versions of BIND that are 9.9.x and older *, if having failed until querying without EDNS, and if that query succeeds, then named memorizes the formula that worked; it won't try to use EDNS(0) again with that server for the duration of the server's TTL, or one day, whichever is shorter.

To remove this internal setting for a server that has incorrectly been remembered as EDNS0-incapable, flush the name from cache:

rndc flushname <name>

Flushing the name may not always work

If the record for the server is keyed by IP address rather than name, it cannot be cleared from cache using rndc flushname. In that case, you will have to wait for the status to timeout (5 minutes) or flush the entire cache (rndc flush). Clearing the remembered EDNS setting is only going to be useful if the problem that caused it to be set that way originally has been addressed.

If you are having repeated/consistent problems with communication between your DNS resolver and one or more authoritative servers, and you are not running a current version of BIND, then you may be affected by this problem: Refinements to EDNS fallback behavior can cause different outcomes in Recursive Servers.

This article was originally written for versions of BIND older than BIND 9.10

If you are using BIND 9.10 or newer, please also read: Testing authoritative server support for EDNS and large UDP buffer sizes in BIND 9.10 as the recovery of BIND from determining that another server does not support EDNS is different and more resilient to changing circumstances and transient failures.

For information regarding more recently-introduced EDNS standards, you might also like to read: DNS Cookies in BIND 9.10 and 9.11.

Properly-implemented DNS servers should handle unknown EDNS options and versions correctly, but unfortunately this is not always the case. ISC has been monitoring server compliance - the results are posted here: https://ednscomp.isc.org/.