Serve-stale implementation details

Here are some details about the BIND 9 serve-stale and prefetch implementations in BIND 9.17.11 and 9.16.13, and a discussion about how these features interact with fetch-limits and other quota mechanisms. This article provides some background on the logic as implemented and is not intended to give explicit guidance on how to set these parameters.

Serve-stale

The max-stale-ttl configuration is stored in a per-view cache.
An RRset in any given cache is marked as stale during RRset lookup in that cache, if ALL of the following conditions apply:
1. The RRset's TTL has reached zero, i.e. the RRset is expired.
2. stale-cache-enable is set to yes in the configuration.
3. The RRset expiry is less than max-stale-ttl seconds ago.
If stale-refresh-time is zero (disabled), then:
1. Lookup of stale RRset in cache only takes place when a previous attempt to refresh the RRset from authoritative servers has failed.
2. The lookup in cache happens in the same request, right after the failure in attempting to refresh the RRset.
3. All subsequent requests to the same RRset follow the same path: try to refresh from name servers, fail, try cache.
4. The default behavior in BIND after the stale-refresh-time addition is to have it enabled with a positive value of 30 seconds.
If stale-refresh-time is non-zero (enabled), then a lookup MAY return a stale RRset from cache before going into recursion if:
1. The RRset is marked as stale.
2. A previous attempt to refresh the RRset has failed.
3. The lookup happens during the period stale-refresh-time after the refresh failure.

Negative Cached Content and Serve-stale

Stale negative cached content (NXDOMAIN or NXRRSET) is handled slightly differently because clients prefer positive answers. If there is a stale NXDOMAIN or NXRRSET in cache, BIND returns it only if the resolver query times out (stale negative data will not be returned on stale-answer-client-timeout). Although stale-answer-client-timeout is not used to provide an earlier response to clients from negative stale cache RRsets, once a refresh attempt of these RRs has eventually timed-out, the stale-refresh-time will be started so that subsequent client queries will receive the stale response immediately.

Fetch-limits

Fetch-limits include the fetches-per-server and fetches-per-zone quota mechanisms.

The action taken when a query exceeds any of the fetch-limits is not to process the query (that is, not to initiate any new 'fetch' to obtain an answer to send to the client).

The response to the client when such a query is dropped varies depending on the fetch-limit triggered, as follows:
- fetches-per-server: the default action is to return a SERVFAIL to the client.
- fetches-per-zone: no responses are sent to the client; the client observes this as a timeout.

It is possible to change the client response behavior for both fetches-per-zone and fetches-per-server options in named.conf.

Prefetch

Prefetching takes place in the late stage of processing a client query, in the response-building phase; more specifically, it occurs during execution of the following functions:

query_respond_any - Build the response for a query for type ANY.
query_addanswer - Fill the ANSWER section of a positive response.
query_cname - Handle CNAME responses.
query_dname - Handle DNAME responses.

Prefetching code performs some quota verification, in the following order:

Check if the recursive-clients quota is below the soft clients value. If yes, prefetch attaches to the recursive-clients quota.
If there is a fetch context already created for <qname,qtype,qclass> (let's call it curr_fctx), then:
- Let fctx_num_clients = number of clients currently associated with that fetch context.
- If current client address matches one of the addresses currently associated with curr_fctx, drop prefetch and log the query as duplicated
- Else, if current client address doesn't match any of the addresses currently associated with curr_fctx, then check if fctx_num_clients is less than the current auto-tuned value for 'clients-per-query'; if the check fails, drop the prefetch.
- If none of the checks above abort prefetching, attach to curr_fctx and proceed.
If the current number of fetches for the target domain is greater than or equal to the value of fetches-per-zone, then drop the fetch.
If the number of current queries exceeds max-recursion-queries, then drop the fetch.
Finally, prefetch tries to find a server address on which to send the query, one that isn't over quota, i.e. a server in which the number of current fetches targeted does not exceed the configured fetches-per-server limit.

The impact of fetch-limits

How does prefetch interact with fetch-limits?
Prefetch is dropped if either the fetches-per-server or the fetches-per-zone quota is reached.
It is also dropped if any of the following quotas are reached:
- recursive-clients
- clients-per-query (actually, the value used is a self-adjusted one between clients-per-query and max-clients-per-query).
- max-recursion-queries

How does serve-stale interact with fetch-limits when serving of stale answers has been enabled?
- If there is eligible stale content with an active stale-refresh-time window, then no fetch is initiated and the stale answer will be served to the client.
- When a fetch is dropped due to fetch-limits, then before sending SERVFAIL or DROP (depending on what's configured in fetch-limits), we'll look to see if there is stale data we could respond to the client query with instead.
- Because there is no fetch initiated when a query triggers fetch-limits, although we can respond to the client using eligible (within max-stale-ttl) stale data, we will not start a new stale-refresh-time for the stale data we use. A stale-refresh-time window is only opened when a refresh attempt has timed out.

Q. What is the logic path if content has expired and a client query comes in that would normally trigger a fetch (which ought to fail and lead to the content being marked for serving stale), but that fetch never happens because it is dropped because of fetch-limits?
A. If the content (RRset) has expired and a query comes in asking for it, then, assuming the RRset is not yet marked as stale, and stale-cache-enable is yes, the following steps take place:

A cache lookup is made, but expired entries are ignored (they are now marked as stale).
A fetch is initiated which is dropped due to fetch-limits.
A new cache lookup is made, now including stale entries. A response is sent to the client with the stale answer (if available).
If there is no stale entry, a response is sent to the client (or not), depending on which fetch-limit was triggered (see the behavior described in the beginning of this document).

Q. For a query dropped in this situation, does BIND initiate a stale-refresh-timewindow for this RRset?
A. A query dropped due to fetch-limits won't activate stale-refresh-time, as this is not considered a real failure in contacting the name servers in an attempt to refresh the given RRset.
- Although stale-answer-client-timeout will not be initiated when content cannot be refreshed due to fetch-limits, if there is eligible stale data, clients will still receive a prompt response using those stale cached RRsets.

Update

This KB has been updated to reflect the behavior in BIND 9.17.11; 9.16.13 and 9.16.13-S1. For more details of the change see the Gitlab issue.

Serve-stale Implementation Details

Serve-stale

Negative Cached Content and Serve-stale

Fetch-limits

Prefetch

The impact of fetch-limits