Changes to serve-stale option stale-answer-client-timeout in BIND 9.18 and newer
  • 08 Dec 2023
  • 3 Minutes to read
  • Contributors
  • Dark
    Light
  • PDF

Changes to serve-stale option stale-answer-client-timeout in BIND 9.18 and newer

  • Dark
    Light
  • PDF

Article Summary

For an overall and more detailed explanation of BIND 9 serve-stale implementation, see: Serve-stale Implementation Details

What is the change that is being implemented in 9.18.22?

In BIND 9.18.22 and newer, ISC is planning to limit the accepted values for serve-stale option stale-answer-client-timeout to 0 (zero) and disabled (also equivalent to off).

How does stale-answer-client-timeout work prior to this change?

The stale-answer-client-timeout option was introduced so that operators could choose how long to wait for stale cache content to be refreshed before responding with a stale answer to a client query (but still allowing the ongoing cache refresh to proceed and complete or time out as normal).

Setting this option value to 0 (zero) means that when a client query encounters a stale (TTL has expired) cache entry, this will be used immediately to fulfil the query response to the client, before BIND attempts to refresh the content.

If set to a non-zero value, then the option defines the amount of time (in milliseconds) that named waits before attempting to answer the query with a stale RRset from cache.

Setting this option to disabled or offmeans that if a stale answer is found in cache, named continues the ongoing fetches, attempting to refresh the RRset in cache until the resolver-query-timeout interval is reached, and only then will use the stale RRset in a query response to the client. (Subsequent client queries for the same record(s) would use this stale content immediately until the stale-refresh-time window (default 30s) has expired, at which point the next client query will have to wait for another stale content refresh attempt).

stale-answer-client-timeout is disabled by default

This option is off by default, which is equivalent to setting it to off or disabled. It also has no effect if stale-answer-enable is disabled.

Why are we changing the accepted values for this option?

The complexity of the internal processing paths within named for handling non-zero stale-answer-client-timeout is significant, and, along with the interaction between resolver client query handling and other features such as RPZ and Recursive Client Rate Limiting, this complexity has resulted in several issues where unexpected outcomes could occur depending on the current state of the cache and client queries being processed.

From what we have observed in configurations where stale-answer-enable has been enabled, stale-answer-client-timeout is either given a value of zero or it is disabled entirely. These two option values are significantly simpler to manage in source code (effectively they mean either 'answer from stale first, attempt to refresh content afterwards' or 'attempt to refresh first, then use stale content if this times out'), therefore we would like to remove the overhead of code complexity required to handle non-zero values.

How will this change be implemented?

In BIND 9.18.22 (ESV) and newer, non-zero values for stale-answer-client-timeout will be silently reduced to 0 (zero).

In BIND 9.19.20 (Development) and newer, and from BIND 9.20.0 (Stable, available Q1 2024) it will be a configuration error in named.conf to specify a non-zero value for stale-answer-client-timeout.

Do I need to do anything?

You do not need to do anything If you are not using the serve-stale feature at all.

The default in BIND 9.18 is not to use stale cache, so if you do not have the stale-cache-enable option in named.conf, or it is disabled (stale-cache-enable no;) then you do not have a stale cache and this change should not affect you.

If you have stale cache enabled (stale-cache-enable yes;) in named.conf, then you are either running with stale answers enabled already (stale-answer-enable yes;) or you have stale answers disabled but could enable this feature at any time whilst named is running using the rndc utility command rndc serve-stale on. In this instance:

  • Operators planning to upgrade to BIND 9.18.22 or newer need to be aware that if they have configured a non-zero value for stale-answer-client-timeout that this will be silently reduced to zero when running named.
  • Operators planning to upgrade to BIND 9.19.20 or 9.20.0 or newer need to review their named.conf file and edit stale-answer-client-timeout as needed, if they were previously using a non-zero value for this timeout.
You can also check your serve-stale status using rndc

The command rndc serve-stale status reports whether caching and serving of stale answers is currently enabled or disabled.