BIND Memory Consumption Explained

NB: a more recent blog post on this issue, which includes some memory measurements, has been published on the ISC web site.

Overview

BIND users upgrading from BIND 9.11 versions to BIND 9.16 may notice increased memory consumption. This article explains in detail how BIND allocates memory in 9.16, and 9.17/9.18, and how refactoring resulted in increased memory consumption in BIND 9.16 up to and including 9.16.24. There is a change in BIND 9.18.0, which is partly backported to 9.16.25, which reduces BIND's memory consumption down to levels similar to those with 9.11.

BIND 9 (up to and including 9.16.24)

BIND has (and still has in BIND 9.16) its own internal memory manager/allocator (isc_mctx) for obtaining, freeing and tracking in-use memory contexts.

Supplementing this, BIND also has a facility (isc_mempool) for managing pools of available memory contexts of different sizes that have been requested from the OS. Tasks/processes in BIND that frequently need to use chunks of memory of a specific size will interface with isc_mempool and ask it to create for them, dedicated pools of memory chunks of the specific size that they need. The memory pool manager uses isc_mctx to pre-populate the pools under its care with sufficient chunks of memory that they can be available quickly to those tasks/processes (and without blocking) when needed. The memory pool manager takes care of both topping up the pools when demand is high, and releasing memory chunks back to the memory allocator (isc_mctx) when demand wanes.

In other words, the memory pool manager was created to add a layer of memory chunk buffering so that named processes/tasks with a high rate of demand for memory, are not slowed down by the back-end interaction between named and the OS in handling memory requests and releases.

There are two types of memory pool, those that require locks, and those that do not.

Tasks/processes will return memory when they no longer need it - either to isc_mempool (if this is how they obtained it in the first place) or directly to isc_mctx. Memory that is returned directly to the memory manager (isc_mctx) is eligible to be released back to the OS. Memory that is returned to isc_mempool is returned to its matching managed pool but these pools also have a limit on how many chunks they are allowed to hold - anything beyond that will be returned to the memory manager (isc_mctx) to give back to the OS.

Finally, the BIND internal memory manager also maintains (for itself) a small blocks cache behind the scenes. It holds small chunks of free memory - anything less than 1100 bytes. Maintaining a small blocks cache is a performance technique for buffering small memory blocks between the memory allocator and the OS.

Unlike with any larger chunks of memory that are handed to it (from isc_mempool or directly from named processes/tasks) the internal memory manager/allocator does not ever return small chunks of memory to the OS at all - they remain in the small blocks cache in perpetuity, even if this has become larger than needed due to a short-lived increase in memory needs by named.

BIND 9.18 onwards

Historically, BIND used its own internal memory management because this was the most effective way handle its fast-moving memory needs. Creating this significantly sped-up BIND performance. Of particular importance was the unlocked isc_mempool feature which prevented named from blocking when it needed to construct queries and query responses (used by dns_message).

But since BIND was written, system memory allocators have improved to the extent that for most of BIND's memory needs, there is no difference between named using an OS memory allocator versus its own - the one exception to that being dns_message, where using unlocked isc_mempool does deliver faster throughput.

dns_message is the set of procedures responsible for managing DNS queries and responses.

Therefore in BIND 9.17 (development version, becoming 9.18 stable version) we have entirely removed the internal memory manager/allocator in favour of relying on an OS-provided solution. isc_mctx remains in the BIND code, but solely as a wrapper around an external allocator, and to keep track of a few more things than the external memory manager provides on its own. External memory managers have their own built-in user cache strategies (similar to BIND's internal small block cache) so that is no longer needed and has been removed.

The isc_mempool feature now only supports unlocked memory pools - this is because locked memory pools don't provide any performance advantage over relying on a modern external memory manager to handle memory requirements.

BIND 9.16.25 and newer

We've back-ported most (but not all) of the BIND 9.17 memory management changes to BIND 9.16.25 (January 2022), as follows:

BIND now defaults to using an external memory manager (this can be reverted by running with -M internal, although this is not recommended).
Any BIND processes/tasks that had previously been using locked mempools will now find themselves using the memory manager/allocator directly (as mentioned already, there being no performance advantage in modern OSs in using locked mempools).
There are some tuning adjustments to the unlocked mempools that are still being used in BIND 9.16 - another optimisation for memory consumption/performance.
Even if BIND is started with the -M internal option, the isc_mempool will no longer use the small blocks cache native to the internal memory allocator.

We do not recommend using BIND 9.16.25 and newer with the -M internal option

The reason this is not recommended is because the code changes remove both the features that enhance the performance of the internal memory manager and that cause it to persistently consume more memory.

Why did BIND 9.16 (up to 9.16.24) use more memory than BIND 9.11?

Summary:

BIND's memory management did not change between 9.11 and 9.16, but the use of specific types of memory allocation were increased as a result of the refactoring of the network sockets code (introduction of libuv and netmgr threads), thus making the existing behaviour of the memory management have a greater impact on named's memory consumption and retention.

Specifically, it is the small blocks cache feature of BIND's internal memory allocator that could now consume more memory than before. This (as explained above in more detail) is a pool of (reusable) small memory chunks that is topped-up as needed (for example if there is a spike in usage), but whose contents are never returned to the OS, even when the demand for blocks from this pool has decreased.

In addition to the refactoring changes that affect memory requirements, BIND 9.16 can read inbound queries more efficiently/faster, and as a result can now make higher use of memory for buffering in-progress queries, particularly if there is a spike in in-progress queries, for example on busy servers where other activities such as zone updates temporarily defer processing of inbound queries. (Typically this would be observed by (externally) monitoring RTT patterns. Our expectation is that the overall/everage RTT rate is lower, but the intermittent exceptions are further from the mean, although still reasonable response times.)

More detail:

In older versions of BIND (9.14 and older) the persistence of the small blocks cache was mostly 'seen' as the rise in cache memory consumption on resolvers as cache content increased and then plateaued after start-up, and also in step increases on inbound AXFRs when two versions of zones were hosted in memory temporarily.

In BIND 9.16 however, as a result of the re-architecturing work causing named to use memory very differently, we are seeing an greater use of small memory allocations - and this is what is causing named to obtain and then not release, more memory than on earlier versions of BIND - it is running with a much bigger small blocks cache than it did before.

Also for some users of BIND, the larger inbound socket buffers introduced with the new default build option --with-tuning=large will have increased memory consumption, although this would in most cases be less significant than the increased use of small blocks cache. Servers that have a large number of server (listening) interfaces/sockets may however be more aware of this change being related to --with-tuning defaulting to large instead of small.

The greater memory consumption by BIND 9.16 is also influenced by the number of CPUs that named detects on start-up or specified as a run-time option. Smaller installations with fewer CPUs and query throughput won't encounter such a noticeable difference.

And finally, running with higher throughput capacity and with additional features, will inevitably cause an increase in memory consumption by server software - this should be anticipated and considered when upgrading any software (not just BIND).

What do we recommend to users of BIND 9.16?

We recommend upgrading to BIND 9.16.25 (or 9.16.25-S for those entitled-to and using the subscriber-only edition) after it is released in January 2022.

If your server's memory consumption is greater than server capacity and is causing outages then there are some mitigations that you can consider now for BIND 9.16.24:

Install the January code changes as a source code patch and rebuild BIND. The code patch can be obtained here:
https://gitlab.isc.org/isc-projects/bind9/-/merge_requests/5637.patch
If it's not possible to upgrade (or to rebuild BIND with the patch between now and the release of 9.16.25), then instead you could consider one or more of these potential mitigations:
-- reduce the number of working threads via the runtime -n option
-- if using many server (listening) sockets, rebuild BIND with
./configure option --with-tuning=small (particularly if you had not previously been building BIND with --with-tuning=large)

One last option (for all versions of BIND 9.16, including with the January changes) could be to link named with jemalloc. This external memory manager is much better at handling memory fragmentation and it also provides its own very efficient small object caches. It can be used either at compile time by setting CFLAGS and LDFLAGS or pre-loading the library at runtime with LD_PRELOAD.

To launch BIND 9.16 older than 9.16.25 to use an external memory manger, you will also need to specify -M external at run-time.

Using -M external and jemallocwith older versions of BIND (9.16.24 and older) should be considered experimental and undertaken at your own risk

Changing memory use options must be considered experimental for older/un-patched versions of BIND because it's configuring named to use an external memory allocator without removing functionality that is already duplicated within BIND. The resulting interaction between un-patched BIND and external memory managers may result in improved memory consumption, but at the cost of a reduction in performance. We recommend extensive testing before deployment on business-critical production systems