What to do if your BIND, Kea DHCP, Stork, or ISC DHCP server has crashed
If your BIND 9, Kea DHCP, Stork, or ISC DHCP server crashes (i.e. the daemon terminates unexpectedly), collecting the evidence available and submitting to us is vital if we are to help diagnose the problem and provide a solution. Below is a list of files/information to collect after a crash. This includes the process core dump - if you don't have one, this may need to be enabled on your system so that one can be collected if the event happens again.
Things to check if a core file does not get generated:
- Do you have enough available disk space? Core files can be quite large.
- Do you have the correct permissions that allow a core file to be created in the process's working directory?
- Is there already a directory or file named "core" in the working directory? If a file named "core" already exists but has multiple hard links the kernel will not dump a core file.
- What is the current file size limit?
- Have you used
ulimit -ato display a list of current system limits and
ulimit -c unlimitedto temporarily set the "core file size" value to "unlimited"? Make sure you do not have
ulimit -c 0in any shell startup scripts, e.g.,
.bashrc. Other files that may change
- Have you verified that
/etc/security/limits.confenables core dumps globally?
- Are you running BIND as a non-root user? In most Linux distributions core file creation is disabled by default for non-root users. When running BIND as a non-root user (-u), use
sysctlto allow core dumps:
sysctl kern.sugid_coredump=1(for Debian, use the
- Are you running chrooted? You can test whether or not core dumps are possible by using
kill -6(sigabrt) against a process at a time when restarting it will not impact production.
You can check the details of enabling core files for your environment via the man pages:
Information and files to collect and preserve
Please collect and preserve the details below (in particular the logs and any core files which may be lost if not collected and copied/moved elsewhere right away). We may need some or all of them to diagnose the problem.
Note down and tell us what BIND 9, Kea, Stork, or DHCP was doing at the time - for example, is this a test or a production environment? If testing, how is the test being run, etc.? If in production, was there anything specific happening at the time such as configuration updates or similar maintenance activities?
Look for a core dump. Generally this will be in either the current working directory of the server program, or in the directory of the binary. Depending on how core dumping is managed on that system, it will simply be named
core, or it may have a more esoteric name, possibly involving the PID of the process that died.
You can check where a core has come from by using the file command:
file core core: ELF 32-bit MSB core file, SPARC, version 1 (SYSV), SVR4-style, from 'named'
/coreand suffixed with the pid of the aborting process (e.g.,
systemd-coredumpdis enabled by default, which automatically collects and stores coredumps when an application crashes. When
systemd-coredumpis enabled, you can list the collected core dumps by running
coredumpctl list, and a core dump can be saved, if available, by running
coredumpctl -o ./named.core dump /usr/sbin/named, where
-ospecifies the output file, and the last argument is the binary of the relevent application.
Send us the actual binary that generated the core in order to be able to read it on another machine.
Note: If the binary was built without any debug information - particularly if it doesn't have the procedure names available for stack traces - you can sometimes get around this by building a new binary with compile option
-gfrom the same source code bundle. This implies all other build/configure options are the same as the original build, and that the compiler/linker/optimizer software hasn't been updated in the build environment since the original binary was created.
Collect any libraries that the binary loaded dynamically from the run-time environment that produced the core.
lddfor a first-pass on what's needed (sometimes we need more libs than
lddshows us at first; they're only exposed when we try to read the core for the first time - the most important lib that is usually omitted and that we have to request afterwards is
ldd /usr/sbin/named linux-vdso.so.1 => (0x00007fff11d4c000) libcrypto.so.0.9.8 => /usr/lib64/libcrypto.so.0.9.8 (0x00007f19d4d19000) libc.so.6 => /lib64/libc.so.6 (0x00007f19d49c0000) libdl.so.2 => /lib64/libdl.so.2 (0x00007f19d47bc000) libz.so.1 => /lib64/libz.so.1 (0x00007f19d45a6000) /lib64/ld-linux-x86-64.so.2 (0x00007f19d5096000)
There is no ldd command in MacOS. Instead,
otool -L should provide the same functionality; for example:
$ otool -L /usr/local/sbin/named
Note the environment information - OS and version, hardware, number of CPUs, memory size, BIND/DHCP/Kea version, and configure/compile/link options. For BIND,
named -V; for DHCP,
dhcpd --version; and for Kea,
Please also collect the following files:
- Configuration files: named.conf / dhcpd.conf / kea.conf (as appropriate). Key material can be obscured if you prefer.
named-checkconftool has an option
-pthat outputs the configuration file in canonical format (after checking it). A second option
-xobscures secrets by replacing them with strings of question marks ("?"). This means that the following command parses and outputs your named.conf in a format suitable for sharing without exposing key material:
$ named-checkconf -px
/etc/named.conf; to explicitly apply
named-checkconfto a configuration file in another location, use the following format:
$named-checkconf -px filename
(If you normally run named chrooted, you may need additional permissions to run named-checkconf as well as making sure you specify the full path to the named.conf file that is used when you launch named)
- Any include files that are declared in the main configuration files (look for nested includes too!).
- The leases file (for ISC DHCP/Kea only, or possibly a database dump for Kea; see the kea-admin(8) manual page).
- Zone data files (for BIND only) - although these may not be needed initially, so please ask.
Gather all the relevant logfiles leading up to and covering the period of the incident.
Think about what might have changed in your environment prior to experiencing this problem. For example:
- Software upgrade (your BIND 9, Kea DHCP, Stork, or ISC DHCP software)
- Operating system patches
- Configuration changes (OS- or product-related)
- Loading changes - additional clients, for example
- Networking changes - server moves, network topology changes, etc.
- Firewall updates or configuration changes
- Seasonal usage changes
If the failures are recurring, is there any pattern to them? For example:
- Do they occur at the same time of week/day/hour each time? (If so, what else is going on at this time?)
- Do they always occur at server-loading peaks?
- Are they related to a specific server operation such as a zone transfer (BIND 9) or omapi update (ISC DHCP)?
- Do they always occur at a specific time (days/hours) since last restarting?
- Are they increasing or decreasing in frequency?
What to include when reporting the problem
If you are submitting a bug report or support ticket, it is vital that you collect and preserve as much information and as many as possible of the files requested above. It may not be possible to diagnose the cause of the problem if this level of information is not available to us.
However, in the initial report, we prefer just to receive the basic details, and will let you know what else we need once we've reviewed them. Please include these basics with any initial contact:
- Environment information (see items 5 and 8, above), including the BIND 9/ISC DHCP/Kea server version.
- Frequency and impact to your servers or production services of the incident(s) you have experienced or are continuing to experience.
- What named or dhcpd or the relevant modular kea-* process was doing at the time (see item 1 above).
- Extracts from system and application logfiles leading up to and including the incident/crash.
- If at all possible, a debugger backtrace from the crash from the server that produced it.
If you have a core file and
gdb on the server that produced the core, you can obtain a backtrace (a snapshot of what each thread was doing and the nested layers of procedure calls and data that was being used/accessed at the time) by launching
gdb as follows:
$ <path-to>/gdb <binary(full path)> <core(full path)>
And then from
> thread apply all bt full
(Sometimes this will return an error and fail to complete - if that happens, retry it, omitting 'full'.)
Please include all of the output from
gdb from when it is launched, not just the output from the
thread apply all bt full command.
Note: MacOS users whose system has not generated a core file may still have access to useful crash data: go to Applications/Console and look at "User Diagnostic Reports," and you may see a report.
Before submitting a bug report please ensure first that you are running a current version. Also, some packaged versions of BIND 9, Kea DHCP, and ISC DHCP might be built with source code that has been modified by the distributor. In those cases, while it may be useful to report the issue directly to ISC (particularly if it might be a potential security issue), we may not be able to successfully diagnose the root cause of the problem if it cannot be reproduced with binaries that have been built directly from source code downloaded from ISC.
To report a bug, please use our Bug Report Form.
- BIND issues are maintained at gitlab.isc.org/isc-projects/bind9/-/issues.
- Kea issues are maintained at gitlab.isc.org/isc-projects/kea/-/issues.
- Stork issues are maintained at gitlab.isc.org/isc-projects/stork/-/issues.
- ISC DHCP issues are maintained at gitlab.isc.org/isc-projects/dhcp/-/issues.
ISC support customers:
If you have a support subscription with us, please contact us first through our support portal, rather than filing a bug report.
Reporting security issues
DNS and DHCP security are critical to the Internet infrastructure. If you think you may be seeing a potential security vulnerability (for example, a crash with REQUIRE, INSIST, or ASSERT failure), please visit the Reporting Security Vulnerabilities page on our website for instructions; do not post about it on any public mailing list. Please also see our Security Vulnerability Disclosure Policy for details on how we publish security vulnerabilities.