What to do if your BIND or DHCP server has crashed
- Updated on 07 Sep 2018
- 8 minutes to read
If your BIND or DHCP server crashes (i.e. the daemon terminates unexpectedly), collecting the evidence available and submitting to us is vital if we are to help diagnose the problem and provide a solution. Below is a list of files/information to collect after a crash. This includes the process core dump - if you don't have one, it's possible that this needs to be enabled on your system so that one can be collected if the event happens again.
The ulimit command handles this in many cases - but you should check that this is appropriate before trying it in your environment:
ulimit -c unlimited
Appropriate write permissions may also need to be set for the directory that the core file is to be written to. (Are you running chrooted? What user does your daemon process run as?). You can test whether or not core dumps are possible by using gcore or kill -6 (sigabrt) against a process at a time when restarting it isn't going to impact production.
You can check the details of enabling core files for your environment via the man pages:
Information and files to collect and preserve:
Please collect and preserve the details below (in particular the logs and any core files which may be lost if not collected and copied/moved elsewhere right away). We may need some or all of then in order to diagnose the problem.
Note down and tell us what BIND, Kea, or DHCP was doing at the time - for example, is this a test or a production environment? If testing, how is the test being run, etc.? If in production, was there anything specific happening at the time such as configuration updates or similar maintenance activities?
Look for a core dump. Generally this will be in either the current working directory of the the server program, or in the directory of the binary. Depending on how core dumping is managed on that system, it will simply be named 'core' or may have a more esoteric name - possibly involving the PID of the process that died. On MacOS, core files are in directory /core and suffixed with the pid of the aborting process (e.g. /core/core.
You can check where a core has come from by using the file command:
file core core: ELF 32-bit MSB core file, SPARC, version 1 (SYSV), SVR4-style, from 'named'
We will need the actual binary that generated the core in order to be able to read it on another machine.
Note: If the binary was built without any debug information - particularly if it doesn't have the procedure names available for stack traces - you can sometimes get around this by building a new binary with compile option -g from the same source code bundle. This implies all other build/configure options are the same as the original build, and that the compiler/linker/optimizer software hasn't been updated in the build environment since the original binary was created.
Collect any libraries that the binary loaded dynamically from the run-time environment that produced the core .
Use ldd for a first-pass on what's needed (sometimes we need more libs that ldd doesn't show us at first - they're only exposed when we try to read the core for the first time):
ldd /usr/sbin/named linux-vdso.so.1 => (0x00007fff11d4c000) libcrypto.so.0.9.8 => /usr/lib64/libcrypto.so.0.9.8 (0x00007f19d4d19000) libc.so.6 => /lib64/libc.so.6 (0x00007f19d49c0000) libdl.so.2 => /lib64/libdl.so.2 (0x00007f19d47bc000) libz.so.1 => /lib64/libz.so.1 (0x00007f19d45a6000) /lib64/ld-linux-x86-64.so.2 (0x00007f19d5096000)
There is no ldd command in MacOS
Instead, otool -L should provide the same functionality, for example:
$ otool -L /usr/local/sbin/named
Note the environment information - OS and version, hardware, #cpus, memory size, BIND/DHCP/Kea version and configure/compile/link options. For BIND, named -V ; for DHCP, dhcpd --version ; and for Kea, kea-dhcp4 -V .
Configuration files also needed:
- named.conf / dhcpd.conf / kea.conf (as appropriate) - key material can be obscured if you prefer
Use named-checkconf to obscure your key material from named.conf:The named-checkconf tool has an option -p that outputs the configuration file in canonical format (after checking it). A second option -x obscures secrets by replacing them with strings of question marks ("?"). This means that the following command will parse and output your named.conf in a format suitable for sharing without exposing key material:
$ named-checkconf -px
The filename to be checked defaults to /etc/named.conf; to explicitly apply named-checkconf to a configuration file in another location, use the format:
$named-checkconf -px filename
- any include files that are declared in the main configuration files (look for nested includes too!)
- leases file (dhcpd/Kea only, or possibly a database dump for Kea, see the kea-admin(8) manual)
- zone data files (named only) - although these may not be needed initially - please ask!
Gather all the relevant logfiles leading up to and covering the period of the incident.
Think about what might have changed in your environment prior to experiencing this problem. For example:
- Software upgrade (your BIND, Kea, or ISC DHCP software)
- Operating system patches
- Configuration changes (OS or product related)
- Loading changes - additional clients for example
- Networking changes - server moves, network topology changes etc.
- Firewall updates or configuration changes
- Seasonal usage changes
If the failures are recurring, is there any pattern to them? For example:
- Same time of week/day/hour each time (what else is going on at this time?)
- Always occur at a server loading peaks
- Related to a specific server operation such as a zone transfer (named) or omapi update (dhcp)
- Always occurs x days/hours since last restarting
- Seemingly completely random
- Increasing or decreasing in frequency
What to include when reporting the problem:
If you are submitting a bug report or support ticket, it is vital that you collect and preserve as much information and as many as possible of the files requested above. It may not be possible to diagnose the cause of the problem if this level of information is not available to us.
However, in the initial report, we prefer just to receive the basic details (and will let you know what what else we need once we've reviewed them).
- Environment information (see 5. and 8.).
- BIND/DHCP/Kea server version (use "named -V" for BIND).
- Frequency and impact to your servers or production services of the incident(s) you have experienced or are continuing to experience.
- What named or dhcpd, or the relevant modular kea-*\ process was doing at the time (see 1.).
- Extracts from system and application logfiles leading up to and including the incident/crash.
- If at all possible, a debugger backtrace from the crash from the server that produced it.
If you have a core file and gdb on the server that produced the core, you can obtain a backtrace (snapshot of what each thread was doing and the nested layers of procedure calls and data that was being used/accessed at the time) by launching gdb as follows:
$ <path-to>/gdb <binary(full path)> <core(full path)>
And then from gdb, type:
> thread apply all bt full
(Sometimes this will error and fail to complete - if that happens, retry it, omitting 'full')
Please include all of the output from gdb from when it is launched, not just the output from the thread apply all bt full command.
Note: MacOS users whose system has not generated a core file may still have access to useful crash data - go to Applications/Console and look at "User Diagnostic Reports" and you may see a report.
Before submitting a bug report please ensure first that you are running a current version. Also, some packaged versions of BIND, Kea, and ISC DHCP might be built with source code that has been modified by the distributor - in those cases while it may be useful to report the issue directly to ISC (particularly if it might be a potential security issue) we may not be able to successfully diagnose root cause of the problem if it cannot be reproduced with binaries that have been built directly from source code downloaded from ISC.
To report a bug, please use our Bug Report Form (or see the Kea Known Issues List for Kea), or submit via email to email@example.com or to firstname.lastname@example.org. (At this time there is no way to submit a Kea bug report via email.)
If you have a support subscription with us, please contact us first that way to report problems rather than filing a bug report.
Reporting security issues
DNS and DHCP security is critical to the Internet Infrastructure. If you think you may be seeing a potential security vulnerability (for example, a crash with REQUIRE, INSIST, or ASSERT failure), please report it immediately to email@example.com and do not post it on the public mailing list. We provide numerous alternate ways to contact our security officer alias, including via the Bug Report Form (or see the Kea Known Issues List for Kea) and the ISC Contact form. Please also see our Security Vulnerability Disclosure Policy for details on how we publish security vulnerabilities.