Why is rndc dumpdb very slow (taking an unexpectedly long time to complete)?
  • 24 Jun 2021
  • 1 Minute to read
  • Contributors
  • Dark
    Light
  • PDF

Why is rndc dumpdb very slow (taking an unexpectedly long time to complete)?

  • Dark
    Light
  • PDF

Article Summary

The control command rndc dumpb -all causes named to write out the contents of cache and authoritative zones from memory into a text file (by default this is named named_dump.db).

Ordinarily this takes place very quickly (a matter of seconds), but we have reports that on some servers it can take several minutes.

We do not yet understand the root cause of this symptom and have not been able to reproduce the problem ourselves in testing. One site reporting this anomaly to ISC was in a position to make a comparison of more than 20 different servers for the same set of authoritative zones, some of which completed the creation of named_dump.db quickly, and others very slowly. No significant pattern could be identified despite a thorough check of all hardware, firmware, OS, kernel, patches, and other versions.

The dump process operates via an internal 'iterator' that breaks up the task into small quanta. This is to allow it to share processing resources with other named activities. Further debugging demonstrated that on the 'slow' dumping systems, named performed the dumpdb using many small quanta, as opposed to the 'fast' systems that used far fewer large quanta. No explanation for this has been identified, although we think this is because each time the dump iterator starts up, it only dumps a small number of nodes and then pauses again for some reason.

We were unable to continue our investigations through to identifying (and fixing) the underlying root cause, but the next step in troubleshooting this mysterious behavior would be to add some instrumentation and diagnostic logging to the dump iterator.

If you have encountered unexpectedly slow authoritative/cache server dumping and wish to submit a report, you can use our online form.

We welcome any new information or insight into this problem, particularly if you have a server reliably demonstrating this issue and are in a position to work with us to add further diagnostic instrumentation to BIND. Even if you are unable to provide any new evidence, by submitting a bug report we can add you to the list of those experiencing this issue.