Performance testing of recursive servers using queryperf
This is an article about testing recursive DNS server performance.
Testing recursive servers is intrinsically more challenging than testing authoritative server performance, because the server under test isn't providing the DNS query responses from authoritative zone data that it owns - it has to communicate with other servers and it also needs to manage and maintain its cache of zone data that it has previously learned.
- When testing an authoritative server, make sure that your server does not perform any recursion during the test - for advice on why it might performance recursion, see: Why does my authoritative-only nameserver try to query the root nameservers?
- When testing an authoritative server, make sure that your server does not access cache when constructing query responses - in named.conf:
- minimal-responses yes; (this is the default from BIND 9.12 onwards)
- additional-from-cache no; (this option is no longer needed and is deprecated from BIND 9.12 onwards)
Why measure Query Performance?
Many DNS administrators are interested in knowing what the peak capacity of their DNS resolvers is going to be so that they can adequately provision their DNS services. They might also want to deploy independent query rate-limiting tools to protect their servers in a situation where significantly more queries than normal are being received, and before the point at which they expect their resolvers to reach saturation point.
Therefore most query rate testing tools and techniques are stress-based, monitoring the responsiveness of the server under test and increasing the query load until the point that it starts to fail. The maximum QPS is usually defined as being as the highest query rate that the server can handle, above which which the proportion of queries receiving no response starts to increase.
Therefore, we recommend that server operators also routinely measure the behavior of their servers under normal load so that they know what a typical range is for the proportion of cache hits, query failures (no response and/or SERVFAIL response), and NXDOMAINs returned by their servers. It can also be useful to monitor record types queried, cache contents, and other metrics. For more information about the statistics and counters provided by BIND, see: Using BIND's XML statistics-channels.)
What tools are available for measuring query performance?
There are several tools available for measuring a DNS server's query performance. Most of them operate by sending a pre-determined stream of queries to the server under test. Some of them self-throttle, monitoring the server's responsiveness and limiting further queries when the tool determines (per its internal metrics) that the server is starting to falter. Different tools use different internal metrics to determine the target server's responsiveness and thus the maximum query throughput determined by each may vary. The tuning of and performance of the testing tool can also be critical to the results.
This article discusses the options and tuning of ISC's tool queryperf.
Other tools that the reader might like to explore for comparison include:
- DNSPerf and ResPerf from DNS-OARC: https://www.dns-oarc.net/tools/dnsperf
- perftcpdns - a performance testing tool for DNS over TCP: https://gitlab.isc.org/isc-projects/perftcpdns - not actively maintained
- queryperf++ - an independent opensource framework for testing DNS servers that uses both UDP and TCP: https://github.com/jinmei/queryperfpp - not actively maintained
Setting up a test environment
Testing of recursive server performance is sometimes considered to be an art rather than a science because of all the variables that need to be taken into account. For example:
- Can you test your recursive servers using the Internet authoritative servers to resolve names that your test server doesn't have in cache? This is a realistic test, assuming that the query stream is also realistic, but it relies upon the availability of the Internet authoritative servers as well as the Internet infrastructure, much of which is outside of your control. But if you want to see how a server will perform 'in situ', then this is very likely going to be a good test set-up for you, particularly if you can compare different candidate servers and configurations that are situated in the exact same network location. There will be inevitable variations in throughput depending on time of day, fluctuations in network connectivity and throughput, and zone changes, but probably not too much variance (if there is, you may want to look more closely at your own infrastructure or your test query stream).
Should you test your recursive servers using other servers that are under your management to provide authoritative answers? This can be an effective test strategy for producing comparative results because all of the variables will be under your own control. The downfall of this technique is that in normal operation, a recursive server will be handling communication with very many different authoritative servers. Recursive servers can sometimes under-perform when handling only a small number of unique servers.
Is it best to test using a query-generating tool (queryperf being an example of one of these) and a pre-defined query stream? Using a query generator with a pre-defined query stream is good for producing comparative results in both scenarios listed above. It is, however, important to make sure that the query source is recent, or the test may result in a higher proportion of delays and SERVFAIL results from authoritative servers than would be seen in normal operation. If the query stream is too small and is repeated multiple times, this can also result in:
- A higher proportion of cache hits than normal skewing your test results
- Repeating many times the same queries to authoritative servers that respond with short or zero TTLs causing them to deploy rate-limiting techniques against your test server (also skewing your test results)
What about testing by replicating a live query stream? One very effective method of testing servers ahead of production deployment (and testing new versions of software prior to upgrading) is to duplicate one or more live client query streams that are being handled by your existing DNS servers and direct this to the server under test. Typically you would create the duplicate query stream on the router or switch that is in-path between your clients and your active resolvers. (Please refer to your devices' manuals for instructions.) Depending on your infrastructure, you could be also be duplicating the traffic between your network and the authoritative servers on the Internet. You also need to take into account the client replies that the test server will respond with and discard those!
Which is better - a 'cold start', or a 'running warm' test? In production, recursive servers save previously-learned answers in cache, which means that they can respond immediately (if asked the same question again) or don't have to make as many queries to Internet servers (if queried for names in domains that have been queried before). A realistic run-time performance test against a recursive server should be testing it 'warm' - that is, after running it under normal or test query load for a period. If, however, you want to measure the server's ability to handle pure recursion, then hitting a server that has been restarted, just prior to the test, with a stream of unique queries will achieve this. (Note, however, that in this test, full iteration from the root nameservers will not occur for every query, unless it can be contrived that each query lies in and below domains that have not previously been queried.)
Will enabling recursive client rate limiting (fetches-per-server and fetches-per-zone) affect the test results? It will depend on the value of the settings and the contents of the test query stream whether or not these limits are triggered. If they are triggered and your server is configured to drop rather than send back SERVFAIL, then this will almost certainly result in a lower QPS being recorded because the test tool will interpret the increase in missing responses as being due to your server starting to falter under the test load. On the other hand, measured QPS may be higher with recursive client rate limiting triggered and sending back SERVFAIL for otherwise unresolvable client queries.
How many times should I run the test? We recommend repeating the same test multiple times as you're unlikely to get identical results on each run. The differences will be due to many factors, including underlying operating system 'choices' during each run as well as variations on how named constructs and manages the cache during the test. We anticipate that you'll see a majority bunch of results that are similar, but occasionally there will be outlying anomalies of higher or lower peak performance. ISC runs multiple iterations of in-house performance tests: https://www.isc.org/docs/bellis-oarc-perflab.pdf; https://www.isc.org/blogs/isc-performance-lab/.
Assuming that you're reading this article because you're planning to use the queryperf tool to test your server, here are some information and guidelines specific to queryperf.
Some notes on how queryperf operates
Note that most performance testing tools including queryperf are self-throttling. We could write a tool that sends queries as fast as it can get packets onto the wire, but to reasonably assess the performance of the target server, this tool will still need to match what it sends with what it gets back. This involves local 'keeping track'. Like other performance testing tools, queryperf has to maintain 'state' and counters.
To run an effective queryperf test, therefore, you need the machine sending the queries to be fast enough itself to manage both the sending and the tallying of responses (this is not as CPU-intensive as being a nameserver, but it it is still significant), and have good network bandwidth. If you're firing queries from a slow box and/or you have network latency issues between the two boxes, the test results will be poor.
Internally, with each send, queryperf puts the query in a status buffer for 'outstanding queries'. The number of outstanding queries is limited; by default it is 20 - you can increase this using the -q option - but be aware that this also increases how many sends there are before queryperf pays attention again to inbound responses.
When the test starts, queryperf sends until it reaches the end of the sample file or it fills up the outstanding queries array (thus by default in bursts of 20). Then it processes any replies received (which should empty the array some) followed by 'retiring' any queries that haven't received any response and are now too old. 'Too old' is by default > 5 seconds (you can tweak this one too using the -t parameter).
There's an extra logic piece in here too - while it's working, queryperf calculates the ongoing qps rate and if you have set a target, then the 'retirement' logic will early-retire old entries if it's not reaching the target because it can't send, because the status array is filled with queries that don't have responses. It does this by shortening the timeout period arbitrarily on that 'retirement' run through the status array. This is going to result in it reporting a higher rate of lost replies - be aware that this might happen when using the -T option.
But essentially, if queryperf is not getting responses back, then that is going to limit its outbound query rate. There is some compensation for slow responders if you set a target qps rate, but queryperf still has to wait at least a small amount of time for the query replies before being able to send again. In the situation where you're testing with a server and client queries that are liable to elicit a 'bursty' type of response, then increasing the number of outstanding queries via the -q option may gain you a better reported throughput.
Another significant factor which will impact queryperf testing is the the data: if the test file carries an unrealistically high proportion of queries for which the test server itself has to timeout, then the results are going to be unusually low too, in comparison with live performance.
When you download and build BIND from the ISC-distributed tarball, queryperf is not built or installed automatically; therefore, you need to follow these additional steps:
- From the top directory of the BIND tarball:
$ cd ./contrib/queryperf
- Run ./configure and make:
$ ./configure$ make queryperf
- Copy the queryperf binary to the server and location from which you will be running it
queryperf parameters explained
Usage: queryperf [-d datafile] [-s server_addr] [-p port] [-q num_queries] [-b bufsize] [-t timeout] [-n] [-l limit] [-f family] [-1] [-i interval] [-r arraysize] [-u unit] [-H histfile] [-T qps] [-e] [-D] [-R] [-c] [-v] [-h] -d specifies the input data file (default: stdin) -s sets the server to query (default: 127.0.0.1) -p sets the port on which to query the server (default: 53) -q specifies the maximum number of queries outstanding (default: 20) -t specifies the timeout for query completion in seconds (default: 5) -n causes configuration changes to be ignored -l specifies how a limit for how long to run tests in seconds (no default) -1 run through input only once (default: multiple iff limit given) -b set input/output buffer size in kilobytes (default: 32 k) -i specifies interval of intermediate outputs in seconds (default: 0=none) -f specify address family of DNS transport, inet or inet6 (default: any) -r set RTT statistics array size (default: 50000) -u set RTT statistics time unit in usec (default: 100) -H specifies RTT histogram data file (default: none) -T specify the target qps (default: 0=unspecified) -e enable EDNS 0 -D set the DNSSEC OK bit (implies EDNS) -R disable recursion -c print the number of packets with each rcode -v verbose: report the RCODE of each response on stdout -h print this usage
While most of the parameters above are self-explanatory, it may help to elaborate on some of them.
You are almost always going to be specifying an input file with the -d option - see the section on How to create the query stream file at the end of this article.
- -q specifies the maximum number of queries outstanding (default: 20). It will almost always be worth experimenting with other values of -q. You don't want queryperf to stop sending queries and to start waiting for replies from the server if none has been received back yet, but similarly, you don't want to set this value too large, because once replies have been received, you want queryperf to process them, and it won't do that while it's sending. Using the -q option may help you to measure better a server whose can handle a high query rate, but (perhaps because of distance from the queryperf machine, or perhaps because of the test data it has to iterate for a large proportion of client queries) is responding more slowly to each individual query, although it is responding nevertheless to each.
- -t can be used to cause queryperf to 'give up' sooner or to wait longer for replies from the server under test. Timing out too soon is going to cause a greater rate of missing replies in the statistics. Waiting for too long may slow down the test.
- -l is there for you to set a fixed time for the test - this means that queryperf will run through the provided input file as many times as needed to keep on pushing out queries. Under these circumstances, the reported qps (queries per second) may increase because named will start to respond immediately from cache for a proportion of the test queries it receives. The default is that queryperf runs through the input file once and only once (see -1 option).
- -i can be useful if you want to track the progress of the test via intermediate results (you also specify the interval between reports in seconds).
- -R would be used to remove the 'recursion desired' bit from the test queries. This is not useful unless you're using queryperf to test a server that is responding authoritatively.
- -D adds the DNSSEC-OK (EDNS) bit to the test queries. If most of your client queries set this bit, then you should test your server with it too; the adoption of DNSSEC and EDNS is increasing. This will however increase the size of query responses that the server has to assemble when DNSSEC material is available to use in its replies.
- -e enables the use of EDNS0 (and thus larger UDP response sizes). It is implied by setting -D. Your server should support this feature when communicating with its clients , so it makes sense to run performance tests with it enabled.
- -b increases the buffer size used for sending test queries and receiving responses. In a high-performance test environment, if you suspect that buffering might be impeding the test, you could try increasing this value to see if it changes the results.
- -v is unlikely to be an option that you would use during performance testing, as it will generate a large amount of output wherever stdout is being directed, but it might be useful for collecting more information about how the server is behaving during a test run.
queryperf -d qtest -s 192.0.2.27 –l 300 –i 60 –c
This test will send queries (from the file qtest) for 300 seconds solidly to IP address 192.0.2.27, sending to the default server socket 53. There will be intermediate output printed at intervals of 60 seconds, and additionally a summary report of the number of packets with each RCODE.
How to create the query stream file
An input query stream file is a text file containing one query per line with the name first, followed by the query type. For example:
1-host.com ANY 1-linknet.com A 1.0-127.35.195.200.in-addr.arpa PTR 1.0-18.104.22.168.in-addr.arpa PTR 22.214.171.124.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.ip6.int PTR 126.96.36.199.in-addr.arpa PTR 188.8.131.52.in-addr.arpa PTR
You can sample the queries received by your own servers by turning on BIND query logging for a period of time, and then parsing the output logfile using awk.
A sample query stream file can also be downloaded from DNS-OARC's DNSPerf and ResPerf site at https://www.dns-oarc.net/tools/dnsperf, but this may not be similar in profile to the query traffic that your servers would be expected to be handling.