DNS Cache snooping - should I be concerned?
  • 15 Oct 2018
  • 6 Minutes to read
  • Contributors
  • Dark
    Light
  • PDF

DNS Cache snooping - should I be concerned?

  • Dark
    Light
  • PDF

Article Summary

DNS cache snooping is a technique that can be employed for different purposes by those seeking to benefit from knowledge of what queries have been made of a recursive DNS server by its clients.

Uses of this information vary, ranging from planning which mis-typed domains are worth registering (for marketing and other purposes) through to determining which domains might be easiest to target for a cache poisoning attack.

How can it be done?

  1. Using non-recursive queries: This is the simplest option. From a client that the recursive server will respond to, a snooper needs to send a non-recursive query (that is, one with the recursion desired bit in the query header set to zero) for the name that the snooper is interested in. For example:
dig +norecurse @192.168.1.1 target.domain.com

; <<>> DiG 9.7.4 <<>> +norecurse @192.168.1.1 target.domain.com
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 2757
;; flags: qr ra; QUERY: 1, ANSWER: 0, AUTHORITY: 13, ADDITIONAL: 0

;; QUESTION SECTION:
;target.domain.com.        IN    A

;; AUTHORITY SECTION:
.            518400    IN    NS    B.ROOT-SERVERS.NET.
.            518400    IN    NS    I.ROOT-SERVERS.NET.
.            518400    IN    NS    A.ROOT-SERVERS.NET.
.            518400    IN    NS    C.ROOT-SERVERS.NET.
.            518400    IN    NS    J.ROOT-SERVERS.NET.
.            518400    IN    NS    M.ROOT-SERVERS.NET.
.            518400    IN    NS    L.ROOT-SERVERS.NET.
.            518400    IN    NS    G.ROOT-SERVERS.NET.
.            518400    IN    NS    D.ROOT-SERVERS.NET.
.            518400    IN    NS    E.ROOT-SERVERS.NET.
.            518400    IN    NS    F.ROOT-SERVERS.NET.
.            518400    IN    NS    K.ROOT-SERVERS.NET.
.            518400    IN    NS    H.ROOT-SERVERS.NET.

;; Query time: 19 msec
;; SERVER: 192.168.1.1#53(192.168.1.1)

In the example above, the record being queried for is not in cache. Now, see below what happens if someone has queried for it in the interim when we try the same dig again:

dig +norecurse @192.168.1.1 target.domain.com

; <<>> DiG 9.7.4 <<>> +norecurse @192.168.1.1 target.domain.com
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 9258
;; flags: qr ra; QUERY: 1, ANSWER: 6, AUTHORITY: 4, ADDITIONAL: 0

;; QUESTION SECTION:
;target.domain.com.        IN    A

;; ANSWER SECTION:
target.domain.com.    294    IN    A    10.1.1.1

;; AUTHORITY SECTION:
domain.com.        172792    IN    NS    ns2.domain.com.
domain.com.        172792    IN    NS    ns1.domain.com.

;; Query time: 7 msec
;; SERVER: 192.168.1.1#53(192.168.1.1)

From the answer above (and with the knowledge that the TTL on the record is 300), I know that 6 seconds ago, someone else queried the server for target.domain.com. Since it's cached, I don't know how often the same query has been made since, but I can get a good idea of how frequently the record is queried by repeating my dig tests and looking at the periods of time when there is no answer available. (This technique also works for learning about queries for domains that don't exist, where the recursive server is caching the NXDOMAIN pseudo-records.)

  1. Using recursive queries: This is very similar to the above, except that I have to deduce that the recursive server responded from cache by looking at both the time it took for the server to respond to me (although, depending on the server load, this may not be significant) and at the TTL of the answer I'm given. For example:
dig @192.168.1.1 target.domain.com

; <<>> DiG 9.7.4 <<>> @192.168.1.1 target.domain.com
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 37267
;; flags: qr rd ra; QUERY: 1, ANSWER: 6, AUTHORITY: 4, ADDITIONAL: 0

;; QUESTION SECTION:
;target.domain.com.        IN    A

;; ANSWER SECTION:
target.domain.com.    300    IN    A    10.1.1.1

;; AUTHORITY SECTION:
domain.com.        172798    IN    NS    ns2.domain.com.
domain.com.        172798    IN    NS    ns1.domain.com.

;; Query time: 2689 msec
;; SERVER: 192.168.1.1#53(192.168.1.1)

The answer has been given back to me with a TTL of 300 and the query time was 2689. I happen to know in that the TTL on this record is 300 (but I could query the authoritative servers easily to get this information), so it looks like in the example above that the record was not in cache and had to be fetched for me.

But in the example below, the situation is different:

dig @192.168.1.1 target.domain.com

; <<>> DiG 9.7.4 <<>> @192.168.1.1 target.domain.com
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 19935
;; flags: qr rd ra; QUERY: 1, ANSWER: 6, AUTHORITY: 4, ADDITIONAL: 0

;; QUESTION SECTION:
;target.domain.com.        IN    A

;; ANSWER SECTION:
target.domain.com.       273    IN      A       10.1.1.1

;; AUTHORITY SECTION:
domain.com.             170553  IN      NS      ns1.domain.com.
domain.com.             170553  IN      NS      ns2.domain.com.

;; Query time: 8 msec
;; SERVER: 192.168.1.1#53(192.168.1.1)

From both the TTL of the answer (which is less than 300) and the query time (much, much shorter than before), I can determine that this answer came from cache.

How to protect your servers from snooping

  1. The first, and most important, form of protection is to limit access to your recursive servers. Unless you are intentionally providing open DNS resolution, then you should limit access to your known and trusted clients alone. 

By default, BIND does not allow open recursive access - these are the settings you will get if allow-query and allow-recursion are not set explicitly:

allow-query { any; };
allow-recursion { localnets; localhost; };

If this is a recursive-only server, then you would most likely want allow-query and allow-recursion to have the same Access Control List (ACL) and for these to both be your trusted clients.

If this is a mixed-use server with some authoritative zones that are either open to all Internet users in combination with recursive services that are for internal clients only, then your options would differ.

Please see the Administrator Reference Manual
Detail of both these and other fine-grained access controls can be found in the BIND 9 ARM, which is also available both on our website and in the source code distributions. Note also that the defaults above apply to current versions of BIND - older versions (now unsupported) may behave differently.
  1. It is sometimes recommended that you should limit non-recursive access to your recursive servers to prevent the possibility of cache snooping attempts using the first technique documented in the section above. BIND does not have a configuration option that provides this level of control and there are no current plans to add this. The reasons for this are:
  • Restricting cache snooping via non-recursive queries doesn't prevent would-be attackers from employing the second technique documented above instead. Therefore this would be ineffective as a protective solution.
  • Changing the default behaviour of BIND may break the requirements of some existing users.
  • RFC 1034 does strongly imply (even though it doesn't clearly state) that RD=0 queries should be answered from cache data when it's available and the client is permitted to query this server.
  1. One further suggestion that has been made is that even when responding to recursive queries, a recursive server should subtract a random value from the TTL of the record it is going to serve in order to obfuscate the length of time it has been present in cache. This would mean that both cached and non-cached answers would have unpredictable TTLs on the records served. This suggestion does not, however, prevent detection of cached answers based on the time taken by the nameserver to answer. The obvious next suggestion here would be to deliberately delay responses when answering from cache - but that would be buying nebulous security benefits at the expense of recursive query performance.

In summary - the problem appears to be one that requires a judgement on whether or not clients of a recursive server can be trusted. If they cannot be, then the next decision is how significant is the risk in your environment - and to whom does the risk apply?

Some security tools may report that a server is vulnerable to snooping attacks
Some security analysis tools may report that a server is responding to non-recursive queries for 3rd party domains. This article has been created to explain what the risks actually are. If the analysis tools are being run from within your network where your trusted clients reside, then the warning is a false-positive, providing that you:
a)  trust your clients; and
b)  do not allow recursive queries from outside your trusted client network.