The ISP Column 
A monthly column on things Internet


                                                                         May 2015

Tech Note: 
Measuring DNS Behaviour


  The DNS is a very simple protocol. The protocol is a simple query /
  response interaction where the client passes a DNS transaction to a
  server with the query part of the transaction completed. The server fills
  in the answer part and possibly adds further information in the
  additional information part, and returns the transaction back to the
  client. 

  All very simple. What could possibly go wrong?

  In practice the operation of the DNS is significantly more involved.
  Indeed, it’s so involved that its difficult to tell whether the operation
  of the DNS is merely “complicated” or whether the system’s interactions
  verge on what we would describe as “complex”. When we have some simple
  questions about aspects of DNS behaviour sometimes they questions are
  extremely challenging to answer.

  The question in this this instance is extremely simple to phrase. How
  many DNS resolvers are incapable of receiving a DNS response whose size
  is just inside the conventional upper limit of an unfragmented IP packet
  in today’s Internet? Why many resolvers can successfully receive a DNS
  response when the combined size of the DNS response plus the protocol
  overheads approaches 1,500 octets in total? Before looking at how to
  answer this question it may be helpful to understand why this is an
  interesting question right now.

Rolling the Root Zone’s Key Signing Key

  In June 2010 the Root Zone (RZ) of the DNS was signed using the DNSSEC
  signing framework. This was a long-anticipated event. Signing DNS zones,
  and validating signed responses allows the DNS to be used in such a
  manner that all responses can be independently verified, adding an
  essential element of confidence into one of the fundamental building
  blocks of the Internet.

  However, it goes further than that. Today’s framework of security in the
  Internet is built upon some very shaky foundations. The concept of domain
  name certificates, in use since the mid-90’s, was always a quick and
  dirty hack, and from time to time we are reminded just how fragile this
  system truly is when the framework is maliciously broken. We can do
  better, and the approach is to use DNSSEC to place cryptographic
  information in the DNS, signed using DNSSEC (“The DNS-Based
  Authentication of Named Entities (DANE) Transport Layer Security (TLS)
  Protocol: TLSA”, RFC 6698). So these days DNSSEC is quickly becoming a
  critical technology. The cryptographic structure of  DNSSEC mirrors the
  structure of the DNS name space itself, in that it is a hierarchical
  structure with a single common apex point, which is the anchor point of
  trust, namely the key used to sign across the Zone Signing Key (ZSK)
  that, in turn, is used to sign all entries in the root zone of the DNS.
  This key, the RZ Key Signing Key (KSK), is the key that was created five
  years ago when the RZ was first signed.

  One of the common mantras we hear in cryptography is that no key should
  be used forever, and responsible use of this technology generally
  includes some regularly process to change the key value to some new key.
  This was envisaged in the DNS KSK procedures, and in a May 2010 document
  from the Root DNSSEC Design Team (“DNSSEC Practice Statement for the Root
  Zone KSK Operator” http://www.root-dnssec.org/wp-content/uploads/2010/06/icann-dps-00.txt),
  the undertaking was made that “Each RZ KSK will be scheduled to be rolled
  over through a key ceremony as required, or after 5 years of operation”. 

  Five years have elapsed, and its time to roll the RZ KSK of the DNS. The
  procedure is familiar: a new key pair is manufactured, and the new public
  key value is published. After some time the new key is used to sign the
  zone signing key, in addition to the current signature. Again, after some
  time, the old signature, and the old key is withdrawn and we are left
  with the new KSK in operation. A procedure to do this for the RZ KSK was
  documented in 2007, as RFC 5011 (”Automated Updates of DNS Security
  (DNSSEC) Trust Anchors”,  RFC5011).

  There are two basic unknowns in this approach. The first is that we do
  not know how many resolvers implement RFC5011, and will automatically
  pick up the new RZ KSK when it is published in the root zone. If they do
  not automatically pick up the new RZ KSK via this process then they will
  need to do so manually. Those that do not will be left stranded when the
  old RZ KSK is removed, and will return SERVFAIL due to an inability to
  validate any signed response as their copy of the RZ KSK no longer
  matches the new RZ KSK. The second unknown is in the size of the DNS
  response, and in particular the size of the rise of the response in the
  intermediate stage when both KSK values are used to sign the RZ ZSK. At
  some stages of this anticipated key roll procedure the DNSSEC-signed
  response to certain DNS queries of  the root, namely the RZ DNSKEY query,
  may approach 1500 octets in size. The first unknown is difficult to
  quantify in advance. However the second should measurable. 

  The query that is of interest here is a query for the RZ DNSKEY RR. The
  largest response is a scenario where the ZSK is increased to a 2048-bit
  RSA signature and where both the old and the new KSK values sign the ZSK.
  In this phase it appears that the size of the DNS response is 1,425
  octets. This is compared to the current response size of 736 octets which
  uses a 1048-bit ZSK and a single 2048-bit KSK. This only applies to those
  resolvers who are requesting the DNSSEC signature data to be included in
  the response, which is signalled by setting the DNSSEC OK flag in the
  query. 

  How can we measure if this size of response represents an operational
  problem for the Internet?

DNS, UDP, Truncation, TCP and Fragmentation

  The DNS is allowed to operate over the UDP and TCP transport protocols.
  In general, it is preferred to use UDP where possible. UDP is a preferred
  due to the lower overheads of UDP, particularly in terms of the load
  placed on the server. However there is a limitation imposed by this
  protocol choice. By default, UDP can only handle responses with a DNS
  payload of 512 octets of less (RFC1035). 

  EDNS0 (RFC6891) allows a DNS requestor to inform the DNS server that it
  can handle UDP response sizes larger than 512 octets. 

  It should be noted that some clarification of the semantics of this claim
  is necessary. If a clients sets the value of 4096 in the EDNS0 UDP buffer
  size field of a query the client is telling the server that it is capable
  of reassembling a UDP response  whose DNS payload size if up to 4,096
  octets in size. It is likely that to transit such a large response the
  UDP response will be fragmented into a number of packets. What the client
  cannot control is whether all the fragments of the response will be
  passed through from the server to the client. Many firewalls regards IP
  packet fragments as a security risk, and use fragment discard rules. The
  client is not claiming that no such filters exist on the path form the
  server to the client. The EDNS0 UDP buffer size only refers to the packet
  reassembly buffer size in the client’s internal IP protocol stack.

  Setting this EDNS0 buffer size is akin to the client saying to the
  server: “You can try using a UDP response up to this size, but I cannot
  guarantee that I will get the answer.”

  The requestor places the size of its UDP payload reassembly buffer (not
  the IP packet size, but the DNS payload reassembly size) in the query,
  and the server is required to respond with a UDP response where the DNS
  payload is no larger than the specified buffer size. If this is not
  possible then the server sets the “truncated” field in the response to
  indicate that truncation has occurred. If the truncated response includes
  a valid response in the initial part of the answer, the requestor may
  elect to use the truncated response in any case. Otherwise the intended
  reaction to receiving a response with the Truncated Bit set is that the
  client then opens a TCP session to the server and presents the same query
  over TCP.  A client may initiate a name resolution transaction in TCP,
  but common client behaviour is to initiate the transaction in UDP, and
  use the Truncated Bit in a response to indicate that the client should
  switch TCP for the query.

  When a UDP response is too large for the network the packet will be
  fragmented at the IP level. In this case the trailing fragments use the
  same IP level leader (including the UDP protocol number field), but
  specifically exclude the UDP pseudo header in the trailing fragments. UDP
  packet fragmentation is treated differently in IPv4 and IPv6. In IPv4
  both the original sender, or any intermediate router, may fragment an IP
  packet (unless the Don’t Fragment IP header flag is set). In IPv6 only
  the original sender may fragment an IP packet. If an intermediate router
  cannot forward a packet onto the next hop interface then the IPv6 router
  will generate an ICMP6 diagnostic packet with the MTU size of the next
  hop interface, append the leading part of the packet and pass this back
  to the packet’s sender.  When the offending packet is a UDP packet the
  sender does not maintain a buffer of unacknowledged data, so the IPv6 UDP
  sender, when receiving this message, cannot retransmit a fragmented
  version of the original packet. Some IPv6 implementations is to generate
  a host entry in the local IPv6 forwarding table, and record the received
  MTU in this table against the destination address recorded in the
  appended part of the ICMP6 message, for some locally determined cache
  entry lifetime. This implies that any subsequent attempts within some
  cache lifetime to send an IPv6 UDP packet to this destination will use
  this MTU value to determine how to fragment the outgoing packet. Other
  IPv6 implementations appear to simply discard this ICMP6 Packet Too Big
  message when it is received  in response to an earlier UDP message. 

  The interaction of the DNS with intercepting middleware is sometimes
  quite ugly. Some firewall middleware explicitly blocks port 53 over TCP,
  so that any attempt by resolvers on the “inside” to use TCP for DNS
  queries will fail. More insidious is the case where middleware blocks the
  transmission of UDP fragments, so that when a large DNS response is
  fragmented the fragments do not reach the resolver. Perhaps one of the
  hardest cases for the client and server to cope with is the case where
  middleware blocks ICMP6 Packet Too Big messages. The sender of the large
  IPv6 response (TCP or UDP) is unaware that the packet was too large to
  reach the receiver as the ICMP6 message has been blocked, but the
  receiver cannot inform the sender of the problem as it never received any
  response at all from the sender. There is no resolution to this deadlock
  and ultimately both the resolver and the server need to abandon the query
  due to local timers.

DNS Response Size Considerations

  There are a number of components in a DNS response, and the size
  considerations are summarized in Table 1.

        8 octets  UDP pseudo header size 
       20 octets  IPv4 packet header
       40 octets  maximum size of IPv4 options in an IPv4 IP packet header
       40 octets  IPv6 packet header
      512 octets  the minimum DNS payload size that must be supported by DNS
      576 octets  the largest IP packet size (including headers) that must be supported by IPv4 systems
      913 octets  the size of the current root priming response with DNSSEC signature
    1,232 octets  the largest DNS payload size of an unfragmented IPv6 DNS UDP packet
    1,280 octets  the smallest unfragmented IPv6 packet that must be supported by all IPv6 systems
    1,297 octets  the largest size of a ./IN/DNSKEY response with two 2048-bit KSKs and one 1024-bit ZSK.
    1,425 octets  the largest size of a ./IN/DNSKEY response with two 2048-bit KSKs and one 2048-bit ZSK.
    1,452 octets  the largest DNS payload size of an unfragmented Ethernet IPv6 DNS UDP packet
    1,472 octets  the largest DNS payload size of an unfragmented Ethernet IPv4 DNS UDP packet
    1,500 octets  the largest IP packet size supported on IEEE 802.3 Ethernet networks

    Table 1. Size components relevant to DNS

  The major consideration here is to minimize the likelihood that DNS
  response is lost. What is desired is a situation where a UDP response
  with a commonly used EDNS0 buffer size can be used to form a response
  with a very low probability of packet fragmentation in flight for IPv4,
  and an equally low probability of encountering IPv6 Path MTU issues.

  If the desired objective is to avoid UDP fragmentation as far as
  possible, then it appears that the limit in DNS payload size is 1,452
  octets. Can we quantify “as far as possible”? In other words, to what
  extent are those resolvers that ask a DNSSEC-signed root key priming
  query able to accept a UDP response with a DNS payload size of 1,452
  octets? 

The Measurement Technique

  The technique used for this measurement is one of embedding URL fetches
  inside online advertisements, and use the advertisement network as the
  distributor of the test rig (this is documented in an earlier article:
  http://www.potaroo.net/ispcol/2013-05/1x1xIPv6.html, and won't be
  repeated here). However, this approach is not easily capable of isolating
  the behaviour of individual resolvers who query authoritative name
  servers. User systems typically have two or more resolvers provided in
  their local environment, and these resolvers may use a number of
  forwarders. The user system will repeat the query based on its own local
  timers, and each recursive resolver also has a number of local timers.
  The internal forwarding structure of the DNS resolver system is also
  opaque, so the correlation of a user system attempting to resolve a
  single DNS name, and the sequence of queries that are seen at the
  authoritative name server are often challenging to interpret. If the user
  is able to fetch the Web object that is referenced in the URL then one
  can observe that at least one of the resolvers was able to successfully
  resolve the DNS name, but that does not mean that one can definitively
  say which resolver was the one that successfully resolved the name when
  the name was queried by two or more resolvers. Similarly if the web
  object was not fetched then it is possible that none of the resolvers
  were able to resolve the name, but this is not the only reason why the
  web fetch failed to occur. Another possibility is that the script’s
  execution was aborted, or the user’s local resolution timer expired
  before the DNS system returned the answer. This explains a necessary
  caveat that there is a certain level of uncertainty in this form of
  experimental measurement. 

  The measurement experiment was designed to measurement the capabilities
  of resolvers that ask questions of authoritative name servers, and in
  particular concentrate our attention on those resolvers that use EDNS0
  and set the DNSSEC OK flag. This has some parallels with the root server
  situation, but of course does not in fact touch the root servers and does
  not attempt to simulate a root service. 

  In the first experiment an authoritative server has been configured to
  respond to requests with a particular response size. The resolvers we are
  looking for are those who generate queries for an A Resource Record query
  with the DNSSEC OK bit set. The response to these queries is a DNS
  response of 1,447 octets, which is just 5 bytes under the largest DNS
  payload of an unfragmented Ethernet IPv6 DNS  UDP packet. 

  Server Platform

  The server is running FreeBSD 10.0-RELEASE-p6 (GENERIC)

  The I/O interface supports TCP segment offloading, and is enabled with
  1500 octet TCP segmentation in both IPv4 and IPv6.

  The server is running a BIND server:
    BIND 9.11.0pre-alpha <id:> built by make with '--with-aes' '--with-ecdsa'
    compiled by CLANG 4.2.1 Compatible FreeBSD Clang 3.3 (tags/RELEASE_33/final 183502)
    compiled with OpenSSL version: OpenSSL 1.0.1e-freebsd 11 Feb 2013
    linked to OpenSSL version: OpenSSL 1.0.1e-freebsd 11 Feb 2013
    compiled with libxml2 version: 2.9.1
    linked to libxml2 version: 20901

  The BIND server uses an MTU setting of 1500 octets in IPv4 and IPv6 for
  TCP and UDP.

  The server is serving a DNSSEC-signed zone, and the specific response
  sizes for this experiment have been crafted using additional bogus RRSIG
  records against the A RR entry, on the basis that the server is unable to
  strip out any of these RRSIG records in its response, and will be forced
  to truncate the UDP response if it cannot fit in the offered buffer size.

  The zone was served in both IPv4 and IPv6, and the choice of IP protocol
  for the query and response was left to the resolver that was querying the
  server.

  The queries were generated using Google’s online advertising network, and
  the advertising network was set up to deliver no less than 1 million
  impressions per day. Each ad impression uses a structured name string
  which consists of a unique left-most name part,  and a common zone parent
  name. The use of the unique parts is to ensure that no DNS cache nor any
  Web proxy is in a position to serve a response from a local cache. All
  DNS queries for the IP addresses bound to this name are passed to the
  zone’s authoritative name server, and similarly all http GET requests for
  the associated URL are also passed to the web server. EDNS0 Buffer Size
  Distribution The first distribution is a simple scan of all the offered
  EDNS0 buffer sizes in queries that have the DNSSEC OK flag set, for the
  period from the 26th March 2015 to 8th May 2015.  (Figure 1). Of the 178
  million queries, some 31% of queries presented an EDNS0 UDP buffer size
  of less than 1500 octets, and 18% of queries presented a size of 512
  bytes.

    http://www.potaroo.net/ispcol/2015-05/fig1.jpg
    Figure 1: Cumulative distribution of EDNS0 UDP Buffer Size Distribution
              across all queries

  This distribution reflects resolver behavior where the resolver will vary
  the EDNS0 buffer size  and re-present the query. 

  A second distribution is shown in Figure 2, which shows the cumulative
  distribution of the maximum EDNS0 buffer sizes  across the 1.5 million
  resolvers observed across this period. The relatively smaller proportion
  of small EDNS0 buffer sizes in this per-resolver distribution shows that
  reducing the EDNS0 buffer size in subsequent queries following a timeout
  is one technique a resolver can use to establish the difference between
  an unresponsive server and a network problem with the transmission of
  larger responses.

    http://www.potaroo.net/ispcol/2015-05/fig2.jpg\
    Figure 2: Cumulative distribution of EDNS0 UDP Buffer Size Distribution
              across all resolvers

  Some 6.9% of visible resolvers already set their EDNS0 buffer size below
  the current size of the DNSSEC-signed root zone priming response of 913
  octets

  A further 1.4% of visible resolvers use a buffer size that is greater
  that 913 octets, and less than 1452 octets. (See Figure 3).

 
    http://www.potaroo.net/ispcol/2015-05/fig3.jpg
    Figure 3: Cumulative distribution of EDNS0 UDP Buffer Size Distribution
              across all resolvers, 900 to 1500 octet range

The DNS Transport Protocol

  The conventional operation of the DNS is to prefer use UDP to perform
  queries. Within this convention TCP is used as a fallback when the server
  passes a truncated response to the client in UDP. The distribution of
  EDNS0 UDP buffer sizes shows that 30% of all UDP queries used an EDNS0
  buffer size of less than the 1,447 octet DNS response size used in the
  experiment, and 8.5% of all visible resolvers used an EDNS0 buffer size
  of less than 1,447 octets. The difference between the two numbers
  indicates that these resolvers who used small EDNS0 buffer sizes
  presented a disproportionately high number of UDP queries to the server,
  indicate that some resolvers evidently do not cleanly switch to TCP upon
  receipt of a truncated UDP response. 

  The experiment used an authoritative name server configured with both
  IPv4 and IPv6 addresses, allowing resolvers to use their local protocol
  preference rules to determine which protocol to use to pass queries to
  the name server. In this case it appears that the local preference rules
  appear to be weighted heavily in favor of using IPv4.

  The breakdown of use of UDP and TCP, and the use of IPv4 and IPv6 over a
  7 day sample period is shown in Table 2.

        Number of Queries  47,826,735                      
                                
        UDP Queries        46,559,703  97%             
        TCP Queries         1,267,032   3%              
                                                UDP          TCP
        V4 Queries         47,431,919  99%   46,173,832   1,258,087
        V6 Queries            394,816   1%      385,871       8,945

    Table 2: DNS Transport Protocol Use

Fragmentation Signals

  IPv6 IP Reassembly Time Exceeded

  When middleware discards trailing fragments of an IP datagram then the
  destination host will receive just the initial fragment and will time out
  in attempting to reassemble the complete IP packet. Conventional
  behaviour is for hosts to send an ICMP IP Reassembly Time Exceeded
  message to the originator to inform the packet sender of the failure. Not
  all hosts reliably send this message and there is no assurance that all
  forms of middleware to allow this ICMP message to be passed to the
  sender, but this is some form of indication that middleware exists that
  intercepts trailing fragments of fragmented IP datagrams.

  The server received no ICMP6, time exceeded in-transit (reassembly)
  messages across a 7 day measurement period.

  IPv6 Packet Too Big

  In IPv6 routers cannot fragment IPv6 packets to fit into the next hop
  interface. Instead, the router will place the initial part of the packet
  into an ICMP6 message, add the code to say that the packet is too large,
  and the MTU of the next hop interface. This ICMP packet is passed back to
  the packet’s source address. 

  There were 205 ICMP6 packet too big messages received by the server in
  this experiment in this 7 day measurement period. Of these, 33 offered a
  1280 MTU in the ICMP message, 1 offered 1454, 1 offered 1476, 1 offered
  1500 (bug!) and the remainder offered 1480 (indicative of an IPv6 in IPv4
  tunnel;). 

  These ICMP messages were generated by 26 IPv6 routers. The overall
  majority of these messages come from the two tunnel broker networks that
  were early providers of IPv6 via IP-in-IP tunneling (AS109 and AS 6939).  

  IPv4 IP Reassembly Time Exceeded

  In this experiment the server received no ICMP IP Reassembly Time
  exceeded messages in response to responses for the A RR.

  Summary

  There is a relatively minor issue with IPv6 ICMP packet too big messages,
  where 205 messages were received out of a total of 46.5M DNS responses.
  These were mainly from known IPv6 tunnel broker networks, and the IPv4
  query was processed without fragmentation. These messages were the
  outcome of a design decision to run the experiment with a 1,500 octet MTU
  for IPv6 UDP and TCP. It is likely that running the server with a 1280
  MTU for outbound IPv6 UDP would’ve eliminated these IPv6 messages. 

Measurement 1: DNS and Web

  In this section we report on a measurement approach that combined the
  server’s records of DNS queries and responses with the HTTP queries and
  responses to assemble an overall picture of the capability of resolvers
  who retrieve DNSSEC signatures to successfully process a response when
  the DNS payload approaches 1,452 octets in size (the maximum size of an
  unfragmented IPv6 UDP datagram). In this experiment we are using a
  response size of 1,447 octets.

  We observe that if a DNS resolver is unable to receive a response to its
  query the original question will both be repeated to the original
  resolver to retry and the query will also be passed to alternate
  resolvers, assuming that multiple resolvers have been configured by the
  end user system or if multiple resolvers are used in the query forwarding
  path, or if there is some form of DNS load balancing being used in the
  resolver path. Ultimately, if all these resolvers are unable to resolve
  the name within a user-defined query timeout interval, then the user will
  be unable to fetch the provided URL. This observation implies that the
  proportion of DNS queries where there is no matching HTTP GET operation
  is some approximate upper bound on the  DNS resolution failure rate when
  large responses are used.

  However, within the parameters of this test, undertaken as a background
  script run by a users browser upon impression of an online advertisement,
  there is a further factor that creates some noise that is layered upon
  the underlying signal. The intended outcome of the experiment is that the
  user’s browser fetches the URL that has a DNS name where the associated
  DNS response is 1,447 octets, and the failure rate is determined by the
  absence of this fetch. However, this is not the only reason why the
  user’s browser fails to perform the fetch. The user may switch the
  browser to a new page, or quit the session, or just skip the ad, all of
  which causes an early termination of the measurement script, which
  results in a failure to retrieve the web object. 

  We can mitigate this noise factor to some extent by observing that we are
  interested in DNS resolvers that perform queries with EDNS0 DNSSEC OK.
  The DNS  name used for this measurement is DNSSEC signed, and
  DNSSEC-validating resolvers will perform a DS query immediately following
  the successful receiving of the A response. (The resolvers used by many
  (1 on 3) users who use resolvers that include DNSSEC OK in their A 
  queries perform DNSSEC validation, and fetch the associated DS and DNSKEY
  RRs.). To identify those experiments that terminate early the experiment
  script also performs a further operation. The script will wait for all
  experiments to complete or for a 10 second timer to trigger. At this
  point the script will perform a final fetch to signal experiment
  completion. 

    Experiments   Timed Out   Completed     Web Seen   DS Seen Remainder
      8,760,740     329,993   8,430,747    7,615,693   815,054     0
                     4%                        90%       10%       0%

    Table 3: DNS-to-Web Results

  Across the 7 day period with just under 9 million sample points we
  observed no evidence that any user who used resolvers that set the DNSSEC
  OK bit was unable to fetch the DNS records.

  There are two caveats to this result. The first is that some 4% of users
  were unable to complete the experiment and did not fetch a completion
  object, nor the initial web object, nor a DS resource record. This
  represents a level of uncertainty in this result. The second caveat is
  the nature of this experiment. This experiment measures the capability of
  the collection of resolvers used by each user who performed this test.
  The experiment does not directly measure the capability of the individual
  resolvers who attempted to resolve the DNS name for the user who
  performed the experiment.

Measurement 2: DNS Indirection

  Is it possible to refine this experiment and identify individual
  resolvers who are unable to receive DNS responses when the response size
  approaches 1,452 octets?

  One approach is to take a form of DNS delegation that was described in a
  DNS OARC presentation by Florian Maury of the French Systems Security
  agency (“The “indefinitely” Delegated Name Servers (IDNS) Attack,”
  https://indico.dns-oarc.net/event/21/contribution/11/material/slides/0.
  pdf). The approach described there is the use a “glueless delegation”, so
  that when a resolver is attempting to resolve a particular name it is
  forced to perform a secondary resolution of the name server’s name, as
  this information is explicitly not provided in the parent zone as glue.

        For example, to resolve the name “a.b.example.com”, the resolver
        will query the name servers for “example.com” for the name
        “a.b.example.com”. The server will return the name servers for
        “b.example.com” but will be unable to return their IP addresses, as
        this is the glue that has been explicitly removed from the zone.
        Let’s suppose that the name server for this zone is
        “nsb.z.example.com”. In order to retrieve the address of the name
        server, the resolver will have to query the name server for
        “z.example.com” for the name “nsb.z.example.com”, and then use the
        result as the address of the server to use to query for
        “a.b.example.com”. This is shown in Figure 4.

        http://www.potaroo.net/ispcol/2015-05/fig4.jpg
        Figure 4: Example of “Glueless” Delegation

  When viewed in the context of measuring the properties of DNS resolvers
  we note that in the above example, in order to pass a query to the name
  server of “b.example.com” the resolver must have been able to resolve the
  name “nsb.z.example.com.” If we are interested in identifying those
  resolvers that are unable to receive a DNS response of 1,444 octets, then
  if we use RRSIG padding to pad the response to the “nsb.z.example.com”
  query, then in an ideal world the resolvers that query for
  “nsb.z.example.com” but do not query for “a.b.example.com” would be the
  resolvers that are unable to receive or process a 1,444 octet DNS
  response.

  In this second experiment we configured two domain name families. One was
  a conventional, unsigned glueless delegation, where the response to the
  query for the name server address was a 93 octet response, which we used
  as the control point for the experiment. The other delegation used a
  DNSSEC-signed record, signed with RSA-2048, where the response to a
  request for the name server address was a 1,444 octet response if the
  resolver set the DNSSEC OK flag in the query. The names used in each
  experiment were structured to be unique names, so that in both cases the
  queries for the names, and the name server names, were not cached, and
  the resolution efforts were visible at the authoritative name server.
  
  We altered the server to use a 1280 IPv6 UDP MTU, leaving all other settings
  on the server as per the first measurement.

  In a 109 hour period from the 21st May 2015 to the 26th May 2015 we
  recorded the following results:

    Experiments Resolved NS   Fail to resolve NS  DNSSEC OK Failure    Other
      5,972,519   5,446,535        525,984            474,037         51,947
                     91%             8%                   7%            1%

    Table 4: DNS Indirection Results

  In this case the experiment count is the number of experiments where the
  resolver successfully queries for both the address of the control name
  server and the address of the control URL DNS name, and queried for the
  address of the experiment name server. In 91% of the cases the same
  resolver was also seen to query for the experiment URL DNS address,
  indicating that it had successfully received the response to the earlier
  name server address query. The majority of the cases that failed to
  receive this response were in response to name server address queries
  that generated a 1,444 octet response, while a smaller number of failures
  (slightly less than 1%) were in response to a non-DNSSEC OK query.

  The nature of this particular experiment is that it is now possible to
  isolate the capabilities of each resolver.  The statistics of resolver
  capability is shown in the following table

    Resolvers  Resolved NS  Fail to resolve NS  DNSSEC OK Failure    Other
      82,954     78,703        4,251                3,936             315
                  94%            6%                 5.5%             0.5%

    Table 5: Resolver Statistics of DNS Indirection Results

  This experiment indicates that some 6% of the resolvers that query
  authoritative name servers are unable to process a DNS response where the
  response is 1,444 octets in length. In this case this is a pool of 4,251
  resolvers who can successfully follow an NS chain where the NS A response
  is small, but cannot follow the NS chain when they generate a query with
  the DNSSEC OK flag set and the response is 1,444 octets. Of this set of
  potentially problematical resolvers some 3,110 were only seen to perform
  (and fail) the experiment a single time, and it is perhaps premature to
  categorize these resolvers as being unable to retrieve a 1,444 octet
  payload on the strength of a single data point. Of the remaining 826
  resolvers which exhibited this failure behavior more than once. We have
  the following distribution of occurrences:

    Times Seen Resolver Count
       1       3,110
       2         533
       3         152
       4          56
       5          28
       6          17
       7           4
       8           4
       9           5
      10           1
      11           2
      12           6
      13           3
      14           1
      16           2
      17           4
      18           1
      23           1
      28           1
      30           1
      37           1
      60           1
     140           1
     370           1

    Table 6: Failing Resolver – Seen Counts


  It is reasonable to surmise that if the resolver exhibits the same
  behavior two or more times in the context of this experiment then the
  possibility that this is a circumstance of experimental conditions is far
  lower, and the likelihood that there is a problem in attempting to pass a
  large DNS response to the resolver is higher. In this case some 826
  resolvers, or 1% of the total count of seen resolvers falls into this
  category.

  Of the total number of observed resolvers, some 5,237 are using IPv6, or
  6% of the total. Of the 4,251 resolvers who were observed to fail to
  resolve the NS record, 830 resolvers, or 21% of the total number of
  failing resolvers used IPv6. This indicates that there are some prevalent
  issues with IPv6 and DNS responses that push the IPv6 packet size larger
  than the 1,280 octet minimum unfragmented MTU size.

  These results indicate that using a 1,444 octet DNS response appears to
  present issues for up to 5% of all visible resolvers, who appear to be
  incapable of receiving the response. This is most likely for 1% of
  visible resolvers, and is possibly the case for the remaining 4%.

  This does not necessarily imply that 5%, or even 1%, of all end users are
  affected. In this experiment 7,261,042 users successfully fetched the
  control A record, and of these some 7,172,439 successfully fetched the
  test record, a difference of 1%. This implies that where individual
  resolvers failed, the users’ DNS resolution configuration passed the
  query to other resolvers to successfully resolve the name record.

  Some 6.5% of queries were made over TCP (1,211,034 queries out of
  18,620,589 queries) for the test name from those resolvers who set the
  DNSSEC OK flag in the query. By comparison, for the control name, 475
  queries were made using TCP, from a total of 16,451,812 queries. This
  6.5% figure correlates with the EDNS0 UDP buffer size distribution, where
  7% of all UDP queries in this experiment had an EDNS0 buffer size of less
  than 1,444 octets.

Conclusions

  What can we expect when the DNSSEC-signed responses to DNS queries start
  to approach the maximum unfragmented packet size?

  This will potentially impact most users as the majority of queries (some
  90% of queries in this experiment) set the EDNS0 DNSSEC OK flag, even
  though a far lower number of resolvers actually perform validation of the
  answer that they receive.

  The conclusion from this experiment is that up to 5% of DNS resolvers
  that set the DNSSEC OK flag in their queries appear to be unable to
  receive a DNS response of 1,444 octets, and within this collection IPv6
  is disproportionally represented. It is possible that this is due to the
  presence of various forms of middleware, but the precise nature of the
  failures cannot be established from within this experimental methodology. 

  However, the failing resolvers serve a very small proportion of users,
  the number of users who are unable to resolve a DNS name when DNS
  responses of this size are involved appear to be no more than 1% of all
  users, and likely to be significantly lower.


________________________________________
Author

  Geoff Huston B.Sc., M.Sc., is the Chief Scientist at APNIC, the Regional
  Internet Registry serving the Asia Pacific region. He has been closely
  involved with the development of the Internet for many years,
  particularly within Australia, where he was responsible for building the
  Internet within the Australian academic and research sector in the early
  1990’s. He is author of a number of Internet-related books, and was a
  member of the Internet Architecture Board from 1999 until 2005, and
  served on the Board of Trustees of the Internet Society from 1992 until
  2001 and chaired a number of IETF Working Groups. He has worked as a an
  Internet researcher, as an ISP systems architect and a network operator
  at various times.

  www.potaroo.net


________________________________________
Disclaimer

  The above views do not necessarily represent the views or positions of
  the Asia Pacific Network Information Centre.