Network Working Group V. Pappas Internet-Draft IBM Intended status: Standards Track B. Zhang Expires: April 26, 2007 Colorado State Univ. E. Osterweil UCLA D. Massey Colorado State Univ. L. Zhang UCLA October 23, 2006 Improving DNS Service Availability by Using Long TTL Values draft-pappas-dnsop-long-ttl-03 Status of this Memo By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on April 26, 2007. Copyright Notice Copyright (C) The Internet Society (2006). Pappas, et al. Expires April 26, 2007 [Page 1] Internet-Draft Improving DNS Service Availability October 2006 Abstract Due to the hierarchical tree structure of the Domain Name System [RFC1034][RFC1035], losing all of the authoritative servers that serve a zone can disrupt services to not only that zone but all of its descendants. This problem is particularly severe if all the authoritative servers of the root zone, or of a top level domain's zone, fail. Although proper placement of secondary servers, as discussed in [RFC2182], can be an effective means against isolated failures, it is insufficient to protect the DNS service against distributed denial of service attacks (DDoS). This document proposes to mitigate the impact of DDoS attacks against top level DNS servers by setting long TTL values for NS records and the associated A records. Our proposal involves only operational changes and can be deployed incrementally. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 4 1.2. Conventions . . . . . . . . . . . . . . . . . . . . . . . 4 2. Recommendations . . . . . . . . . . . . . . . . . . . . . . . 5 2.1. Examples . . . . . . . . . . . . . . . . . . . . . . . . . 5 3. Considerations . . . . . . . . . . . . . . . . . . . . . . . . 7 3.1. Cache Coherency . . . . . . . . . . . . . . . . . . . . . 7 3.2. Implementation Issues . . . . . . . . . . . . . . . . . . 7 3.3. Security Considerations . . . . . . . . . . . . . . . . . 8 4. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 9 5. References . . . . . . . . . . . . . . . . . . . . . . . . . . 10 5.1. Normative References . . . . . . . . . . . . . . . . . . . 10 5.2. Informative References . . . . . . . . . . . . . . . . . . 10 Appendix A. Measurements . . . . . . . . . . . . . . . . . . . . 11 A.1. Frequency of IRR Changes . . . . . . . . . . . . . . . . . 11 A.2. Effectiveness of Long TTL Values . . . . . . . . . . . . . 11 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 13 Intellectual Property and Copyright Statements . . . . . . . . . . 14 Pappas, et al. Expires April 26, 2007 [Page 2] Internet-Draft Improving DNS Service Availability October 2006 1. Introduction [RFC2182] provides operational guidelines for selecting and operating authoritative servers to maximize a zone's availability. Proper placement of authoritative servers can be an effective means to guard DNS service against unintentional failures or errors, but it cannot effectively protect DNS services against intentional attacks. A distributed denial of service attack could target all of the authoritative servers for a zone, regardless of where they are placed. By disabling all of a zone's authoritative servers, an attacker can disrupt service for that zone and all the zones below it. In particular, attacks against domains such as the root, generic top level domains (gTLDs), country code top level domains (ccTLDs), and other zones serving popular DNS domains (such as co.uk. or co.jp.) could have a severe global impact. As a countermeasure to potential DDoS attacks, some of the root and gTLD and ccTLDs servers use shared unicast addresses [RFC3258]. This approach is also called anycast, and can be effective when the number of replicated servers is large and when they are placed in diverse geographic locations. However the use of anycast adds one entry in the global BGP routing for each anycast enabled server or zone. Thus it is not a generic solution that can be easily applied to all DNS zone. In this document we propose an alternative and much simpler approach of improving DNS availability in face of denial of service attacks by using longer TTL values for NS records and the associated A resource records (RRs). Our proposal is based on the observation that DNS caching can effectively mitigate the impact of denial of service attacks. A caching resolver consults an authoritative server only when the requested data is not already present in the cache. The cache contains both specific records such as www.example.com and infrastructure records such as the name servers for example.com. In this document, we focus primarily on the caching of infrastructure records (defined formally in the next section) and show how longer TTL values on these records can help mitigate the impact of DDoS attacks. For example, consider the case of a successful attack against all of the DNS root servers and suppose all root servers are unavailable for some time period P. Despite the attack, resolvers can still access commonly used gTLDs and ccTLDs as long as these NS records and their corresponding A/AAAA resource record sets (RRsets) remain in a locally available cache during the period P. Generally speaking, access to the root servers is only used for looking up top level domain entries that are not presently available in the cache. Similar arguments apply to attacks against servers of other top level domains, or any DNS domain for that matter. If the NS and associated A/AAAA RRsets for a domain are cached, an attack against higher level Pappas, et al. Expires April 26, 2007 [Page 3] Internet-Draft Improving DNS Service Availability October 2006 domains will have little or no impact on descendant domains. 1.1. Terminology We use the DNS terminology introduced in [RFC1034], [RFC1035], [RFC2181] and [RFC4034]. Furthermore, this section introduces some additional terminology used in this draft: Data Resource Records (DRRs): The set of RRs that can potentially be queried by a stub-resolver. In other words RRs that are used by end- host applications. This records include almost all types of RRS, such as A, AAAA, MX, CNAME, SRV, etc. Infrastructure Resource Records (IRRs): The set of RRs that are used only in order to (securely) resolve a zone. NS and DS RRs are by definition IRRs. The A and AAAA RRs are IRRs if and only if the name associated with the A or AAAA RR exactly matches a name in the data portion of some NS RR. An NSEC RR is an IRR if and only if its owner name is a delegation. A DNSKEY RR is an IRR if and only if it matches a DS RR or is configured as a trust anchor in some resolver. An RRSIG RR is an IRR if and only if it signs an infrastructure RRset. All other resource records defined at the time of this draft are data resource records. Parent Zone (PZ): The zone defined right above the referenced zone. Child Zone (CZ): A zone defined right below the referenced zone. Authoritative Copy (AC): Some RRs (or RRsets) are defined in more than one zones. For example the NS RRset for a zone is defined both at the zone and at its parent zone. We consider as authoritative copy of an RR the one that is the most authoritative, based on the rules defined in [RFC2181]. 1.2. Conventions The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119]. Pappas, et al. Expires April 26, 2007 [Page 4] Internet-Draft Improving DNS Service Availability October 2006 2. Recommendations Measurement studies [IMW01CACHE] show that currently data and infrastructure RRs are treated more or less equally in setting their TTL values. The TTL values of infrastructure RRs (IRRs), NS RRset specifically, range from as short as 0 seconds (!) to as long as one week, with most TTL values set to 12 hours or shorter. Similarly, TTL values for data RRs (DRRs) exhibit similar variations, with a smaller mean. These measurement results suggest that many DNS operators do not distinguish the semantic difference between IRRs and DRRs when setting their TTL values. In this draft we argue that IRRs and DRRs SHOULD be treated differently when setting their TTL values. Because IRRs are mainly used by the DNS system itself and as such tend to be relatively stable records, while DRRs are primarily used by applications and thus tend to be more dynamic. As such, IRRs can afford to have longer TTL values. More specifically we propose that IRRs SHOULD have longer TTL values than currently observed (12 or less hours), and we recommend that their TTL value SHOULD be in the order of days. This specific value recommendation is based on a current measurement study (Appendix A.2), which shows that TTL values of 3 to 7 days can considerably improve the overall availability of the DNS system against denial of service attacks. We recommend this range of TTL value both for the authoritative copy of the IRRs, as well as for the copy of the IRRs stored at the parent zone. These TTL values SHOULD be applied for both copies of IRRs for the following reasons: A) If they are applied only at the parent's copy, then a resolver will always replace the parent's copy with the authoritative copy (lower TTL values), whenever it receives a reply that contains or attaches the authoritative copy. B) If they are applied only at the authoritative copy, then it is possible that a resolver will only use the parent's copy. For example most resolver cache the NS RRs of most TLDs from replies that usually come from the root zone. These RRs are rarely replaced by their authoritative copy, given that TLDs usually reply with referrals for their child zones (and thus they do not attach their own NS records in these replies). 2.1. Examples In this section we provide, some example configuration for setting the TTL value of IRRs. The first example shows the configuration of a zone (zone1.example.) when all its NS records have names that belong to the same branch of the DNS tree: Pappas, et al. Expires April 26, 2007 [Page 5] Internet-Draft Improving DNS Service Availability October 2006 $example. zone1.example. 432000 NS a.zone1.example. zone1.example. 432000 NS b.zone1.example. a.zone1.example. 432000 A 10.1.0.1 b.zone1.example. 432000 A 10.1.0.2 $zone1.example. zone1.example. 432000 NS a.zone1.example. zone1.example. 432000 NS b.zone1.example. a.zone1.example. 432000 A 10.1.0.1 b.zone1.example. 432000 A 10.1.0.2 The following example shows the configuration of a zone (sub1.zone1.example.), when one of its servers has a name that belongs to a different branch of the DNS tree: $zone1.example. sub1.zone1.example. 432000 NS a.sub1.zone1.example. sub1.zone1.example. 432000 NS b.sub2.zone2.example. a.sub1.zone1.example. 432000 A 10.1.1.1 $sub1.zone1.example. sub1.zone1.example. 432000 NS a.sub1.zone1.example. sub1.zone1.example. 432000 NS b.sub2.zone2.example. a.sub1.zone1.example. 432000 A 10.1.1.1 $sub2.zone2.example. b.sub2.zone2.example. 432000 A 10.2.2.2 Finally the following example shows the configuration for a DNSSEC enabled zone (zone2.example.): $example. zone2.example. 432000 NS a.zone2.example. zone2.example. 432000 NS b.zone2.example. zone2.example. 432000 DS ... zone2.example. 432000 RRSIG ... zone2.example. 432000 NSEC ... $zone2.example. zone2.example. 432000 NS a.zone2.example. zone2.example. 432000 NS b.zone2.example. zone2.example. 432000 RRSIG ... zone2.example. 432000 DNSKEY ... zone2.example. 432000 RRSIG ... zone2.example. 432000 NSEC ... Pappas, et al. Expires April 26, 2007 [Page 6] Internet-Draft Improving DNS Service Availability October 2006 3. Considerations 3.1. Cache Coherency Increasing the TTL value of IRRs can cause some cached IRRs to be inconsistent with the IRRs provided by the authoritative servers for longer periods of time, when these IRRs change. We believe this cache coherency problem is not an issue for most cases, for the following reasons: First,IRRs tent to be relatively stable records (Appendix A.1, [HOT05DNS]), and thus this inconsistency issue is expected to be a rare event. Second, IRRs inconsistencies for all records except NS and the associated A/AAAA RRs can be easily identified and corrected by a resolver (by querying the authoritative zone for these RRs). The only involved case is about inconsistencies of NS and A/AAAA IRRs. In such a case, the resolver is still able to correct the inconsistency as far as the NS and A/AAAA RRs have not changed for at least one name server. The resolver will eventually contact this server and will be able to replace the new NS and A/AAAA IRRs (assuming that these records are attached in the reply). In the case that all NS and A/AAAA have changed then the resolver may or may not be able to correct IRR inconsistency. This depends on the resolvers implementation. Some resolvers tent to contact the parent zone in such a case (and thus correct the inconsistency), while others return an error code (and thus stay inconsistent until the RRs expire). Given that different implementations behave differently in the case where all NS and A/AAAA IRRs change at the same time, we recommend the following for zones that implement longer TTL values for IRRs. The zone administrator SHOULD gradually move to the new IRRs, by changing partially the NS and A/AAAA RRs (not all of them in once), or she/he SHOULD maintain operational the old set of servers until all the old IRRs have be expired at any possible resolver. Another approach that she/he MAY follow is to gradually reduce the TTL value of IRRs before the change, and restore the longer TTL values after the change. Clearly a zone administrator MAY follow any of the above strategies when changing the NS and A/AAAA records to a completely different set. 3.2. Implementation Issues This document does not require or recommend any implementation changes neither at the authoritative server software nor at the resolver software. On the other hand, we should point out that some changes may be beneficial when zones implement longer TTL values for IRRs. For example, resolvers can prefetch IRRs before they expire. In this way they can increase the time that these records stay Pappas, et al. Expires April 26, 2007 [Page 7] Internet-Draft Improving DNS Service Availability October 2006 locally cached. Notably, resolvers should also evict some of these records, by using for a cache replacement policy, such as a Least Frequently Used (LFU) or a Least Recently Used (LRU) policy. Another example of beneficial implementation change can be a mechanism at the authoritative server that gradually reduces the TTL value when there is an anticipated change in the IRRs. 3.3. Security Considerations This document prescribes an operational practice that facilitates DNS queries during prolonged outages. Such outages may result from extended DDoS attacks against key servers in the DNS. The use of long TTL values does not reduce the vulnerability of targeted servers to DDoS attacks. However, the use of long TTL values limits the effectiveness of a DDoS to the global DNS. While a DDoS may disrupt the availability of some critical authoritative servers, the NS records for the zones that are delegated by them will be available in remote caches for much longer. Therefore, while a DDoS is no less likely, its scope is dramatically reduced. Pappas, et al. Expires April 26, 2007 [Page 8] Internet-Draft Improving DNS Service Availability October 2006 4. Acknowledgments We would like to express our thanks to Greg Minshall for an early discussion on the feasibility of using long TTLs to improve DNS availability, to Pete Resnick for his support and the suggestion of using one week or even longer TTL values, and to Rob Austin and Patrik Faltstrom who also provided constructive comments to our proposal. Pappas, et al. Expires April 26, 2007 [Page 9] Internet-Draft Improving DNS Service Availability October 2006 5. References 5.1. Normative References [RFC1034] Mockapetris, P., "Domain names - concepts and facilities", STD 13, RFC 1034, November 1987. [RFC1035] Mockapetris, P., "Domain names - implementation and specification", STD 13, RFC 1035, November 1987. [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC2181] Elz, R. and R. Bush, "Clarifications to the DNS Specification", RFC 2181, July 1997. [RFC2182] Elz, R., Bush, R., Bradner, S., and M. Patton, "Selection and Operation of Secondary DNS Servers", BCP 16, RFC 2182, July 1997. [RFC3258] Hardie, T., "Distributing Authoritative Name Servers via Shared Unicast Addresses", RFC 3258, April 2002. [RFC4034] Arends, R., Austein, R., Larson, M., Massey, D., and S. Rose, "Resource Records for the DNS Security Extensions", RFC 4034, March 2005. 5.2. Informative References [HOT05DNS] Handley, M. and A. Greenhalgh, "The Case for Pushing DNS", HotNets, 2005. [IMW01CACHE] Jung, J., Sit, E., Balakrishnan, H., and R. Morris, "DNS Performance and the Effectiveness of Caching", IMW, 2001. [SIG88DNS] Mockapetris, P. and K. Dunlap, "Development of the Domain Name System", SIGCOMM, 1988. Pappas, et al. Expires April 26, 2007 [Page 10] Internet-Draft Improving DNS Service Availability October 2006 Appendix A. Measurements A.1. Frequency of IRR Changes To assess the stability of currently deployed DNS servers, we conducted a measurement study. From a crawl over 15 million DNS zones (the crawl was initiated at DMOZ.ORG), we randomly selected 100,000 zones and measured their infrastructure RRsets over a 4-month period. During this 4-month period we queried each of the 100,000 zones twice a day to obtain its infrastructure RRset. Our data shows that 75% of the measured zones did not change either the NS or corresponding A RRSets during the entire study period. 11% of the zones showed changes to their NS RRset during this 4-month period, and 5% of the zones made the changes in less than 2 months. The A records of all the measured zone servers had more changes than the NS RRsets: 22% of the zones had their servers' A records changed within 4 months, and 10% of the zones made servers' A record changes in less than 2 months. All in all, our measurement results show that the current DNS servers, in the majority of the zones, are very stable. Even those servers that made changes during our measurement period show that their DNS server changes are rather infrequent. A.2. Effectiveness of Long TTL Values In order to gauge the effectiveness of a longer TTL value for the DNS infrastructure records, we used a real DNS trace that was captured by a UCLA caching server for 2 weeks. Based on this trace, we simulated a DoS attack on all root and TLD servers and we measured the percentage of queries that weren't resolved (excluding negative answers from the root and TLD zones), in the case of current TTL values, and in the case of a hypothetical TTL value of 3, 5, 7, and 9 days for all zones. The attack duration was 3, 6, 12 and 24 hours, and started at the eighth day (in simulation time). The following table shows the absolute number as well as the percentage of the queries that they did not resolve for each case of attack duration and TTL value: Pappas, et al. Expires April 26, 2007 [Page 11] Internet-Draft Improving DNS Service Availability October 2006 --------------------------------------------------------------------- | || Attack Duration (Hours) | --------------------------------------------------------------------- | || 3 | 6 | 12 | 24 | | TTL ||------------------------------------------------------------- |(day)|| 7776 Queries | 13799 Queries| 23586 Queries| 53636 Queries | |-------------------------------------------------------------------- | - || 2227 - 28.6% | 3829 - 27.7% | 6807 - 28.8% | 17099 - 31.8% | | 3 || 1132 - 14.5% | 1884 - 13.6% | 3154 - 13.3% | 7218 - 13.4% | | 5 || 917 - 11.7% | 1530 - 11.0% | 2562 - 10.8% | 5947 - 11.0% | | 7 || 767 - 9.8% | 1256 - 9.1% | 2092 - 8.8% | 4766 - 8.8% | | 9 || 711 - 9.1% | 1165 - 8.4% | 1898 - 8.0% | 4157 - 7.7% | --------------------------------------------------------------------- Clearly, we see that by using a longer TTL value we can increase the overall system availability under denial of service attacks. The table shows that with a TTL value of seven days we can decrease the impact of such an attack at the root and TLD servers by 70%, independent of the attack duration. Also the table shows that by increasing the TTL value, we are able more resilient to attacks. Based on these results we believe that a TTL value of seven days is adequate enough to considerably improve the resilience of the DNS system against denial of service attacks. Pappas, et al. Expires April 26, 2007 [Page 12] Internet-Draft Improving DNS Service Availability October 2006 Authors' Addresses Vasileios Pappas IBM Research, Watson Research Center 19 Skyline Drive Hawthorne, NY 10532 US Email: vpappas@us.ibm.com Bin Zhang Colorado State University, Department of Computer Science Fort Collins, CO 80523-1873 US Email: zhangb@cs.colostate.edu Eric Osterweil University of California, Los Angeles, Department of Computer Science 4805 Boelter Hall Los Angeles, CA 90095-1596 US Email: eoster@cs.ucla.edu Dan Massey Colorado State University, Department of Computer Science Fort Collins, CO 80523-1873 US Email: massey@cs.colostate.edu Lixia Zhang University of California, Los Angeles, Department of Computer Science 3713 Boelter Hall Los Angeles, CA 90095-1596 US Email: lixia@cs.ucla.edu Pappas, et al. Expires April 26, 2007 [Page 13] Internet-Draft Improving DNS Service Availability October 2006 Full Copyright Statement Copyright (C) The Internet Society (2006). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Intellectual Property The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org. Acknowledgment Funding for the RFC Editor function is provided by the IETF Administrative Support Activity (IASA). Pappas, et al. Expires April 26, 2007 [Page 14]