INTERNET-DRAFT Eric A. Hall Document: draft-hall-dns-data-02.txt June 2003 Expires: January, 2004 Category: Informational Considerations for DNS Resource Records Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC 2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Copyright Notice Copyright (C) The Internet Society (2003). All Rights Reserved. Abstract This document discusses some common design considerations for DNS resource records and data models. Internet Draft draft-hall-dns-data-02.txt June 2003 Table of Contents 1. Introduction...............................................2 2. Prerequisites and Terminology..............................3 3. DNS Architectural Principles...............................3 3.1. Resource Records........................................3 3.2. Hierarchical Partitioning...............................4 3.3. Minimalist Messages.....................................4 3.4. Built-In Record Caching.................................5 4. Inherent Design Limitations................................5 4.1. Domain Name Length......................................5 4.2. Ambiguity...............................................6 4.3. Incomplete Answer Sets..................................6 4.4. Lookups Only............................................7 4.5. Message-Size Restrictions...............................7 4.6. Unusable Compression....................................8 4.7. Cache Overflow..........................................9 4.8. Cache Lag...............................................9 4.9. World-Readable Data....................................10 5. Design Conclusion.........................................10 6. Going Standards-Track.....................................11 7. Security Considerations...................................11 8. IANA Considerations.......................................11 9. Author's Address..........................................12 10. Normative References......................................12 11. Acknowledgments...........................................12 12. Full Copyright Statement..................................12 1. Introduction In terms of deployment, the Domain Name System (DNS) [STD13] is an extremely successful network service, having perhaps the widest usage of all Internet services. Unfortunately, the omnipresence of DNS makes it a frequent target for well-intentioned efforts to extend the service into roles that it is technically unsuited to provide, or which would impose excessive burdens on the Internet community as a whole if they were widely adopted. This document attempts to itemize some of these issues, so that planners and developers can try to avoid these concerns during their planning cycles. However, it should also be recognized that there are several modern DNS usage models which violate more than one of the considerations listed in this document, but which still provide significant value for the Internet community. As such, this document should not be considered as a governing device of Hall I-D Expires: January 2004 [page 2] Internet Draft draft-hall-dns-data-02.txt June 2003 any kind, and should not be used to reject any and all proposals for new usage models. Instead, this document is intended to be used to facilitate honest discussion about the kinds of problems that a particular proposal may be expected to encounter, or the burdens that it may impose on the Internet community as a whole if it were to be widely adopted. 2. Prerequisites and Terminology Readers of this document are expected to be familiar with the following specifications: [RFC1034] Mockapetris, P., "Domain names - concepts and facilities", STD 13, RFC 1034, November 1987. [RFC1035] Mockapetris, P., "Domain names - implementation and specification", STD 13, RFC 1035, November 1987. [RFC1123] Braden, R., "Requirements for Internet Hosts - Application and Support", STD 3, RFC 1123, October 1989. [RFC2181] Elz, R., and Bush, R. "Clarifications to the DNS Specification", RFC 2181, July 1997. The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119. 3. DNS Architectural Principles The current collection of DNS specifications define a lightweight and anonymous "lookup-by-name" service, with compact datagrams being relayed through a structured network of authoritative servers and caches, each of which provide access to specific database partitions and/or resource records. The Domain Name System is able to fulfill its primary responsibility as a fast and robust distributed naming service directly as a result of these design principles. 3.1. Resource Records Data stored in DNS uses a common format, consisting of six common fields ["domain name", "type", "class", "time-to-live", "length" and "data" (the "data" field is further structured according to the kind of data being provided by the resource record itself)]. Hall I-D Expires: January 2004 [page 3] Internet Draft draft-hall-dns-data-02.txt June 2003 The domain name, type and class fields collectively form a unique identifier for all resource records, and allow clients to specifically identity the kind of data they want to retrieve for a specific named resource. At a minimum, all queries must explicitly identify the domain name of the resource record(s) being requested. Queries may request all known types and/or classes associated with the named resource, but typically specify those fields as well. Multiple resource records may share the same domain name, type and/or class values, but those resource records must have different data values to be considered unique. If a query results in multiple matches, then all of the matching resource records will be returned. 3.2. Hierarchical Partitioning From a high-level perspective, the DNS database is distributed across multiple partitions called "zones", each of which have ownership for a specific subset of domain names. Zones are linked in a hierarchical tree, with the top-level zones having zones directly beneath them, and with some of those zones having subordinate zones, and so forth. Although the zones are arranged in a hierarchy, each zone acts as an independent entity, and is usually only concerned with the records that it controls directly. The hierarchical zone structure is traversed whenever a zone which is authoritative for a named resource record needs to be located (this usually only happens when the answer has not already been cached), with this process continuing until either an answer or an error is returned. In this regard, the domain name of a resource record provides a lookup key which is used by the protocol to navigate the zone structure (this is why every query must specify the domain name of the resource records desired). 3.3. Minimalist Messages The DNS protocol uses a highly-compact, binary message format which is specifically suited for fast and lightweight lookup transactions. There are very few spurious bits or fields in the DNS message (there is no "version" field, for example). The message format also uses a protocol-specific compression technique for domain names in the message itself, which further reduces message sizes and contributes to greater efficiencies. Hall I-D Expires: January 2004 [page 4] Internet Draft draft-hall-dns-data-02.txt June 2003 By default, DNS uses UDP to transfer messages, avoiding the latency and processing costs that are typically associated with TCP sessions. However, there are some situations in which UDP cannot be used, and in those cases DNS will use TCP in order to ensure that lookups succeed. 3.4. Built-In Record Caching DNS resolvers and servers are allowed to cache resource records that they have discovered as part of normal query processing. This allows subsequent queries for that information to be answered immediately from the cache, without requiring another batch of transactions for the same information. In turn, this ensures that lookups are answered in the shortest amount of time, that servers are not excessively burdened by unnecessary queries, and that the total number of transactions are kept to a minimum. 4. Inherent Design Limitations As a result of the highly-optimized lookup model, DNS has several critical built-in limitations. For example, DNS does not provide any functions to "search by value", nor does it provide any sort of mechanisms for user authentication, access control services, cache-validation, nor most of the other mechanisms typically associated with general-purpose databases or directories. Although DNS could be extended to accommodate some of these usages, such an effort would require a significant amount of engineering labor to preserve compatibility with the existing DNS protocol and systems. Furthermore, there is a significant danger inherent in overloading DNS with excessive features and data such that the service itself becomes incapable of performing lightweight lookups quickly and efficiently, thereby precluding its primary purpose. 4.1. Domain Name Length Domain names are restricted to a maximum length of 255 characters. Since a domain name is the primary identifier for a resource record, and since the domain name also identifies the zone where a resource record is stored, the length restrictions of a domain name can be a significant limitation in some cases. For example, a domain name for a resource record in a zone that is nested several layers deep in the global hierarchy could face Hall I-D Expires: January 2004 [page 5] Internet Draft draft-hall-dns-data-02.txt June 2003 significantly tighter space constraints than domain names for resource records in a top-level zone, simply because there will be fewer octets left to work with in the lower-level zones. This can be a significant concern with applications which require the use of application-specific domain name sequences, especially when those sequences are relatively long. In some cases, it may simply be impossible to use those sequences in some zones, given the space restrictions. As such, the use of application-specific domain name sequences should generally be avoided. 4.2. Ambiguity As stated earlier, only the domain name, type and class fields can be specified in a lookup query. If multiple resource records match against these fields, then all of those resource records will be returned. Since it is not possible for a query to be any more specific than this, it is therefore not possible to explicitly request an exact resource record from among a set, unless only one instance of the requested resource record exists at the specified domain name. However, it is not possible to guarantee that a particular resource record will only exist in the singular form at any given time. Although it is possible to demand that administrators "MUST NOT" create more than one instance of a particular resource record for any domain name, such demands are usually at the mercy of the administrators of those systems, and are generally unenforceable. In short, it is not possible to guarantee that a newly-defined resource record will only exist in the singular form. Data models which depend on singular instances of a particular record should be designed with this issue in mind. 4.3. Incomplete Answer Sets It is not always possible to be sure that all of the resource records will be returned in response to a query. Specifically, the original DNS specifications allowed each resource record in a set to have different time-to-live values, and this allowed (in theory) each resource record to be aged out of a cache at different times. Furthermore, there have been some secondary bugs in some implementations which have resulted in incomplete answer sets being returned and subsequently cached by other nodes. Hall I-D Expires: January 2004 [page 6] Internet Draft draft-hall-dns-data-02.txt June 2003 Although these problems have mostly been addressed over time, it is still not possible to guarantee with absolute certainty that all of the records in a set will always be returned. Data models which depend on spreading component data over multiple resource records in a set should be designed with this in mind. 4.4. Lookups Only As was stated earlier, DNS currently only provides a lookup query, using the domain name in the query as the lookup key, and using the type and class fields as additional qualifiers. DNS does not provide any queries which would allow a resolver to search for every resource record in the entire distributed database which contain a particular data value. Although the original DNS specifications did provide a mechanism for searching a specific server for resource records with matching data-values, this feature was never widely deployed, and the query-type has since been officially deprecated. Similarly, DNS does not provide any means to search for all resource records of a particular type or class, without the client specifying an exact domain name to match against, meaning that all resource records of a specific type cannot be queried for. In theory, it would be possible to create a super-index of all zones in the entire distributed database and issue these kinds of searches against that index, although nobody has built such a system as-of-yet. It is also possible to fake these kinds of searches on a per-zone basis by using transferring the entire zone contents, and then performing local searches against all of the resource records. However, neither of these scenarios are normal, and would not be representative of typical DNS transactions and client processes. In the absence of these mechanisms, designers must be aware that they can only issue queries against the name of a resource record unless they are willing to use something other than DNS. 4.5. Message-Size Restrictions Standard DNS messages sent over UDP have a maximum message size of 512 bytes. If a lookup results in an response message that exceeds the maximum message or datagram sizes, the query process must be restarted using TCP. However, standard DNS messages sent over TCP are themselves limited to a maximum size of 65,535 bytes, and Hall I-D Expires: January 2004 [page 7] Internet Draft draft-hall-dns-data-02.txt June 2003 messages which are larger than that cannot be transferred over DNS at all. Furthermore, not all DNS servers support the use of TCP, and in those cases, messages which overflow the 512-byte limit for UDP will also be inaccessible. In short, messages which are larger than 512 bytes always cause performance problems and sometimes trigger catastrophic failures, while messages which are larger than 65,535 bytes always trigger catastrophic failures. Extended DNS (EDNS) [RFC2671] can carry messages up to 65,535 bytes over UDP, although the actual payload size is usually limited to 1280 bytes due to limitations in physical media capacity and problems that arise from fragmentation. If the size of the EDNS message exceeds the capacity of the end-to-end link, TCP is likely to be needed. In those cases where TCP works as expected, there can be several penalties from its use. For example, TCP session management typically consumes more resources than UDP datagrams, which can significantly limit the number of queries that a server is able to process at any given time. For a particularly busy server, processing a significant number of TCP transactions can mean that other transactions will have to be rejected. Meanwhile, the use of TCP also requires more round-trips, which can sometimes cause timers to expire while the query is still being processed, resulting in multiple duplicate queries going to that server, accelerating the negative affects. It's also important to recognize that TCP messages are transferred directly between a resolver and a server, and will not use the caching infrastructure. As a result, any answers which are returned over the TCP connection will not be cached by intermediary nodes, meaning that the entire process will need to be repeated for each instance of the same query. For all of these reasons, planners and developers are strongly encouraged to limit resource record data to sizes that will not cause UDP messages to overflow. In those cases where this is unavoidable, they should be prepared for a variety of problems, including performance degradation and outright failure. 4.6. Unusable Compression The DNS specifications provide a compression mechanism which can be used to substitute label sequences with pointers to previous occurrences of those sequences. However, older caches will not be aware of new resource record's data-structures, so the compression Hall I-D Expires: January 2004 [page 8] Internet Draft draft-hall-dns-data-02.txt June 2003 mechanism cannot be used by new resource record data fields. Instead, it only works with the domain name part of new resource records, and with data sections of older resource records. This is an especially important consideration to keep in mind when considering large data-structures. While it is tempting to believe that domain names can be compressed to save room, this simply is not true as often as people would like. 4.7. Cache Overflow Another issue related to data size is the amount of memory available to a particular cache. All caches have fixed amounts of available memory, and when that memory is consumed, some data will have to be expired from the cache. If that data is needed again, the entire query will have to be reissued, and the cache will have to expire some other resource records from the memory pool in order to make room for the answer data. In heavily loaded environments (such as a very busy ISP), this can result in a constant churning of the memory pool. This is obviously a good reason to limit the size of the resource record data, but it is also a good reason for limiting the total number of resource records in a set. Since each entry will have to consume memory in a cache somewhere, large resource record data blocks and large sets of resource records will both contribute to the potential for cache churning. 4.8. Cache Lag Since DNS is optimized for lookups, the use of caching is generally considered a positive feature. However, caching can also be somewhat hostile towards certain usage models, especially since DNS does not provide any mechanisms for forcing a system to flush its cache of previously discovered records. In particular, caches prevent data from being validated against an authoritative source. While this is normally beneficial for lookup activities, it can be a devastating feature for data models that require data-integrity at all times. Although DNS servers can dictate the maximum length of time that a resource record is to be held in a cache, data models which require the use of low time-to- live settings are generally frowned upon by the DNS community, as these resource records place a disproportionate burden on the infrastructure. Hall I-D Expires: January 2004 [page 9] Internet Draft draft-hall-dns-data-02.txt June 2003 For these reasons, DNS is generally considered to be inappropriate for data models which require full-time and instantaneous data integrity, and developers are generally encouraged to look towards other services if this kind of service is absolutely needed, especially if the application is expected to be widely deployed. 4.9. World-Readable Data DNS does not provide any mechanisms for authenticating users during the lookup process, nor does it provide any mechanisms for linking access controls to a resource record across the global network of servers and caches. Without these features, DNS is unsuitable for applications which require authenticated access to private data. For example, certain applications require that some kinds of data only be made available to "authorized" users. Usually these kinds of applications require that servers provide an authentication mechanism, and that the database records have access-control attributes associated with them. However, DNS does not provide for either of these mechanisms, but instead treats all resource records and data as world-readable. Although some products provide mechanisms for restricting query-level access to aggregate ranges of IP addresses, it is important to point out that once the resource records get into a cache outside of the protected scope, the information is only as secure as that system. In this regard, a cache which resides outside of a firewall can be just as informative as the DNS servers inside the firewall. In the end, there is no such thing as "private" data with DNS. Developers must treat all data as if it will eventually be made public, and are strongly encouraged to use some other service if higher levels of security are required. 5. Design Conclusion Due to the architectural tradeoffs inherent in the DNS lookup model, some usage models are better suited to DNS than others. In particular, DNS is highly efficient at lookups of compact, public and relatively stable data. Conversely, DNS is unsuitable for value-based queries or searches, restricted-access data, highly- dynamic data, or large records and arrays. Applications which require access to those kinds of data should investigate services such as LDAP or HTTP as being more appropriate. Hall I-D Expires: January 2004 [page 10] Internet Draft draft-hall-dns-data-02.txt June 2003 6. Going Standards-Track Generally speaking, planners and developers can usually define their own resource record types as part of a standards-track specification without interference from the DNS community, as long as the functional scope is limited to defining data-structures for those resource record types. However, there are some cases where it may be useful or necessary for the DNS community to be involved with the standardization of a particular resource record type. In particular, if a resource record type requires a server to perform some kind of extra processing other than piping data from a database into a message, then the DNS community should be consulted. Similarly, requiring that servers provide additional data outside the answer section of the response message should be vetted with the community. Moreover, if a specification requires special structuring of the message for the benefit of a single service, then the DNS community should definitely be involved in the discussion, since any changes to the highly-optimized message format could be disastrous in non-obvious ways. Requests to reserve portions of the namespace for the use of a single network service should also be brought to the DNS community for discussion. Finally, if a particular usage goes against more than two of the recommendations put forth in this document, then it would probably be a good idea to consult with the DNS community over any alternatives which may be available. In all cases, IANA must be involved in delegating resource record type codes and mnemonics. 7. Security Considerations This document does not create any security considerations. 8. IANA Considerations This document does not create any IANA considerations. Hall I-D Expires: January 2004 [page 11] Internet Draft draft-hall-dns-data-02.txt June 2003 9. Author's Address Eric A. Hall ehall@ehsco.com 10. Normative References [RFC1123] Braden, R., "Requirements for Internet Hosts - Application and Support", STD 3, RFC 1123, October 1989. [RFC2181] Elz, R., and R. Bush, "Clarifications to the DNS Specification", RFC 2181, July 1997. [RFC2671] Vixie, P., "Extension Mechanisms for DNS (EDNS0)", RFC 2671, August 1999. [STD13] Mockapetris, P., "Domain names - concepts and facilities", STD 13, RFC 1034 and "Domain names - implementation and specification", STD 13, RFC 1035, November 1987. 11. Acknowledgments Funding for the RFC editor function is currently provided by the Internet Society. Significant feedback on this document was provided by Edward Lewis and Walt Howard. 12. Full Copyright Statement Copyright (C) The Internet Society (2003). All Rights Reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English. Hall I-D Expires: January 2004 [page 12] Internet Draft draft-hall-dns-data-02.txt June 2003 The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assigns. This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Hall I-D Expires: January 2004 [page 13]