======================================================================= INTERNET DRAFT A. Kumar Expiration Date: May 31, 1994 S. Hotz J. Postel USC/ISI Dec. 1993 Incremental Transfer and Fast Convergence in DNS Status of this Memo This document is an Internet-Draft. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its Areas, and its Working Groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months. Internet-Drafts may be updated, replaced, or obsoleted by other documents at any time. It is not appropriate to use Internet-Drafts as reference material or to cite them other than as a ``working draft'' or ``work in progress.'' To learn the current status of any Internet-Draft, please check the 1id-abstracts.txt listing contained in the Internet-Drafts Shadow Directories on ds.internic.net, nic.nordu.net, ftp.nisc.sri.com, or munnari.oz.au. This Internet Draft expires May 31, 1994. Abstract This memo proposes extensions to the DNS protocols to provide for an incremental zone transfer (IXFR) procedure. A companion mechanism, the NOTIFY procedure, is also proposed to allow secondaries to learn of changes to the primary database in a timely manner. A new DNS Opcode (NEWQUERY) is proposed that will provide the necessary upgrades to the DNS packet structure to provide for both these mechanisms. Further, it allows for easy upgradability, in the future. Two new RR types (CARRIER and ISOA) are proposed. This memo provides only a first cut at an attempt to document the ideas about these protocols and we invite extensive comments, maybe to revamp the entire stream of thought. 1. Introduction The last few years have witnessed an exponential growth in the number of machines in the internet, and a corresponding dependence on DNS. As a result, zone files have grown to near HOSTS.TXT proportions. Kumar, Hotz, Postel [Page 1] INTERNET DRAFT DNS Incremental Transfers December, 1993 Each zone file is maintained at a primary server. All modifications to this file are made at the single site and propagated to secondary servers using the Zone Transfer protocol [RFC 1035]. Whenever any change is made to the zone file, the zone administrator increments the SOA serial number. Secondary servers poll the primary every REFRESH interval, and if the serial number has changed, the entire zone file is transferred. More often than not, the change made to the zone file is a very small percentage of the zone file. Thus, an incremental transfer protocol that will propagate only the changes to the zone file, may allow substantial savings of bandwidth overhead. In addition, secondaries only check to see if they are consistent with the primary every REFRESH period. While setting REFRESH to be a relatively large value reduces bandwidth overhead, there can be large time intervals during which at least one secondary has data that is inconsistent with the primary. The proposed NOTIFY mechanism (where the primary sends a message to known secondaries) facilitates fast convergence of servers vis-a-vis consistency of data in the zones (without requiring the overhead implied by a short REFRESH period). These two mechanisms can be used to reduce the bandwidth overhead of DNS while maintaining server-to-server consistency for any particular zone. These mechanisms could prove particularly useful if a DNS of the future were required to support dynamic updates (e.g. frequent changes to a zone, possibly from multiple entities making changes by sending "update packets"). Dynamic updates imply small database changes, and a need for fast convergence among authoritative servers. This memo does not specifically address a Dynamic Update scheme, but the IXFR and NOTIFY mechanisms were designed in light of possible requirements for dynamic update schemes. Three additions to the current DNS protocol (per RFC1034 and RFC1035) are proposed to provide for IXFR and NOTIFY, and to ensure that future changes to DNS are easier to incorporate: (1) a new Opcode "NEWQUERY" which facilitates a new, flexible query packet and an extensible response packet structure, (2) a new RR type called ISOA (and with it, a new way of defining fields in a resource record), and (3) a Serial Number field is added to each resource record. 2. Resource Record support for Incremental Transfers To support incremental zone transfers, an RR will now have 7 fields as follows: Kumar, Hotz, Postel [Page 2] INTERNET DRAFT DNS Incremental Transfers December, 1993 The current DNS mechanism does not provide a way to identify the chronological history of the data in the zone files. To support incremental transfers, it is necessary to know when a resource record was added to the database (relative to other updates). Thus, a serial number is associated with each resource record in a zone file. The field follows the structure < >, much like an IP or TCP options field. Any new fields that may, later, need be added to the Resource Record structure may be added on, in a similar fashion, before the and sections. A ZERO byte will be the terminating ID for these fields and will mark the beginning of the field. An ID = 1 will be a NOP (should be just skipped) and will be used to align word boundaries. The data section of new RR types should be similarly defined to allow for easy addition of fields in the future. Thus, an RR will now look like: 1 1 1 1 1 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | | / / / NAME / | | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | TYPE | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | CLASS | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | TTL | | | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | ID = "S_NO" | S_NO-DLEN = 4 | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | SERIAL NUMBER | | | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | (NOP) | (ZERO) | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | RDLENGTH | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | | / RDATA / / / | | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ Kumar, Hotz, Postel [Page 3] INTERNET DRAFT DNS Incremental Transfers December, 1993 In addition, a new field will be added to the server's internal database, associated with each RR. The "Zflag" (zombie flag) is necessary to convey incremental state information with respect to a resource record (i.e. should the RR be added or deleted from a secondaries zone information). Section 2.2 details the use of "Zflag". For backward compatibility reasons, the serial number and "Zflag" will be propagated only with IXFR transfers. The information in IXFR will be propagated as tuples. A present-day DNS client has no use for serial number information at this point and is also not equipped to interpret serial numbers. Future DNS clients might want to make use of this information, and new query types using the opcode NEWQUERY could be defined that would make use of and return serial numbers [1]. 2.1 IXFR Use of RR Serial Numbers The RR serial numbers must be a strictly monotonically increasing function. This will allow servers to differentiate between two sets of RRs: those added before a certain serial number, and those added after a certain serial number. To illustrate the basic scheme, for the moment consider only the case of adding new RRs to a zone (the more subtle cases of deletion and modification are considered in detail below). When an RR is added to a zone, a new (higher) serial number is associated with the newly added RR. Because RR serial numbers are monotonically increasing, servers can distinguish when an RR was added (relative to other RRs). A scheme to conserve serial number space is described in section 2.4.1. The current status of zone information held at a particular server is reflected by the highest serial number associated with the RRs of the zone. When a secondary requests an incremental zone transfer (IXFR), it must send its current status (highest RR serial number) as part of this request. The primary server can then transfer all resource records that have a higher sequence number; consequently, the status of the zone information held at the primary and secondary will be the same. 2.2 Deleting/Modifying an existing resource record A modification will be treated as a deletion followed by an addition, thus only the deletion process is described here. [NOTE: If there is a requirement for modification atomicity, this would require a distinct operation; this could be supported by extending the "zombie" mechanism described below.] Kumar, Hotz, Postel [Page 4] INTERNET DRAFT DNS Incremental Transfers December, 1993 When receiving IXFR updates, a secondary must receive an EXPLICIT notification of deleted RRs (unlike a full AXFR scheme where all RRs are considered deleted unless refreshed). Hence an RR cannot be removed from the primary zone when it is deleted; instead it is modified and used as the explicit notification (to the secondaries) of removal. The RR is marked as a "ZOMBIE" (using the Zflag) and the serial number is updated as described above. The zombie RR is kept in memory until the primary is either sure all secondaries have updated zones to reflect the deleted RR or until the Zombie record has become sufficiently old (as per schemes described in the next section); during the interim the primary, of course, does not return the deleted RR in response to client queries. No server should return a zombie RR in response to a client query. As described above, the Zombie RR gets a new serial number. Hence, the secondary must be careful when deleting the RR from its database. The serial numbers on the two RRs will not match. It must, therefore, match all other fields of the RR before deleting it (or marking it Zombie in its database). In a sense, an IXFR contains commands of two types: one that specifies a new RR should be added to the zone information, and one to delete an RR from the zone. To converge correctly, a server receiving an IXFR must apply/process these commands (RRs and zombie RRs) in order of the RR sequence numbers. Note that once a secondary applies a ZOMBIE RR to the zone information it holds, it does not need to maintain this Zombie (unless it also serves to update other secondaries via IXFR). Zombie RRs cannot be maintained indefinitely, because this would cause the amount of information maintained for the zone (at the primary) to be unnecessarily large (i.e. one does not want to maintain some number of ADD/DELETE pairs for a particular RR that could theoretically occur over time). Fortunately, there is no need to maintain ZOMBIE RRs indefinitely; they can be deleted when all servers for a zone have been notified of the deleted RR. 2.2.1 Mechanisms for Deleting "ZOMBIE" Resource Records There are multiple mechanisms that could be used to keep track of ZOMBIE RRs and when they can be deleted. Of the following schemes, scheme (a) might be used by itself. It can be used in conjunction with (b) if we do not wish to use the zone expiry time. It would not be advisable to use (b), all by itself. All these alternatives present their own advantages and disadvantages and one may choose either one based on their system requirements or limitations. Kumar, Hotz, Postel [Page 5] INTERNET DRAFT DNS Incremental Transfers December, 1993 a. This scheme requires that secondary servers may sometimes be "forced" to take an XXFR rather than an IXFR. An XXFR is a full zone transfer, using the IXFR mechanism. Complete Zone data is sent to the client as tuples. This is described in greater detail, in section 6. A primary maintains all RRs within "N" serial numbers of the zone's current serial number (highest valued RR serial number); Zombie RRs with serial numbers lower than (current_serial_number - N) are deleted. Any secondary server that requests a serial number smaller than the primary's (current_serial_number - N), must XXFR instead of IXFR. The primary will send an XXFR reply (in response to the IXFR) and the secondary is expected to be able to parse either IXFR or XXFR responses to its original IXFR query. This single simple transaction is designed to be more efficient than the alternative where the primary refuses the IXFR and the secondary is required to initiate a seperate request for an XXFR (although the alternative is more characteristic of current DNS transactions). b. The primary can maintain information (state) about all secondaries that normally transfer zone from it. This will be "soft state", implying that it can be rebuilt from scratch should the primary server crash (the only impact being that ZOMBIE RRs may not be deleted as soon as they might otherwise). Hence, for each secondary server, the primary records the last serial number it transferred (on recovering from a crash, this number will be set to 0). When the minimum of these serial numbers (for all servers of a particular zone) is greater than the serial number on a ZOMBIE RR entry, that ZOMBIE RR can be removed. This scheme has the robustness problem that if a secondary crashes and never comes up again, the primary will maintain zombie records indefinitely. This can be solved by using this scheme and limiting the amount of information kept using scheme (b) or zone expiry time as described later in this section. Either or both of these schemes, (a) and (b), could be used, and this can be an implementation-specific decision so long as the servers can interoperate. Scheme (b) requires no additional protocol interaction, but all [secondary] servers must accommodate a "you asked for an IXFR *but* here is an XXFR" if different implementations are to interoperate. Kumar, Hotz, Postel [Page 6] INTERNET DRAFT DNS Incremental Transfers December, 1993 While a primary could implement either approach, one should see an advantage by implementing both. The "soft state" method (b) will allow ZOMBIE RRs to be deleted as early as is possible, and the "refused IXFR" method (a) will place a bound on the amount of memory required by the primary (useful in the case where a secondary is out of touch for very long periods of time). 2.2.2 Soft-state and zone EXPIRE time. One would worry about implementing scheme (b) by itself since it does not allow for an escape hatch, in case a secondary dies and never comes up again. The problem is that Zombie RRs will never be deleted in this situation since in the primary's records, the secondary might come up any time and ask for an IXFR. Another mechanism to bound Zombie RR lifetimes can be based on the observation that zone data of any kind will be useless at the secondary, beyond the zone EXPIRY time (as specified in the SOA record). We can define the "ENDTIME" for each secondary as the (time of last transfer + SOA EXPIRE time). A zombie record can be deleted after the max of all secondary ENDTIMEs has passed. For example, given only one secondary server X and the fact that X transferred zone last 20 days ago and that the zone expiry time is 21 days, we can safely delete a zombie RR after one more day. The secondary's zone files will have expired by then and it will have to do a full zone transfer. However, the implementation will necessarily have to store timestamps with transfer records of secondary servers, and with zombie records, if this is the chosen mechanism. It is important to note that the zone EXPIRE time is read from the SOA record, and the SOA record itself is capable of changing at any time. Consider a situation where the last secondary transferred at time 0 and SOA EXPIRE read 100 at that time. Then an RR is deleted at t = 25 and SOA EXPIRE at that time read 50. We must keep the RR until t = 100 since the secondary would believe its zone file was good until t = 100 (0+100) and not until t = 75 (25 + 50). Hence, if using the "soft-state" scheme, per (b) above, and the zone EXPIRE time to bound Zombie lifetimes, then the EXPIRE time at the time of a secondary last transfer must be stored. Thus, when an RR is marked Zombie, the delete timer is set to a value equal to: max(last transfer time + then SOA EXPIRE time = ENDTIME) of all secondaries. Kumar, Hotz, Postel [Page 7] INTERNET DRAFT DNS Incremental Transfers December, 1993 Thus, when computing ENDTIMEs, we always use the current SOA EXPIRE time. This ensures that a Zombie RR is kept for as long as a secondary can ask for it. This scheme could be used independently of scheme (b) in 2.2.1, but since we will be keeping some sort of soft state anyway, we might just as well use scheme (b) for efficiency reasons. 2.3 The Role of the SOA Serial Number Since IXFR-capable servers are likely to be used together with older servers during a transition period (of unfortunately indefinite length), the SOA serial number functionality must be preserved for backward compatibility. This implies that the SOA serial number must be changed each time the zone is updated. The simplest solution is for the SOA serial number to reflect the highest RR serial number. This issue is not difficult in the context of simply accommodating incremental zone transfers, since a zone file (and SOA serial number) will necessarily be updated when RRs are added, deleted, or modified. However, there is other ongoing work that is addressing mechanisms for dynamically updating zone information; it would be an advantage if the IXFR scheme considered such mechanisms, and was designed to accommodate the additional complexities introduced by dynamically updated zone information. If the use of dynamic updates are ignored, the SOA serial number could be updated in the same manner as it is today. In fact, the manually-updated SOA serial number could be assigned to each new, modified, or deleted/zombie RR. Other implementation-specific schemes could be used to derive SOA serial numbers and maintain the relationship between SOA serial number and RR serial numbers. Further, incremental information could be entered manually by the system administrator, to maintain simplicity of working. Strict consistency checks must be made, in such a case, to ensure that data follows the logic of serial number space. 2.3.1 SOA Serial Numbers in light of Dynamic Updates The IXFR scheme essentially obsoletes the function of the SOA serial number, replacing it with finer-granularity serial numbers applied to RRs (this is especially true if dynamically updates to the zone are made). The highest-valued RR serial number now reflects the status of the zone information. In any dynamic update scenario, new RRs will be added, old ones deleted or modified, on-line. Without going into specifics, the crucial point is that zone information will not always change due to Kumar, Hotz, Postel [Page 8] INTERNET DRAFT DNS Incremental Transfers December, 1993 a zone file update. Hence, RR serial numbers will be assigned dynamically and since the SOA serial number will reflect some function of the highest RR serial number (probably equal), the SOA serial number will also change, independent of a zone file update. The implication (which may be disturbing to the old guard of DNS administrators) that SOA serial number will not be updated exclusively in the zone file, raises other issues. To preserve zone integrity, the changes to the SOA serial number made dynamically must take precedence over manually-updated SOA serial numbers. All changes, manual or "automatic" must be properly serialized. In summary: (a) The function of the SOA serial number is replaced by the highest RR serial number. SOA serial number must also reflect changes to the zone for backward compatibility; some mapping from RR serial numbers to SOA serial is required (the simplest being that they are the same). (b) A manual update of a zone file cannot specify an SOA serial number which conflicts with (is smaller than) SOA serial number that reflects dynamic changes to the zone. All updates need to be properly serialized. 2.4 New RR Serial Number Generation This is an implementation specific issue, as long as the function is monotonically increasing, and the constraints imposed by the relationship between RR serial numbers and the SOA serial number are met (see Section 2.3). One obvious function is to maintain a sequence number counter, which is incremented each time a new update happens. An update can be the addition/deletion of a single RR or a set of RRs. Any set of additions/deletions happening together (via a file edit or the admission of a dynamic update packet) is termed an "update" and all RRs affected by that are assigned the same serial number. A simple alternative (especially in light of dynamic updates) is to use a timestamp (seconds since the epoch) as determined at the primary server. The primary advantage (other than simplicity) is speculative; if the actual time of update is available to clients, the DNS system, or network administrators, it might serve some useful function in the future (e.g. perhaps network administrators can identify irregularities in a zone by examining RR serial number and applying heuristic that takes into account expected turn-over rate for the zone). As with the serial number scheme (last para), we propose that a single timestamp value (possibly time at beginning of Kumar, Hotz, Postel [Page 9] INTERNET DRAFT DNS Incremental Transfers December, 1993 operation) be used for one "update". Section 2.3 implies that manually-updated SOA serial numbers may not be possible in the future, hence the semantics attached to them by many administrators today (i.e. YYMMDDHH) may also be less clear. The use of timestamp-based serial number might be an appropriate replacement. The disadvantage of timestamps is that they might not be compatible with the existing serial number space. A future timestamp might be much smaller (numerically) than a serial number associated with a given zone today. This could, of course, be resolved by the [one time] shut down of all servers for a domain, deleting the zone backup files from all secondaries and starting afresh, with a new serial number space based on timestamps. Of course, secondaries might be restarted from scratch before they actually attempt to transfer zone from the primary; it does not have to be simultaneous. 2.4.1 Atomic actions and serial numbers. In a large zone, even an incremental transfer might involve a few hundred (possibly thousand?) operations. Unless we can break the stream of RRs being added and deleted, at intermediate, consistent states, the server might be out of commission for a long time while processing an IXFR. Thus, there is a need to identify checkpoints, while processing IXFRs, where the database is in a consistent state. We claim that when we finish processing RRs that bear the same serial number, we are in a consistent state. This is entirely a function of the way in which we generate serial numbers. Any set of changes that happened atomically (an update as we defined it above) will take the database from one consistent to another consistent state. And since we assign the same serial number to such changes, processing one serial number completely will take the database to a consistent state, given a consistent starting point. 2.4.2 Reducing Sequence Number Rollover and Associated Problems If a simple counter-based sequence number is used (rather than timestamp-based), the sequence number may not necessarily be updated each time a new RR is added. A scheme to conserve the use of serial number space could be based on whether other servers have received updates recently: Kumar, Hotz, Postel [Page 10] INTERNET DRAFT DNS Incremental Transfers December, 1993 If (zone transferred by anybody) set(NewNumNeeded); and when a new RR is added: If (NewNumNeeded) { currentserial++; reset(NewNumNeeded); } RR->serialNum = currentserial; Thus, all RRs added between any two transfers get the same serial number, thereby saving some amount of serial number space. However, we must be careful when using this scheme. Consider the scenario where a particular RR is created and deleted during the time when the same sequence number is being used (i.e. no secondary transferred zone between the two operations). We could keep this Zombie RR around and transfer it out to secondaries. However, we notice that since the addition and deletion happened without the changes being visible to any secondary, we can safely delete this RR without ever letting it go to Zombie state. If we do not do this (i.e. we let it become Zombie) and the RR is added again, we run into problems. Now we have a good RR and a Zombie RR, both bearing the same serial number and the same data. The order of processing at the secondaries will now determine what the resulting database looks like. Thus, we recommend, specially when using this sequence number rollover prevention scheme, that if an RR is created and deleted without any zone transfers in between, the RR not be marked Zombie but be deleted right away. 3. ISOA, the new RR type. System administrators (for whatever reason) might not be comfortable depending exclusively on incremental transfer to maintain zone consistency. If this is the case, it can be resolved by configuring periodic "checkpoints" where full zone transfers are done. Thus, after REFRESH seconds, secondaries will use IXFR to transfer incremental data. In addition, secondaries will do a complete zone transfer every XXFRTIME seconds (typically at least an order of magnitude greater than REFRESH) using the method described later in this document (section 5). To specify the XXFRTIME, an additional RR type is defined (so we are backward compatible with existing DNS implementations). It has symbolic name "ISOA" and numeric value xxx. The data section of the Kumar, Hotz, Postel [Page 11] INTERNET DRAFT DNS Incremental Transfers December, 1993 ISOA record will be a 32 bit integer value with associated sub- fields, as described below. 3.1 ISOA structure We model the structure of the ISOA data section on the "OPTIONS" field from the TCP and IP protocol descriptions. Following the RDLENGTH is the RDATA field in the DNS RR. The RDATA field will now be structured to allow for future growth. A field in the RDATA section will have three sub-fields: The will be an 8 bit integer value specifying what the field is. The is another 8 bit integer that gives the length of the data field and the field contains the actual data octets. We restrict the number of IDs to 256 and the number of data bytes in any field also to 256 but we believe these are acceptable limits IDs "0" and "1" will be reserved. "1" will be a filler octet, used to mark NOPs to align fields with word boundaries. "0" will imply end of data. Both these IDs will have no or fields. Thus, given that we need only one field right now, we will assign it ID = "XXTIME" which has numerical value xxx. The ISOA record will then look as follows: 1 1 1 1 1 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | ID = "XXTIME" | XXFR-DLEN = 4 | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | XXFRTIME | | | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | (NOP) | (ZERO) | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ We expect this kind of a structure to be used extensively in RR definitions of the future. This allows for future extension of the RR by addition of new fields and so provides for easy upgradability. 4. Structures to support easy upgradability We provide for a flexible Query/response structure so future additions of fields to the query section is easy, not requiring Kumar, Hotz, Postel [Page 12] INTERNET DRAFT DNS Incremental Transfers December, 1993 software upgrade. This is achieved through a new, flexible Opcode in the DNS header called "NEWQUERY" that allows for a new structure to the DNS query section and the response sections. Also we propose a new RR type called CARRIER, that exists only as transitory entity. It carries a payload of RRs that are delivered to the querying agent. The carrier RR is destroyed once its payload has been delivered. 4.1 NEWQUERY A new Opcode "NEWQUERY", numerical value xxx, is assigned. A NEWQUERY query section will follow the structure as follows: 1 1 1 1 1 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | | / QNAME / / / +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | QTYPE = IXFR | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | QCLASS | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | ID | LEN | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | | / FIELD DATA / / / | | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | (NOP) | (ZERO) | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ Of course, we can have as many tuples as needed. This is what gives the NEWQUERY structure its flexibility. ID = 1 is a NOP and ID = 0 is a "end of data" marker. Both these IDs have no length and data fields associated with them. Servers who do not recognize any ID can skip that field completely and process a query based on the rest of the data provided. Beyond this necessary upgrade, we do not foresee any further changes to the DNS query section to support changes in future query specifications. Currently, we need only a serial number field to be added to a query, so the query section will be seen as: Kumar, Hotz, Postel [Page 13] INTERNET DRAFT DNS Incremental Transfers December, 1993 1 1 1 1 1 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | | / QNAME / / / +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | QTYPE | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | QCLASS | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | ID = "S_NO" | LEN = 4 | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | SERIAL NUMBER | | | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | (NOP) | (ZERO) | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ In response to a "NEWQUERY" query, RRs that are sent will contain the serial number field, as described above, and any other field that might be added on to the RR at a later stage. Thus, any application that needs to see the serial number of a RR will have to implement the NEWQUERY opcode. Existing DNS software will return "NOT IMPLEMENTED" response codes to all these queries. 4.2 "CARRIER" resource record The CARRIER resource record is unique in that it exists only in a response packet. Once processed by the receiving end, its payload delivered, it is destroyed. The CARRIER RR has the following structure: Kumar, Hotz, Postel [Page 14] INTERNET DRAFT DNS Incremental Transfers December, 1993 1 1 1 1 1 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | | / / / NAME / | | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | TYPE = CARRIER | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | CLASS | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | TTL = 0 | | | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | ID = "NUM_RRs" | NUM-DLEN = 4 | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | NUMBER OF PAYLOAD RRs | | | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | (NOP) | (ZERO) | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | RDLENGTH | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | | / RDATA / / / | | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ Note that since a serial number field doesn't make any sense with a carrier record, we do not have to have it there. The RDATA section will contain RRs as an tuple. The Operation field specifies what the receiving end should do with the RR. Thus, for an IXFR it will be either an "add" operation or a "delete" operation. Other operations might be needed for other future queries and we intend for to be a 1-octet field, allowing for 256 operations. We reserve Operation = 0 to specify that this field has no meaning for this query type. The numerical value for the RR type is xxx. Thus, using the new Opcode NEWQUERY and the new RR type CARRIER, we will define IXFR and NOTIFY mechanisms. Kumar, Hotz, Postel [Page 15] INTERNET DRAFT DNS Incremental Transfers December, 1993 5. IXFR, The Actual Mechanism An IXFR query packet will, necessarily, contain the highest RR serial number the secondary last saw. Thus, the DNS IXFR query packet will look like this. 1 1 1 1 1 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | | / QNAME / / / +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | QTYPE = IXFR | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | QCLASS | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | ID = "S_NO" | LEN = 4 | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | SERIAL NUMBER | | | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | (NOP) | (ZERO) | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ Note that QNAME, QTYPE and QCLASS are exactly as for any standard query type. SERIAL NUMBER is the serial number the secondary must convey to the primary so the primary can send it all entries updated since that serial number was seen. The ID "S_NO" is assigned numerical value xxx. The response packet will, in turn, contain one or more CARRIER RRs that contain payload RRs, as described above, in serial order. To make atomicity checks simple, each CARRIER RR could carry all RRs bearing the same serial number. Thus, with each CARRIER RR processed, the system goes from one consistent state to the next. Note that this is an implementation specific issue. 5.1 The Client Side The client (secondary) sends an SOA query to the server (primary) and compares the serial number returned by the server with its copy. Kumar, Hotz, Postel [Page 16] INTERNET DRAFT DNS Incremental Transfers December, 1993 If (current_client_soa# > latest_server_soa#) { signal a possible error; QUIT; } If (current_client_soa# == latest_server_soa#) { exit gracefully; } If (current_client_soa# < latest_server_soa#) { transmit_IXFR_query(current_serial_number); destroy_zone_file(); receive_IXFR_response(&buf); update_zone_data(hashtab); recreate_zone_file(backup_zone_file); } Note that we destroy the zone file before we actually attempt to receive any data from the server. We do this since otherwise, if we receive data and crash before we are able to update our zone file, the primary will believe that we have the latest data while when we actually recover, we will not. The "recreate_zone_file()" operations needs to be atomic or at least, if the server crashes midway through the operation, it should be able to detect this when restarting. Thus, by this destruction, we ensure that we either have the latest data or that we have nothing. This is of special importance here since if we do not destroy the zone file and the client crashes after it completes the transfer but before it can update the zone file, we have a problem. The server believes the client has the updated data (if it keeps soft state) while the client actually does not. Thus, when we actually recover from the crash, we initiate a full zone transfer from the server. 5.2 The Server The server follows this simple algorithm when processing IXFR queries. Kumar, Hotz, Postel [Page 17] INTERNET DRAFT DNS Incremental Transfers December, 1993 1. current_serial_from_client -> csn; Record (csn, soft_state_structure); 2. If (csn == highest RR serial) { send empty response; } 3. If (csn < highest RR serial - N) { send XXFR response; exit; } 4. for (each entry in zone file) if (csn >= entry_serial_number) continue; else add to IXFR packet; /* * If (Zflag = 1) * Operation = Delete; * else * Operation = Add; */ endfor; Transmit IXFR packet to client. Thus, a server sends incremental transfers only if the csn from the client falls within N of the present value of the SOA serial number. Otherwise, the server sends the XXFR packet and the secondary must expect and be able to interpret that packet. Of course, this need be done only if scheme (b) from section 2.2.1 is being followed. If scheme (a) is used, step 3 is skipped. 6. XXFR Once every XXFRTIME (section 3), the IXFR query packet will be: Kumar, Hotz, Postel [Page 18] INTERNET DRAFT DNS Incremental Transfers December, 1993 1 1 1 1 1 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | | / QNAME / / / +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | QTYPE = IXFR | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | QCLASS | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | ID = "S_NO" | LEN = 4 | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | SERIAL NUMBER = 0 | | | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | (NOP) | (ZERO) | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ Note that the SERIAL NUMBER field equals the value "0". This prompts the server to send an IXFR packet that contains all the zone data, i.e. it is a full zone transfer, using the IXFR mechanism. The system could use this to synchronize complete zone files at regular XXFRTIME intervals. What this also implies, as a side-issue, is that the serial number "0" is reserved and zones cannot have RRs with serial number "0". 7. NOTIFY Currently, a secondary always waits "REFRESH" seconds before polling the primary for any changes in the zone. If a primary makes any changes (that may be rather important) and wants that all secondaries reflect these changes immediately, the primary has no means of talking to the secondaries. A mechanism must be available to notify the secondary that it might benefit from a zone transfer, right away if possible. We propose a new procedure, "NOTIFY", to fulfill exactly this need. When the database is updated, the primary sends a NOTIFY packet to the secondaries. This packet contains the SOA record for the zone and informs the secondary that it might benefit from a transfer. The secondary can choose not to transfer, if it sees a heavy load at that moment. The notification could be turned on, on a per zone basis, and might need a new bootfile parameter (NOTIFY/NONOTIFY) with the primary/secondary entry. Kumar, Hotz, Postel [Page 19] INTERNET DRAFT DNS Incremental Transfers December, 1993 This mechanism will be particularly useful in dynamic update situations where servers might need to converge to a common state, fast. The NOTIFY packet looks like this: 1 1 1 1 1 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | ID | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ |QR| "NEWQUERY" |AA|TC|RD|RA| Z | RCODE | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | QDCOUNT = 0 | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | ANCOUNT = 1 | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | NSCOUNT = 0 | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | ARCOUNT = 0 | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ / / / CARRIER RECORD / +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ The CARRIER record in the response section contains the SOA record, with the Operation = NOTIFY (numerical value = xxx). Thus, it looks like: Kumar, Hotz, Postel [Page 20] INTERNET DRAFT DNS Incremental Transfers December, 1993 1 1 1 1 1 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | | / / / NAME / | | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | TYPE = CARRIER | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | CLASS | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | TTL = 0 | | | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | ID = "NUM_RRs" | NUM-DLEN = 4 | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | NUMBER OF PAYLOAD RRs = 1 | | | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | (NOP) | (ZERO) | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | RDLENGTH | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | Operation = "NOTIFY"| | +--+--+--+--+--+--+--+--+ / / SOA RECORD / / / | | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ A mix of polling (current mechanism) and low-priority interrupts from the primary may be considered as the mechanism for zone transfers. 7.1 Unregistered Secondaries The primary server must be aware of the presence of all secondaries, including those that aren't registered ("registered" servers are those servers whose names are returned by DNS in response to an NS query for that zone), in order to send them NOTIFY messages. Note that this information is also needed if the primary is maintaining "soft state" (per the mechanism in section 2.2.1(a)). Once again, there are two different means of doing this: Kumar, Hotz, Postel [Page 21] INTERNET DRAFT DNS Incremental Transfers December, 1993 a. Automatic Registration Any entity that initiates any kind of transfer (IXFR/AXFR) is identified as a potential candidate for notification and for the purpose of keeping soft state (as described in section 2.1.2). The IP address of this entity may be recorded against the zone it transferred. Old entries (more than some number of REFRESH periods old, say 10*REFRESH) may be timed out). While this may be convenient, it also could mean a server could waste time trying to NOTIFY any number of machines due to client-level zone transfer requests (ala 'dig', 'nslookup', etc). b. "my-secondaries" We keep a list of "my-secondaries" servers that are not advertised via the DNS but are known servers for our domain (possibly serving local resolvers). Since the system administrator typically knows about unregistered secondaries, this merely formalizes their existence (and should not require a burdensome amount of additional configuration effort). Such secondaries are usually local machines on a campus or organization premises (and within the domain) that serve as name servers for local queries, keeping the load on the registered secondaries small. We propose an entry in the bootfile of each server that expects to see IXFR queries from other servers that reads: my-secondaries zone server1, server2, ....., serverN This is a list of secondaries that would normally transfer zone from this server. This entry may immediately follow the entry that says, primary zone datafile or secondary zone primary Thus, "my-secondaries" servers are associated with a given zone and could transfer from either a secondary or the primary server. Thus, a secondary that serves as source for zone data for other secondaries needs to maintain such a list, like the primary. In either case, please note that we do not address the issue of cached data at other servers. TTL values could be used to ensure that data is not cached for longer than it is likely to stay valid. Kumar, Hotz, Postel [Page 22] INTERNET DRAFT DNS Incremental Transfers December, 1993 7.2 Timing and Security issues. Notification procedure necessitates that we ensure the following: a. There must be a minimum time between notifications. This prevents a malicious primary from bogging the secondary down, with back-to-back notifications. The secondary must maintain a timer (optionally) to enforce such restrictions. b. A secondary must transfer zone within a maximum interval (existing REFRESH mechanism should suffice). This ensures the state is not inconsistent for more than a fixed maximum interval. c. Modification Notification should be accepted only if coming from primary or the server you normally transfer from. A malicious network entity could pose as a primary and transfer incorrect data to you. This does not really solve the problem of impersonation since a masquerading entity could just as well act as the primary when sending the NOTIFY message. This just provides an additional hurdle so somebody actually sending you a NOTIFY does need to impersonate the primary. d. When accepted, zone should be transferred only from primary or the server you normally transfer from. 8. Performance Issues DNS, like any other replicated, distributed system, has various parameters that can be tuned to get the desired nature of performance. For example, a shorter REFRESH cycle ensures a faster convergence among authoritative servers, at the cost of extra network bandwidth used in transferring zone data. Thus, system administrators tune this figure to the desired mix of consistency and bandwidth usage. Similarly, the TTL associated with each RR presents a trade-off between how often you query an authoritative server and how current your data is. A low TTL would keep the data current at the cost of extra network traffic while a high TTL conserves bandwidth but allows for the data to be inconsistent. System administrators balance between the two requirements and choose a reasonable TTL. Similar figures in the above scheme, such as the frequency of NOTIFY acceptance, the TTL of dynamic data and possibly the number of secondaries allowed, present trade-offs that need to be made to tune performance. A higher rate of NOTIFY acceptance will imply greater network traffic but very speedy convergence. A lower figure will conserve network bandwidth but will allow for data to be inconsistent Kumar, Hotz, Postel [Page 23] INTERNET DRAFT DNS Incremental Transfers December, 1993 for longer. A single server, serving a zone will imply instantaneous convergence but will provide very low availability and reliability. This might be alright for a low traffic volume zone. We do not, by way of the IXFR and NOTIFY mechanisms, hope to provide servers that converge instantaneously with minimal traffic. Studies in the future will show how effective these mechanisms will be and how best the various parameters can be used to tune the performance. 9. Acknowledgements We express our sincere thanks to Don Lewis, Paul Mockapetris, Clifford Neuman, Masataka Ohta, Sue Thomson, Paul Vixie and Philip Wood for their comments on earlier versions of this draft. 10. References [1] Thomson, S., "Timestamped Queries", Internet Draft. 11. Authors' Addresses: Anant Kumar USC Information Sciences Institute 4676 Admiralty Way Marina Del Rey CA 90292-6695 Phone: (310) 822-1511 FAX: (310) 823-6714 Email: anant@isi.edu Steve Hotz USC Information Sciences Institute 4676 Admiralty Way Marina Del Rey CA 90292-6695 Phone: (310) 822-1511 FAX: (310) 823-6714 Email: hotz@isi.edu Kumar, Hotz, Postel [Page 24] INTERNET DRAFT DNS Incremental Transfers December, 1993 Jon Postel USC Information Sciences Institute 4676 Admiralty Way Marina Del Rey CA 90292-6695 Phone: (310) 822-1511 FAX: (310) 823-6714 Email: postel@isi.edu This Internet Draft expires May 31, 1994. Kumar, Hotz, Postel [Page 25]