IAB B. Carpenter Internet Draft R. Austein February 2002 (editors) Recent Changes in the Architectural Principles of the Internet Copyright Notice If published as an RFC this document will become Copyright (C) The Internet Society (2002). All Rights Reserved. Abstract draft-iab-arch-changes-00.txt In 1996, the IAB published RFC 1958, entitled "Architectural Principles of the Internet." The Internet has grown and evolved since then, and this document records the impact of this evolution on the principles laid down previously. This first draft is issued for discussion by the IETF community. Status of this Memo This document is an Internet-Draft and is subject to all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet- Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/1id-abstracts.html The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html IAB Expires August 2002 [Page 1] Internet Draft Changes in the Architectural Principles February 2002 Table of Contents: Status of this Memo.............................................1 1. Introduction.................................................3 2. The End-to-End Argument and Middleboxes......................3 2.1. NATs as a Special Case of Middleboxes......................4 3. Internationalisation.........................................5 4. Overextension................................................6 5. A New External Issue.........................................7 6. Security Considerations......................................7 Non-normative References........................................7 Acknowledgements................................................8 Editors' Addresses..............................................9 Full Copyright Statement........................................9 IAB Expires August 2002 [Page 2] Internet Draft Changes in the Architectural Principles February 2002 1. Introduction In 1996, the IAB published RFC 1958, entitled "Architectural Principles of the Internet" [RFC 1958]. The Internet has grown dramatically and evolved substantially since then, and this document attempts to record the impact of this evolution on the principles laid down previously. Indeed, the very first section of [RFC 1958] is entitled "Constant Change" and it was expected that the document would require revisiting. Although the principles are still fundamentally sound, some of them need additional explanation and refinement to match current realities. Three main areas are of concern. Also, one new principle has arisen. From here on, it is assumed that the reader is familiar with [RFC 1958]. Principles that are not mentioned in the present document are still regarded as fully valid. 2. The End-to-End Argument and Middleboxes The description of the end-to-end argument in [RFC 1958] does not take into account the concerns about transparency [RFC 2775, RFC 2956] and the related issues surrounding NATs [RFC 2993, RFC 3022, RFC 3027] and intermediaries or middleboxes [RFC 3234, RFC 3040]. These topics have stimulated much activity in the IETF in recent years, and this work has attracted some attention in other forums [Clark]. We can state that the traditional end-to-end argument still applies, both as an abstract design principle, and concretely to all cases where a single transport connection (unicast or multicast) is considered. However, in cases where an application data flow passes through multiple transport connections and may even be modified en route by middleboxes, the argument applies in a more subtle way. In the presence of middleboxes, it is no longer possible to assume that because TCP is reliable, the end-to-end path is reliable. Also, state in the middleboxes may be lost during failures. To obtain reliability and robustness, more sophisticated techniques than those employed by TCP are needed, possibly approaching the complexity of transactional two-phase commit. Similarly, IP or Transport Level security no longer guarantees end-to-end security. Security mechanisms above the network itself are needed in addition. One way to look at the problem is that users do still care very much about the end-to-end principle, but only at layers where they understand how preservation of semantics matters to them. For example: loss of TCP's end-to-end data integrity would matter a great deal to most users, but loss of an ability they've never used (such as a new, genuine peer-to-peer application that requires full transparency) would be invisible. So the end-to-end principle is still valid, but its realisation cannot be limited to the transport layer. IAB Expires August 2002 [Page 3] Internet Draft Changes in the Architectural Principles February 2002 More concretely, in the presence of middleboxes, applications are often forced to use application-layer handshakes to ensure reliability and robustness. See SMTP delivery confirmation and stateful SIP proxies [RFC 2543] as examples. This is the end-to-end principle, recursed further to the "real" ends to cover cases where it no longer applies to an end-to-end IP path. In some cases even application state may be temporarily stored in the middle of the network. The duration of this state varies widely: for example, RSVP soft state lasts for the duration of the reservation, while SIP state only lasts for one transaction, not for the duration of the session. If an RSVP soft state box dies in mid-flow, the flow is perturbed until the soft state is reconstructed, which is not true for SIP and SMTP. The idea of fate-sharing survives this recursion, but requires that all application state created in middleboxes must be capable of re- creation after failure. Additionally, to support this requirement, where a function cannot be fulfilled completely, reliably and securely by two endpoints of a conversation, the necessary middleboxes to fulfil the function should be explicit partners in explicit communication with at least one endpoint (or, by recursion of the same principle, with an intermediate middlebox). In other words, middleboxes should not be invisible, because their failures need to be detected and dealt with by their communication partners. A--------M1-------------M2-------------Z \ / \ / M3----------------------/ In this example, suppose a conversation between A and Z requires intervention by middleboxes M1, M2 and M3, with the creation of soft state in those boxes. To create this state, explicit communication about the A-Z conversation may be required between each of the following pairs: (A, M1), (M1, M2), (M2, Z), (M1, M3), and (M3,Z). This will allow for recreation of soft state in M1, M2 and M3 (or their backups) in all failure cases except total failure of A or Z; thus the fate-sharing principle is preserved. Subtle effects can occur when the middlebox actually modifies the application data stream in some way, i.e. becomes an active agent in destroying data path integrity. Specific concerns of this kind are discussed in [RFC 3238]. 2.1. NATs as a Special Case of Middleboxes The specific case of NAT middleboxes has led to much anguish, due to the number of instances (notoriously TCP and IPSEC) in which end-to- end integrity depends on the value of IP addresses. In other cases (such as H.323) the protocol itself assumes address transparency. NATs specifically cause breaches of one of the principles in [RFC 1958]: "Addresses must be unambiguous (unique within any scope where they IAB Expires August 2002 [Page 4] Internet Draft Changes in the Architectural Principles February 2002 may appear)." [RFC 2993] goes into detail, and makes a strong case for address translation being a dead end in the architecture. Unfortunately, while we remain convinced that it is a dead end, it is crowded with users, and workarounds are being sought, especially by the MIDCOM WG. These workarounds may produce their own architectural issues [unsaf]. It remains a fact that today, NAT inhibits progress beyond the simple Web client/server model, and that the only well understood solution that definitively fixes this problem is general adoption of a larger address space. [RFC 1958] also states that: "Upper layer protocols must be able to identify end-points unambiguously. In practice today, this means that addresses must be the same at start and finish of transmission." Again, NATs can cause violations of this principle. However, the new SCTP transport protocol [RFC 2960] may provide a partial solution by handling multiple addresses for a single connection. There have also been proposals to allow TCP connections to update their endpoint addresses [Snoeren]. One well-established aspect of IP addresses that has become more troublesome recently [RFC 2956] is that they combine two functions: that of an "endpoint identifier" and a "locator". Such functional overloading could be considered a violation of an important design principle (see Section 4 below). Of course there were (and are) good pragmatic engineering reasons for this choice. Nevertheless, if it was possible to split endpoint identifiers from locators, many of the problems currently associated with NATs, route aggregation, multihoming, and renumbering would be fundamentally reformulated. Despite recent research activity in this area, it remains uncertain whether this would in fact simplify the problems just cited, or merely move them to a different part of the Internet architecture. The issue identified here has a parallel closer to the applications layer. We have several efforts, in one form or another, to assign URI types to all protocols and then permit naming URIs (e.g., a NAPTR record ultimately maps a name into a URI). But, historically, our names and addresses bind to to a network object, such as a host, and we use port numbers to identify protocols and protocol listeners. But URIs aren't like that -- they typically bind to a (host, protocol) pair or something equivalent to it. And, through that evolution, we are in danger of losing the host:port model That has its own set of implications, some of which may even be good ones, but it is another evolving architectural change whose implications are not completely understood. 3. Internationalisation [RFC 1958] states that: "Public (i.e. widely visible) names should be in case-independent ASCII. Specifically, this refers to DNS names, and to protocol IAB Expires August 2002 [Page 5] Internet Draft Changes in the Architectural Principles February 2002 elements that are transmitted in text format." This was not intended to diminish the importance of multiple character sets for users of the Internet. It was a simple statement of the need for a lingua franca (indeed, ASCII is very close to being the character set of the original lingua franca). Since 1996, the IETF has studied character set issues in general [RFC 2130] and made specific recommendations for the use of a standardised approach [RFC 2277]. In fact, the situation is complicated by the fact that some uses of text are hidden entirely in protocol elements and need only be read by machines, while other uses are intended entirely for human consumption. Many uses lie between these two extremes, which leads to conflicting implementation requirements. For the specific case of DNS, a working group on internationalisation is in progress. A fundamental requirement in this work is to not disturb the current use and operation of the domain name system, and for the DNS to continue to allow any system anywhere to resolve any domain name. This leads to some very strong requirements for backwards compatibility with the existing ASCII-only DNS. Yet since the DNS has come to be used as if it was a directory service, domain names are also expected to be presented to users in local character sets. Furthermore, users want them in forms that can be interpreted locally as words. Satisfying this requirement simultaneously with those for backwards compatibility and universal name resolution is not easy. This document is not intended to resolve these complex and difficult issues. But the principle that names encoded in a text format within protocol elements must be universally decodable (i.e. encoded in a globally standard format with no intrinsic ambiguity) will not change. At some point, it is possible that this format will no longer be case-independent ASCII. 4. Overextension There has been a strong tendency in the last few years to overload some designs with functionality that was not originally anticipated, with resulting operational complexities. One can argue that [RFC 1958] covers this point by stating "Keep it simple." However, it should be stated explicitly that protocols and systems should not be stretched beyond their reasonable design parameters until scaling, reliability and security isssues have been resolved beyond doubt. A nuance is that two different interpretations are possible: 1) Functional overloading considered harmful. 2) Extensible protocols have limits. Both of these interpretations are true in various cases. Examples where one or the other might apply include DNS [dnsrole], MPLS, and BGP. It is hard to give precise criteria for the safe limits to IAB Expires August 2002 [Page 6] Internet Draft Changes in the Architectural Principles February 2002 overloading or extension. In some cases, overloading or extending a protocol may reduce total complexity by avoiding the creation of a new protocol; in other cases a new ad hoc protocol might be the simpler solution. There is a subtle line between overloading and re-use. We have a number of re-useable technologies, including component technologies specifically designed for re-use. Examples include SASL, BEEP and APEX. On the other hand, re-use should not go so far as to turn a protocol into a Trojan Horse, as has happened with HTTP [RFC 3205]. 5. A New External Issue Section 5 of [RFC 1958], "External Issues", deals with a number of principles where the IETF's work touches on non-technical issues. The IETF Policy on Wiretapping [RFC 2804] adds such a principle that is not included in [RFC 1958]: the IETF has decided not to consider legal requirements for wiretapping in certain jurisdictions as part of the process for creating and maintaining IETF standards. That RFC sets out the reasons for the new principle, e.g. that adding a requirement for wiretapping will make affected protocol designs considerably more complex. As with all principles of the architecture, this one will no doubt require review in the future. 6. Security Considerations This document does not discuss issues directly affecting the security of the Internet. However, it is worth noting that since [RFC 1958] was published, the requirement for all Internet protocols to have an adequate security solution has become more important than ever. Non-normative References [RFC 1958] Architectural Principles of the Internet. B. Carpenter, Ed.. June 1996. [RFC 2130] The Report of the IAB Character Set Workshop held 29 February - 1 March, 1996. C. Weider, C. Preston, K. Simonsen, H. Alvestrand, R. Atkinson, M. Crispin, P. Svanberg. April 1997. [RFC 2277] IETF Policy on Character Sets and Languages. H. Alvestrand. January 1998. [RFC 2543] SIP: Session Initiation Protocol. M. Handley, H. Schulzrinne, E. Schooler, J. Rosenberg. March 1999. IAB Expires August 2002 [Page 7] Internet Draft Changes in the Architectural Principles February 2002 [RFC 2775] Internet Transparency. B. Carpenter. February 2000. [RFC 2804] IETF Policy on Wiretapping. IAB, IESG. May 2000. [RFC 2956] Overview of 1999 IAB Network Layer Workshop. M. Kaat. October 2000. [RFC 2960] Stream Control Transmission Protocol. R. Stewart, Q. Xie, K. Morneault, C. Sharp, H. Schwarzbauer, T. Taylor, I. Rytina, M. Kalla, L. Zhang, V. Paxson. October 2000. [RFC 2993] Architectural Implications of NAT. T. Hain. November 2000. [RFC 3022] Traditional IP Network Address Translator (Traditional NAT). P. Srisuresh, K. Egevang. January 2001. [RFC 3027] Protocol Complications with the IP Network Address Translator. M. Holdrege, P. Srisuresh. January 2001. [RFC 3040] Internet Web Replication and Caching Taxonomy. I. Cooper, I. Melve, G. Tomlinson. January 2001. [RFC 3205] On the use of HTTP as a Substrate. K. Moore. February 2002. [RFC 3234] Middleboxes: taxonomy and issues. B. Carpenter, S. Brim. February 2002. [RFC 3238] IAB Architectural and Policy Considerations for Open Pluggable Edge Services. S. Floyd, L. Daigle. January 2002. [dnsrole] Role of the Domain Name System. J. Klensin. Work in progress, 2001 (draft-klensin-dns-role-01.txt) [unsaf] IAB Considerations for UNilateral Self-Address Fixing (UNSAF). IAB. Work in progress, 2002 (draft-iab-unsaf- considerations-01.txt) [Clark] D. Clark, M. Blumenthal, "Rethinking the Design of the Internet: The End-to-end Arguments vs. the Brave New World", Telecommunications Policy Research Conference, September 2000, available via http://www.tprc.org/ [Snoeren] Alex C. Snoeren, David G. Andersen and Hari Balakrishnan, "Fine-Grained Failover Using Connection Migration," USENIX (San Francisco), March 2001. Acknowledgements This document is a collective work of the Internet community, published by the Internet Architecture Board. Special thanks to . IAB Expires August 2002 [Page 8] Internet Draft Changes in the Architectural Principles February 2002 Editors' Addresses Brian E. Carpenter IBM Zurich Research Laboratory Saeumerstrasse 4 8803 Rueschlikon Switzerland Email: brian@hursley.ibm.com Rob Austein InterNetShare, Incorporated 325M Sharon Park Drive, Suite 308 Menlo Park, CA 94025 USA Email: sra@hactrn.net Full Copyright Statement PLACEHOLDER for full ISOC Copyright Statement IAB Expires August 2002 [Page 9]