HTTP/1.1 200 OK Date: Tue, 09 Apr 2002 10:28:35 GMT Server: Apache/1.3.20 (Unix) Last-Modified: Mon, 23 Jun 1997 19:40:00 GMT ETag: "361cba-4090-33aed110" Accept-Ranges: bytes Content-Length: 16528 Connection: close Content-Type: text/plain Network Working Group C. Newman Internet Draft: Application Protocol Design Principles Innosoft Document: draft-newman-protocol-design-00.txt June 1997 Expires in six months Application Protocol Design Principles Status of this memo This document is an Internet-Draft. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." To view the entire list of current Internet-Drafts, please check the "1id-abstracts.txt" listing contained in the Internet-Drafts Shadow Directories on ftp.is.co.za (Africa), ftp.nordu.net (Europe), munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or ftp.isi.edu (US West Coast). Abstract There are a number of design principles which come into play over and over again when designing application protocols. Many of these are entrenched in IETF lore and spread by word of mouth. Most have been learned the hard way many times. This is an attempt to codify some of these principles so they can be referenced rather than spread by word of mouth. The author has not invented any of these ideas and while the exercise of finding the originator of the ideas would be interesting, it is not deemed necessary for this project. Many of these principles have a much wider scope than application protocol design. However, the author's primary experience is with application protocols and examples provided usually involve application protocols or elements. [Disclaimer: this is a preliminary draft. Some of the case studies and exceptions need tuning. Suggestions welcome.] Newman [Page i] Internet Draft Application Protocol Design Principles June 21, 1997 Table of Contents Status of this memo ............................................... i Abstract .......................................................... i 1. K.I.S.S. .................................................... 1 2. Make the Common Case Simple & Uncommon Case Possible ......... 1 3. 0, 1, N Principle ............................................ 1 4. Be Liberal/Conservative ...................................... 2 5. Avoid Silly States ........................................... 3 6. Text Not Numbers ............................................. 3 7. Avoid Alternative Representations ............................ 4 8. Announce Features, Not Version ............................... 4 9. Avoid Unnecessary Layers ..................................... 5 10. Conclusions Based on Design Principles ....................... 5 11. Security Considerations ...................................... 6 12. References ................................................... 6 13. Author's Address ............................................. 6 Newman [Page ii] 1. K.I.S.S. The "Keep It Simple, Stupid" principle or "KISS" principle is well known. The basic idea is not to add complexity if there is any way to avoid it. Sometimes this also involves a decision of where the complexity should live (e.g. client implementation, server implementation, protocol itself, external layers). This is a very difficult principle to follow in practice. Consequences of Violation: design errors, implementation bugs, poor deployment, poor maintainability, interoperability problems, poor usability, less peer review, protocol has to be "profiled" to interoperate. Case Study: X.400 vs. SMTP/MIME. X.400 is very complex and is losing ground steadily in the marketplace. SMTP/MIME is much simpler and is gaining ground in the marketplace. 2. Make the Common Case Simple & Uncommon Case Possible This is largely a corollary of the KISS principle. Sometimes phrased as "design for the common case." The idea is to make the common case very simple without disallowing the useful uncommon cases. Consequences of Violation: Same as KISS. If useful uncommon case is not possible, then a potentially complex protocol extension is necessary which results in more complexity than if the uncommon case was considered from the start. Case Study (common case too complex): ASN.1 makes the common case far too complex. While it does provide for unlimited extensibility, in practice implementations can't deal with many legal structures due to the complexity. Case Study (uncommon case not possible): Internet Mail originally didn't allow non-text data. MIME is more complex than it would have been if designed in from the beginning. 3. 0, 1, N Principle The "0, 1, N principle" is not an obvious principle but is true surprisingly often. In general, any protocol element or object should come in quantities of 0, 1, or N where N is an arbitrary number. If a limit is picked, it is likely to be too small. This is especially true of hierarchy, and often true of names. Consequences of Violation: System has to be extended to allow Newman [Page 1] Internet Draft Application Protocol Design Principles June 21, 1997 larger values. This causes a transition with severe interoperability problems or a semantic overload of existing structure which adds complexity and confusion. Case Study: 640K Case Study: MIME media types. Two levels of hierarchy were defined in the initial MIME specification. This proved inadequate, so a new hierarchy delimiter had to be introduced to allow more naming hierarchy. Case Study: SMTP error codes have 3 levels of hierarchy with 10 settings each. This has proved to be insufficent and inflexible requiring the addition of ESMTP ENHANCEDSTATUSCODES. Exception: A quantity of two is permitted for clearly binary situations. Exception: Has to be balanced with the KISS principle. For example, current practice limits most numbers in protocols to 32-bit values. 4. Be Liberal/Conservative The "Be Liberal in What You Accept, and Conservative in What You Generate" principle is well known in the IETF, but controversial in some cases. The intention is to interoperability. The basic idea is that on generation one should follow the standard strictly as that will work with all other compliant software. On acceptance if one tolerates minor protocol or format violations, it helps work around known bugs in other software. This principle would work great if everyone followed it. However, when there are mixtures of systems which follow this rule and others which don't the exceptions below need to be considered. Consequences of Violation: decreased interoperability, customers blaming the violator of this principle for bugs in other vendor's software. Case Study: TBD Exception: Don't accept ambiguous interactive input with potentially vastly different meanings. If the user ends up seeing the data in an indecipherable context, severe consequences result. It's often better to reject the data so the problem can be fixed at the source. Exception: Don't accept clearly illegal interactive input when Newman [Page 2] Internet Draft Application Protocol Design Principles June 21, 1997 there are no known sources of it. If client vendors notice their illegal behavior before deploying, it gets fixed before it's deployed, and overall interoperability is increased. Exception Case Study: When netnews was initially deployed, a number of clients generated date headers in a variety of illegal formats. Fairly early in the deployment, a major implementation was modified to discard news messages which had missing or improperly formatted date headers. Very soon after this was deployed, all date headers in news were interoperable. 5. Avoid Silly States Whenever possible, design the system so no silly states are possible. A silly state is a combination of options or values which contradict each other or are nonsensical. Consequences of Violation: Increased complexity to deal with the possibility of the silly states occurring. Case Study: TBD 6. Text Not Numbers Whenever possible, text should be used instead of numbers. Numbers almost always have to be looked up in order for humans to interpret them. Text can be read and debugged by a mere mortal. One common counter argument is that numbers are more compact, but if size is a serious concern, a general purpose compression layer is usually a better solution. Another counter argument is that the mapping tables and parsers to convert to internal numbers add complexity. In practice the complexity of debugging a non-text protocol is usually greater than the complexity of the parser and tables. Consequences of Violation: Protocol is difficult to debug, protocol is difficult to understand, examples are hard to provide. Previous three consequences make this equivalent to a KISS violation. Results in poor user interface. Endian problems. Case Study: X.400 problems are very hard to diagnose. The protocol trace has to be recorded and run through an ASN.1 interpretor to debug. SMTP can be debugged by observing the original protocol trace. Case Study: Whenever numeric error codes are used unqualified by text, humans are invariably presented with these error codes, resulting in a poor user interface and debugging difficulties. Newman [Page 3] Internet Draft Application Protocol Design Principles June 21, 1997 Case Study: The telnet ENVIRON option had to be replaced with NEW-ENVIRON due to endian problems. Exceptions: Compression or Encrytion layers (which make things unreadable anyway). Low-level protocols with high performance requirements. Encapsulated non-text objects. 7. Avoid Alternative Representations Having several ways to represent the same thing results in interoperability problems. In general, implementors will only test the representation format they use. The less often used representations will fail to work. In a worst case scenario, two or more representions are widely used, but systems which use one often can't talk to systems which use another. Consequences of Violation: Serious interoperability problems, more bugs, conversion support necessary to interoperate. Case Study: The TIFF image format permits both a "big endian" version and a "little endian" version. Some implementations can only read one or the other. Many TIFF applications now have a "Macintosh format TIFF" vs "IBM format TIFF" option when saving TIFF files. Case Study: ASN.1 provides many ways of representing the same thing. This has caused numerous interoperability problems as not all systems support all representations of a given field. Profiles of ASN.1 are usually necessary to interoperate at all. Case Study: RFC 822 allows several different ways to quote the same address. The useless ones like: "foo"."bar"@do.main rarely work. Exceptions: An alternative representation may be necessary for a more expressive case. For example, quoted strings and literals in IMAP. Rare alternative representations should be avoided. 8. Announce Features, Not Version While version numbers are fine to inform the user of what implementation or conformance level they are at, they are usually a bad idea in protocols. A system where the server announces available features and the client activates the features it wants results in a far better protocol. If a protocol needs to be redesigned from scratch, use of a different port number for the new version will allow a parallel transition period -- otherwise when a major version number is increased on the server, the old clients cease to interoperate with it. Newman [Page 4] Internet Draft Application Protocol Design Principles June 21, 1997 Consequences of Violation: Useless version number fields, painful version transition, complexity due to need to support older versions, meaning of version number sometimes ambiguous. Case Study: MIME has the MIME-Version header. Since MIME also has feature announcement via headers, the version number is useless and will never change. Case Study: X.400:1988 fails to interoperate with X.400:1993 due to certain body part types. [XXX: need to confirm] 9. Avoid Unnecessary Layers Whenever two layered services can be combined into a single service without a significant increase in complexity, it should be done. Unnecessary layers result in implementor confusion and more complexity. Consequences of Violation: same as KISS violations Case Study: RFC 822 has a multi-layer parsing model which includes unfolding lines, lexing, removal of linear-white-space, and parsing. This has resulted in endless confusion and serious interoperability problems. The DRUMS WG is folding these into a single formal syntax and the result looks promising. 10. Conclusions Based on Design Principles KISS: Every protocol should go through a "feature cut review" before going on the standards track. KISS/Text not Binary/Alternate Representations: Use of ASN.1 in new protocols should be strongly discouraged. 0,1,N Principle/Text not Numbers: Use of 3 digit SMTP-style error codes in new protocols should be forbidden. Announce Features, Not Version: Server feature announcement should be required in most standards track protocols. Alternate Representations: CRLF line separators should be required. Big endian should be required in new binary protocols and formats. Use of UTF-8 should be preferred over labelled character sets in new protocols. Newman [Page 5] Internet Draft Application Protocol Design Principles June 21, 1997 11. Security Considerations Many of these can have profound security implications. Violation of KISS makes a security bug more likely. Alternate Representations makes a security bug more likely in a less frequently used representation. A silly state could introduce a security bug if special handling isn't included. Failure to follow the 0,1,N principle when implementing makes buffer overrun problems more likely. While it's harder to fix a security bug in a binary protocol due to the debugging complexity, text protocols tend to be more susceptible to buffer overrun security problems. These two factors probably offset each other. 12. References [IMAP4] Crispin, M., "Internet Message Access Protocol - Version 4rev1", RFC 2060, University of Washington, December 1996. [TBD] 13. Author's Address Chris Newman Innosoft International, Inc. 1050 Lakes Drive West Covina, CA 91790 USA Email: chris.newman@innosoft.com Newman [Page 6]