Internet Draft Greg Vaudreuil Expires in six months Lucent Technologies May 1, 2000 SMTP Service Extensions for Transmission of Large and Binary Message Headers draft-vaudreuil-binaryheaders-00.txt Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC 2026. This document is an Internet Draft. Internet Drafts are working documents of the Internet Engineering Task Force (IETF), its Areas, and its Working Groups. Note that other groups may also distribute working documents as Internet Drafts. Internet Drafts are valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet Drafts as reference material or to cite them other than as a "work in progress". The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. To learn the current status of any Internet-Draft, please check the "1id-abstracts.txt" listing contained in the Internet-Drafts Shadow Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe), munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or ftp.isi.edu (US West Coast). Copyright Notice Copyright (C) The Internet Society (2000). All Rights Reserved. This Internet-Draft is in conformance with Section 10 of RFC 2026. Abstract This memo defines two new service extensions to the SMTP service. The first service enables a SMTP server to indicate the maximum size RFC822 header it is willing to accept in a message. The second enables a SMTP client and server to negotiate the use of 8bit or or 8bit encoded binary contents within the message header. This memo specifies the algorithms necessary for the interworking between extended clients with a large and binary header and non-extended SMTP servers. It is expected this document may be split into three, 1) The ESMTP messageheader SIZE extension, 2) the ESMTP 8bit message header encoding extension, and 3) the registration of the "8" transport encoding for the encoded word. Internet Draft Binary Mail Headers April 1, 2000 Working Group Summary This protocol is not the product of an IETF working group. This is an initial draft to stimulate conversations. Table of Contents 1. OVERVIEW ............................................................2 2. FRAMEWORK FOR THE HEADER SIZE DECLARATION ...........................3 2.1Header size limits ................................................4 2.2Header size downgrade .............................................4 3. FRAMEWORK FOR THE 8BIT HEADERS SERVICE EXTENSION ....................6 3.1Extended Encoded Word .............................................6 3.2Extended Encoded Word Definitions .................................7 3.3Interworking with unextended SMTP servers .........................8 3.4Interworking with unextended RFC822/MIME clients ..................8 4. EXAMPLES ............................................................9 5. SECURITY CONSIDERATIONS .............................................9 6. ACKNOWLEDGMENTS .....................................................9 7. REFERENCES ..........................................................9 8. COPYRIGHT NOTICE ....................................................9 9. AUTHOR'S ADDRESS ...................................................10 1. Overview Native / Better support of UTF-8 character set without the overhead of quoted-printable in the encoded word. Spoken names and other useful non-textual representations of the message sender or recipient. Signature blocks for signed message header lines. In liu of directly transporting binary data, a new transport-encoding is defined to effeciently encode binary data within the 8 bit extension. This new transport encoding is not intended for use with MIME message bodies. Transmission of binary bodies should be performed using the [BINARY] ESMTP extensions or existing MIME transport encodings. The email community views handling of pure binary data in the message headers as excessively disruptive and prone to implementation error. An effecient encoding of binary data into an 8bit representation provides minimal processing overhead and minimal data size expansion, reasonable trade-offs against the complexity of supporting fully binary data. There has been consideration of "just send UTF-8" within the header fields for messages. This simple concept provides a number of operational difficulties. Even with the ESMTP service extensions for the indication of such a message, there is a need to clearly identify Vaudreuil Expires 6/1/00 [Page 2] Internet Draft Binary Mail Headers April 1, 2000 the data elements that may need to be converted in any downgrade performed while interoperating with the large legacy of deployed email clients. Use of a lighter-weight encoded word provides this indication and additional flexibility for the support of potentially more than one character set. 2. Framework for the Header Size Declaration The first extension provides the ability for the server to indicate to the client the header size it supports. This improves interoperability where headers may be larger than the currently accepted best current practice. The following service extension is hereby defined: (1) the name of the SMTP service extension is "Message Header Size Declaration"; (2) the EHLO keyword value associated with this extension is "HEADERSIZE"; (3) one optional parameter is allowed with this EHLO keyword value, a decimal number indicating the fixed maximum message header size in bytes that the server will accept without truncation. The syntax of the parameter is as follows, using the augmented BNF notation of [RFC822]: size-param ::= [1*DIGIT] A parameter value of 0 (zero) indicates that no fixed maximum message header size is in force. If the parameter is omitted no information is conveyed about the server's fixed maximum supported message header size; (4) the maximum length of a MAIL FROM command line is increased by 15 characters by the possible addition of the HEADERSIZE keyword and value; (5) no additional SMTP verbs are defined by this extension. A conforming ESMTP client SHOULD NOT send messages with a header block larger than the size indicated by the SMTP server. However, as is currently the case, a client without knowledge of the HEADERSIZE extension MAY send a header block larger than can be accepted by the server. Under such conditions, the handling of this pre-existing error condition is left as a local matter. (Ed note: Should we try to clean this up?) Note: There is no need for the client to indicate the size of the header block to the server. This function in the ESMTP SIZE extension is to enable the receiver-smtp to pre-allocate storage for large messages. If the client desires this function, the SIZE extension can be used. Vaudreuil Expires 6/1/00 [Page 3] Internet Draft Binary Mail Headers April 1, 2000 2.1 Header size limits 2.2 Header size downgrade With the declaration of enlarged header size support, it is now possible to identity and compensate for the important error case where the total size of the header block is larger than that which can be accepted by the SMTP server. The following downgrade rules provide a determistic set of discard rules to trim information not-essential for message delivery from the header block as necessary to deliver the message. The bias of these discard rules is to facilitate message transport, even with the potential loss of potentially important information. 1) Discard any "comments" within message sender and recipient fields. 2) Discard the "Received-by" lines. 3) Discard the recipient fields "to" and "cc". 4) Discard of the "subject" line. The downsizing of headers by the elimination of header lines should be indicated by a new received line. The syntax of such is: Justbitstotwiddlelater The largest header block that should be sent to a non-extended SMTP receiver is 10,000 bytes. While not specified in RFC822, this limit is widely understood to be the largest header block size likely to be accepted by the perponderance of clients and servers on the Internet today. To reduce the number of header-size downgrades, it is useful to define a set of size "thresholds". Messages sent with a header size over a given threshold SHOULD be reduced to fit within the next smallest threshold that can be sent to the next hop. These thresholds represent conventional wisdom about break-points in existing and anticipated software/service architectures. 100 bytes the low limit useful for hand-held devices such as text pagers and SMS enabled mobile telephones. These devices typically support the minimum RFC822 fields necessary to identify and reply to the sender, determine the sending time, and preserve the indicated subject of the message. 1000 bytes is specified just because two orders of magnitude between 100 and 10,000 seems like too large a jump. 10,000 bytes is the current best-current-practice maximum limit for a message header targeted at conventional PC-based email Vaudreuil Expires 6/1/00 [Page 4] Internet Draft Binary Mail Headers April 1, 2000 clients. (Verify that this is correct) This header size supports conventional use of message headers with a large number of recipient fields. 50,000 bytes is recommended minimum-maximum for extended clients supporting richer content and media types within message headers. This supports small multi-media elements and security elements associated with the sender. 50,000 bytes is not a maximum header size. It is anticipated that special-use devices may send substantially larger header blocks as needed for conveying multi-media information about each recipient. Vaudreuil Expires 6/1/00 [Page 5] Internet Draft Binary Mail Headers April 1, 2000 3. Framework for the 8bit Headers Service Extension The following service extension is hereby defined: 1) The name of the binary service extension is "8BITHEADERS". 2) The EHLO keyword value associated with this extension is "8BITHEADERS". 3) A new parameter, 8BITHEADERS, to indicate that the message to be sent contains headers formatted according to the binary header encoding rules defined below. 4) No new verbs are defined for the 8BITMIME extension. A receiver-SMTP may indicate support for 8bit headers by including the 8BITHEADERS keyword in the EHLO response. A sender SMTP may indicate that a message has 8 bit conforming message headers by sending an 8BITHEADER parameter with the MAIL command. When the receiver SMTP accepts a MAIL FROM command with 8BITHEADERS requested, it agrees to accept, and if necessary downgrade the contents by applying an appropriate transport encoding for delivery to the next hop. The 8BITHEADERS extension can only be used with the 8BIT [8BIT] or BINARY [BINARY] body ESMTP extension. If an 8BITHEADERS parameter is present and neither the 8BIT nor the BINARY body parameter is specified, the MAIL FROM command MUST be rejected. The ESMTP reply code 5.?.? should be used to indicate an invalid parameter. When ENHANCEDSTATUSCODES are in use, a 5.?.? status code must be used to indicate a protocol error [status]. The syntax of the extended MAIL command is identical to the MAIL command in [RFC821], except that a BODY and the 8BITHEADERS parameters must appear after the address. The complete syntax of this extended command is defined in [ESMTP]. The ESMTP-keyword is 8BITHEADERS and the syntax for ESMTP-value is given by the syntax for body-value in [ESMTP]. If a receiver-SMTP does not indicate support the 8BITHEADERS message format then the client SMTP must not, under any circumstances, send 8bit encoded message header data. Headers sent with the 8BITHEADERS parameter are still subject to the 1000 character line limit restriction. However, using the extended encoded word defined below, a large object may be encoded on multiple lines of the same RFC822 attribute value pair. 3.1 Extended Encoded Word An extended encoded word is defined for use with the 8BITHEADERS ESMTP extension. Even with a "clean 8bit transport, it is still necessary to declare to the receiver the nature of the data and the character Vaudreuil Expires 6/1/00 [Page 6] Internet Draft Binary Mail Headers April 1, 2000 set in use. The following extension of the encoded word of [MIME3] provides the following features: 1) Defines one new transport encoding, an encoding for the sending of minimally encoded 8bit textual or binary data. This encoding is losely based on the "Q" encoding with substantally fewer encoding rules. 2) Redefines the "charset" parameter to indicate multi-media attributes. 3) Eliminate the length restriction on the encoded word for effecient transmission of large encoded objects. Encoded words are limited in length only as necessary to satisfy the existing 1000 character RFC822 line limit. These extensions are made within the spirit of the existing encoded word, with a priority place on avoiding addition implementation complexity on both senders and receivers. Strict backward compatibility with the encoded word is not believed possible, especially within the highly restricted environment of existing RFC822 header parsers. Because support for encoded words is explicitly negotiated between the client and server, the strict backwards compatibility requirements with deployed RFC822 parsers underlying the design of the "Q" encoding are substantially lessened. In particular: 1) Support for characters containing values greater than decimal 127 are central to this protocol. There is no way to make these characters "backward compatable" with deployed RFC822 parsers. 2) Within an extended encoded word, the set of special characters is reduced to the special characters marking the encoded word and other characters considered problematic such as NULLs. (Ed note, we may want to us another character besides "?" for the extended encoded word.) 3) Gateway and other devices are expected to be encoded-word aware and provide necessary downgrade or discard services. 3.2 Extended Encoded Word Definitions Permit the registration of "charset" tokens for unique nedia combinations other than text/plain. Preserve small flat space for identifiers. 3.2.1 "8" encoding To be engineered, however, given the reduced interoperability requirements, this is a conceptually simple transformation. Vaudreuil Expires 6/1/00 [Page 7] Internet Draft Binary Mail Headers April 1, 2000 Essentially, excape the troublesome bad characters and Nulls, the encoded word delimiters "?", and "=", and ensure that lines are limited to 1000 characters. 3.3 Interworking with unextended SMTP servers The bias in this protocol is to permit the delivery of mail, even if content must be dropped to make it so. Senders with important information which cannot be represented with an encoded word are encourged to understand the capabilities of the recipient. Directory enabled recipient capabilities declaration would be a very good thing. Where a downconversion from an enhanced encoded word to an encoded word is necessary, the following rules should be used. 1) If the data in an extended encoding word can be represented as an encoded word, the SMTP client MUST convert the data into an encoded word. 2) Where the data cannot be represented as an encoded word, it MUST be discarded to permit the delivery of the message to the intended recipient. 3) A Received line must be added indicating the downward conversion of an encoded word and the lost of content. The received line should be of the form: Blat blather frob 3.4 Interworking with unextended RFC822/MIME clients It is clear that sending an extended encoded word to an unextended client would be a very bad thing. It appears to be very hard to make it otherwise and still meet the goals of this extension. When an IMAP or POP server has a message containing an extended- encoded-word to send to an unextended client, they SHOULD convert the data into an encoded word. When data contained in an enhanced encoded word cannot be represented using an encoded word, it should be discarded. Only when client access protocols are extended to permit the client to declare support for extended encoded words, or explicit configuration is made available to the servers, the extended encoded word may be sent. Vaudreuil Expires 6/1/00 [Page 8] Internet Draft Binary Mail Headers April 1, 2000 4. Examples Extended encoded words 5. Security Considerations This extension is not known to present any additional security issues not already endemic to electronic mail and present in fully conforming implementations of [RFC821], or otherwise made possible by [MIME]. 6. Acknowledgments 7. References [BINARY] Vaudreuil, G, " SMTP Service Extensions for Transmission of Large and Binary MIME Messages", RFC 1830, August 1995. [RFC821] Postel, J., "Simple Mail Transfer Protocol", STD 10, RFC 821, USC/Information Sciences Institute, August 1982. [RFC822] Crocker, D., "Standard for the Format of ARPA Internet Text Messages", STD 11, RFC 822, UDEL, August 1982. [MIME1] Borenstein, N., and N. Freed, "Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies", RFC 2045, Bellcore, Innosoft, November 1996. [MIME2] [MIME3] [ESMTP] Klensin, J., WG Chair, Freed, N., Editor, Rose, M., Stefferud, E., and D. Crocker, "SMTP Service Extensions" RFC 1869, United Nations University, Innosoft International, Inc., Dover Beach Consulting, Inc., Network Management Associates, Inc., The Branch Office, November 1995. [SIZE] [8BIT] Klensin, J., WG Chair, Freed, N., Editor, Rose, M., Stefferud, E., and D. Crocker, "SMTP Service Extension for 8bit-MIMEtransport" RFC 1652, United Nations University, Innosoft International, Inc., Dover Beach Consulting, Inc., Network Management Associates, Inc., The Branch Office, July 1994. 8. Copyright Notice "Copyright (C) The Internet Society (2000). All Rights Reserved. Vaudreuil Expires 6/1/00 [Page 9] Internet Draft Binary Mail Headers April 1, 2000 This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English. The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assigns. This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE." 9. Author's Address Gregory M. Vaudreuil Lucent Technologies Communications Application Group 17080 Dallas Parkway Dallas, TX 75248-1905 Voice/Fax: +1-972-733-2722 GregV@IEEE.org Vaudreuil Expires 6/1/00 [Page 10] Internet Draft Binary Mail Headers April 1, 2000 Vaudreuil Expires 6/1/00 [Page 11]