Network Working Group N. Freed Internet-Draft Sun Microsystems Expires: August 25, 2003 February 24, 2003 Deflate-8bit and Deflate-base64: Compression Content-Transfer-Encodings for MIME draft-freed-mime-newenc-00.txt Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http:// www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on August 25, 2003. Copyright Notice Copyright (C) The Internet Society (2003). All Rights Reserved. Abstract This document defines two additional MIME content-transfer-encodings, deflate-8bit and deflate-base64. Adding these CTEs to MIME that provide facilities for loss-less, adaptive, general-purpose compression. The first of these, deflate-8bit, produces 8bit output, while the second, deflate-base64, produces the same sort of output as the base64 content-transfer-encoding defined in RFC 2045. Freed Expires August 25, 2003 [Page 1] Internet-Draft Compression CTEs February 2003 1. Introduction The MIME specification RFC 2045 [2] defines several Content-Transfer-Encodings: 1. 7bit, used to label textual 7bit data, 2. 8bit, used to label textual 8bit data, 3. binary, used to label binary data, 4. quoted-printable, normally used to transform 8bit textual data to 7bit form, and 5. base64, normally used to transform binary data to 7bit form. All of these encodings produce output that greater than or equal to the input data in length. In particular, quoted-printable can incur up to 300% overhead and base64 incurs a fixed 33% overhead. This amount of overhead can be significant in some applications. This document defines two new CTEs that incorporate the popular deflate compression algorithm described in RFC 1951 [1]. The first of these, deflate-8bit, also incorporates a lightweight encoding based on the popular yEnc [6] encoding scheme. The resulting material is often smaller than the input even when the output range is restricted to the base64 alphabet. Freed Expires August 25, 2003 [Page 2] Internet-Draft Compression CTEs February 2003 2. Conventions Used In This Document The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [3]. Freed Expires August 25, 2003 [Page 3] Internet-Draft Compression CTEs February 2003 3. Deflate Compression The deflate compression format described in RFC 1951 [1], as used by the PKZIP and gzip compressors and as embodied in the freely and widely distributed zlib [Gailly95] library source code, has the following features: o An apparently unencumbered encoding and compression algorithm, with an open and publicly-available specification. o low-overhead escape mechanism for incompressible data, o heavily used for many years in networks, on modem and other point-to-point links to transfer files for personal computers and workstations, o easily achieves 2:1 compression on the Calgary corpus [5] using less than 64KBytes of memory on both sender and receive. Freed Expires August 25, 2003 [Page 4] Internet-Draft Compression CTEs February 2003 4. The Deflate-8bit Content-Transfer-Encoding The deflate-8bit encoding process consists of applying the deflate algorithm defined in RFC 1951 [1] to a MIME object in canonical form. Since all MIME objects are potentially independent of each other the compressor's history MUST be cleared prior to performing the compression operation. The output of dellfate algorithm is binary data. This binary data is then encoded as follows: 1. 42 is added to each octet modulo 256. 2. If the resulting octet has the decimal value 61 (equals sign), 13 (CR), 10 (LF), or 0 (NULL) it must be escaped. This is done by prefixing the octet with an octet of value 64 and adding 64 to the resulting octet modulo 256. 3. A CRLF sequence MUST be inserted into the output after every 256 octets of output. A trailing CRLF MUST also appear at the end of the data. Implementations MUST be tolerant of CRLFs being inserted between the escape prefix and the octet it modifies. 4. Additional octets MAY be escaped using the previously described procedure. In particular, implementations SHOULD escape any octets with the values 32 (space) or 9 (tab) that would otherwise appear at the end of a line. 8bit data is produced by this encoding process. As such, this CTE can only be used in conjunction with transports capable of handling 8bit data. Decoding consists simply of reversing the encoding process, that is, first reversing the encoding described above and then applying the inflate algorithm. Freed Expires August 25, 2003 [Page 5] Internet-Draft Compression CTEs February 2003 5. The Deflate-Base64 Content-Transfer-Encoding The deflate-base64 encoding process consists of applying the deflate algorithm defined in RFC 1951 [1] to a MIME object in canonical form. The result of the deflate compression operation is then further encoded using the base64 scheme defined in RFC 2045 [2]. Since all MIME objects are potentially independent of each other the compressor's history MUST be cleared prior to performing the compression operation. The output of this process has the same range as the base64 CTE, and can be used with any transport. Decoding consists simply of reversing the encoding process, that is, first reversing the base64 encoding and then applying the inflate algorithm. Freed Expires August 25, 2003 [Page 6] Internet-Draft Compression CTEs February 2003 6. Appropriate Use As deflate-8bit produces 8bit material as output, it MUST NOT be used with transports that do not support 8bit, such as tranditional SMTP. Happily, most SMTP transports currently support the 8bitMIME SMTP extension and hence can accomodate the use of deflate-8bit Both deflate-8bit and deflate-base64 SHOULD only be used when the originator has some indication that the recipient can decode them. Note that this document does not specify a means by which such support can be indicated. Freed Expires August 25, 2003 [Page 7] Internet-Draft Compression CTEs February 2003 7. Security Considerations The deflate algorithm is complex and hence prone to implementation errors. In particular, certain inflate implementations are known to not perform sufficient checking of their input stream and hence may be vulnerable to certain forms of attack. Aside from this, the new content-transfer-encodings specified in this document are believe not to raise any security considerations not already present in MIME itself. Freed Expires August 25, 2003 [Page 8] Internet-Draft Compression CTEs February 2003 Normative References [1] Deutsch, P., "DEFLATE Compressed Data Format Specification version 1.3", RFC 1951, May 1996. [2] Freed, N. and N. Borenstein, "Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies", RFC 2045, November 1996. [3] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. Freed Expires August 25, 2003 [Page 9] Internet-Draft Compression CTEs February 2003 Informative References [4] Freed, N., Klensin, J. and J. Postel, "Multipurpose Internet Mail Extensions (MIME) Part Four: Registration Procedures", BCP 13, RFC 2048, November 1996. [5] Bell, T. and I. Witten, "Text Compression", Prentice-Hall Englewood Cliffs NJ, 1990. [6] Helbing, J., "yEncode - A quick and dirty encoding for binaries", http://www.yenc.org/yenc-draft.1.3.txt version 1.3, 2002. Author's Address Ned Freed Sun Microsystems 1050 Lakes Drive West Covina, CA 91790 USA Phone: +1 626 850 4350 EMail: ned.freed@mrochek.com Freed Expires August 25, 2003 [Page 10] Internet-Draft Compression CTEs February 2003 Intellectual Property Statement The IETF takes no position regarding the validity or scope of any intellectual property or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; neither does it represent that it has made any effort to identify any such rights. Information on the IETF's procedures with respect to rights in standards-track and standards-related documentation can be found in BCP-11. Copies of claims of rights made available for publication and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementors or users of this specification can be obtained from the IETF Secretariat. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights which may cover technology that may be required to practice this standard. Please address the information to the IETF Executive Director. Full Copyright Statement Copyright (C) The Internet Society (2003). All Rights Reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English. The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assignees. This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION Freed Expires August 25, 2003 [Page 11] Internet-Draft Compression CTEs February 2003 HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Acknowledgement Funding for the RFC Editor function is currently provided by the Internet Society. Freed Expires August 25, 2003 [Page 12]