Network Working Group Ned Freed, Innosoft Internet Draft Deflate and Deflate-base64: Compression Content-Transfer-Encodings April 2000 Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC 2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet- Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Copyright Notice Copyright (C) The Internet Society (2000). All Rights Reserved. 1. Abstract This document defines two additional content-transfer- encodings, deflate and deflate-base64. Adding these CTEs to MIME that provide facilities for loss-less, adaptive, general-purpose compression. The first of these, deflate, produces binary output, while the second, deflate-base64, Internet Draft Compression CTEs April 2000 produces the same sort of output as the base64 content- transfer-encoding defined in [RFC-2045]. 2. Introduction The MIME specification [RFC-2045] defines several Content- Transfer-Encodings: (1) 7bit, used to label textual 7bit data, (2) 8bit, used to label textual 8bit data, (3) binary, used to label binary data, (4) quoted-printable, normally used to transform 8bit textual data to 7bit form, and (5) base64, normally used to transform binary data to 7bit form. All of these encodings produce output that greater than or equal to the input data in length. In particular, quoted- printable can incur up to 300% overhead and base64 incurs a fixed 33% overhead. This amount of overhead is significant in many applications. This document defines two new CTEs that incorporate the popular deflate compression algorithm [RFC-1951]. The resulting material is often smaller than the input even when the output range is restricted to the base64 alphabet. 3. Requirements Notation This document occasionally uses terms that appear in capital letters. When the terms "MUST", "SHOULD", "MUST NOT", "SHOULD NOT", and "MAY" appear capitalized, they are being used to indicate particular requirements of this specification. A discussion of the meanings of these terms appears in [RFC- 2119]. Expires October 2000 [Page 2] Internet Draft Compression CTEs April 2000 4. Deflate Compression The deflate compression format [RFC-1951], as used by the PKZIP and gzip compressors and as embodied in the freely and widely distributed zlib [Gailly95] library source code, has the following features: (1) An apparently unencumbered encoding and compression algorithm, with an open and publicly-available specification. (2) low-overhead escape mechanism for incompressible data, (3) heavily used for many years in networks, on modem and other point-to-point links to transfer files for personal computers and workstations, (4) easily achieves 2:1 compression on the Calgary corpus [Corpus90] using less than 64KBytes of memory on both sender and receive. 5. The Deflate Content-Transfer-Encoding The deflate encoding process consists of applying the deflate algorithm defined in [RFC-1951] to a MIME object in canonical form. Since all MIME objects are potentially independent of each other the history MUST be cleared prior to performing the compression operation. The output of this process is binary data. As such, this CTE can only be used in conjunction with transports capable of handling binary data. Decoding consists simply of reversing the encoding process. 6. The Deflate-Base64 Content-Transfer-Encoding The deflate-base64 encoding process consists of applying the deflate algorithm defined in [RFC-1951] to a MIME object in canonical form. The result of the deflate compression operation is then further encoded using the base64 scheme defined in [RFC-2045]. Since all MIME objects are potentially independent of each other the history MUST be cleared prior to Expires October 2000 [Page 3] Internet Draft Compression CTEs April 2000 performing the compression operation. The output of this process has the same range as the base64 CTE, and can be used with any transport. Decoding consists simply of reversing the encoding process, that is, first reversing the base64 encoding and then the compression. 7. Security Considerations The new content-transfer-encodings specified in this document are believe not to raise any security considerations not already present in MIME itself. 8. References [Corpus90] Bell, T. C., Cleary, G. G., and Witten, I. H., "Text Compression", Prentice-Hall, Englewood Cliffs NJ, 1990. The compression corpus itself can be found in ftp://ftp.uu.net/pub/archiving/zip/ [RFC-1951] Deutsch, P., "DEFLATE Compressed Data Format Specification version 1.3", RFC 1951, May 1996. [RFC-2045] Freed, N. and Borenstein, N., "Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies", RFC 2045, Innosoft, First Virtual Holdings, December 1996. [RFC-2046] Freed, N. and Borenstein, N., "Multipurpose Internet Mail Extensions (MIME) Part Two: Media Types", RFC 2046, Innosoft, First Virtual Holdings, December 1996. Expires October 2000 [Page 4] Internet Draft Compression CTEs April 2000 [RFC-2048] Freed, N., Klensin, J., Postel, J., "Multipurpose Internet Mail Extensions (MIME) Part Four: MIME Registration Procedures", RFC 2048, Innosoft, MCI, ISI, December 1996. [RFC-2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", RFC 2119, March 1997. 9. Author Address Ned Freed Innosoft International, Inc. 1050 Lakes Drive West Covina, CA 91790 USA tel: +1 626 919 3600 fax: +1 626 919 3614 email: ned.freed@innosoft.com 10. Full Copyright Statement Copyright (C) The Internet Society (2000). All Rights Reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English. The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assigns. Expires October 2000 [Page 5] Internet Draft Compression CTEs April 2000 This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Expires October 2000 [Page 6]