Network Working Group M. Kucherawy Internet-Draft January 28, 2014 Intended status: Informational Expires: August 1, 2014 Adding Capabilities to Email draft-kucherawy-email-caps-01 Abstract Many enhancements to the email service have appeared since its basic format and protocol were defined. However, the lines between these are sometimes blurred as a result of the evolution of the software that implements them. This document provides observations about these points and some clear guidelines for those interested in developing further enhancements. Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on August 1, 2014. Copyright Notice Copyright (c) 2014 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as Kucherawy Expires August 1, 2014 [Page 1] Internet-Draft Adding Email Capabilities January 2014 described in the Simplified BSD License. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . 3 3. The Protocol . . . . . . . . . . . . . . . . . . . . . . . . . 3 4. The Message . . . . . . . . . . . . . . . . . . . . . . . . . . 4 5. Header vs. Envelope . . . . . . . . . . . . . . . . . . . . . . 5 6. Deployment Observations and Results . . . . . . . . . . . . . . 5 7. Guidance for Future Work . . . . . . . . . . . . . . . . . . . 7 8. Security Considerations . . . . . . . . . . . . . . . . . . . . 7 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 8 10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 8 10.1. Normative References . . . . . . . . . . . . . . . . . . . 8 10.2. Informative References . . . . . . . . . . . . . . . . . . 8 Appendix A. Acknowledgments . . . . . . . . . . . . . . . . . . . 8 Kucherawy Expires August 1, 2014 [Page 2] Internet-Draft Adding Email Capabilities January 2014 1. Introduction The email service has two core components: The format of a message, and the transport for that message. The format was originally fully specified in [RFC0733], though it has some antecedents in the RFC archive. The current specificaton is [RFC5322]. This format describes two sections: a "header" and a "body". The body contains the actual content fo the message itself, while the header conveys some metadata such as who sent it, to whom it was sent, where replies are to be directed, etc. The Simple Mail Transfer Protocol (SMTP) was originally fully specified in [RFC0821], though it too was based on some other previous work. The current specification is [RFC5321]. The protocol is essentially a simple ASCII dialog between the client (sending) system and the server (receiving) system that exchanges a couple of identifiers -- the sender and the recipient -- and then the message itself, with status codes as the responses at each step. A number of enhancements to both of these have appeared over the years, which are too numerous to list here. They range in popularity and deployment. Some of these are enhancements to format (such as the addition of multimedia support), others to the protocol (such as enhanced error handling), and a few have augmented both. The original and increased complexity of the service has led to a body of deployed code that has in turn had some impacts on the development of enhancements over time. The IETF has observed that this leads to enhancements being developed in what it believes to be a misguided way, given the intended architecture. This document describes the relationship between the two major components and provides advice on developing future enhancements, while also being sensitive to the nature of currently deployed software, and the community's use of the service in general. 2. Definitions For a description of the email architecture, consult Internet Mail Architecture [RFC5598]. 3. The Protocol The Simple Mail Transfer Protocol (SMTP) is the language spoken by email clients and servers to exchange messages. The protocol is all in printable ASCII, which makes it easy for users to "speak" the protocol directly for the purposes of testing, debugging, or Kucherawy Expires August 1, 2014 [Page 3] Internet-Draft Adding Email Capabilities January 2014 illustration. Essentially, the client introduces itself to the server, which replies with a similar greeting. The client declares that it has a message from a given party for delivery to one or more parties, followed by a declaration that it is ready to send the content. When the server is ready, the client relays its payload (the message) to the server. Finally, the server accepts the message, usually returning a code to the client that uniquely identifies this transaction so that later analysis of the specific transaction is possible. This sequence can repeat if the client has multiple messages to relay. When no more relaying is to be done, the two politely disconnect, and the dialog is complete. One could make the analogy of a person (perhaps a postal worker) speaking to another person (perhaps at home) and the former handing the latter an envelope bearing a sender address and a recipient address. The specific contents are not known to either of these parties at this stage; merely the exchange takes place. An important point here is that once the exchange is complete, the first party no longer has the message. This is one of the intentional properties of the email service; the message always exists in exactly one place. An envelope, in this illustration, can name more than one recipient. An agent holding a message with such an envelope may find it must next relay the message to multiple independent servers to complete delivery to each recipient. In this case, that agent clones ("splits") the envelope, resuting in multiple envelopes each with a subset of the previous recipient set, but with identical content. 4. The Message The email format is meant to convey the "core" content of a message between two or more parties. It consists of a header and a body. The body is the actual message, and in modern terms it can contain unstructured plain text, structured multimedia, or nothing at all. The header consists of a set of header fields that include meta-data about the content, such as identifying the party (or parties) that generated it, which agents handled it in transit, the date and time at which it was generated, the (apparent) set of intended recipients, etc. In the case of structured content, the header also contains the initial set of details needed to extract the structure. If one imagines a printed memo, with fields like "From", "To", "Subject", "Date", and perhaps "Cc", it is easy to envision a simple email message; these fields are at the top, separated by some kind of Kucherawy Expires August 1, 2014 [Page 4] Internet-Draft Adding Email Capabilities January 2014 divider (which might be just an extra blank line or two) followed by the body of the memo. It is in this image that the email format was also created. 5. Header vs. Envelope It is useful to consider this separation of function of the message versus the envelope when considering the design of any enhancement to the email service. Importantly, what's on that memo is completely independent of what was on the envelope that contained it. The memo might say "From: Alice" and "To: Bob", while the envelope said "From: Charlie" and "To: Donald". More generally, there is no guarantee that the content and the transport have any relationship at all. An example of non-core material that is rightly a property of the message and not the envelope includes digital signatures. One might think of the mark or seal of a notary, which is meant to certify the content and not the envelope containing it. SMTP also has the notion of "Trace Information" which is a record of the agents that handled the message prior to delivery and when they each processed the message. One might think of a premium package handling service that includes tracking as part of its product, showing through which stations the package was carried and a date/ time at each. Email trace information fulfills the same goal, and is normally recorded as header fields. However, any message can be forwarded by a user or a piece of software (such as a mailing list service). In this case, it is appropriate to think of the message content as taking on a new life beyond its original delivery, with a new envelope and possibly a new or revised header or even augmented content. Caution must be taken when constructing a new header so that information relevant only to the original delivery does not get forwarded; this leakage of information can lead to mishandling of the content or even leakage of private information to the new recipient(s). 6. Deployment Observations and Results As the email service has grown in popularity, it also became a popular target for abuse. In particular, it became a vector for delivery of unwanted commercial email ("spam") or even malicious active content ("malware", such as viruses or worms). Both of these attempt to exploit user trust (and naivete) in order to deliver undesirable content. Among other things, false or misleading From and Subject fields on messages are commonplace. Kucherawy Expires August 1, 2014 [Page 5] Internet-Draft Adding Email Capabilities January 2014 Mail User Agents (MUAs) retrieve messages from message stores, and not from the Message Transfer Agents (MTAs) or Message Delivery Agents (MDAs) that affect transport and delivery of messages. As such, they do not have access to the parameters exchanged during the protocol sessions that resulted in the delivery. This led to many MUA extensions that were activated by way of custom header fields rather than extensions to SMTP, or to MUA protocols such as the Internet Message Access Protocol (IMAP) or Post Office Protocol (POP). These two factors have led to increased flexibility in email handling agents (primarily Message Transfer Agents, or MTAs) for making decisions about, or altering, header fields on messages as they arrived. By contrast, very little in the way of messaging abuse takes place via misuse of SMTP extensions. This has resulted in the current environment, in which it is often very easy to add, alter, remove, and analyze header fields on a message, and typically very difficult if not impossible to add or process an SMTP extension for which built-in support does not already exist. By its nature, an MTA or MDA that implements an SMTP extension advertises this fact in its reply to the EHLO command verb when issued by a client. A client, therefore, that has support for a particular extension can easily determine if the next agent to which it will relay a message also supports that extension. If that agent does not include such support, the current agent must decide to do one of two things: a. consider the delivery a failure, and begin processing it as an error; or b. relay the message anyway, losing the capability afforded by the extension. The former is required, of course, if the capability offered by the extension needs to be active from the message's origin to its final destination. By contrast, an MTA or MDA that does not know of a particular special interpretation for a specific header field will almost always simply ignore that header field and continue to relay it unmodified. This can be enormously useful when one wishes to deploy a new capability that will not be affected by non- participating agents. Another distinguishing property between the two approaches is that an SMTP client can immediately determine which protocol extensions are supported, while extensions that involve header fields are not visible to the client. Header field extensions, therefore, need to be designed such that they can be ignored by agents that don't Kucherawy Expires August 1, 2014 [Page 6] Internet-Draft Adding Email Capabilities January 2014 support them. As a result, it is common to assume that adding a new capability to the email service is best accomplished by creating (and hopefully, registering) a new header field specific to that purpose, even if that capability would more properly be implemented as an SMTP extension. In a few very rare cases, new capabilities have even been developed that include both header field and SMTP extension forms. 7. Guidance for Future Work Enhancements to the email system should be designed such that they appear in the protocol when they have to do with handling and delivery, and in the message when they directly affect the content in ways that do not involve the transport of the content. An additional rule of thumb is to ensure that extensions specific to an envelope address need to appear as SMTP extensions rather than header fields. Overloading the header for enhancements that do not fit the envelope vs. content model may be convenient given the current deployed environment, but they result in such issues as: o pollution of the header field namespace; o often needless enlargement of messages; and o inadvertent leakage of data not relevant to later message recipients. MTA and MDA implementers need to ensure that SMTP extensions can be added and handled via the runtime environment as easily as they can be for header fields. This will ensure the more sound architectural decisions can be made by designers of future enhancements. 8. Security Considerations An important observation is that the envelope and the header overlap in only a small number of key ways. The most obvious of these is the Return-Path header field, added at time of delivery, which includes the sender address as extracted from the message envelope. Typically, all other envelope details are discarded upon delivery. Because of this, data that should be ephemeral but are stored in header fields can fall into the wrong hands when the messages is forwarded. Following the recommendations above can help to reduce this concern. Kucherawy Expires August 1, 2014 [Page 7] Internet-Draft Adding Email Capabilities January 2014 9. IANA Considerations This document contains no actions for IANA. [RFC Editor: Please remove this section prior to publication.] 10. References 10.1. Normative References [RFC5598] Crocker, D., "Internet Mail Architecture", RFC 5598, July 2009. 10.2. Informative References [RFC0733] Crocker, D., Vittal, J., Pogran, K., and D. Henderson, "Standard for the format of ARPA network text messages", RFC 733, November 1977. [RFC0821] Postel, J., "Simple Mail Transfer Protocol", STD 10, RFC 821, August 1982. [RFC5321] Klensin, J., "Simple Mail Transfer Protocol", RFC 5321, October 2008. [RFC5322] Resnick, P., Ed., "Internet Message Format", RFC 5322, October 2008. Appendix A. Acknowledgments John Levine provided useful review comments during the early development of this work. Author's Address Murray S. Kucherawy 270 Upland Drive San Francisco, CA 94127 USA EMail: superuser@gmail.com Kucherawy Expires August 1, 2014 [Page 8]