Network Working Group W. Leibzon Internet-Draft Elan Networks Expires: December 18, 2005 June 16, 2005 SMTP Extension for Advertisement of External-Body Content Retrieval Capability draft-leibzon-smtp-retrievecontent-00 Status of this Memo By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on December 18, 2005. Copyright Notice Copyright (C) The Internet Society (2005). Abstract This document describes an ESMTP extension by means of which mail agents can report capability to retreive message content from remote location specified with MIME/External-Body type. This allows to save senders from having to send data for content parts that may not be of interest to the recipient and is especially useful when content is available in several alternative formats or languages and its not known which one recipient prefers. Leibzon Expires December 18, 2005 [Page 1] Internet-Draft RETRIEVECONTENT SMTP Extension June 2005 Requirements Language The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119]. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 2. External Body Content Retrieval Service Extension . . . . . 4 3. The RETRIEVECONTENT Keyword of the EHLO Command . . . . . . 5 4. 'Retrieval' MIME/External-Body Parameter . . . . . . . . . . 6 5. Content-Alternative-Access MIME Field . . . . . . . . . . . 8 6. RETRIEVECONTENT RCPT Parameter . . . . . . . . . . . . . . . 9 6.1 Simple form of RETRIEVECONTENT RCPT Parameter . . . . . . 9 6.2 Extended form of RETRIEVECONTENT RCPT Parameter . . . . . 10 6.3 Special Error Codes for use after RETRIEVECONTENT RCPT . . 10 7. Retrieved Trace Header Field . . . . . . . . . . . . . . . . 12 7.1 From parameter of Retrieved Trace Field . . . . . . . . . 13 7.2 Content-id parameter of Retrieved Trace Field . . . . . . 13 7.3 With parameter of Retrieved Trace Field . . . . . . . . . 13 7.4 Example of Retrieved Trace Field . . . . . . . . . . . . . 14 8. Data Retrieval . . . . . . . . . . . . . . . . . . . . . . . 15 8.1 Data Retrieval by MDA and MUA . . . . . . . . . . . . . . 15 8.2 Data Retrieval at Intermediate MTA . . . . . . . . . . . . 15 8.3 Retrieved Content Data Integrity . . . . . . . . . . . . . 16 8.4 Example of Message Data Retrieval . . . . . . . . . . . . 17 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . 21 10. Security Considerations . . . . . . . . . . . . . . . . . . 22 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 23 11.1 Normative References . . . . . . . . . . . . . . . . . . 23 11.2 Informative References . . . . . . . . . . . . . . . . . 23 Author's Address . . . . . . . . . . . . . . . . . . . . . . 24 Intellectual Property and Copyright Statements . . . . . . . 25 Leibzon Expires December 18, 2005 [Page 2] Internet-Draft RETRIEVECONTENT SMTP Extension June 2005 1. Introduction In [RFC2046] it is described how an email message can include reference to external content to be retrieved by the recipient. This is often done as way to avoid having to include large file as attachment (such as reference to ftp website containing internet draft that IETF as opposed to inclusion of entire text of the draft) when its not certain that recipient would necessarily need it. In other cases the same content can also be available in several formats (for example XML and text format for internet draft) and recipient can then only download the data in the preferred format. This can save from unnecessary data traffic and reduce load and work performed by sender and recipient mail systems. Current system described in section 5.2.3 of [RFC2046] and by [RFC2017] (and originally specified in [RFC1521] ) relies on special MIME/External-Body MIME type to be supported by sender and recipient MUA. Sender would also have to make arrangements to have content placed on the external distribution server and recipient MUA needs to support type of access used to retrieve data from such server (this can specified as some type of URLs such as ftp or http or special custom MIME/External-Body access type), recipient also needs to have internet connectivity at the time email is read so that data can be retrieved. This all makes it unsuitable as way for communication between two parties who have not made prior arrangements and have not previously verified that both are able to support MIME/External-Body MIME type and proper access type and because of these limitations the addition of support for MIME/External-Body by MUAs has been slow and use of it is still very limited. This document attempts to bring external message content into wider use in more automated manner and relies on the fact that mail destination servers quite often already know preference of the recipient as to what kind of data would be accepted (right now this is often implemented as filter to get rid of unwanted email or remove unwanted attachments). So this document describes a way that mail server can advertise to the mail client capability to automatically retrieve the data content and type of retrieval access supported and then client can safely use MIME/External-Body content type. Leibzon Expires December 18, 2005 [Page 3] Internet-Draft RETRIEVECONTENT SMTP Extension June 2005 2. External Body Content Retrieval Service Extension Using the SMTP service extension mechanism described in [RFC1869], the following service extension is hereby defined: 1. The name of this SMTP service extension is "External Body Content Retrieval" 2. The EHLO keyword value associated with this extension is "RETRIEVECONTENT" 3. The "RETRIEVECONTENT" EHLO keyword may contain as a parameter a space separated list of server supported External-Body content access types 4. No additional SMTP verbs are defined by this extension 5. An optional parameter using the keyword "RETRIEVECONTENT" is added to the RCPT command and extends the maximum line length of the RCPT command by 500 characters 6. This extension maybe used with the submission protocol [RFC2476] Leibzon Expires December 18, 2005 [Page 4] Internet-Draft RETRIEVECONTENT SMTP Extension June 2005 3. The RETRIEVECONTENT Keyword of the EHLO Command The RETRIEVECONTENT keyword in EHLO is used by mail server to identify its support for processing External-Body messages as discussed in this document. Additional optional parameters are used to identify when mail site is able to automatically retrieve MIME/External-Body content and which type of access is supported. Note that "mail site" maybe SMTP mail server program, but may also be external program (possibly located on external server) that mail server can call or direct email to for further processing when retrieval content references by MIME/ External-Body is requested. For mail client the knowledge about which access types are definitely supported allows to safely transmit reference and know that content will definitely be delivered to recipient (if recipients accepts such content) no matter what kind of MUA recipient has. As described in section 5.2.3.1.(1) of [RFC2046], the MIME header line "Content-Type: MIME/External-Body" must include parameter "Access-Type" with possible values listed in IANA registry. These values exactly as listed are acceptable as parameters that can follow keyword RETRIEVECONTENT. A special case is URL access type as described in [RFC2017] as that allows to reference large number of different access methods by means of standard URL reference system. To advertise capability to use particular URL access method, the parameter name listed following keyword RETRIEVECONTENT would include URL type following "URL:", i.e. if mail site can support HTTP URL access types, it can list it as parameter "URL:HTTP" following after RETRIEVECONTENT EHLO keyword. An example of EHLO with RETRIEVECONTENT and listing both URL and normal access-types is: 250-mail.example.com ESMTP server ready 250-AUTH 250-RETRIEVECONTENT FTP ANON-FTP URL:HTTP URL:FTP URL:IMAP 250 SIZE Leibzon Expires December 18, 2005 [Page 5] Internet-Draft RETRIEVECONTENT SMTP Extension June 2005 4. 'Retrieval' MIME/External-Body Parameter An additional parameter is introduced for MIME/External-Body Content- Type header that makes it possible to distinguish data meant for automated retrieval from content meant for manual user-directed retrieval. The parameter name is RETREIVAL and it can have the following values: 1. NOAUTO - This is default and specifies that content is not meant for automated retrieval. Mail systems that can do automated content retrieval (as advertised by EHLO RETRIEVECONTENT keyword) should ignore all MIME/External-Body content that have this RETRIEVAL option set or that have no RETRIEVAL set at all. 2. AUTOPREF - This is used for content that should be retrieved by automated means if mail site supports listed access type but if it does not, then its not an error and mail should be passed along to MUA just like with NOAUTO. This means that if mail client has message with MIME/External-Body content with this RETRIEVAL parameter and during EHLO, the server did not advertise RETRIEVECONTENT with correct Access-Type, then mail client should still pass message along as is. For MDA, it means that if it does not have specific information that recipient MUA can retrieve content, then it should go ahead and do it when placing message into user mailbox. The "AUTOPREF" Retrieval is meant for use by MUAs that want more assurance that message content will get to the user then what can be achieved with NOAUTO. 3. AUTO - This is main RETRIEVAL option for when content reference is added by the source mail client and meant for automated retrieval by destination site. The retrieval would typically expected to happen at the MDA, but if MDA knows for certain that MUA of the recipient can retrieve the type of content specified, it may leave retrieval to MUA. When that happens, MUA SHOULD attempt to retrieve the content but prior to doing so, it MAY report to the user that message has such external content and MAY ask for permission to do so. 4. ONLYAUTO - This is almost equivalent to AUTO, except that in all cases the message's external content is expected to be retrieved by MDA (based on user preferences if appropriate) and that such complete message is expected to be placed into user mailbox. This SHOULD NOT be a default option for mail systems adding external content but MAY be used for specific recipients who are known to have problems with their MUA. Leibzon Expires December 18, 2005 [Page 6] Internet-Draft RETRIEVECONTENT SMTP Extension June 2005 Based on above explanation, the following table provides quick view as to which RETRIEVAL option can be used by MUA and which by MTAs: +-------------------+---------------+---------------+ | | Addded by | Retrieval by | | Retrieval Option | MTA | MUA | MTA | MUA | +-------------------+-------+-------+-------+-------+ | NOAUTO | NO | YES | NO | YES | | AUTOPREF | NO | YES | YES | YES | | AUTO | YES | NO* | YES | YES | | ONLYAUTO | YES | NO | YES | NO | +-------------------+-------+-------+-------+-------+ * - If SUBMIT server is known to support RETRIEVECONTENT then MUA may decide to use AUTO, however generally it is recommended that MUA set RETRIEVAL parameter to AUTOPREF for MIME/External-Body content An example of MIME/External-Body Content-Type header lines with Retrieval parameter is: Content-Type: message/external-body; retrieval=AUTO; access-type=ANON-FTP; name="218F64C460"; site="mail.example.com"; directory="/mail/outspool/"; expiration="Tue, 03 May 2005 19:01:03 -0400 (EDT)" Content-Type: text/plain; charset=US-ASCII; format=flowed Content-ID: <218F64C460.u314@example.com> Note: Above example should not serve as recommendation for using ANON-FTP access type. In fact opposite is true and it is recommended that URL access types be used whenever possible. Leibzon Expires December 18, 2005 [Page 7] Internet-Draft RETRIEVECONTENT SMTP Extension June 2005 5. Content-Alternative-Access MIME Field In some cases the access to the data is available in more then one form or by more then one method and currently this is implemented by including all forms of data in common Multipart/Alternative content structure. This system is better suited when data is available in multiple formats (such as text and html) and is too complex for when all that is needed is to specify that data can be retrieved by more then one method. For cases when its only different data retrieval methods but the actual data is the same, new Content-Alternative-Access MIME header field is introduced. Its syntax is the same as Content-Type header field but it does not contain actual content-type and only consists of Message/External-Body parameters associated with data retrieval information starting with access-type. Since RETRIEVAL parameter (as specified in section 4 of this document) applies for message/ external-body in general and for does not depend on what type of access is provided for the data, it should normally not be included in the Content-Alternative-Access field. An example of using Content-Alternative-Access MIME Field is as follows: Content-Type: message/external-body; retrieval=AUTO; access-type=ANON-FTP; name="218F64C460"; site="mail.example.com"; directory="/mail/outspool/"; expiration="Tue, 03 May 2005 19:01:03 -0400 (EDT)" Content-Alternative-Access: access-type=URL; URL="http://mail.example.com/mail/outspool/218F64C460" expiration="Tue, 03 May 2005 19:01:03 -0400 (EDT)" Note that there can be multiple Content-Alternative-Access MIME fields in the same MIME header. Leibzon Expires December 18, 2005 [Page 8] Internet-Draft RETRIEVECONTENT SMTP Extension June 2005 6. RETRIEVECONTENT RCPT Parameter Mail clients that support RETRIEVECONTENT SMTP extension and that are sending email message that includes any MIME/External-Body data parts with RETRIEVAL Content-Type parameter set to anything other then NOAUTO MUST use RETRIEVECONTENT parameter with RCPT command. The RETRIEVECONTENT can either have simple form as single keyword or extended form with additional data following "=". The ABNF [RFC2234] syntax of RETRIEVECONTENT parameter of RCPT is as follows: RCPT-RETRIEVECONTENT = "RETRIEVECONTENT" / *(retrcontent-extform FWS) retrcontent-extform = "RETRIEVECONTENT=(" elem *("," elem) ")" elem = elem-name ":" elem-data elem-name = "RETR" / "AT" / "EXP" / word elem-data = word / phrase phrase = 1*word 6.1 Simple form of RETRIEVECONTENT RCPT Parameter In simple form (just RETRIEVECONTENT without additional data) the use of keyword means that: 1. Message contains MIME/External-Body data parts that will need to be retrieved for user to see complete mail message 2. All MIME/External-Body data parts that have Retrieval parameter of Content-Type set to AUTO or ONLYAUTO have Access-Type that was on the list of supported access-types mail server advertised after RETRIEVECONTENT EHLO keyword. 3. That all MIME/External-Body data parts that have Retrieval parameter of Content-Type set to anything other then NOAUTO have either no Expiration parameter or have Expiration set at least 24 hours in the future from the time of transmission. Mail client MAY also be more strict and make certain that Expiration is set several days in the future based on local policies regarding how long it is considered acceptable for email message to be in transit. Leibzon Expires December 18, 2005 [Page 9] Internet-Draft RETRIEVECONTENT SMTP Extension June 2005 6.2 Extended form of RETRIEVECONTENT RCPT Parameter In the extended form, when RETRIEVECONTENT parameter of RCPT has data after "=", this data specify which Retrieval Access-Type and Expiration parameters do MIME/External-Body message parts have. These are specified as data elements inside "( )", i.e. "RETRIEVECONTENT=(RETR: #1,AT: #2,EXP: #3)" where: #1 (following "RETR:") is Retrieval parameter of Content-Type #2 (following "AT:") is Access-Type parameter of Content-Type #3 (following "EXP:") is amount of time (expresssed in seconds) that access to content should be possible based on Expiration parameter RETRIEVECONTENT may also have more then 3 data elements inside "( )", in particular if Content-Alternative-Access is used and specified different alternative access-type then multiple "AT" maybe present inside "( )" and similarly separate EXP may follow it. Unknown data element names inside "( )" should be ignored by mail server unless mail server is aware of what they specify. There can be multiple RETRIEVECONTENT parameters in extended form following RCPT. Mail clients should add them after RCPT for each MIME/External-Body part that have Retrieval parameter other then NOAUTO if 1. Access-Type parameter is such that the type was not advertised by mail server after EHLO RETRIEVECONTENT keyword OR 2. Expiration parameter is less then 24 hours in the future In cases when all data elements (everything after "=") would be the same for multiple RETRIEVECONTENT extended form RCPT parameters to be added, mail client SHOULD only add one. An example of extended RCPT command with RETRIEVECONTENT keyword is as follows: RCPT TO: RETRIEVECONTENT=(RETR:AUTO,AT:"URL:IMAP",AT:"URL:RSS",EXP:172800) 6.3 Special Error Codes for use after RETRIEVECONTENT RCPT Mail servers supporting RETRIEVECONTENT SMTP extension may utilize new special error codes after receiving RCPT command that contained extended form of RETRIEVECONTENT parameter. The error codes are as Leibzon Expires December 18, 2005 [Page 10] Internet-Draft RETRIEVECONTENT SMTP Extension June 2005 follows: 233 Retieval Access-Type not supported Issued in case of AUTOPREF and if ACCESS-TYPE not supported by the mail server or known client (i.e. no guarantee that mail content would be received by the user). A client MTA may decide to inform the user of this and may based on other preferences decide to retrieve the data itself and send it through as full message content 433 Retrieval Access-Type not supported Issued in case of AUTO or AUTOONLY when ACCESS-TYPE is not supported by the mail server. A client MTA MUST not continue to DATA and MUST either attempt to retrieve the data itself or return (bounce) message back to its origin Note that the choice of using RCPT extension (as opposed to MAIL extension) is specifically because it is possible that while one recipient user system supports retrieving of external content, for another recipient this is not true and in that case the error codes specified above would allow mail client to know exactly which recipient the message can be delivered to with external-body message and for which user that is not possible. Leibzon Expires December 18, 2005 [Page 11] Internet-Draft RETRIEVECONTENT SMTP Extension June 2005 7. Retrieved Trace Header Field When mail server has retrieved data it should record this event as trace data in the message header. A new trace header field "Retrieved" should be used for this purposes. The ABNF [RFC2234] syntax of "Retrieved:" trace header field is as follows: Retrieved = "Retrieved:" FWS "by" system-info FWS params-list [FWS] ";" date-time CRLF system-info = system-name [optional-info] system-name = domain / address-literal params-list = param *(FWS param) param = param-name FWS data-val-list [optional-info] param-name = ALPHA *(["-"] (ALPHA / DIGIT)) optional-info = FWS "(" [FWS] data-val-list [FWS] ")" data-val-list = data-val-pair *([FWS] "," FWS data-val-pair) data-val-pair = item-name [ "=" item-value ] item-name = ALPHA *(["-"] (ALPHA / DIGIT)) item-value = quoted-string Of the structures not defined above - date-time, quoted-string, domain, FWS and CRLF are defined in [RFC2822] and domain and address- literal are defined in [RFC2821]. Retrieved header field should start with information about system adding it. Fully qualified domain name of the system MUST be added following "by" clause and furthermore ip address of the system SHOULD be recorded by putting it inside "( )" data structure that follows system name. The ip address should be recorded as "ip=[A.B.C.D]" where a.b.c.d is actual ip address and if system has ip addresses of multiple types (i.e. both IPv4 and IPv6 addresses) then all these ip addresses SHOULD be added and this is done by having more then one instance of "ip=" separated by ",". Leibzon Expires December 18, 2005 [Page 12] Internet-Draft RETRIEVECONTENT SMTP Extension June 2005 Here is how this would look like for system mail.example.net, which has IPv4 address 192.168.0.1 Retrieved: by end.example.net (ip=[192.168.0.1]) As indicated by above syntax in addition to "by" clause the Retrieved also has several additional parameters: 7.1 From parameter of Retrieved Trace Field "from" parameter of Retrieved is used to indicate where the data content has been retrieved from. Similar to "by" this is a fully qualified domain name of the system optionally followed by its ip address inside "( )" data structure. An example of from parameter use is as follows: from mail.example.com (ip=[192.168.10.10]) 7.2 Content-id parameter of Retrieved Trace Field "content-id" parameter of Retrieved is used to indicate which content was retrieved. This parameter should be added if retrieved data had "Content-ID" mime header field (see [RFC2392]) and contains the same id as was present in that header field. So for example if content had this mime header field: Content-ID: <218F64C460.u314@example.com> Then content-id parameter Retrieved would be content-id <218F64C460.u314@example.com> 7.3 With parameter of Retrieved Trace Field "with" parameter of Retrieved is used to indicate what protocol was used for retrieving data and specifies what was in access-type parameter of the External-Body MIME content type. Following this in optional "( )" a URL may be specified of where the data was retrieved from or name of the file and site name. An example of access-type Retrieved parameter: with URL:HTTP (URL="http://mail.example.com/mail/outspool/1234") Leibzon Expires December 18, 2005 [Page 13] Internet-Draft RETRIEVECONTENT SMTP Extension June 2005 7.4 Example of Retrieved Trace Field Putting all of the above described parameters together a full Retrieved trace field could look like: Retrieved: by end.example.net (ip=[192.168.0.1]) from mail.example.com (ip=[192.168.10.10]) content-id <218F64C460.u314@example.com> with URL:HTTP (URL="http://mail.example.com/mail/outspool/1234") ; Thu, 16 Jun 2005 04:17:34 +0000 Leibzon Expires December 18, 2005 [Page 14] Internet-Draft RETRIEVECONTENT SMTP Extension June 2005 8. Data Retrieval 8.1 Data Retrieval by MDA and MUA When retrieving the data, mail system on the recipient end (MDA) should follow policies set by the recipient which means that not every content is going to be retrieved and only those of interest to the recipient. The decision on what is of interest to the recipient maybe done by automated means by the mail system that can often base this decision using mail filtering software by looking at message header data or based on the location and URL of the retrieved content. In other cases if MDA knows that recipient is using mail client that can directly retrieve the data it may want to off-load retrieving to end-user system (which would allow end-user to make decision on if email content is of interest by looking at where message came from and the subject of the message). The ability to retrieve only data that is of interest to the recipient can save considerable amount of bandwidth on the mail system as it is estimated that for over 75% of mail messages recipients do not read the content and can determine if content is of interest just by looking at who it is from and subject of the message. Additionally it in many cases the messages are sent with more then one content version (like text and html) as it is unknown to the sender which content version recipient would prefer. Using data retrieval would save bandwidth and processing time in this case as well since recipient would only retrieve the preferred data format (and this becomes known to the sender in automated way, i.e. based on which content type was retrieved and thus in the future producing several content versions may be avoided). Similar are also cases when sender has content available in multiple languages and wants to allow the recipient a choice of the language for content retrieval. When content data is retrieved, it should entirely replace corresponding message/external-body content-type in the message. If this data has been retrieved by intermediate mail system (this includes MTA or MDA but not MUA which is end-user mail system) it would appear to all subsequent systems as if it was originally a full message with that data present when it was being sent and no changes in MUA are therefore required when RETRIEVAL=AUTO is used, although as mentioned above if MUA supports the retrieval, MDA may take that into account and let MUA do it. 8.2 Data Retrieval at Intermediate MTA Normally an intermediate mail transfer or mail redirection agent such as forwarder should not change the message with Message/External-Body Leibzon Expires December 18, 2005 [Page 15] Internet-Draft RETRIEVECONTENT SMTP Extension June 2005 content-type and should attempt to send it further to appropriate destination. But in some cases the data has to be retrieved by such intermediate system because intermediate MTA can not pass message as-is (without retrieving data) if subsequent mail system does not advertise RETRIEVECONTENT capability in EHLO. In these cases it may however still be of interest to the subsequent system that particular content was retrieved from certain place. So when data is retrieved by intermediate system, it is RECOMMENDED that when replacing message/external-body content with full mime content data, the information about original retrieval location be retained by means of Content-Alternative-Access MIME field. When message retrieval is done by MDA, the MDA MAY also use Content- Alternative-Access MIME field to retain original retrieved data location if it believes this is of interest to user MUA. For example if content reference in the original message is: Content-Type: message/external-body; retrieval=AUTO; access-type=URL; URL="http://mail.example.com/mail/outspool/218F64C460.dat"; expiration="Tue, 03 May 2005 19:01:03 -0400 (EDT)" Content-Type: text/plain; charset=US-ASCII; format=flowed Content-ID: <218F64C460.u314@example.com> While the actual content in the file 218F64C460.dat is Content-Type: text/plain; charset=US-ASCII; format=flowed Content-Transfer-Encoding: 7bit Content-ID: <218F64C460.u314@example.com> This is an invitation to a tea party. Then the content after retrieval by intermediate MTA as found in the message might become: Content-Type: text/plain; charset=US-ASCII; format=flowed Content-Transfer-Encoding: 7bit Content-ID: <218F64C460.u314@example.com> Content-Alternative-Access: retrieval=AUTO; access-type=URL; URL="http://mail.example.com/mail/outspool/218F64C460.dat"; expiration="Tue, 03 May 2005 19:01:03 -0400 (EDT)" This is an invitation to a tea party. Note that as shown in above example Content-Alternative-Access should in these cases include RETRIEVAL parameter. 8.3 Retrieved Content Data Integrity The mail system doing retrieval may want to make certain its the Leibzon Expires December 18, 2005 [Page 16] Internet-Draft RETRIEVECONTENT SMTP Extension June 2005 right content that is being retrieved. There are several ways it can be done and the simplest is by matching Content-ID from the message/ external-body content to the Content-ID in the actual retrieved content (see example in previous section). A more comprehensive solution can also be used that involves content digest hash. For example when using Content-MD5 field (as specified in [RFC1864]) then message/external-body content part reference could look like Content-Type: message/external-body; retrieval=AUTO; access-type=URL; URL="http://mail.example.com/mail/outspool/218F64C460.dat"; expiration="Tue, 03 May 2005 19:01:03 -0400 (EDT)" Content-Type: text/plain; charset=US-ASCII; format=flowed Content-ID: <218F64C460.u314@example.com> Content-MD5: zGcglfIDw+/Ay+I+2WeQuA== And then when the content data is retrieved its hash would be calculated and compared to the hash included in the original content reference above. If the content integrity can not be verified, then its possible the server where it is located may have had data failure or worse the server could have been hacked (for purposes of replacing its content with unknown and potentially dangerous one) and so such content should not be delivered to end-user and message may need to be bounced back with appropriate note in the mail delivery notification report. 8.4 Example of Message Data Retrieval In this an example is provided to demonstrate how retrieval works and how that allows to only retrieve wanted content-type data. If email message when it leaves origin mail system is: Leibzon Expires December 18, 2005 [Page 17] Internet-Draft RETRIEVECONTENT SMTP Extension June 2005 From: "Alice" To: "Uncle Bob" Subject: Invitation to tea party Date: Sun, 01 May 2005 00:00:00 +0000 Message-ID: Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="==_MIME-Boundary-1_==" --==_MIME-Boundary-1_== Content-Type: message/external-body; retrieval=AUTO; access-type=URL; URL="http://mail.example.com/mail/outspool/218F64C460.dat"; expiration="Tue, 03 May 2005 19:01:03 -0400 (EDT)" Content-Type: text/plain; charset=US-ASCII; format=flowed Content-ID: <218F64C460.u314@example.com> Content-MD5: zGcglfIDw+/Ay+I+2WeQuA== --==_MIME-Boundary-1_== Content-Type: message/external-body; retrieval=AUTO; access-type=URL; URL="http://mail.example.com/mail/outspool/218F64C461.dat"; expiration="Tue, 03 May 2005 19:01:03 -0400 (EDT)" Content-Type: text/html; charset=US-ASCII; Content-ID: <218F64C461.u314@example.com> Content-MD5: kVb/O70lbAho8W+REvq0GA== --==_MIME-Boundary-1_==-- Where the content located in the 218F64C460.dat file on mail.example.com maybe: Content-Type: text/plain; charset=US-ASCII; format=flowed Content-Transfer-Encoding: 7bit Content-ID: <218F64C460.u314@example.com> Content-MD5: zGcglfIDw+/Ay+I+2WeQuA== This is an invitation to a tea party. Leibzon Expires December 18, 2005 [Page 18] Internet-Draft RETRIEVECONTENT SMTP Extension June 2005 And content located in the file 218F64C461.dat on mail.example.com maybe: Content-Type: text/html; charset=US-ASCII; Content-Transfer-Encoding: 7bit Content-ID: <218F64C461.u314@example.com> Content-MD5: kVb/O70lbAho8W+REvq0GA== INVITATION This is an invitation to a tea party Lets assume that message is not forwarded and is immediatly delivered to end.example.net mail system. First during the delivery end.example.net would add Received trace header field as specified in [RFC2821]. Then it would lookup policies and preferences for user Bob and lets assume that end.example.net finds that Bob's MUA can not retrieve messages or prefers that end.example.net MDA do it. Bob's policies may also specify that he does not like HTML and prefers text-only messages. Leibzon Expires December 18, 2005 [Page 19] Internet-Draft RETRIEVECONTENT SMTP Extension June 2005 As a result end.example.net would retrieve the message and final result placed in Bob's mailbox maybe: Retrieved: by end.example.net (ip=[192.168.0.1]) from mail.example.com (ip=[192.168.10.10]) content-id <218F64C460.u314@example.com> with URL:HTTP (URL="http://mail.example.com/mail/outspool/218F64C460.dat"); Sun, 01 May 2005 04:11:30 -0400 (EDT) Received: from mail.example.com (mail.example.com [192.168.10.10]) by mail.example.net (8.13.1/8.13.1) with ESMTP id j5G6pYWu013408 for ; Sun, 01 May 2005 04:10:45 -0400 (EDT) From: "Alice" To: "Uncle Bob" Subject: Invitation to tea party Date: Sun, 01 May 2005 00:00:00 +0000 Message-ID: Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="==_MIME-Boundary-1_==" --==_MIME-Boundary-1_== Content-Type: text/plain; charset=US-ASCII; format=flowed Content-Transfer-Encoding: 7bit Content-ID: <218F64C460.u314@example.com> Content-MD5: zGcglfIDw+/Ay+I+2WeQuA== Content-Alternative-Access: retrieval=AUTO; access-type=URL; URL="http://mail.example.com/mail/outspool/218F64C460.dat"; expiration="Tue, 03 May 2005 19:01:03 -0400 (EDT)" This is an invitation to a tea party. --==_MIME-Boundary-1_== Content-Type: message/external-body; retrieval=AUTO; access-type=URL; URL="http://mail.example.com/mail/outspool/218F64C461.dat"; expiration="Tue, 03 May 2005 19:01:03 -0400 (EDT)" Content-Type: text/html; charset=US-ASCII; Content-ID: <218F64C461.u314@example.com> Content-MD5: kVb/O70lbAho8W+REvq0GA== --==_MIME-Boundary-1_==-- Leibzon Expires December 18, 2005 [Page 20] Internet-Draft RETRIEVECONTENT SMTP Extension June 2005 9. IANA Considerations IANA is hereby requested to register the RETRIEVECONTENT SMTP service extension as described in section 2 of this document. IANA is hereby requested to register Content-Alternative-Access MIME header field: -------------------------------------------------------------------- Header field name: Content-Alternative-Access Applicable protocol: MIME Status: provisional Author/Change controller: William Leibzon, william@elan.net Specification document(s): This document Related information: none --------------------------------------------------------------------- IANA is hereby requested to register Retrieved message trace field: -------------------------------------------------------------------- Header field name: Retrieved Applicable protocol: mail Status: provisional Author/Change controller: William Leibzon, william@elan.net Specification document(s): This document Related information: none --------------------------------------------------------------------- Leibzon Expires December 18, 2005 [Page 21] Internet-Draft RETRIEVECONTENT SMTP Extension June 2005 10. Security Considerations As mentioned in [RFC2046] section 5.2.3.6.(1) using reference to external content has possibility that use and access to the content specified may not have been authorized by content owner. Large number of messages with such references sent anonymously to multiple parties could then result in a denial of service attack (this is similar to what happens if somebody includes reference to image on external website in html email without authorization of website owner). To protect against this possibility it is advisable to develop content access mechanisms specific to email with security and authorization that would allow content owner to detect unauthorized requests and deny them and to automatically delete referenced content once it has been accessed. Leibzon Expires December 18, 2005 [Page 22] Internet-Draft RETRIEVECONTENT SMTP Extension June 2005 11. References 11.1 Normative References [RFC1864] Myers, J. and M. Rose, "The Content-MD5 Header Field", RFC 1864, October 1995. [RFC1869] Klensin, J., Freed, N., Rose, M., Stefferud, E., and D. Crocker, "SMTP Service Extensions", STD 10, RFC 1869, November 1995. [RFC2017] Freed, N. and K. Moore, "Definition of the URL MIME External-Body Access-Type", RFC 2017, October 1996. [RFC2046] Freed, N. and N. Borenstein, "Multipurpose Internet Mail Extensions (MIME) Part Two: Media Types", RFC 2046, November 1996. [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC2234] Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax Specifications: ABNF", RFC 2234, November 1997. [RFC2392] Levinson, E., "Content-ID and Message-ID Uniform Resource Locators", RFC 2392, August 1998. [RFC2476] Gellens, R. and J. Klensin, "Message Submission", RFC 2476, December 1998. [RFC2821] Klensin, J., "Simple Mail Transfer Protocol", RFC 2821, April 2001. [RFC2822] Resnick, P., "Internet Message Format", RFC 2822, April 2001. 11.2 Informative References [RFC1521] Borenstein, N. and N. Freed, "MIME (Multipurpose Internet Mail Extensions) Part One: Mechanisms for Specifying and Describing the Format of Internet Message Bodies", RFC 1521, September 1993. Leibzon Expires December 18, 2005 [Page 23] Internet-Draft RETRIEVECONTENT SMTP Extension June 2005 Author's Address William Leibzon Elan Networks 500 Laurelwood Rd, Suite 12 Santa Clara, California 95054 USA Email: william@elan.net URI: http://www.elan.net/~william/emailsecurity/ Leibzon Expires December 18, 2005 [Page 24] Internet-Draft RETRIEVECONTENT SMTP Extension June 2005 Intellectual Property Statement The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org. Disclaimer of Validity This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Copyright Statement Copyright (C) The Internet Society (2005). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. Acknowledgment Funding for the RFC Editor function is currently provided by the Internet Society. Leibzon Expires December 18, 2005 [Page 25]