Edmon Chung & Jim Lam Internet Draft Neteka Inc. February 2003 Internationalized Mail Address eXtensions (IMAX) STATUS OF THIS MEMO This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The reader is cautioned not to depend on the values that appear in examples to be current or complete, since their purpose is primarily educational. Distribution of this memo is unlimited. The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Abstract This document describes a set of extension mechanisms for mail servers and clients to handle the transportation and negotiation of multilingual email addresses, utilizing the standard SMTP and POP extension mechanisms. In other words, the mechanism discussed in this document promotes the use of multilingual email addresses that is immediately deployable by interested parties without affecting or breaking any other existing systems. Table of Contents 1. Introduction....................................................2 1.1 Terminology....................................................2 2. SMTP Extensions for Multilingual Addresses......................2 2.1 Framework for Multilingual Extension...........................2 2.2 IMAX Compliant Sessions........................................3 2.3 CHARSET Parameter..............................................4 2.4 Client Side Fallback Strategy..................................4 3. Considerations for Inscription of Message Headers.Error! Bookmark not defined. Chung & Lam [Page 1] IMAX February 2003 3.1 Denotation for Multilingual Header Fields....Error! Bookmark not defined. 3.2 Encoding Schemes....................Error! Bookmark not defined. 3.3 Implementation......................Error! Bookmark not defined. 3.4 Compatibility Issues................Error! Bookmark not defined. 4. POP Capabilities Extension for Multilingual User Names..........5 4.1 IMAX Capability................................................5 4.2 The IMAX process...............................................6 4.3 Fallback Strategy..............................................7 5. Stringprep, Nameprep and Charprep Considerations................7 6. IANA & Security Considerations..................................7 6.1 SMTP Service Extension Considerations..........................7 6.2 Message Header Considerations.......Error! Bookmark not defined. 6.3 POP Capability Considerations..................................7 6.4 Security Considerations........................................8 1. Introduction This document outlines a set of extension mechanisms that would enable the use of multilingual email addresses without disrupting the services of the existing servers. In brief, two main areas have been enhanced to deal with multilingual addresses: SMTP and POP. All enhancements follow the guidelines of standard extension mechanisms, thus should be deployable without affecting the interoperability of mail services. 1.1 Terminology The key words "MUST", "SHALL", "REQUIRED", "SHOULD", "RECOMMENDED", and "MAY" in this document are to be interpreted as described in RFC 2119 [RFC2119]. Throughout this document, multilingual characters intended to be 8- bit will be presented in the format (charset=hexvalue=hexvalue), while 16-bit UCS-2 characters will be in the form (U+hex). 2. SMTP Extensions for Multilingual Addresses Following the SMTP Service Extension specifications defined in RFC 1869, a new EHLO keyword is introduced to facilitate the use of multilingual email addresses for the transportation of messages. Optional parameters are included to define the charset(s) supported by the server and that of the multilingual address. To promote and maintain the simplicity however, by default, a compliant server MUST support UTF-8 as well as the IETF ASCII Compatible Encoding (ACE) scheme as common for multilingual domain names (Section 5). 2.1 Framework for Multilingual Extension The IMA extensions are defined as follows: Chung & Lam [Page 2] IMAX February 2003 (1) The textual name of the service extension defined is ôInternationalized Mail Addressesö; (2) The EHLO Keyword value associated with the extension is ôIMAXö; (3) The IMA EHLO Keyword supports as an optional parameter a space separated list of charsets registered with IANA for use with MIME, plus any ACE formats supported by the DNS; (4) No new SMTP verb is introduced; (5) A parameter using the keyword "CHARSET" is added to the MAIL FROM and RCPT TO commands. (6) The clients and servers session would not be lengthened yet multilingual characters could be used as mailbox identifiers. 2.2 IMAX Compliant Sessions Besides the new IMAX function, another service extension that would be used to determine the handling of multilingual addresses is the 8bit-MIMEtransport extension [RFC1952]. If the server supports both, the client SHOULD send a multilingual address in UTF-8. If however the server only supports IMAX or only supports 8BITMIME, then an ACE formatted name MUST be used instead. For example, a session between fully compliant clients and servers would look like (C: = Client, S: = Server)(Please also be reminded that characters intended to be 8-bit will be presented in the format: charset=hexvalue=hexvalue): S: C: S: 220 mail.neteka.com -- Server SMTP (NeBOX v3.0) C: EHLO mail.toronto.edu S: 250-mail.neteka.com S: 250-8BITMIME S: 250 IMAX C: MAIL FROM:<(UTF-8=E4=E8=AD=E6=96=87)@toronto.edu> CHARSET=UTF-8 S: 250 Address Ok. C: RCPT TO: S: 250 edmon@neteka.com OK C: DATA ... If the server does not support 8BITMIME or if it does not support IMAX, then a 7-bit format must be used: C: EHLO mail.toronto.edu S: 250-mail.neteka.com S: 250 IMAX C: MAIL FROM: CHARSET=ACE S: 250 Address Ok. Chung & Lam [Page 3] IMAX February 2003 ... Should there be any negotiations on the supported CHARSET, any unsupported CHARSET would result in a 504 error response, indicating that the command parameter is not implemented. C: EHLO mail.toronto.edu S: 250-mail.neteka.com S: 250-8BITMIME S: 250 IMAX UTF-8 GB2312 C: MAIL FROM:<(Big5=A4=A4=A4=E5)@neteka.com> CHARSET=Big5 S: 504 command parameter not implemented C: MAIL FROM:<(UTF-8=E4=E8=AD=E6=96=87)@neteka.com> CHARSET=UTF-8 S: 250 Address Ok. ... It is recommended in this document that the default encoding schemes supported SHOULD be UTF-8 and an ACE format (that used by the DNS), and both MUST be implemented for any server returning the IMAX EHLO Keyword. 2.3 CHARSET Parameter While the CHARSET parameter for the IMAX EHLO Keyword is optional, any IMAX compliant client MUST specify the charset used by including the CHARSET parameter after the MAIL FROM or RCPT TO commands. This avoids the confusion caused by multiple conflicting character encoding schemes. It also further signifies that the client is attempting to transmit an IMA using the IMAX extensions. If no charset is specified, the server SHOULD assume that the client is not IMAX compliant. 2.4 Client Side Fallback Strategy There are different scenarios and level of compliancy for servers. The basic one would be encountering a server that doesnÆt support EHLO keywords or any SMTP service extensions. In this case, the client should attempt to send any multilingual names in an ACE format. S: C: S: 220 mail.example.com -- Server SMTP C: EHLO mail.neteka.com S: 500 Command not recognized: EHLO C: HELO mail.neteka.com S: 250 mail.example.com hello C: MAIL FROM: S: 250 Address Ok. ... Chung & Lam [Page 4] IMAX February 2003 If a server supports EHLO but does not indicate that it also supports IMAX, essentially the client should handle multilingual user names the same way as encountering a non-EHLO aware server. S: 220 mail.example.com -- Server SMTP C: EHLO mail.neteka.com S: 250-mail.neteka.com S: 250 HELP C: MAIL FROM: S: 250 Address Ok. ... The use of ACE names is intended to be for transitional purposes to ensure a smooth and transparent migration towards multilingual enabled mail address handling. Eventually, mail clients and servers should attempt to use the IMAX scheme with UTF8 for increased efficiency and reduced confusion. 3. POP Capabilities Extension for Multilingual User Names While SMTP takes care of the transportation of messages POP essentially handles the retrieval of mail objects from the server by a client. In order to use multilingual user names for the retrieval of messages from a mail server using the POP protocol, a new capability is introduced following the POP3 extension mechanism [RFC 2449]. 3.1 IMAX Capability (1) CAPA tag: IMAX (2) Arguments: CHARSET û a space separated list of charsets registered with IANA for use with MIME (3) Standard commands affected: USER and APOP (4) Announced states / possible differences: AUTHENTICATION / none (5) Commands valid in states: AUTHENTICATION (6) Specification Reference: This document (7) Discussion: The IMAX capability indicates that the POP server is Chung & Lam [Page 5] IMAX February 2003 multilingual aware and is able to handle multilingual user addresses. Further discussion in Section 4.2. 3.2 The IMAX process The IMAX capability enables the POP server to successfully handle multilingual user names, which form a crucial part of a multilingual email address. The IMAX capability supports optional arguments specifying the charsets supported by the POP server. By default, similar to that specified in Section 2.2, any POP server advertising that it supports IMAX MUST at least support UTF-8 and the ACE format used by the DNS. Other charsets may be included and could be advertised using a space-separated list as an argument for the IMAX CAPA keyword. For example, a POP session between IMAX compliant servers and clients would follow: S: C: S: +OK POP3 server ready C: CAPA S: +OK Capability list follows S: TOP S: USER S: IMAX S: . Again, there will be different levels of compliance for POP servers. If the POP3 extension mechanism is not at all supported, then the CAPA command would yield a -ERR response, which indicates that this is not an IMAX aware server. Since it should be necessary for POP servers to be IMAX aware to host multilingual addresses however, a client could assume that the POP server is non-compliant if a CAPA command results in an ûERR response. An IMAX compliant client MUST include the additional CHARSET parameter with the USER command if the name in question contains extended characters. When confronted with a charset that is not implemented, the server will generate a -ERR response: S: +OK POP3 server ready C: CAPA S: +OK Capability list follows S: TOP S: USER S: IMAX UTF-8 GB2312 S: . C: USER (Big5=A4=A4=A4=E5)@neteka.com CHARSET=Big5 S: -ERR CHARSET=big5 not implemented C: USER (UTF-8=E4=E8=AD=E6=96=87)@neteka.com CHARSET=UTF-8 Chung & Lam [Page 6] IMAX February 2003 S: +OK welcome ... The client must then fallback to using UTF-8 or those advertised by the POP server as supported charsets. 3.3 Fallback Strategy Although it is a safe assumption to maintain that for a mail server to be willing to handle multilingual addresses, it should be prepared to update its POP servers to be IMAX aware, and thus there really should be little concern for any fallback strategy. Nevertheless, it is possible to create provisions for this situation. Should an IMAX aware client encounter a non-compliant POP server, it could use an ACE formatted user name for login. In essence, it means that it is possible to have a pure client based solution without changing the mail servers at all. Please find further discussion in the IMAA (Internationalized Mail Addresses in Applications) paper. 4. Stringprep, Nameprep and Charprep Considerations Stringprep, Nameprep or other string preparation considerations for matching multilingual names are not discussed in this document. Character equivalence preparations (Charprep) are also not discussed. Please refer to the relevant documents for further information. While this document does not explicitly discuss the different matching preparation considerations, it recommends that the client SHOULD perform the relevant preparations before transportation, and the server MUST perform the relevant preparations. 5. IANA & Security Considerations Within this document, there are a number of new commands and parameters that will have to be included into their respective IANA registry. 5.1 SMTP Service Extension Considerations For the SMTP service extensions, a new EHLO keyword ôIMAXö is introduced with optional parameters being the supported charsets. An additional parameter ôCHARSETö is also added to the SMTP commands ôMAIL FROMö and ôRCPT TOö to indicate the charset used during the transportation. 5.2 POP Capability Considerations A new Capability is added as a POP extension. The CAPA tag ôIMAXö is introduced to indicate that it is multilingual compliant, with optional parameters being, again the supported charsets. Similar to Chung & Lam [Page 7] IMAX February 2003 the SMTP arrangements, an additional parameter ôCHARSETö is also appended to the POP commands ôUSERö and ôAPOPö. 5.3 Security Considerations This document does not discuss any security issues and is not believed to raise any extra security problems not already existing within the email systems. In fact the promotion of the adoption of the service extension mechanisms for both SMTP and POP could in turn enhance the overall security of email messaging. References [RFC821] J. Postel, "Simple Mail Transfer Protocol", RFC 821, USC/Information Sciences Institute, August 1982. [RFC1035] Mockapetris, P., "Domain Names - Implementation and Specification," STD 13, RFC 1035, USC/ISI, November 1987 [RFC1652] J. Klensin et. al., "SMTP Service Extension for 8bit- MIMEtransport", RFC 1952, July 1994 [RFC1869] J. Klensin et. al., "SMTP Service Extensions", RFC 1896, November 1995 [RFC2026] S. Bradner, "The Internet Standards Process -- Revision 3", Harvard University, RFC 2026, October 1996 [RFC2047] K. Moore, "MIME Part Three: Message Header Extensions for Non-ASCII Text", University of Tennessee, RFC 2449, November 1996 [RFC2119] S. Bradner, "Key words for use in RFCs to Indicate Requirement Levels", RFC 2119, March 1997 [RFC2449] R. Gellens, et. al., "POP3 Extension Mechanism", RFC 2449, November 1998 Authors: Edmon Chung Neteka Inc. Suite 100, 243 College St. Toronto, Ontario, Canada M5T 1R5 edmon@neteka.com Jim Lam Neteka Inc. Suite 100, 243 College St. Toronto, Ontario, Canada M5T 1R5 jimmy@neteka.com Chung & Lam [Page 8]