INTERNET-DRAFT N. Elkins Intended Status: Informational Inside Products H. Chowdhary NIXI Expires: April 7, 2017 October 4, 2016 Deployment Issues for Internationalized Email draft-elkchow-iea-deploy-00 Abstract International Email Addresses (IEA) are far from the global reality. The current de-facto language of the Internet is English. Even today, many of the users of the Internet do not speak English as their primary language. The next billion users of the Internet are likely to be even less familiar with English. IEA is probably the first application needed in a truly internationalized Internet. The Email Address Internationalization (EAI) Working Group defined the RFCs to support internationalized email. The time may now finally have come to develop best practices and to discuss the deployment challenges for IEA. Status of this Memo This Internet-Draft is submitted to IETF in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/1id-abstracts.html The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html N. H. Elkchow Expires April 7, 2017 [Page 1] INTERNET DRAFT draft-elkchow-idniea-deploy-00 October 4, 2016 Copyright and License Notice Copyright (c) 2016 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1 Punycode . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.2 Single Language / Multiple Languages . . . . . . . . . . . 4 2 Email Servers . . . . . . . . . . . . . . . . . . . . . . . . 4 2.1 Backend Databases . . . . . . . . . . . . . . . . . . . . . 5 3 Email Clients . . . . . . . . . . . . . . . . . . . . . . . . . 5 3.1 Display of Email ID . . . . . . . . . . . . . . . . . . . . 5 3.2 Display of Email Body . . . . . . . . . . . . . . . . . . . 5 3.3 Messages Routed to SPAM . . . . . . . . . . . . . . . . . . 6 4 Multiple Identities / Aliases . . . . . . . . . . . . . . . . . 6 5 Email Address Books . . . . . . . . . . . . . . . . . . . . . . 6 6 Security Considerations . . . . . . . . . . . . . . . . . . . . 6 6.1 Homographic Attacks . . . . . . . . . . . . . . . . . . . . 6 6.2 Use of Mixed Scripts . . . . . . . . . . . . . . . . . . . 7 6.3 Right-to-left Issues . . . . . . . . . . . . . . . . . . . 7 7 IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 7 8 References . . . . . . . . . . . . . . . . . . . . . . . . . . 7 8.1 Normative References . . . . . . . . . . . . . . . . . . . 7 8.2 Informative References . . . . . . . . . . . . . . . . . . 8 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 8 N. H. Elkchow Expires April 7, 2017 [Page 2] INTERNET DRAFT draft-elkchow-idniea-deploy-00 October 4, 2016 1 Introduction The Email Address Internationalization (EAI) Working Group, which has concluded, created a structure and framework for internationalized email addresses. From the charter: "The email address has two parts, local part and domain part. Email address internationalization must deal with both. This working group's previous experimental efforts investigated the use of UTF-8 as a general approach to email internationalization. That approach is based on the use of an SMTP extension to enable the use of UTF-8 in envelope address local-parts, optionally in address domain-parts, and in mail headers. The mail header contexts can include both addresses and wherever existing protocols (e.g., RFC 2231) permit the use of encoded-words." [EAICharter] Much work was done in this group including: RFC 6530 : Overview and Framework for Internationalized Email [RFC6530] RFC 6531 : SMTP Extension for Internationalized Email [RFC6531] RFC 6532 : Internationalized Email Headers [RFC6532] RFC 6533 : Internationalized Delivery Status and Disposition Notifications [RFC6533] RFC 6783 : Mailing Lists and Non-ASCII Addresses [RFC6783] RFC 6855 : IMAP Support for UTF-8 [RFC6855] RFC 6856 : Post Office Protocol Version 3 (POP3) Support for UTF-8 [RFC6856] RFC 6857 : Post-Delivery Message Downgrading for Internationalized Email Messages [RFC6857] RFC 6858 : Simplified POP and IMAP Downgrading for Internationalized Email [RFC6858] Yet, deployment lags. Global EAI is far from the reality. The Internet is getting bigger day by day by integrating top level domains using non-ASCII based scripts i.e Devanagari,Cyrillic,Arabic, Chinese etc. These new top level domains need to be able to send emails as well as to access web sites via browsers. If a user has an internationalized email address, then it should be possible to send/ receive to/from any email address using any email client. This interoperability demands concerted efforts by all major email service providers. Even now, there is very limited or no support in email servers (SMTP, IMAP, POP), email providers (Gmail, Yahoo, Hotmail) and email clients. Often, it is not even possible to create an email ID for end-users in N. H. Elkchow Expires April 7, 2017 [Page 3] INTERNET DRAFT draft-elkchow-idniea-deploy-00 October 4, 2016 a non-ascii based language while many Internationalized Domain Names (IDN) exist. This is of major concern to many in the parts of the world where the primary language is not English. 1.1 Punycode Languages not based on the Latin script (A, B, C, etc) use unicode to represent the letters in their alphabet rather than ASCII. Punycode is used to show unicode characters in ASCII format. It is used in the transport of email. For example: English: Nehru Hindi: ????? [Cannot be displayed] Punycode: xn--l2bq0a0bw Punycode will start with the prefix: "xn--". An application handling IDN domains has to reference an IDN repository to know how to display them. Emails add to the complication since many email systems pre-date the introduction of IDNs. These systems often simply reject emails that don't work within the old domain name model. 1.2 Single Language / Multiple Languages Some people ask, "What if I send an email in Russian and want to respond in Chinese?" What problems will arise? This is certainly an important issue but it is hard enough to send emails back and forth in one non-ASCII based language. This draft will leave for the future the issues of multiple languages with the accompanying translation and user interface issues. 2 Email Servers For an email server to be ready for EAI, it must implement: RFC 6530 : Overview and Framework for Internationalized Email [RFC6530] RFC 6531 : SMTP Extension for Internationalized Email [RFC6531] RFC 6532 : Internationalized Email Headers [RFC6532] Here is a partial list of servers and test beds for EAI: N. H. Elkchow Expires April 7, 2017 [Page 4] INTERNET DRAFT draft-elkchow-idniea-deploy-00 October 4, 2016 PostFix 3.0 and above Coremail Throughway (Thailand) OpenMail (Taiwan) EAI test environment (Saudi Arabia) Xgenplus (INDIA) 2.1 Backend Databases The email servers store data in relational databases such as MySQL and MariaDB. These databases must support UTF-8 and be configured to use UTF-8. There may need to be both Punycode and UTF-8 fields defined on occasion. 3 Email Clients Since the email client is the interface to the user, here is where a number of issues arise. There are a number of email clients services providers that support EAI to various extents. A partial list follows: Coremail Horde Project Microsoft Outlook 2016 for PC Gmail - to some extent Apple Mail - to some extent Throughway (Thailand) OpenMail (Taiwan) EAI test environment (Saudi Arabia) Roundcube 3.1 Display of Email ID The email ID is often shown in Punycode. For example, the email id: harish@nalini.bharat in Hindi is: ???????@??????.?????? [cannot be displayed] This email ID will be displayed in many email clients in Punycode. That is, the email ID will be shown as: xn--t2bmh3a@xn--l2ba3a4cg.xn- -h2brj9c. This is not particularly user-friendly. 3.2 Display of Email Body The issue with the email body have to do with an easy ability to type in the language of choice. A number of browsers have extensions to allow this. N. H. Elkchow Expires April 7, 2017 [Page 5] INTERNET DRAFT draft-elkchow-idniea-deploy-00 October 4, 2016 But when it comes to displays of links containing IDN names, often the link does not work. 3.3 Messages Routed to SPAM Messages may be routed by email clients to SPAM if they are not in English. 4 Multiple Identities / Aliases A user may have multiple identities. That is, he / she may have an English language email ID, a Hindi email ID, and so on. 5 Email Address Books Email address books today have little or no support for addresses in non-Latin based languages. 6 Security Considerations 6.1 Homographic Attacks A user on Internet can be easily duped with Russian letters 'a, e, p, or y' as they are indistinguishable in writing from their English equivalents. A number of the letters (such as "a") are closely look alike etymologically, whereas others look similar by sheer coincidence. for example, Russian letter p is really pronounced like r, however the glyphs of both the letters are identical. Russian isn't the single such language; other Cyrillic languages could cause similar collisions. For example paypal.com and paypal.com are look alike however first domain name contains the Russian letter "a" while other contains English letter "a",further it can lead to similar-looking e-mail IDs such as Nalini@paypal.com,and Nalini@paypal.com ;both are similar in view however different e-mail IDs in reality.In this case,the characters used for the fraud are perfectly legitimate. Therefore numerous English domain and e-mail IDs may be homographed - that is, maliciously misspelled by substitution of non-Latin letters. A number of approaches may be utilized to protect against this sort of attack. the best fix would indiscriminately forbid domain names that combine letters from totally different alphabets, but this will block actually helpful names like "CNNenEspanol.com". [Note: the 'n' in Espanol has a ~ on the top] Alternatively, the browser may highlight international letters existing in domain names with a separate color, although users might find this system excessively intrusive. N. H. Elkchow Expires April 7, 2017 [Page 6] INTERNET DRAFT draft-elkchow-idniea-deploy-00 October 4, 2016 Browsers may solely highlight really suspicious names, like ones that blend letters from different scripts inside one single word. For additional security, the browser may use a map of identical letters to look for collisions between the requested domain and equally written registered ones. If critical, it would then warn the user of suspected fraud 6.2 Use of Mixed Scripts Legitimate uses for mixed scripts in both Japanese and Chinese are also possible, and many people use Latin usernames with IDN domains. 6.3 Right-to-left Issues Some languages, for example, Arabic, are written right-to-left. Systems created to work with Arabic script typically switch when the first Arabic character is entered. But with a mixed script email address such as 'customer.care@[IDN domain].IDN', the system needs to be able to handle both left-to-right and right-to-left scripting. This could be an additional potential security issue. If someone registers the domain name "customer.helpline" in this scenario, they could type in the Arabic script first (username), triggering an email system to switch to right-to-left and then put in the domain "customer.helpline". It would appear in the input box as though "customer.helpline" was the username on the left-hand side of the email address. The only difference would be that the whole address would be aligned right. 7 IANA Considerations There are no IANA considerations. 8 References 8.1 Normative References [RFC6530] Klensin, J. and Y. Ko, "Overview and Framework for Internationalized Email", RFC 6530, February 2012. [RFC6531] Yao, J. and W. Mao, "SMTP Extension for Internationalized Email", RFC 6531, February 2012. [RFC6532] Yang, A., Steele, S., and N. Freed, "Internationalized Email Headers", RFC 6532, February 2012. [RFC6533] Hansen, T., Newman, C., and A. Melnikov, "Internationalized Delivery Status and Disposition Notifications", N. H. Elkchow Expires April 7, 2017 [Page 7] INTERNET DRAFT draft-elkchow-idniea-deploy-00 October 4, 2016 RFC 6533, February 2012. [RFC6783] Hansen, T., Newman, C., and A. Melnikov, " Mailing Lists and Non-ASCII Addresses", RFC 6783, November 2012. [RFC6855] Resnick, P., Ed., Newman, C., Ed., and S. Shen, Ed., "IMAP Support for UTF-8", RFC 6855, March 2013. [RFC6856] Gellens, R., Newman, C., Yao, J., and K. Fujiwara, "Post Office Protocol Version 3 (POP3) Support for UTF-8", RFC 6856, March 2013. [RFC6857] Fujiwara, K., "Post-Delivery Message Downgrading for Internationalized Email Messages", RFC 6857, March 2013. [RFC6858] Gulbrandsen, A., "Simplified POP and IMAP Downgrading for Internationalized Email", RFC 6858, March 2013 8.2 Informative References [EAICharter] https://datatracker.ietf.org/wg/eai/charter/, May 2010 Authors' Addresses Nalini Elkins Inside Products, Inc. Carmel Valley, CA 93924 USA Phone: +1 831 659 8360 Email: nalini.elkins@insidethestack.com Harish Chowdhary NIXI India Email: harish@nixi.in N. H. Elkchow Expires April 7, 2017 [Page 8]