INTERNET-DRAFT M. Yevstifeyev Intended Status: Informational July 15, 2011 Updates: RFC-tbd Expires: January 16, 2012 Internalization of 'ftp' URIs draft-yevstifeyev-ftp-iri-00 Abstract This document discusses internalization of 'ftp' Uniform Resource Identifiers (URIs), which are used to reference resources accessible via File Transfer Protocol (FTP). It updates RFC-tbd. [NOTE TO RFC EDITOR: RFC-tbd refers to the [FTP-URI]. Please replace "tbd" when [FTP-URI] becomes an RFC with its number.] Status of this Memo This Internet-Draft is submitted to IETF in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/1id-abstracts.html The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html Copyright and License Notice Copyright (c) 2011 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect Yevstifeyev Expires January 16, 2012 [Page 1] INTERNET DRAFT 'ftp' URIs Internalization July 15, 2011 to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . . 2 2. Internalized 'ftp' Resource Identifiers (RIs) . . . . . . . . . 3 2.1. 'ftp' IRIs . . . . . . . . . . . . . . . . . . . . . . . . 3 2.2. UCS Characters in 'ftp' URIs . . . . . . . . . . . . . . . 4 3. Handling Internalized 'ftp' RIs . . . . . . . . . . . . . . . . 4 3.1. Internalized Part in 'ftp' URIs and Part in 'ftp' IRIs . . . . . . . . . . . . . . . . . . . . . . . 4 3.2. The Part in 'ftp' URIs and IRIs . . . . . . . . 4 3.3. The Part in 'ftp' URIs and in 'ftp' IRIs . . . . . . . . . . . . . . . . . . . . . . . . 5 4. Internalization of Actual Data Interchange . . . . . . . . . . 5 5. Mapping 'ftp' IRIs to 'ftp' URIs . . . . . . . . . . . . . . . 5 6. Security Considerations . . . . . . . . . . . . . . . . . . . . 5 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 5 8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 6 8.1. Normative References . . . . . . . . . . . . . . . . . . . 6 8.2. Informative References . . . . . . . . . . . . . . . . . . 7 Appendix A. Acknowledgments . . . . . . . . . . . . . . . . . . . 7 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 7 1. Introduction RFC-tbd [FTP-URI] defined the Uniform Resource Identifier (URI) scheme used to refer to the resources accessible via File Transfer Protocol (FTP) [RFC0959]. RFC 3986 [RFC3986] allows a limited range of characters to be present in URIs (particularly, only characters in ASCII range [ASCII]). RFC 3987 [RFC3987] introduced a new protocol element - Internationalized Resource Identifiers (IRIs) - which supplements URIs by extending the allowed range of characters to all Universal Character Set (UCS) characters. UCS is defined by [ISO.10646] and [UNICODE] simultaneously, and includes the characters which are able to represent almost any modern script and language. See RFC 3536bis [RFC3536bis] for more information. This document discusses the internalization of 'ftp' URIs and, correspondingly, defines 'ftp' IRIs. It updates RFC-tbd [FTP-URI]. 1.1. Terminology The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", Yevstifeyev Expires January 16, 2012 [Page 2] INTERNET DRAFT 'ftp' URIs Internalization July 15, 2011 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119]. The reader is assumed to be familiar with the terminology of RFC 3987 [RFC3987], RFC 5890 [RFC5890], RFC 3536bis [RFC3536bis] and RFC-tbd [FTP-URI]. In this document Resource Identifier (RI) refers to either URI or IRI. 2. Internalized 'ftp' Resource Identifiers (RIs) The internalized 'ftp' Resource Identifier is meant to be a 'ftp' IRI and 'ftp' URI with UCS characters. The subsequent sections discusses how to use UCS with 'ftp' RIs. 2.1. 'ftp' IRIs As mentioned before, IRIs provide a way to include UCS characters in the resource identifier preserving the functions of URI. Unlike URIs, which also may contain UCS characters, which occur percent- encoded, though (see Section 2.2), IRIs represent UCS characters directly, by first encoding them using UTF-8 [RFC3629]. This makes use of IRIs by humans much easier. This section defines the valid syntax of 'ftp' IRIs. The 'ftp' IRI takes the form of rule below, defined using ABNF [RFC5234]: ftp-iri = "ftp:" ftp-ihier-part ftp-ihier-part = "//" [ user-pass "@" ] ihost-port [ iftp-path ] user-pass = ; not internalized - see below ihost-port = ihost [ ":" port ] iftp-path = [ icwd-part ] "/" ilast-segment [ typecode-part ] icwd-part = *( "/" icwd ) icwd = isegment-nsc last-segment = isegment-nsc isegment-nsc = *ipchar-nsc ipchar-nsc = iunreserved / pct-encoded / sub-delims-nsc / ":" / "@" sub-delims-nsc = "!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" / / "," / "=" typecode-part = Yevstifeyev Expires January 16, 2012 [Page 3] INTERNET DRAFT 'ftp' URIs Internalization July 15, 2011 where the rules and are defined in RFC 3987 [RFC3987], and - in RFC 3986 [RFC3986], and and - in RFC-tbd [FTP-URI]. 2.2. UCS Characters in 'ftp' URIs Valid 'ftp' URIs MAY contain UCS characters in their and parts, as mentioned in Section 3.3 of RFC-tbd and Section 2.5 of RFC 3986. In order to use such character in one of these parts, it SHALL first be encoded with UTF-8 [RFC3629]. The resulting sequence of octets SHALL be examined to conclude whether some octets match corresponding ASCII characters. If one does, and such character is allowed in a particular part of 'ftp' URI, it SHALL be presented in the URI directly; otherwise, the octet SHALL be represented percent-encoded. 3. Handling Internalized 'ftp' RIs This section defines the rules which concern handling internalized 'ftp' RIs. The valid syntax for 'ftp' IRIs is defined above; how to include UCS characters is also discussed above. 3.1. Internalized Part in 'ftp' URIs and Part in 'ftp' IRIs The part in 'ftp' URIs and part in 'ftp' IRIs may contain internalized strings, with UCS characters being percent- encoded and displayed directly, respectively. As Domain Name System (DNS) does not allow the UTF-8 encoded data in its interchange, limiting the allowed characters range to ASCII [ASCII], the usual procedure of UTF-8 transformation is insufficient here. Hence, in order to make up the valid domain name for lookup and further processing the following step SHALL be applied: (1) represent the domain name which contains UCS characters as an octets stream in UTF-8 [RFC3629]; (2) apply the algorithm for IDN lookup defined in Section 5 of RFC 5891 [RFC5891]. The received A-label SHALL also be used with FTP HOST command [I-D ietf-ftpext2-hosts], sent when establishing FTP connection per Section 3.2.1 of RFC-tbd [FTP-URI]. 3.2. The Part in 'ftp' URIs and IRIs The part of 'ftp' IRIs is not internalized. FTP does not Yevstifeyev Expires January 16, 2012 [Page 4] INTERNET DRAFT 'ftp' URIs Internalization July 15, 2011 allow the UCS characters to be present in user names or passwords, and does not allow to enable such feature. Correspondingly, when processing 'ftp' RIs, either URIs or IRIs, the may contain characters only allowed by RFC-tbd [FTP-URI]. 3.3. The Part in 'ftp' URIs and in 'ftp' IRIs The and parts allow to include UCS characters in FTP pathnames present in URIs and IRIs, respectively. In order to successfully process an internalized FTP pathname, a client prior to its processing SHALL examine the server's response to the FEAT command [RFC2389], issued upon authentication per Section 3.2.2 of RFC-tbd [FTP-URI]. If one of the lines of the response is "UTF8", the server supports UTF-8 encoded pathnames [RFC2640]. Otherwise, if there is no such line, or the server does not support the FEAT mechanism, the contrary SHALL be assumed. Should it be determined that server supports UTF-8 encoded pathnames, the internalized pathname parts SHALL be encoded with UTF-8 [RFC3629] and then transmitted as an arguments to the corresponding FTP commands as UTF-8 octets stream. 4. Internalization of Actual Data Interchange 'ftp' RIs may refer to the file which contains the internalized data. When transmitting such file over data connection, it should be in Net-Unicode format [RFC5198]. In order to indicate this, the equal to "u" [I-D ietf-ftpext2-typeu] SHALL be set in the 'ftp' RI. 5. Mapping 'ftp' IRIs to 'ftp' URIs 'ftp' IRIs are subject to the rules of Section 3.1 of RFC 3987 with respect to their mapping to 'ftp' URIs. 6. Security Considerations Security considerations for usage of IRIs are discussed in Section 8 of RFC 3987 [RFC3987]; for usage of Internalized Domain Names - in RFC 5890 [RFC5890]. This document adds nothing to those security considerations. 7. IANA Considerations IANA is asked to list this document as an additional reference to 'ftp' URI scheme registration. Yevstifeyev Expires January 16, 2012 [Page 5] INTERNET DRAFT 'ftp' URIs Internalization July 15, 2011 8. References 8.1. Normative References [ASCII] American National Standards Institute (ANSI), "Coded Character Set -- 7-bit American Standard Code for Information Interchange", ANSI X3.4-1986, December 1986. [FTP-URI] Yevstifeyev, M., "The 'ftp' URI Scheme", Work in Progress (draft-yevstifeyev-ftp-uri-scheme), July 2011. [I-D ietf-ftpext2-typeu] Klensin, J., "FTP TYPE Extension for Internationalized Text", Work in Progress (draft-ietf-ftpext2-typeu), June 2011. [ISO.10646] International Organization for Standardization (ISO), "Information technology -- Universal Coded Character Set (UCS)", ISO/IEC 10646:2011, March 2011. [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC2640] Curtin, B., "Internationalization of the File Transfer Protocol", RFC 2640, July 1999. [RFC3629] Yergeau, F., "UTF-8, a transformation format of ISO 10646", STD 63, RFC 3629, November 2003. [RFC3986] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform Resource Identifier (URI): Generic Syntax", STD 66, RFC 3986, January 2005. [RFC3987] Duerst, M. and M. Suignard, "Internationalized Resource Identifiers (IRIs)", RFC 3987, January 2005. [RFC5234] Crocker, D., Ed., and P. Overell, "Augmented BNF for Syntax Specifications: ABNF", STD 68, RFC 5234, January 2008. [RFC5891] Klensin, J., "Internationalized Domain Names in Applications (IDNA): Protocol", RFC 5891, August 2010. [UNICODE] The Unicode Consortium, "The Unicode Standard, Version 6.0.0", (Mountain View, CA: The Unicode Consortium, 2011. ISBN 978-1-936213-01-6), 2011, . Yevstifeyev Expires January 16, 2012 [Page 6] INTERNET DRAFT 'ftp' URIs Internalization July 15, 2011 8.2. Informative References [I-D ietf-ftpext2-hosts] Hethmon, P., and R. McMurray, "File Transfer Protocol HOST Command for Virtual Hosts", Work in Progress (draft-ietf- ftpext2-hosts), February 2011. [RFC0959] Postel, J. and J. Reynolds, "File Transfer Protocol", STD 9, RFC 959, October 1985. [RFC2389] Hethmon, P. and R. Elz, "Feature negotiation mechanism for the File Transfer Protocol", RFC 2389, August 1998. [RFC5198] Klensin, J. and M. Padlipsky, "Unicode Format for Network Interchange", RFC 5198, March 2008. [RFC5890] Klensin, J., "Internationalized Domain Names for Applications (IDNA): Definitions and Document Framework", RFC 5890, August 2010. [RFC3536bis] Hoffman, P., and J. Klensin, "Terminology Used in Internationalization in the IETF", Work in Progress (draft-ietf-appsawg-rfc3536bis), June 2011. Appendix A. Acknowledgments I'd like to thank for their input to this document. Authors' Addresses Mykyta Yevstifeyev 8 Kuzovkov St., Apt 25 Kotovsk Ukraine EMail: evnikita2@gmail.com Yevstifeyev Expires January 16, 2012 [Page 7]