HTTP/1.1 200 OK Date: Tue, 09 Apr 2002 00:21:53 GMT Server: Apache/1.3.20 (Unix) Last-Modified: Wed, 05 Mar 1997 13:17:00 GMT ETag: "3ddeb2-5206-331d724c" Accept-Ranges: bytes Content-Length: 20998 Connection: close Content-Type: text/plain FTPEXT Working Group P. Hethmon INTERNET-DRAFT Hethmon Brothers 5 March 1997 Expires 5 September 1997 MLST Command and Extensions to FTP Status of this Memo This document is an Internet-Draft. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet- Drafts as reference material or to cite them other than as ``work in progress.'' To learn the current status of any Internet-Draft, please check the ``1id-abstracts.txt'' listing contained in the Internet- Drafts Shadow Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe), munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or ftp.isi.edu (US West Coast). Distribution of this document is unlimited. Abstract In order to overcome the problems inherent in the current FTP LIST output, a new command is needed to transfer standardized listing information from Server-FTP to Client-FTP. In addition, a way for the Server-FTP to let the Client-FTP know of this capability without imposing on the Client-FTP to randomly try new commands is needed. This proposal meets both of these requirements. This proposal also extends the FTP protocol to allow character sets other than US-ASCII[1] by allowing the transmission of 8-bit characters and the recommended use of UTF-8[2] encoding. 1. Knowledge of Extra Capabilities 1.1 The FEAT Command In order for the Client-FTP to know whether the Server-FTP understands the MLST command and future extensions to the FTP protocol, a new command, FEAT, can be used for the Client-FTP to query the Server-FTP on any extensions from RFC959 the Server-FTP supports. For Server-FTP's which do not support any extensions, the FEAT command will result in a 500 reply. The request syntax in augmented BNF syntax [3]: feat = feat-cmd CRLF Hethmon Expires 5 September 1997 [Page 1] INTERNET-DRAFT MLST Command and Extensions to FTP 5 March 1997 feat-cmd = "f" / "F" "e" / "E" "a" / "A" "t" / "T" For Server-FTP's which do support extensions the correct reply code will be 211. The reply to the FEAT command will typically be a multiline reply of the form: C> FEAT S> 211- Extensions supported: S> MLST size*,create,modify*,perm,media-type S> SIZE S> MDTM S> 211 End Each extension supported must be listed on a separate line to facilitate the possible inclusion of parameters supported by each extension command. Any parameters included are to be specified in the RFC defining that extension. feat-response = "211-" SP *CHAR CRLF 1*( SP feat CRLF ) '211' SP 'End' CRLF feat = 1*CHAR *1( SP parms ) parms = 1*CHAR FTP implementations which support MLST or other extension commands MUST support FEAT. 1.2 Rational for FEAT While not absolutely necessary, a standard mechanism for the Server-FTP to inform the Client-FTP of any features and extensions supported will help reduce unnecessary traffic between the Client-FTP and Server-FTP as more extensions may be introduced in the future. If no mechanism exists for this, a Client-FTP will have to try each extension in turn for the Server-FTP resulting in a series of exchanges. It is also suggested for Client-FTP which retain information about a particular Server-FTP between uses that they cache the knowledge of a particular Server-FTP supporting extensions. A Client-FTP would be expected to re-query the Server-FTP if any cached extension resulted in a 500 response code, or if the Client-FTP needs to determine the support for a newly introduced extension. 2. MLST Specification 2.1 Generalities The MLST command is intended to standardize the file and directory information returned by the Server-FTP process. The MLST command differs from the LIST and NLST commands in that responses can be sent over either the control connection or data connection. The default is to send responses over the data connection. It also differs in that the format of the replies is strictly defined although extensible. The MLST command also extends the FTP protocol as presented in RFC 959 Hethmon Expires 5 September 1997 [Page 2] INTERNET-DRAFT MLST Command and Extensions to FTP 5 March 1997 and RFC 1123 to allow that transmission of 8-bit data. Note this is not specifying character sets which are 8-bit, but specifying that FTP implementations are to specifically allow 8-bit bytes. The MLST command allows both UTF-8/Unicode and "raw" forms as arguments. The MLST response is allowed over either the data connection or over the control connection. The default is to send the response over the data connection. Client-FTPs which wish to receive the response over the control connection must use the OPTS command as described in Section 3 to set the default response to use the control connection. 2.2 Format of MLST Request The MLST command allows a single optional argument. This argument may be either a directory name or a filename. If a directory name is given then MLST must return a listing of the contents of the named directory. If the argument is a filename, then MLST must return only a single fact line containing the information about the named file. If no argument is given then MLST must return a listing of the contents of the current working directory. If the Client-FTP sends an invalid argument, the Server-FTP MUST reply with an error code of 501. The syntax for the MLST command is: mlst = mlst-cmd [ SP ( utf-8-name | raw ) ] CRLF mlst-cmd = "m" / "M" "l" / "L" "s" / "S" "t" / "T" utf-8-name = raw = 2.3 Format of MLST Response The format of a response to the MLST command is as follows: mlst-response = mlst-control-response | mlst-data-response mlst-control-response = "212-" CRLF *(SP entry CRLF) "212" SP "End" CRLF entry = *facts SP ( utf-8-name | raw ) facts = fact *( ";" facts ) fact = name "=" value name = 1*ltext value = 1*ltext ltext = ALPHA | DIGIT | "," | "." | ":" | "!" | "@" | "#" | "$" | "%" | "^" | "&" | "(" | ")" | "-" | "_" | "+" | "?" | "/" | "\" | "'" | <"> mlst-data-response = initial-response CRLF final-response initial-response = "150" SP response-message CRLF response-message = *ltext final-response = "226" SP response-message CRLF When responses are sent over the data connection, the format of the control connection response is that of mlst-data-response. When the response is sent over the control connection, then the Hethmon Expires 5 September 1997 [Page 3] INTERNET-DRAFT MLST Command and Extensions to FTP 5 March 1997 mlst-control-response format is used. Given the path lengths available on various operating systems, this specification requires implementations to accept a minimum line length (for the entire line of a MLST reply) of at least 2048 bytes. It would be recommended that lengths of up to 4096 bytes be accepted if limits are necessary. The facts part of the specification would contain a series of "file facts" about the file/directory. Typical information to be presented would include file size, last modification time, creation time, unique identifier, file/directory flag. The complete format for a successful reply to the MLST command would be (over the control connection): C> MLST S> 212- S> facts utf-8-name S> facts utf-8-name S> facts utf-8-name S> 212 End 2.4 Filename encoding. A FTP implementation using the MLST command must be 8-bit clean. This is necessary in order to transmit UTF-8 encoded filenames. This specification recommends the use of UTF-8 encoded filenames. FTP implementations SHOULD use UTF-8 whenever possible to encourage the maximum interoperability. Filenames are not restricted to UTF-8, however treatment of arbritrary character encodings is not specified by this standard. Applications are encouraged to treat non-UTF-8 encodings of filenames as octet sequences. Note that this encoding DOES NOT apply to the contents of the file. Further information about filename encoding for FTP may be found in "Internationalization of the File Transfer Protocol" [4]. 2.4.1 Notes about the Filename. The filename returned in the MLST command should be an unqualified filename. No path information should be given. Path information is to be returned separately as specified in Section 2.6. 2.5 Format of Facts The "facts" for a file in a reply to a MLST command consist of information about that file. The facts are a series of keyword=value pairs separated by a semi-colon (";") character. The complete series of facts may not contain the space character. A sample of a typical series of facts would be: (spread over two lines Hethmon Expires 5 September 1997 [Page 4] INTERNET-DRAFT MLST Command and Extensions to FTP 5 March 1997 for presentation only) size=4161;lang=en-us;modify=19970214165800;create=19961001124534; type=file;x.myfact=foo,bar 2.6 Standard Facts This document defines a standard set of facts as follows: size -- Size in bytes modify -- Last modification time create -- Creation time type -- Entry type unique -- Unique id of file/directory perm -- File permissions, whether read, write, execute is allowed for the login id. lang -- Language of the filename per IANA[5] registry. media-type -- MIME media-type of file contents per IANA registry. charset -- Character set per IANA registry (if not UTF-8) Fact names are case-sensitive. Size, size, and SIze are not the same. For keywords specifing time, the time is to be specified in the format: yyyymmddhhmmss.sss where yyyy -- 4 digit year mm -- 2 digit month dd -- 2 digit day hh -- 2 digit hour ss -- 2 digit second .sss -- optional digits for hundreths of a second Further operating system specific keywords could be specified by using the IANA operating system name as a prefix (examples only): OS/2.ea -- OS/2 extended attributes MACOS.rf -- MacIntosh resource forks Implementation specific keywords would be allowed by starting the keyword with the sequence "x.": x.ver -- Version information x.desc -- File description x.type -- File type 2.7 The type Fact The type fact needs a special description. Part of the problem with current practices is deciding when a file is a directory. If it is a directory, is it the current directory, a regular directory, or a Hethmon Expires 5 September 1997 [Page 5] INTERNET-DRAFT MLST Command and Extensions to FTP 5 March 1997 parent directory? The MLST specification makes this unambigous using the type fact. All values for the type fact are relative to the directory listed in the response. Five values are possible for the type fact: file -- a file entry cdir -- the current directory pdir -- the parent directory dir -- a directory or sub-directory link -- the entry is a link to a file or directory The syntax is defined to be: type-fact = "type" "=" 1("file" | "cdir" | "pdir" | "dir" | "link") 2.7.1 type=file The presence of the type=file fact indicates the listed entry is available as a file in the listed directory. 2.7.2 type=cdir The type=cdir fact indicates the listed entry is the full, qualified pathname of the directory whose contents are listed. The value of this entry (the filename part) plus the value of a type=file entry together should represent a complete pathname suitable for a RETR command. The value for the type=cdir entry should include any necessary system delimiters used between path components. An example would be the forward slash "/" on a Unix system, or a back slash "\" on an OS/2 or Windows system. The type=cdir entry is required for all MLST replies which return directory listings. It is not required for MLST replies which return information about a single file. 2.7.3 type=pdir If present, the type=pdir entry represents the fully qualified pathname of the parent directory of the type=cdir directory. A CWD command with the value should change the user to the parent directory of the listed directory. Client-FTPs should note not all responses will include this information. 2.7.4 type=dir If present, the type=dir entry is the name of a directory. When concatenated with the type=cdir entry, a CWD with this argument should succeed (given the user has the appropriate system rights). 2.8 The unique Fact The unique fact is used to present a unique identifier for a file or directory on a Server-FTP. This would be expected to be used by Hethmon Expires 5 September 1997 [Page 6] INTERNET-DRAFT MLST Command and Extensions to FTP 5 March 1997 Server-FTPs whose host system allows things such as symbolic links so that the same file may be represented in more than one directory on the server. The value of the unique fact should be considered an opaque string for comparison purposes. unique-fact = "unique" "=" token 2.9 The perm Fact The perm fact is used to present the file permissions the user has in regard to the listed file. The value of the fact is a 3 character sequence representing read, write, and execute priviliges for the file or directory as pertaining to the login user. perm-fact = "perm" "=" pvals pvals = 1readval 1writeval 1executeval readval = "r" | "-" writeval = "w" | "-" executeval = "x" | "-" The first character specifies the read permission. The character "r" means read is available. The character "-" means it is not. The second character specifies the write permission. The character "w" means write is available while "-" means it is not. Likewise the third character specifies the execute permission. The character "x" means execute is available while "-" means it is not. A file with read rights allows the User-FTP to retrieve (RETR) the file. If it has executable rights, then the file is considered an executable (or runnable) program on the Server-FTP system. Some Server-FTPs may allow the SITE EXEC extension to be used on the specified file. If it has write rights, then the file may be appended to using APPE, or written to by STOR. A directory with read rights may be the target of a LIST, NLST or MLST command. With execute right, it can be the target of a CWD command. With write rights, new files/directories may be created, and existing files/directories deleted or renamed; under usual implementations, existing directories may only be deleted if they are empty. 2.10 The lang Fact The lang fact describes the natural language of the filename for use in display purposes. Values used here should comply with the language registry of IANA. lang-fact = "lang" "=" token Server-FTP implementations MUST not guess language values. Language values must be determined in an unambigous way such as filesystem tagging of language or by user configuration. Hethmon Expires 5 September 1997 [Page 7] INTERNET-DRAFT MLST Command and Extensions to FTP 5 March 1997 2.11 The size fact The size should always reflect the transmitted size of the file across the FTP data connection. Specifically, this size should include counting any change in bytes required in ASCII mode when the local filesystem does not use CRLF for an end of line marker. Given limitations in some systems, Client-FTP implementations must understand this size may not be precise and may change between the time of a MLST and RETR operation. size-fact = "size" "=" 1*DIGIT 2.12 The media-type fact The media-type fact represents the IANA media type of the file. The list of values used must follow the guidelines set by the IANA registry. media-type = "media-type" "=" Server-FTP implementations MUST not guess media type values. Media type values must be determined in an unambigous way such as filesystem tagging of media-type or by user configuration. 2.13 The charset fact The charset fact represents the IANA character set name for the encoded names in a MLST response. This is a optional fact. The default character set is UTF-8 unless specified otherwise. FTP implementations SHOULD use UTF-8 if possible to encourage maximum interoperability. 2.14 Mandatory minimum reply for MLST The mandatory minimum response for MLST when returning a directory listing must include an entry for the listed directory (type=cdir). This requirement is lifted when returning information about a single file. 3 The OPTS Command The OPTS (options) command allows a Client-FTP to specify the exact set of facts for a Server-FTP to return within a MLST command. The Client-FTP may use the FEAT command to determine the set of facts supported by the Server-FTP and then issue the OPTS command to specify the set of those facts it wishes to see. Server-FTPs should implement the OPTS command. Request Syntax: opts = opts-cmd SP command-name *(command-options) CRLF opts-cmd = "o" / "O" "p" / "P" "t" / "T" "s" / "S" command-name = Hethmon Expires 5 September 1997 [Page 8] INTERNET-DRAFT MLST Command and Extensions to FTP 5 March 1997 command-options = Response Syntax: opts-response = opts-good | opts-bad opts-good = "200" SP response-message CRLF opts-bad = "451" SP response-message CRLF | "501" SP response-message CRLF 3.1 OPTS parameters for MLST For the MLST command, the Client-FTP may specify a list of facts it wishes to be returned. The format is specified by: opts = "OPTS" SP *facts 4. Impact On Other FTP Commands Along with the introduction of MLST, traditional FTP commands must be extended to allow for the use of more than US-ASCII or EBCDIC character sets. In general, the support of MLST requires support for arbritrary character sets wherever filenames and directory names are allowed. This applies equally to both arguments given to the following commands and to the replies from them. CWD RETR STOR APPE RNFR RNTO DELE RMD MKD PWD STAT 4.1 Impact on Pathnames and Filenames The design of MLST requires the Server-FTP to allow concatenation of certain elements of a MLST response. Specifically, a typical response would include an element which indicates the current directory and one or more elements which are files in the indicated directory. A Server-FTP must be able to accept a simple concatentation of these two names even if the underlying operating system does not accept a simple concatentation. The Server-FTP must perform any translation of the concatentated name to local equivalents. 5. Security This memo does not discuss security. No new security concerns are raised in this memo above what now exists within the FTP protocol. 6. References Hethmon Expires 5 September 1997 [Page 9] INTERNET-DRAFT MLST Command and Extensions to FTP 5 March 1997 [1] Coded Character Set--7-bit American Standard Code for Information Interchange, ANSI X3.4-1986. [2] F. Yergeau, "UTF-8, a transformation format of Unicode and ISO 10646", RFC 2044, Alis Technologies, October 1996. [3] D. Crocker, "Augmented BNF for Syntax Specifications: ABNF", Work In Progress , Internet Mail Consortium, November 1996. [4] W. Curtin, "Internationalization of the File Transfer Protocol", Work In Progress , Defense Information Systems Agency, November 1996. [5] Internet Assigned Numbers Authority. http://www.isi.edu/div7/iana/ Email: iana@iana.org. 7. Acknowledgements The following people have contributed to this document: Alex Belits D. J. Berstein Martin J. Duerst Mark Harris Alun Jones James Matthews Keith Moore (and others from the FTPEXT working group) 8. Editor's Address Paul Hethmon Hethmon Brothers 2305 Chukar Road Knoxville, TN 37923 USA Phone: 423-690-8990 Email: phethmon@hethmon.com Hethmon Expires 5 September 1997 [Page 10]