FTPEXT Working Group P. Hethmon INTERNET-DRAFT Hethmon Brothers 20 November 1996 Expires 20 May 1997 MLST Command and Extensions to FTP Status of this Memo This document is an Internet-Draft. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet- Drafts as reference material or to cite them other than as ``work in progress.'' To learn the current status of any Internet-Draft, please check the ``1id-abstracts.txt'' listing contained in the Internet- Drafts Shadow Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe), munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or ftp.isi.edu (US West Coast). Distribution of this document is unlimited. Abstract In order to overcome the problems inherent in the current FTP LIST output, a new command is needed to transfer standardized listing information from Server-FTP to Client-FTP. In addition, a way for the Server-FTP to let the Client-FTP know of this capability without imposing on the Client-FTP to randomly try new commands is needed. This proposal meets both of these requirements. This proposal also extends the FTP protocol to allow character sets other than US-ASCII[1] by allowing the transmission of 8-bit characters and the recommended use of UTF-8[2] encoding. 1. Knowledge of Extra Capabilities 1.1 The FEAT Command In order for the Client-FTP to know whether the Server-FTP understands the MLST command and future extensions to the FTP protocol, a new command, FEAT, can be used for the Client-FTP to query the Server-FTP Hethmon Expires 20 May 1997 [Page 1] INTERNET-DRAFT MLST 20 November 1996 on any extensions from RFC959 the Server-FTP supports. For Server-FTP's which do not support any extensions, the FEAT command will result in a 500 reply. For Server-FTP's which do support extensions the correct reply code will be 211. The reply to the FEAT command will typically be a multiline reply of the form: C> FEAT S> 211- Extensions supported: S> MLST S> SIZE S> MDTM S> 211 End Each extension supported must be listed on a separate line to facilitate the possible inclusion of parameters supported by each extension command. Any parameters included are to be specified in the RFC defining that extension. The response format in augmented BNF syntax [3]: feat-response = "211-" SP *CHAR CRLF 1*( SP feat CRLF ) '211' SP 'End' CRLF feat = 1*CHAR *1( SP parms ) parms = 1*CHAR 1.2 Rational for FEAT While not absolutely necessary, a standard mechanism for the Server-FTP to inform the Client-FTP of any features and extensions supported will help reduce unnecessary traffic between the Client-FTP and Server-FTP as more extensions may be introduced in the future. If no mechanism exists for this, a Client-FTP will have to try each extension in turn for the Server-FTP resulting in a series of exchanges. It is also suggested for Client-FTP which retain information about a particular Server-FTP between uses that they cache the knowledge of a particular Server-FTP supporting extensions. A Client-FTP would be expected to re-query the Server-FTP if any cached extension resulted in a 500 response code, or if the Client-FTP needs to determine the support for a newly introduced extension. 2. MLST Specification 2.1 Generalities The MLST command is intended to standardize the file and directory Hethmon Expires 20 May 1997 [Page 2] INTERNET-DRAFT MLST 20 November 1996 information returned by the Server-FTP process. The MLST command differs from the LIST and NLST commands in that responses are always returned over the control connection. No data connection is made. It also differs in that the format of the replies is strictly defined although extensible. The MLST command also extends the FTP protocol as presented in RFC 959 and RFC 1123 to allow that transmission of 8-bit data. Note this is not specifying character sets which are 8-bit, but specifying that FTP implementations are to specifically allow 8-bit bytes. The MLST command allows both UTF-8 and "raw" forms as arguments. 2.2 Format of MLST Request The MLST command allows a single optional argument. This argument may be either a directory name or a filename. If a directory name is given then MLST must return a listing of the contents of the named directory. If the argument is a filename, then MLST must return only a single fact line containing the information about the named file. If no argument is given then MLST must return a listing of the contents of the current working directory. If the Client-FTP sends an invalid argument, the Server-FTP MUST reply with an error code of 501. The syntax for the MLST command is: mlst-command = "MLST" [ SP ( utf-8-name | raw ) ] CRLF utf-8-name = raw = 2.3 Format of MLST Response The format of a response to the MLST command will be as follows: mlst-response = "212-" CRLF *(SP entry CRLF) "212" SP "End" CRLF entry = 1*facts SP ( utf-8-name | raw ) facts = fact *( ";" facts ) fact = name "=" value name = 1*ltext value = 1*ltext ltext = ALPHA | DIGIT | "," | "." | ":" | "!" | "@" | "#" | "$" | "%" | "^" | "&" | "*" | "(" | ")" | "-" | "_" | "+" | "?" | "/" | "\" | "'" | <"> Given the path lengths available on various operating systems, this specification requires the clients to accept a minimum line length (for the entire line of a MLST reply) of at least 1024 bytes. It would be recommended that lengths of up to 4096 bytes be accepted if limits are necessary. Hethmon Expires 20 May 1997 [Page 3] INTERNET-DRAFT MLST 20 November 1996 The facts part of the specification would contain a series of "file facts" about the file/directory. Typical information to be presented would include file size, last modification time, creation time, unique identifier, file/directory flag. The complete format for a successful reply to the MLST command would be: C> MLST S> 212- S> facts utf-8-name S> facts utf-8-name S> facts utf-8-name S> 212 End 2.4 Filename encoding. A FTP implementation using the MLST command must be 8-bit clean. This is necessary in order to transmit UTF-8 encoded filenames. This specification recommends compliant implementations transmit all filenames in UTF-8 format. Filenames are not restricted to UTF-8, however treatment of arbritrary character encodings is not specified by this standard. Applications are encouraged to treat non-UTF-8 encodings of filenames as octet sequences. Note that this encoding DOES NOT apply to the contents of the file. 2.4.1 Notes about the Filename. The filename returned in the MLST command should be an unqualified filename. No path information should be given. Path information is to be returned separately as specified in Section 2.6. 2.5 Format of Facts The "facts" for a file in a reply to a MLST command consist of information about that file. The facts are a series of keyword=value pairs separated by a semi-colon (";") character. The complete series of facts may not contain the space character. A sample of a typical series of facts would be: (spread over two lines for presentation only) size=4161;scharset=us-ascii;mtime=833136672;ctime=833133780; type=file;x.myfact=foo,bar 2.6 Standard Facts Hethmon Expires 20 May 1997 [Page 4] INTERNET-DRAFT MLST 20 November 1996 This document defines a standard set of facts as follows: size -- Size in bytes mtime -- Last modification time ctime -- Creation time type -- Entry type uid -- Unique id of file/directory perm -- File permissions, whether read, write, execute is allowed for the login id. scharset -- Character set of the filename. Fact names are case-sensitive. Size, size, and SIze are not the same. For keywords specifing time, the time is to be specified as the number of seconds since the epoch value 1 January 1970 00:00:00 UTC. For files which may have a date before this epoch value, a negative value is allowed and should not be unexpected by Client-FTPs. Further operating system specific keywords could be specified by using the IANA operating system name as a prefix (examples only): OS/2.ea -- OS/2 extended attributes MACOS.rf -- MacIntosh resource forks Implementation specific keywords would be allowed by starting the keyword with the sequence "x.": x.ver -- Version information x.desc -- File description x.type -- File type 2.7 The type Fact The type fact needs a special description. Part of the problem with current practices is deciding when a file is a directory. If it is a directory, is it the current directory, a regular directory, or a parent directory. The MLST specification makes this unambigous using the type fact. Five values are possible for the type fact: file -- a file entry cdir -- the current directory pdir -- the parent directory dir -- a directory or sub-directory link -- the entry is a link The syntax is defined to be: Hethmon Expires 20 May 1997 [Page 5] INTERNET-DRAFT MLST 20 November 1996 type-fact = "type" "=" 1("file" | "cdir" | "pdir" | "dir" | "link") 2.7.1 type=file The presence of the type=file fact indicates the listed entry is available as a file in the current directory. 2.7.2 type=cdir The type=cdir fact indicates the listed entry is the full, qualified pathname of the directory whose contents are listed. The value of this entry (the filename part) plus the value of a type=file entry together should represent a complete pathname suitable for a RETR command. The value for the type=cdir entry should include any necessary system delimiters used between path components. An example would be the forward slash "/" on a Unix system, or a back slash "\" on an OS/2 or Windows system. The type=cdir entry is required for all MLST replies which return directory listings. It is not required for MLST replies which return information about a single file. 2.7.3 type=pdir If present, the type=pdir entry represents the fully qualified pathname of the parent directory of the type=cdir directory. A CWD command with the value should change the user to the parent directory. 2.7.4 type=dir If present, the type=dir entry is the name of a directory. When concatenated with the type=cdir entry, a CWD with this argument should succeed (given the user has the appropriate system rights). 2.8 The uid Fact The uid fact is used to present a unique identifier for a file or directory on a Server-FTP. This would be expected to be used by Server-FTPs whose host system allows things such as symbolic links so that the same file may be represented in more than one directory on the server. The value of the uid fact should be considered an opaque string for comparison purposes. uid-fact = "uid" "=" token 2.9 The perm Fact Hethmon Expires 20 May 1997 [Page 6] INTERNET-DRAFT MLST 20 November 1996 The perm fact is used to present the file permissions the user has in regard to the listed file. The value of the fact is a 3 character sequence representing read, write, and execute priviliges for the file or directory as pertaining to the login user. perm-fact = "perm" "=" pvals pvals = 1readval 1writeval 1executeval readval = "r" | "-" writeval = "w" | "-" executeval = "x" | "-" The first character specifies the read permission. The character "r" means read is available. The character "-" means it is not. The second character specifies the write permission. The character "w" means write is available while "-" means it is not. Likewise the third character specifies the execute permission. The character "x" means execute is available while "-" means it is not. 2.10 The scharset Fact The scharset fact describes to the Client-FTP the original source charset of the filename before transmission in UTF-8 format. This value may be used by the Client-FTP in order to display the filename to the user in a more friendly format. scharset-fact = "scharset" "=" token 2.11 The size fact The size should always reflect the transmitted size of the file across the FTP data connection. Specifically, this size should include counting any extra bytes required in ASCII mode when the local filesystem does not use CR LF for an end of line marker. Given limitations in some systems, Client-FTP implementations must understand this size may not be precise. size-fact = "size" "=" 1*DIGIT 2.12 Mandatory minimum reply for MLST The mandatory minimum response for MLST when returning a directory listing must include an entry for the listed directory (type=cdir). This requirement is lifted when returning information about a single file. 3. Impact On Other FTP Commands Hethmon Expires 20 May 1997 [Page 7] INTERNET-DRAFT MLST 20 November 1996 Along with the introduction of MLST, traditional FTP commands must be extended to allow for the use of more than US-ASCII or EBCDIC character sets. In general, the support of MLST requires support for arbritrary character sets wherever filenames and directory names are allowed. This applies equally to both arguments given to the following commands and to the replies from them. CWD RETR STOR APPE RNFR RNTO DELE RMD MKD PWD STAT 4. Security This memo does not discuss security. No new security concerns are raised in this memo above what now exists within the FTP protocol. 5. References [1] Coded Character Set--7-bit American Standard Code for Information Interchange, ANSI X3.4-1986. [2] F. Yergeau, "UTF-8, a transformation format of Unicode and ISO 10646", Work In Progress , Alis Technologies, July 1996. [3] D. Crocker, "Standard for ARPA Internet Text Messages", RFC 822, University of Delaware, August 1982. 6. Acknowledgements The following people have contributed to this document: Alex Belits D. J. Berstein Martin J. Duerst Mark Harris Alun Jones James Matthews Keith Moore (and others from the FTPEXT working group) Hethmon Expires 20 May 1997 [Page 8] INTERNET-DRAFT MLST 20 November 1996 7. Author's Address Paul Hethmon Hethmon Brothers 2305 Chukar Road Knoxville, TN 37923 USA Phone: 423-690-8990 Email: phethmon@hethmon.com Hethmon Expires 20 May 1997 [Page 9]