Internet DRAFT - draft-hethmon-mlst-command-ftp

draft-hethmon-mlst-command-ftp



HTTP/1.1 200 OK
Date: Tue, 09 Apr 2002 00:21:53 GMT
Server: Apache/1.3.20 (Unix)
Last-Modified: Wed, 05 Mar 1997 13:17:00 GMT
ETag: "3ddeb2-5206-331d724c"
Accept-Ranges: bytes
Content-Length: 20998
Connection: close
Content-Type: text/plain


FTPEXT Working Group                                        P. Hethmon
INTERNET-DRAFT                                        Hethmon Brothers
<draft-hethmon-mlst-command-ftp-01.txt>                   5 March 1997
Expires 5 September 1997

                    MLST Command and Extensions to FTP

Status of this Memo

This document is an Internet-Draft.  Internet-Drafts are working
documents of the Internet Engineering Task Force (IETF), its
areas, and its working groups.  Note that other groups may also
distribute working documents as Internet-Drafts.

Internet-Drafts are draft documents valid for a maximum of six
months and may be updated, replaced, or obsoleted by other
documents at any time.  It is inappropriate to use Internet-
Drafts as reference material or to cite them other than as
``work in progress.''

To learn the current status of any Internet-Draft, please check
the ``1id-abstracts.txt'' listing contained in the Internet-
Drafts Shadow Directories on ftp.is.co.za (Africa),
nic.nordu.net (Europe), munnari.oz.au (Pacific Rim),
ds.internic.net (US East Coast), or ftp.isi.edu (US West Coast).

Distribution of this document is unlimited.

Abstract

In order to overcome the problems inherent in the current FTP LIST
output, a new command is needed to transfer standardized listing
information from Server-FTP to Client-FTP. In addition, a way for the
Server-FTP to let the Client-FTP know of this capability without 
imposing on the Client-FTP to randomly try new commands is needed. 
This proposal meets both of these requirements.

This proposal also extends the FTP protocol to allow character sets
other than US-ASCII[1] by allowing the transmission of 8-bit
characters and the recommended use of UTF-8[2] encoding.

1. Knowledge of Extra Capabilities

1.1 The FEAT Command

In order for the Client-FTP to know whether the Server-FTP understands
the MLST command and future extensions to the FTP protocol, a new
command, FEAT, can be used for the Client-FTP to query the Server-FTP
on any extensions from RFC959 the Server-FTP supports. For
Server-FTP's which do not support any extensions, the FEAT command will
result in a 500 reply.

The request syntax in augmented BNF syntax [3]: 

  feat = feat-cmd CRLF

Hethmon                Expires 5 September 1997                [Page  1]

INTERNET-DRAFT      MLST Command and Extensions to FTP      5 March 1997

  feat-cmd = "f" / "F" "e" / "E" "a" / "A" "t" / "T"

For Server-FTP's which do support extensions the correct reply code
will be 211. The reply to the FEAT command will typically be a 
multiline reply of the form:

C> FEAT
S> 211- Extensions supported:
S>  MLST size*,create,modify*,perm,media-type
S>  SIZE
S>  MDTM
S> 211 End

Each extension supported must be listed on a separate line to 
facilitate the possible inclusion of parameters supported by each 
extension command. Any parameters included are to be specified in the 
RFC defining that extension.

  feat-response = "211-" SP *CHAR CRLF 1*( SP feat CRLF ) 
                  '211' SP 'End' CRLF
  feat          = 1*CHAR *1( SP parms )
  parms         = 1*CHAR

FTP implementations which support MLST or other extension commands
MUST support FEAT.

1.2 Rational for FEAT

While not absolutely necessary, a standard mechanism for the Server-FTP
to inform the Client-FTP of any features and extensions supported will 
help reduce unnecessary traffic between the Client-FTP and Server-FTP 
as more extensions may be introduced in the future. If no mechanism 
exists for this, a Client-FTP will have to try each extension in turn 
for the Server-FTP resulting in a series of exchanges.

It is also suggested for Client-FTP which retain information about a 
particular Server-FTP between uses that they cache the knowledge of a 
particular Server-FTP supporting extensions. A Client-FTP would be 
expected to re-query the Server-FTP if any cached extension resulted in
a 500 response code, or if the Client-FTP needs to determine the 
support for a newly introduced extension.

2. MLST Specification

2.1 Generalities

The MLST command is intended to standardize the file and directory
information returned by the Server-FTP process. The MLST command 
differs from the LIST and NLST commands in that responses can be sent
over either the control connection or data connection. The default is
to send responses over the data connection. It also differs in that 
the format of the replies is strictly defined although extensible.

The MLST command also extends the FTP protocol as presented in RFC 959

Hethmon                Expires 5 September 1997                [Page  2]

INTERNET-DRAFT      MLST Command and Extensions to FTP      5 March 1997

and RFC 1123 to allow that transmission of 8-bit data. Note this is
not specifying character sets which are 8-bit, but specifying that FTP
implementations are to specifically allow 8-bit bytes. The MLST
command allows both UTF-8/Unicode and "raw" forms as arguments.

The MLST response is allowed over either the data connection or over
the control connection. The default is to send the response over the
data connection. Client-FTPs which wish to receive the response over
the control connection must use the OPTS command as described in
Section 3 to set the default response to use the control connection.

2.2 Format of MLST Request

The MLST command allows a single optional argument. This argument may
be either a directory name or a filename. If a directory name is given
then MLST must return a listing of the contents of the named directory.
If the argument is a filename, then MLST must return only a single fact
line containing the information about the named file.

If no argument is given then MLST must return a listing of the contents
of the current working directory. If the Client-FTP sends an invalid 
argument, the Server-FTP MUST reply with an error code of 501.

The syntax for the MLST command is:

  mlst       = mlst-cmd [ SP ( utf-8-name | raw ) ] CRLF
  mlst-cmd   = "m" / "M" "l" / "L" "s" / "S" "t" / "T"
  utf-8-name = <a UTF-8 encoded Unicode string>
  raw        = <any other 8-bit octet string>

2.3 Format of MLST Response

The format of a response to the MLST command is as follows:

  mlst-response         = mlst-control-response | mlst-data-response
  mlst-control-response = "212-" CRLF *(SP entry CRLF) 
                          "212" SP "End" CRLF
  entry                 = *facts SP ( utf-8-name | raw )
  facts                 = fact *( ";" facts )
  fact                  = name "=" value  
  name                  = 1*ltext
  value                 = 1*ltext
  ltext                 = ALPHA | DIGIT | "," | "." | ":" | "!" | "@" |
                          "#" | "$" | "%" | "^" | "&" | "(" | ")" | 
                          "-" | "_" | "+" | "?" | "/" | "\" | "'" | 
                          <">
  mlst-data-response    = initial-response CRLF final-response
  initial-response      = "150" SP response-message CRLF
  response-message      = *ltext
  final-response        = "226" SP response-message CRLF

When responses are sent over the data connection, the format of the
control connection response is that of mlst-data-response. When the
response is sent over the control connection, then the

Hethmon                Expires 5 September 1997                [Page  3]

INTERNET-DRAFT      MLST Command and Extensions to FTP      5 March 1997

mlst-control-response format is used.

Given the path lengths available on various operating systems, this 
specification requires implementations to accept a minimum line length
(for the entire line of a MLST reply) of at least 2048 bytes. It would
be recommended that lengths of up to 4096 bytes be accepted if limits
are necessary.

The facts part of the specification would contain a series of "file 
facts" about the file/directory. Typical information to be presented 
would include file size, last modification time, creation time, unique 
identifier, file/directory flag.

The complete format for a successful reply to the MLST command would
be (over the control connection):

C> MLST
S> 212-
S>  facts utf-8-name
S>  facts utf-8-name
S>  facts utf-8-name
S> 212 End

2.4 Filename encoding.

A FTP implementation using the MLST command must be 8-bit clean. This
is necessary in order to transmit UTF-8 encoded filenames. This
specification recommends the use of UTF-8 encoded filenames. FTP
implementations SHOULD use UTF-8 whenever possible to encourage the
maximum interoperability.

Filenames are not restricted to UTF-8, however treatment of arbritrary
character encodings is not specified by this standard. Applications are
encouraged to treat non-UTF-8 encodings of filenames as octet sequences.

Note that this encoding DOES NOT apply to the contents of the file.

Further information about filename encoding for FTP may be found in
"Internationalization of the File Transfer Protocol" [4].

2.4.1 Notes about the Filename.

The filename returned in the MLST command should be an unqualified
filename. No path information should be given. Path information is 
to be returned separately as specified in Section 2.6.

2.5 Format of Facts

The "facts" for a file in a reply to a MLST command consist of
information about that file. The facts are a series of keyword=value
pairs separated by a semi-colon (";") character. The complete series
of facts may not contain the space character. 

A sample of a typical series of facts would be: (spread over two lines

Hethmon                Expires 5 September 1997                [Page  4]

INTERNET-DRAFT      MLST Command and Extensions to FTP      5 March 1997

for presentation only)

size=4161;lang=en-us;modify=19970214165800;create=19961001124534;
type=file;x.myfact=foo,bar

2.6 Standard Facts

This document defines a standard set of facts as follows:

  size       -- Size in bytes
  modify     -- Last modification time
  create     -- Creation time
  type       -- Entry type
  unique     -- Unique id of file/directory
  perm       -- File permissions, whether read, write, execute is 
                allowed for the login id.
  lang       -- Language of the filename per IANA[5] registry.
  media-type -- MIME media-type of file contents per IANA registry.
  charset    -- Character set per IANA registry (if not UTF-8)

Fact names are case-sensitive. Size, size, and SIze are not the same.

For keywords specifing time, the time is to be specified in the
format:

  yyyymmddhhmmss.sss

where

  yyyy  -- 4 digit year
  mm    -- 2 digit month
  dd    -- 2 digit day
  hh    -- 2 digit hour
  ss    -- 2 digit second
  .sss  -- optional digits for hundreths of a second

Further operating system specific keywords could be specified by using
the IANA operating system name as a prefix (examples only):

  OS/2.ea  -- OS/2 extended attributes
  MACOS.rf  -- MacIntosh resource forks

Implementation specific keywords would be allowed by starting the
keyword with the sequence "x.":

  x.ver  -- Version information
  x.desc -- File description
  x.type -- File type

2.7 The type Fact

The type fact needs a special description. Part of the problem with
current practices is deciding when a file is a directory. If it is
a directory, is it the current directory, a regular directory, or a 

Hethmon                Expires 5 September 1997                [Page  5]

INTERNET-DRAFT      MLST Command and Extensions to FTP      5 March 1997

parent directory? The MLST specification makes this unambigous using
the type fact. All values for the type fact are relative to the
directory listed in the response.

Five values are possible for the type fact:

  file -- a file entry
  cdir -- the current directory
  pdir -- the parent directory
  dir  -- a directory or sub-directory
  link -- the entry is a link to a file or directory

The syntax is defined to be:

  type-fact = "type" "=" 1("file" | "cdir" | "pdir" | "dir" | "link")

2.7.1 type=file

The presence of the type=file fact indicates the listed entry is 
available as a file in the listed directory.

2.7.2 type=cdir

The type=cdir fact indicates the listed entry is the
full, qualified pathname of the directory whose contents are listed.
The value of this entry (the filename part) plus the value of a 
type=file entry together should represent a complete pathname suitable
for a RETR command. The value for the type=cdir entry should include 
any necessary system delimiters used between path components. An example
would be the forward slash "/" on a Unix system, or a back slash "\" 
on an OS/2 or Windows system.

The type=cdir entry is required for all MLST replies which return
directory listings. It is not required for MLST replies which return
information about a single file.

2.7.3 type=pdir

If present, the type=pdir entry represents the fully qualified
pathname of the parent directory of the type=cdir directory. A
CWD command with the value should change the user to the parent
directory of the listed directory. Client-FTPs should note not all
responses will include this information.

2.7.4 type=dir

If present, the type=dir entry is the name of a directory. When
concatenated with the type=cdir entry, a CWD with this argument should
succeed (given the user has the appropriate system rights).

2.8 The unique Fact

The unique fact is used to present a unique identifier for a file or
directory on a Server-FTP. This would be expected to be used by

Hethmon                Expires 5 September 1997                [Page  6]

INTERNET-DRAFT      MLST Command and Extensions to FTP      5 March 1997

Server-FTPs whose host system allows things such as symbolic links
so that the same file may be represented in more than one directory
on the server. The value of the unique fact should be considered an 
opaque string for comparison purposes.

  unique-fact = "unique" "=" token

2.9 The perm Fact

The perm fact is used to present the file permissions the user has
in regard to the listed file. The value of the fact is a 3 character
sequence representing read, write, and execute priviliges for the
file or directory as pertaining to the login user.

  perm-fact  = "perm" "=" pvals
  pvals      = 1readval 1writeval 1executeval
  readval    = "r" | "-"
  writeval   = "w" | "-"
  executeval = "x" | "-"

The first character specifies the read permission. The character "r"
means read is available. The character "-" means it is not.

The second character specifies the write permission. The character
"w" means write is available while "-" means it is not.

Likewise the third character specifies the execute permission. The
character "x" means execute is available while "-" means it is not.

A file with read rights allows the User-FTP to retrieve (RETR) the
file. If it has executable rights, then the file is considered an
executable (or runnable) program on the Server-FTP system. Some
Server-FTPs may allow the SITE EXEC extension to be used on the
specified file. If it has write rights, then the file may be
appended to using APPE, or written to by STOR.

A directory with read rights may be the target of a LIST, NLST or
MLST command.  With execute right, it can be the target of a CWD 
command. With write rights, new files/directories may be created,
and existing files/directories deleted or renamed; under usual 
implementations, existing directories may only be deleted if they
are empty. 

2.10 The lang Fact

The lang fact describes the natural language of the filename for use
in display purposes. Values used here should comply with the language
registry of IANA.

  lang-fact = "lang" "=" token

Server-FTP implementations MUST not guess language values. Language
values must be determined in an unambigous way such as filesystem
tagging of language or by user configuration.

Hethmon                Expires 5 September 1997                [Page  7]

INTERNET-DRAFT      MLST Command and Extensions to FTP      5 March 1997


2.11 The size fact

The size should always reflect the transmitted size of the file across
the FTP data connection. Specifically, this size should include 
counting any change in bytes required in ASCII mode when the local
filesystem does not use CRLF for an end of line marker.

Given limitations in some systems, Client-FTP implementations must
understand this size may not be precise and may change between the
time of a MLST and RETR operation.

  size-fact = "size" "=" 1*DIGIT

2.12 The media-type fact

The media-type fact represents the IANA media type of the file. The
list of values used must follow the guidelines set by the IANA
registry.

  media-type = "media-type" "=" <per IANA guidelines>

Server-FTP implementations MUST not guess media type values. Media
type values must be determined in an unambigous way such as filesystem
tagging of media-type or by user configuration.

2.13 The charset fact

The charset fact represents the IANA character set name for the 
encoded names in a MLST response. This is a optional fact. The 
default character set is UTF-8 unless specified otherwise. FTP 
implementations SHOULD use UTF-8 if possible to encourage maximum 
interoperability.

2.14 Mandatory minimum reply for MLST

The mandatory minimum response for MLST when returning a directory
listing must include an entry for the listed directory (type=cdir).
This requirement is lifted when returning information about a single
file.

3 The OPTS Command

The OPTS (options) command allows a Client-FTP to specify the exact
set of facts for a Server-FTP to return within a MLST command. The
Client-FTP may use the FEAT command to determine the set of facts
supported by the Server-FTP and then issue the OPTS command to 
specify the set of those facts it wishes to see. Server-FTPs should
implement the OPTS command.

  Request Syntax:
  opts     = opts-cmd SP command-name *(command-options) CRLF
  opts-cmd = "o" / "O" "p" / "P" "t" / "T" "s" / "S"
  command-name    = <any FTP command which allows option setting>

Hethmon                Expires 5 September 1997                [Page  8]

INTERNET-DRAFT      MLST Command and Extensions to FTP      5 March 1997

  command-options = <format specified by individual FTP command>

  Response Syntax:
  opts-response = opts-good | opts-bad
  opts-good     = "200" SP response-message CRLF
  opts-bad      = "451" SP response-message CRLF |
                  "501" SP response-message CRLF

3.1 OPTS parameters for MLST

For the MLST command, the Client-FTP may specify a list of facts it
wishes to be returned. The format is specified by:

  opts = "OPTS" SP *facts

4. Impact On Other FTP Commands

Along with the introduction of MLST, traditional FTP commands must be
extended to allow for the use of more than US-ASCII or EBCDIC
character sets. In general, the support of MLST requires support for
arbritrary character sets wherever filenames and directory names are
allowed. This applies equally to both arguments given to the following
commands and to the replies from them.

  CWD
  RETR
  STOR
  APPE
  RNFR
  RNTO
  DELE
  RMD
  MKD
  PWD
  STAT  

4.1 Impact on Pathnames and Filenames

The design of MLST requires the Server-FTP to allow concatenation
of certain elements of a MLST response. Specifically, a typical
response would include an element which indicates the current
directory and one or more elements which are files in the indicated
directory. A Server-FTP must be able to accept a simple concatentation
of these two names even if the underlying operating system does not
accept a simple concatentation. The Server-FTP must perform any
translation of the concatentated name to local equivalents.

5. Security

This memo does not discuss security. No new security concerns are
raised in this memo above what now exists within the FTP protocol.

6. References


Hethmon                Expires 5 September 1997                [Page  9]

INTERNET-DRAFT      MLST Command and Extensions to FTP      5 March 1997

[1] Coded Character Set--7-bit American Standard Code for Information
    Interchange, ANSI X3.4-1986.

[2] F. Yergeau, "UTF-8, a transformation format of Unicode and ISO
    10646", RFC 2044, Alis Technologies, October 1996.

[3] D. Crocker, "Augmented BNF for Syntax Specifications: ABNF",
    Work In Progress <draft-ietf-drums-abnf-01.txt>, Internet
    Mail Consortium, November 1996.

[4] W. Curtin, "Internationalization of the File Transfer Protocol",
    Work In Progress <draft-ietf-ftpext-itln-00.txt>, Defense 
    Information Systems Agency, November 1996.

[5] Internet Assigned Numbers Authority. http://www.isi.edu/div7/iana/
    Email: iana@iana.org.

7. Acknowledgements

The following people have contributed to this document:

Alex Belits
D. J. Berstein
Martin J. Duerst
Mark Harris
Alun Jones
James Matthews
Keith Moore
(and others from the FTPEXT working group)

8. Editor's Address

Paul Hethmon
Hethmon Brothers
2305 Chukar Road
Knoxville, TN 37923 USA

Phone: 423-690-8990
Email: phethmon@hethmon.com















Hethmon                Expires 5 September 1997                [Page 10]