Network Working Group                                    Martin Hamilton
INTERNET-DRAFT                                   Loughborough University
                                                           February 1996

                         WHOIS++ URL Specification

                     <draft-hamilton-whois-url-00.txt>


Status of This Memo

This document is an Internet-Draft.  Internet-Drafts are working
documents of the Internet Engineering Task Force (IETF), its areas,
and its working groups.  Note that other groups may also distribute
working documents as Internet-Drafts.

Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time.  It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as ``work in progress.''

To learn the current status of any Internet-Draft, please check the
``1id-abstracts.txt'' listing contained in the Internet-Drafts Shadow
Directories on ds.internic.net (US East Coast), nic.nordu.net
(Europe), ftp.isi.edu (US West Coast), or munnari.oz.au (Pacific Rim).

Distribution of this memo is unlimited.

This Internet Draft expires August 22, 1996.


Abstract

This document defines a new Uniform Resource Locator (URL) scheme
"whois", which provides a convention within the URL framework for
referring to WHOIS++ servers and the data held within them.  It does
not specify a standard.  Comments should be sent to the author.


1. Overview of the WHOIS++ protocol

RFC 1835 [1] defines a simple Internet directory protocol known as
WHOIS++.  In order that WHOIS++ may be used within the Uniform
Resource Locator (URL) framework defined by RFC 1738 [2], a URL scheme
definition for WHOIS++ is necessary.  This document specifies a URL
scheme "whois", for use with the WHOIS++ protocol.

WHOIS++ is text based protocol after the fashion of many popular
Internet application protocols, such as SMTP [3] and FTP [4].
Although the protocol is TCP based, WHOIS++ is effectively stateless -
no state information is preserved across requests, there is no concept
of a session per se since each request/response pair is
self-contained, and there is no "login" phase.

WHOIS++ transactions normally consist of a single request from the
client and response from the server, followed by the TCP connection

Hamilton                                                        [Page 1]

INTERNET DRAFT         WHOIS++ URL Specification           February 1996

between the two being torn down.  Use of the "hold" constraint in the
WHOIS++ request makes it possible for the client to indicate that it
would like to keep the TCP connection open for more than one request/
response pair, but whether this is actually done is at the discretion
of the server.


2. WHOIS++ URL specification

The following information is necessary for a WHOIS++ client to
formulate and deliver a request:

  o the domain name or IP address of the server to contact
  o the port number of the server (63 by default)
  o the request itself - normally a single line of text

This is a good match with the generic Uniform Resource Locator (URL)
scheme specified in RFC 1738.  So, a URL of the following form would
seem to be appropriate:

  whois://host[:port][/<request-specification>]

Using the BNF grammar defined in RFC 1738, this could be written as:

  whoisurl   = "whois://" hostport [ "/" whoisrch ]

where

  whoisrch   = *uchar

The definitions for hostport and uchar are imported from RFC 1738:

  hostport       = host [ ":" port ]
  uchar          = unreserved | escape

These in turn depend upon the following:

  unreserved     = alpha | digit | safe | extra
  alpha          = lowalpha | hialpha
  digit          = "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" |
                   "8" | "9"
  safe           = "$" | "-" | "_" | "." | "+"
  extra          = "!" | "*" | "'" | "(" | ")" | ","
  lowalpha       = "a" | "b" | "c" | "d" | "e" | "f" | "g" | "h" |
                   "i" | "j" | "k" | "l" | "m" | "n" | "o" | "p" |
                   "q" | "r" | "s" | "t" | "u" | "v" | "w" | "x" |
                   "y" | "z"
  hialpha        = "A" | "B" | "C" | "D" | "E" | "F" | "G" | "H" | "I" |
                   "J" | "K" | "L" | "M" | "N" | "O" | "P" | "Q" | "R" |
                   "S" | "T" | "U" | "V" | "W" | "X" | "Y" | "Z"

  escape         = "%" hex hex
  hex            = digit | "A" | "B" | "C" | "D" | "E" | "F" |
                   "a" | "b" | "c" | "d" | "e" | "f"

Hamilton                                                        [Page 2]

INTERNET DRAFT         WHOIS++ URL Specification           February 1996

BNF for the WHOIS++ request format is defined in Appendix F of RFC
1835.  This can contain characters which may confuse software which
deals with WHOIS++ URLs, notably spaces and characters drawn from
non-ASCII character sets such as the UTF-8 variant of Unicode [5,6].
Hence, the usual rules about hex-escaping illegal and reserved
characters should apply - and the definiton of the WHOIS++ request as
"uchar".  Note that the default WHOIS++ port of 63 should be used if
the port number component of the "hostport" construction is left out.

Global constraints such as authentication information, language and
character set preferences may be expressed as part of the WHOIS++
request.  Consequently it is not thought necessary to specify them
separately in a mechanism such as the "user@host" construction defined
for the FTP URL.

Most WHOIS++ requests can be expected to consist of a single line of
text, followed by carriage return and line feed characters.  It
should, however, be noted that it may be necessary to encode
multi-line requests within WHOIS++ URLs.  Software which implements
WHOIS++ URLs should either be capable of handling this, or fail
gracefully.


3. Examples

The WHOIS++ URL scheme defined above should make it possible to write
URLs for any of the following:

  (a) a reference particular WHOIS++ server, without implying
        that a search should be done
  (b) a "canned" search of a particular server
  (c) individual objects within a server

Case (a) simply requires that the host and optionally the port number
be specified, e.g.

  whois://acm.org/

or

  whois://acm.org:63/

When given a WHOIS++ URL of this format, implementations may choose to
present the user with a search form or dialogue, contact the server
for information about which WHOIS++ options it supports, and so on.
The WHOIS++ default port 63 should be used if the port number is not
specified.

Case (b) requires a search specification to be present, e.g.

  whois://acm.org/name=phil%20and%20name=zimmerman

This may be sent verbatim to the server, once hex escaped chars in the
URL have been converted back to normal, e.g.

Hamilton                                                        [Page 3]

INTERNET DRAFT         WHOIS++ URL Specification           February 1996

  name=phil and name=zimmerman

Case, (c) is effectively an instance of (b).  This may be implemented
as a search where the request consists of the WHOIS++ "handle" of the
requested object, e.g.

  whois://acm.org/handle=number6


4. Global constraints

Although there are no global constraints specified in these last two
URLs, the WHOIS++ client may choose to add global constraints of its
own, e.g.  use of the "hold" constraint to request that the connection
be held open for a further request.

If in addition, global constraints are part of the URL, this can
easily be recognised by the presence of a colon ":" immediately after
the slash "/" which separates the host and port information from the
search specifier, e.g.

  whois://acm.org/:authenticate=password;name=foo;password=bar

At the implementor's discretion, the client may choose to pass these
global constraints on in any queries which are passed to this server,
e.g. if this URL was used in a search for "zimmerman", the request
passed to the server might be either of

  zimmerman

or

  zimmerman:authenticate=password;name=foo;password=bar

or "zimmerman", followed by some combination of the global constraints
specified in the URL and other global constraints introduced by the
WHOIS++ client.


5. Backwards compatibility with WHOIS

For compatibility with the earlier NICNAME/WHOIS protocol [7], it may
be assumed that "whois" URLs which reference the WHOIS default port
(43) will result in the returning of a free form textual response.  It
will probably not be appropriate to post-process this as though it
were a WHOIS++ response.


6. World-Wide Web integration

These "whois" URLs may be used as hyperlinks in HTML [8] documents,
though it should be noted that the relative URL syntax defined in RFC
1808 [9] is not appropriate for use in these links.  This is because
WHOIS++ requests do not map conveniently onto the generic resource

Hamilton                                                        [Page 4]

INTERNET DRAFT         WHOIS++ URL Specification           February 1996

locator syntax used for relative URLs - the syntactic conventions used
in writing a WHOIS++ request are very different from those of the
generic resource locator.

The WHOIS++ protocol and the "whois" URL lend themselves to
implementation via a proxy HTTP [10] gateway, since the information
necessary to contact the server and deliver the request is embedded
within the URL itself.  A simple proxy gateway has been implemented
which takes HTTP "GET" requests containing a "whois" URL, carries out
a WHOIS++ transaction and returns the results formatted as HTML.  This
will probably be the preferred approach to providing WHOIS++ support
by proxy for some time - there is no Internet Media Type (aka. MIME
content-type) registered for WHOIS++ as yet.

It does not appear to be appropriate to use any HTTP methods other
than "GET" with "whois" URLs, and there does not appear to be any
value in using "whois" URLs in HTML forms.


7. Security Considerations

Client software should check both the contents of the WHOIS++ URL and
the results returned from WHOIS++ search requests for any unsafe
characters and character strings.

It is possible to embed requests for other protocols within this URL
format.  This is an approach which may be used to defeat security
schemes, spoof protocols, and so on.  Implementors should consider
requiring user confirmation when requests are directed to reserved
ports (i.e.  those less than 1024) other than 63 and 43, or well-known
ports in the unreserved range.

Finally, implementations should take care not to cache authentication
information.


8. Acknowledgements

Thanks to Jeff Allen, Lorcan Dempsey, Patrik Faltstrom, Jon Knight,
William F. Maton and <<your name here!!>> for their comments on
earlier drafts of this document.

This work was supported under the UK Electronic Libraries Programme
(eLib) grant 12/39/01, Resource Organisation and Discovery in Subject-
based services.


9. References

[1] P. Deutsch, R. Schoultz, P. Faltstrom and C. Weider.
"Architecture of the WHOIS++ service", RFC 1835. August 1995.
<URL:ftp://ds.internic.net/rfc/rfc1835.txt>

[2] T. Berners-Lee, L. Masinter and M. McCahill (eds).  "Uniform

Hamilton                                                        [Page 5]

INTERNET DRAFT         WHOIS++ URL Specification           February 1996

Resource Locators (URL)", RFC 1738.  December 1994.
<URL:ftp://ds.internic.net/rfc/rfc1738.txt>

[3] J. Postel.  "Simple Mail Transfer Protocol", RFC 821.  August
1982. <URL:ftp://ds.internic.net/rfc/rfc821.txt>

[4] J. Postel, J. K. Reynolds.  "File Transfer Protocol", RFC 959.
October 1985. <URL:ftp://ds.internic.net/rfc/rfc959.txt>

[5] The Unicode Standard, Worldwide Character Encoding, Version 1.0,
Volume 1, Addison-Wesley, 1990. ISBN 0-201-56788-1. 

[6] The Unicode Standard, Worldwide Character Encoding, Version 1.0,
Volume 2, Addison-Wesley, 1992. ISBN 0-201-60845-6.

[7] K. Harrenstien, M.K. Stahl, E.J. Feinler.  "NICNAME/WHOIS", RFC
954. October 1985. <URL:ftp://ds.internic.net/rfc/rfc954.txt>

[8] T. Berners-Lee, D. Connolly.  "Hypertext Markup Language - 2.0",
RFC 1866.  November 1995. <URL:ftp://ds.internic.net/rfc/rfc1866.txt>

[9] R. Fielding. "Relative Uniform Resource Locators", RFC 1808.
June 1995. <URL:ftp://ds.internic.net/rfc/rfc1808.txt>

[10] T. Berners-Lee, R. Fielding, H. Frystyk.  "Hypertext Transfer
Protocol -- HTTP/1.0", Internet Draft.  October 1995.
<URL:ftp://ds.internic.net/internet-drafts/draft-ietf-http-v10-spec-0
4.txt>


10. Author's address

Martin Hamilton
Department of Computer Studies
Loughborough University of Technology
Leics. LE11 3TU, UK

Email: m.t.hamilton@lut.ac.uk


Hamilton                                                        [Page 6]