FIND Working Group                           J. Allen
Internet Draft                               WebTV
<draft-ietf-find-cip-trans-00.txt>           Paul J. Leach
Expires in 6 months                          Microsoft
                                             June 5, 1997
                                    
                         CIP Transport Protocols

Status of this Memo

This document is an Internet-Draft. Internet-Drafts are working
documents of the Internet Engineering Task Force (IETF), its areas, and
its working groups.  Note that other groups may also distribute working
documents as Internet-Drafts.

Internet-Drafts are draft documents valid for a maximum of six months
and may  be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference material
or to cite them other than as "work in progress".

WARNING: The specification in this document is subject to change, and
will certainly change.  It is inappropriate AND STUPID to implement to
the proposed specification in this document.  In particular, anyone who
implements to this specification and then complains when it changes will
be properly viewed as an idiot, and any such complaints shall be
ignored. YOU HAVE BEEN WARNED.

To learn the current status of any Internet-Draft, please check the 1id-
abstracts.txt listing contained in the Internet-Drafts Shadow
Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe),
munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or
ftp.isi.edu (US West Coast).

Distribution of this document is unlimited.  Please send comments to the
FIND working group at <find@bunyip.com>.  Discussions of the working
group are archived at <URL: ftp://ftp.bunyip.com/pub/mailing-
lists/find>.

Abstract

This document specifies three protocols for transporting CIP requests,
responses and index objects, utilizing TCP, mail, and HTTP. The objects
themselves are defined in [CIP-MIME] and the overall CIP architecture is
defined in [CIP-ARCH].

1.   Protocol

In this section, the actual protocol for transmitting CIP index objects
and maintaining the mesh is presented. While companion documents ([CIP-
ARCH] and [CIP-MIME]) describes the concepts involved and the formats of
the CIP MIME objects, this document is the authoritative definition of
the message formats and transfer mechanisms of CIP used over TCP, HTTP
and mail.

1.1  Philosophy

The philosophy of the CIP protocol design is one of building-block
design. Instead of relying on bulky protocol definition tools, or ad-hoc
text encodings, CIP draws on existing, well understood Internet
technologies like MIME, RFC-822, Whois++, FTP, and SMTP. Hopefully this
will serve to ease implementation and consensus building. It should also
stand as an example of a simple way to leverage existing Internet
technologies to easily implement new application-level services.

1.2  MIME message exchange mechanisms

CIP relies on interchange of standard MIME messages for all requests and
replies. These messages are passed over a bidirectional, reliable
transport system. This document defines transport over reliable network
streams (via TCP), via HTTP, and via the Internet mail infrastructure.

The CIP server which initiates the connection (conventionally referred
to as a client) will be referred to below as the sender-CIP. The CIP
server which accepts a sender-CIP's incoming connection and responds to
the sender-CIP's requests is called a receiver-CIP.

1.3  The Stream Transport

CIP messages are transmitted over bi-directional TCP connections via a
simple text protocol. The transaction can take place over any TCP port,
as specified by the mesh configuration. There is no "well known port"
for CIP transactions. All configuration information in the system must
include both a hostname and a port.

All sender-CIP actions (including requests, connection initiation, and
connection finalization) are acknowledged by the receiver-CIP with a
response code. See section 2.3.1 for the format of these codes, a list
of the responses a CIP server may generate, and the expected sender-CIP
action for each.

In order to maintain backwards compatibility with existing Whois++
servers, CIPv3 sender-CIPs must first verify that the newer protocol is
supported. They do this by sending the following illegal Whois++ system
command: "# CIP-Version: 3<cr><lf>". On existing Whois++ servers
implementing version 1 and 2 of CIP, this results in a 500-series
response code, and the server terminates the connection. If the server
implements CIPv3, it must instead respond with response code 300. Future
versions of CIP can be correctly negotiated using this technique with a
different string (i.e. "CIP-Version: 4"). An example of this short
interchange is given below.

Note: If a sender-CIP can safely assume that the server implements
CIPv3, it may choose to send the "# CIP-Version: 3" string and
immediately follow it with the CIPv3 request. This optimization, useful
only in known homogeneous CIPv3 meshes, avoids waiting for the round-
trip inherent in the negotiation.

Once a sender-CIP has successfully verified that the server supports
CIPv3 requests, it can send the request, formatted as a MIME message
with Mime-Version and Content-Type headers (only), using the network
standard line ending: "<cr><lf>".  A more precise specification is as
follows:

Cip-Req        = Req-Hdrs CRLF Req-Body
Req-Hdrs       = *( Version-Hdr | Req-Cntnt-Hdr )
Req-Body       = Body { format of request body as in [CIP-MIME] }
Body           = Data CRLF "." CRLF
Data           = { data with CRLF "." CRLF replaced by CRLF ".." CRLF }
Version-Hdr    = "Mime-Version:" "1.0" CRLF
Req-Cntnt-Hdr  = "Content-Type:" Req-Content CRLF
Req-Content    = { format is specified in [CIP-MIME] }

Cip-Rsp        = Rsp-Code CRLF [ Rsp-Hdrs CRLF Rsp-Body ]
                    [ Indx-Cntnt-Hdr CRLF Index-Body ]
Rsp-Code       = DIGIT DIGIT DIGIT Comment
Comment        = { any chars except CR and LF }
Rsp-Hdrs       = *( Version-Hdr | Rsp-Cntnt-Hdr )
Rsp-Cntnt-Hdr  = "Content-Type:" Rsp-Content CRLF
Req-Content    = { format is specified in [CIP-MIME] }
Rsp-Body       = Body { format of response body as in [CIP-MIME] }

Indx-Cntnt-Hdr = "Content-Type:" Indx-Obj-Type CRLF
Indx-Obj-Type  = { any registered index object's MIME-type }
Index-Body     = Body

The message is terminated using SMTP-style message termination. The data
is sent octet-for-octet, except when the pattern "<cr><lf>.<cr><lf>" is
seen, in which case the period is repeated, resulting in the following
pattern: "<cr><lf>..<cr><lf>". When the data is finished, the octet
pattern "<cr><lf>.<cr><lf>" is transmitted to the receiver-CIP. On the
receiver-CIP's side, the reverse transformation is applied, and the
message read consists of all bytes up to, but not including, the
terminating pattern.

In response to the request, the receiver-CIP sends a response code, from
either the 200, 400, or 500 series. The receiver-CIP then processes the
request and replies, if necessary, with a MIME message. This reply is
also delimited by an SMTP-style message terminator.

After responding with a response code, the receiver-CIP must prepare to
read another request message, resetting state to the point when the
sender-CIP has just verified the CIP version. If the sender-CIP is
finished making requests, it may close the connection. In response the
receiver-CIP must abort reading the message and prepare for a new sender-
CIP connection (resetting it's state completely).

An example is given below. In this (and all further examples) octets
sent by the sender-CIP are preceded by ">>> " and those sent by the
receiver-CIP by "<<< ". Line endings are explicitly shown in angle-
brackets; newlines in this text are added only for readability. Comments
occur in curly-brackets.

     { sender-CIP connects to receiver-CIP }
<<< % 220 Example CIP server ready<cr><lf>
>>> # CIP-Version: 3<cr><lf>
<<< % 300 CIPv3 OK!<cr><lf>
>>> Mime-Version: 1.0<cr><lf>
>>> Content-type: application/cip-request; request="noop"<cr><lf>
>>> <cr><lf>
     {
     This example uses the "noop" request. Receiver-CIPs must simply
     ignore this request. The actual text in the following request is:
     "This next line is only a dot.<local-newline>.<local-newline>".
     }
>>> The next line is only a dot:<cr><lf>
>>> ..<cr><lf>
>>> <cr><lf>
>>> .<cr><lf>
<<< % 200 Good MIME message received
     { sender-CIP shuts down socket for writing }
<<< % 222 Connection closing in response to sender-CIP shutdown
     { receiver-CIP closes its side, resets, and awaits a new sender-CIP
     }

An example of an unsuccessful version negotiation looks like this:

     { sender-CIP connects to receiver-CIP }
<<< % 220 Whois++ server ready<cr><lf>
>>> # CIP-Version: 3<cr><lf>
<<< % 500 Syntax error<cr><lf>
     { server closes connection }

The sender-CIP may attempt to retry using version 1 or 2 protocol.
Sender-CIP may cache results of this unsuccessful negotiation to avoid
later attempts.

1.3.1     Transport specific response codes
The following response codes are used with the stream transport:

Code  Suggested description     Sender-CIP action
      text
220   Initial server banner     Continue with Whois++ interaction, or
      message                   attempt CIP version negotiation.
300   Requested CIP version     Continue with CIP transaction, in the
      accepted                  specified version.
222   Connection closing (in    Done with transaction.
      response to sender-CIP
      close)
200   MIME request received     Expect no output, continue session (or
      and processed             close)
201   MIME request received     Read a response, delimited by SMTP-
      and processed, output     style message delimiter.
      follows
400   Temporarily unable to     Retry at a later time. May be used to
      process request           indicate that the server does not
                                currently have the resources available
                                to accept an index.
500   Bad MIME message format   Retry with correctly formatted MIME

1.4  Internet mail infrastructure as transport

As an alternative to TCP streams, CIP transactions can take place over
the existing Internet mail infrastructure. There are two motivations for
this feature of CIP. First, it lowers the barriers to entry for leaf
servers. When the need for a full TCP implementation is relaxed, leaf
nodes (which, by definition, only send index objects) can consist of as
little as a database and a indexing program (possibly written in a very
high level language) to participate in the mesh.

Second, it keeps with the philosophy of making use of existing Internet
technology. The MIME messages used for requests and responses are, by
definition of the MIME specification, suitable for transport via the
Internet mail infrastructure. With a few simple rules, we open up an
entirely different way to interact with CIP servers which choose to
implement this transport. See Protocol Conformance, below, for details
on what options server implementers have about supporting the various
transports.

The basic rhythm of request/response is maintained when using the mail
transport. The following sections clarify some special cases which need
to be considered for mail transport of CIP objects. In general, all mail
protocols and mail format  specifications (especially MIME Security
Multiparts) can be used with the CIP mail transport.

[ Note to reviewers: What about version negotiation for mail transport?
Should we add a CIP-Version header? ]

1.4.1     Return path

When CIP transactions take place over a bidirectional stream, the return
path for errors and results is implicit. Using mail as a transport
introduces difficulties to the recipient, because it's not always clear
from the headers exactly where the reply should go, though in practice
there are some heuristics used by MUA's.

CIP solves this problem by fiat. CIP requests sent using the mail
transport must include a Reply-To header as specified by RFC-822. Any
mail received for processing by a CIP server implementing the mail
transport without a Reply-To header must be ignored, and a message
should be logged for the local administrator. The receiver must not
attempt to reply with an error to any address derived from the incoming
mail.

If under no circumstances is a response to be sent to a CIP request, the
sender should include a Reply-To header with the address "<>" in it.
Receivers must never attempt to  send replies to that address, as it is
defined to be invalid (both here, and by the BNF grammar in RFC-822).
It should be noted that, in general, it is a bad idea to turn off error
reporting in this way. However, in the simplest case of an index pushing
program, this may be a desirable simplification.

1.5  HTTP transport

HTTP may also be used to transport CIP objects, since they are just MIME
objects. A transaction is performed by using the POST method to send an
application/cip-request and returning an application/cip-response or an
application/cip-index-object in the HTTP reply. The URL that is the
target of the post is a configuration parameter of the CIP-sender to CIP-
receiver relationship. Security is handled by using HTTP Basic
Authentication [RFC 2068] or HTTP Message Digest Authentication [RFC
2069], or SSL/TLS.

Example:

     { the client opens the connection and sends a POST }
>>> POST / HTTP/1.1<cr><lf>
>>> Host: cip.some.corp<cr><lf>
>>> Content-type: application/cip-request; request="noop"<cr><lf>
>>> Date: Thu, 6 Jun 1997 18:16:03 GMT<cr><lf>
>>> Content-Length: 43<cr><lf>
>>> Connection: close<cr><lf>
>>> <cr><lf>
>>> This is some text that will be ignored.<cr><lf>
     { the server processes the request }
<<< HTTP/1.1 204 No Content<cr><lf>
     { the server closes the connection }

In addition to leveraging the security capabilities that come with HTTP,
there are other HTTP features that may be useful in a CIP context. A CIP
client may use the Accept-Charset and Accept-Language HTTP headers to
express a desire to retrieve an index in a particular character set or
natural language. It may use the Accept-Encoding header to (e.g.)
indicate that it can handle compressed responses, which the CIP server
may send in conjunction with the Transfer-Encoding header. It may use
the If-Modified-Since header to prevent wasted transmission of an index
that has not changed since the last poll. A CIP server can use the Retry-
After header to request that the client retry later when the server is
less busy.

2.   References
[RFC 2068] Fielding, et.al., "Hypertext Transfer Protocol -- HTTP/1.1",
January, 1997
[RFC 2069] Franks, et. al., "An Extension to HTTP: Digest Access
Authentication", January, 1997
[CIP-ARCH] Allen, J., Mealling, M, "The Common Indexing Protocol", work
in progress
[CIP-MIME] Allen, J., Mealling, M., "MIME Object Definitions for the
Common Indexing Protocol", work in progress


3.   Authors' Addresses

   Jeff R. Allen                         Paul J. Leach
   246 Hawthorne St.                     Microsoft
   Palo Alto, CA  94301                  1 Microsoft Way
   USA                                   Redmond, WA 98052
   EMail: jeff.allen@acm.org             Email: paulle@microsoft.com