Network Working Group                                         P. Rzewski
Internet Draft                                                   Inktomi
Expires 17, May 2001                                             B. Cain
                                                            Mirror Image
                                                            N. Robertson
                                                                  Exodus
                                                           November 2000

         Cross-Network Distribution of Content Signals for HTTP
                     draft-rzewski-cndistcs-00.txt

Status of this Memo

   This document is an Internet-Draft and is in full conformance with
   all provisions of Section 10 of RFC2026.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups. Note that other
   groups may also distribute working documents as Internet-Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time. It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   This Internet-Draft will expire on May 14, 2001.

Copyright Notice

   Copyright (C) The Internet Society (2000). All Rights Reserved.

Abstract

   One of the goals of "content peering" is to allow content providers
   increased control of content presentation in remote networks. In
   order to provide this, simple, commonly accepted implementations are
   required for content providers to originate control messages
   ("content signals"), for intermediate networks to relay them, and for
   edge networks to receive and act on them. This draft proposes such an


Rzewski, et al            Expires May 17, 2001                  [Page 1]


Internet-Draft               CNDCS for HTTP                November 2000


   implementation specifically for the control of HTTP content.

Table of Contents

   1.     Introduction . . . . . . . . . . . . . . . . . . . .  2
   1.1    Purpose  . . . . . . . . . . . . . . . . . . . . . .  2
   1.2    Requirements . . . . . . . . . . . . . . . . . . . .  3
   1.3    Terminology  . . . . . . . . . . . . . . . . . . . .  3
   2.     Injection  . . . . . . . . . . . . . . . . . . . . .  4
   2.1    Role of a Content Injector . . . . . . . . . . . . .  4
   2.2    Specific functions of a Content Injector . . . . . .  4
   2.2.1  Basic Invalidation . . . . . . . . . . . . . . . . .  4
   2.2.2  Explicit Invalidation  . . . . . . . . . . . . . . .  5
   2.2.3  Pre-Load . . . . . . . . . . . . . . . . . . . . . .  5
   3.     Distribution . . . . . . . . . . . . . . . . . . . .  5
   3.1    Role of a Content Relay  . . . . . . . . . . . . . .  5
   3.2    Specific functions of a Content Relay  . . . . . . .  6
   3.2.1  Basic Distribution of Content Signals  . . . . . . .  6
   3.3    Role of a Surrogate  . . . . . . . . . . . . . . . .  6
   3.4    Specific functions of a Surrogate  . . . . . . . . .  6
   3.4.1  Surrogate Acting on an Invalidation  . . . . . . . .  6
   3.4.2  Surrogate acting on a Pre-Load . . . . . . . . . . .  6
   3.5    Role of a Content Relay Surrogate  . . . . . . . . .  7
   4.     Security . . . . . . . . . . . . . . . . . . . . . .  7
   5.     Acknowledgements . . . . . . . . . . . . . . . . . .  7
          References . . . . . . . . . . . . . . . . . . . . .  7
          Authors' Addresses . . . . . . . . . . . . . . . . .  8
          Full Copyright Statement . . . . . . . . . . . . . .  8

1. Introduction

1.1 Purpose

   Standard web proxy caches use a set of heuristics to invalidate stale
   content and to attempt to maintain consistency. However, without
   direct validation support from origin servers, these techniques can
   be overly conservative, resulting in unnecessary trips to the origin
   servers and degradation in cache performance. A less conservative
   heuristic, on the other hand, allows the possibility of stale content
   and inconsistency among caches.

   It would be ideal for origin servers to have increased control over
   content validation and coherence in remote caches by providing a
   mechanism to explicitly invalidate and update stale content. However,
   due to the large number of origin servers and the large number of


Rzewski, et al            Expires May 17, 2001                  [Page 2]


Internet-Draft               CNDCS for HTTP                November 2000


   remote caches, scaling this control can be non-trivial.

   This draft proposes a simple, HTTP-based implementation for the
   control of HTTP content through use of "content signals". Specific
   requirements are given for systems that originate content signals
   (Content Injectors), systems that pass content signals between
   networks (Content Relays), and systems that receive and act upon
   content signals (Surrogates).

   See [4] for a description of a full implementation of "content
   peering" using the systems described in this document.

1.2  Requirements

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC 2119 [1].

   An implementation is not compliant if it fails to satisfy one or more
   of the MUST or REQUIRED level requirements for the protocols it
   implements. An implementation that satisfies all the MUST or REQUIRED
   level and all the SHOULD level requirements for its protocols is said
   to be "unconditionally compliant"; one that satisfies all the MUST
   level requirements but not all the SHOULD level requirements for its
   protocols is said to be "conditionally compliant."

1.3  Terminology

   The following terms are used as defined in [3]: DISTRIBUTION,
   SURROGATE, CONTENT SIGNAL

   ACCEPT
      If a rule in a DISTRIBUTION POLICY allows a DISTRIBUTION system to
      perform an action on a TARGET URL, it is said that the
      DISTRIBUTION system ACCEPTS that TARGET URL.

   ACCOUNTING RELAY
      Described in detail in [5].

   DISTRIBUTION POLICY
      A set of rules defined on a CONTENT INJECTOR, CONTENT RELAY, or
      SURROGATE that defines a set of TARGET URLS and a set of actions
      that may be performed on them.

   TARGET URL


Rzewski, et al            Expires May 17, 2001                  [Page 3]


Internet-Draft               CNDCS for HTTP                November 2000


      A URL attached to a set of rules defined in a DISTRIBUTION POLICY.
      DISTRIBUTION systems described in this document MUST be able to
      indicate a set of TARGET URLS by Domain Name granularity, and MAY
      provide additional granularity (e.g. regular expressions)

2. Injection

2.1  Role of a Content Injector

   A CONTENT INJECTOR sends CONTENT SIGNALS using HTTP to the Hostname
   and Port of a CONTENT RELAY.

   A CONTENT INJECTOR MUST be able to send a CONTENT SIGNAL whenever an
   object referenced by a TARGET URL is added, changed, or deleted from
   the ORIGIN. For example, a CONTENT INJECTOR could logically be run
   whenever content is published from a staging server to an origin
   server.

   A CONTENT INJECTOR MUST re-send a CONTENT SIGNAL in the event that a
   CONTENT RELAY returns any status code other than 200. A CONTENT
   INJECTOR MAY send alarms (e-mail, SNMP traps, etc.) in the event that
   a re-send of a CONTENT SIGNAL is necessary.

   A CONTENT INJECTOR SHOULD provide a mechanism to exclude URLs
   containing certain file extensions (e.g. "cgi") from generating
   CONTENT SIGNALS. A CONTENT INJECTOR MAY provide a more flexible
   mechanism for URL exclusion, such as regular expressions.

   A CONTENT INJECTOR SHOULD provide a mechanism to cause only URLs
   containing certain file extensions (e.g. "gif") to generate CONTENT
   SIGNALS. A CONTENT INJECTOR MAY provide a more flexible mechanism for
   URL inclusion, such as regular expressions.

2.2 Specific functions of a Content Injector

2.2.1  Basic Invalidation

   Probably the simplest form of CONTENT SIGNAL that a CONTENT PROVIDER
   may wish to originate is an INVALIDATION.

   A CONTENT INJECTOR MUST originate a INVALIDATION in the form of an
   HTTP DELETE operation, as defined in [2].

      DELETE [URL]
      Max-Forwards: 0


Rzewski, et al            Expires May 17, 2001                  [Page 4]


Internet-Draft               CNDCS for HTTP                November 2000


   The HTTP DELETE requests that a CONTENT INJECTOR sends MUST include
   the Max-Forwards: 0 directive so they are not forwarded to origin
   servers.

2.2.2  Explicit Invalidation

   When cross-network INVALIDATION is the sole purpose of the CONTENT
   SIGNAL, a CONTENT INJECTOR SHOULD also include the following
   additional header along with the basic INVALIDATION described in
   Section 2.2.1:

      CND: DELETE

2.2.3  Pre-load

   Another simple form of CONTENT SIGNAL that a CONTENT PROVIDER may
   wish to originate is a PRE-LOAD.

   A CONTENT INJECTOR MUST originate a PRE-LOAD using a basic
   INVALIDATION as described in Section 2.2.1, but MUST also include the
   following additional header:

      CND: GET

3. Distribution

3.1 Role of a Content Relay

   A CONTENT RELAY receives CONTENT SIGNALS from an UPSTREAM source
   (such as a CONTENT INJECTOR or another CONTENT RELAY). For each
   CONTENT SIGNAL received, a CONTENT RELAY MAY perform some local
   action, and then MAY pass the CONTENT SIGNAL to DOWNSTREAM systems.

   A CONTENT RELAY MUST re-send CONTENT SIGNALS in the event that a
   DOWNSTREAM system returns any status code other than 200. A CONTENT
   RELAY MAY send alarms (e-mail, SNMP traps, etc.) in the event that a
   re-send of a CONTENT SIGNAL is necessary.

   A CONTENT RELAY MUST be able to log all invalidation messages to an
   ACCOUNTING RELAY [5].

   A CONTENT RELAY MAY have the option to maintain its own local object
   store and therefore take on a combined role as a SURROGATE. Such a
   system is called a CONTENT RELAY SURROGATE. A CONTENT RELAY SURROGATE
   has additional responsibility to act on CONTENT SIGNALS before


Rzewski, et al            Expires May 17, 2001                  [Page 5]


Internet-Draft               CNDCS for HTTP                November 2000


   passing them to DOWNSTREAM systems.

3.2  Specific functions of a Content Relay

3.2.1 Basic Distribution of Content Signals

   A CONTENT RELAY receives CONTROL SIGNALS using HTTP on a specific
   Port.

   A CONTENT RELAY that receives and ACCEPTS a CONTENT SIGNAL MUST be
   able to pass these CONTENT SIGNALS to a set of DOWNSTREAM CONTENT
   RELAYS or SURROGATES. Passing a CONTENT SIGNAL is performed by
   opening an HTTP connection to a specific Hostname and Port for each
   DOWNSTREAM system and sending the unmodified CONTROL SIGNAL as
   received from the UPSTREAM source.

3.3 Role of a Surrogate

   Because a SURROGATE is responsible for actual delivery of CONTENT, it
   must act on each CONTENT SIGNAL received from an UPSTREAM source.

   A SURROGATE MUST have the ability to ignore cache heuristics
   (expiration headers, etc.) for TARGET URLS. Note that [2] states: "A
   stale cache entry may not normally be returned by a cache [...]
   unless it is first validated with the origin server." For example, a
   DISTRIBUTION POLICY (such as might be defined by a "content peering"
   agreement) that includes a commitment for the ORIGIN to issue
   explicit CONTROL SIGNALS for any TARGET URLS acts as a blanket
   validation by the ORIGIN. Therefore, cache heuristics attached to
   TARGET URL content can safely be ignored by SURROGATES and are
   assumed to be present only for the benefit of non-peered systems

3.4 Specific functions of a Surrogate

3.4.1 Surrogate Acting on an Invalidation

   A SURROGATE that receives and ACCEPTS an INVALIDATION MUST perform
   the local deletion of the specified URL from its object store.

3.4.2 Surrogate acting on a Pre-Load

   A SURROGATE that receives and ACCEPTS a PRE-LOAD MUST perform the
   local deletion of the specified URL from its object store, and then
   MUST originate an HTTP GET request of the URL and store the HTTP
   response in its local object store.


Rzewski, et al            Expires May 17, 2001                  [Page 6]


Internet-Draft               CNDCS for HTTP                November 2000


   A SURROGATE MUST follow any HTTP 302 Redirect it receives in response
   to an HTTP GET. (Contrast with many HTTP caching proxy
   implementations that simply forward an HTTP Redirect on to a client.)

3.5 Role of a Content Relay Surrogate

   A CONTENT RELAY SURROGATE MUST perform all distribution tasks of a
   SURROGATE (e.g. INVALIDATION, PRE-LOAD) followed by all distribution
   tasks of a CONTENT RELAY (passing a CONTENT SIGNAL to DOWNSTREAM
   systems).

4. Security

   Because CONTROL SIGNALS are transferred over HTTP sessions, a CONTENT
   RELAY or SURROGATE MUST have the ability to restrict incoming HTTP
   connections to only a specified set of UPSTREAM CONTENT INJECTORS or
   CONTENT RELAYS.

   A CONTENT RELAY or SURROGATE MAY support the ability to perform SSL
   encryption over these HTTP sessions for added security. [Ed note:
   Other signed tamper prevention/detection mechanisms may be preferable
   due to the weight of the SSL negotiation on each connect as well as
   the hassles of key management.]

5. Acknowledgements

   The authors acknowledge the contributions and comments of Joe Bai
   (Adero), Arthur Huston (Inktomi), and Paul Francis (Inktomi).

References

   [1] S. Bradner, "Key words for use in RFCs to Indicate Requirement
   Levels", RFC2119, March 1997
   <URL: http://www.ietf.org/rfc/rfc2119.txt>

   [2] R. Fielding, J. Gettys, J. Mogul, H. Frystyk, L. Masinter, P.
   Leach, T. Berners-Lee, "Hypertext Transfer Protocol -- HTTP/1.1",
   RFC2616, June 1999
   <URL: http://www.ietf.org/rfc/rfc2616.txt>

   [3] M. Day, B. Cain, G. Tomlinson, "A Model for CDN Peering", draft-
   day-cdnp-model-03.txt (work in progress), November 2000
   <URL:http://www.ietf.org/internet-drafts/draft-day-cdnp-model-03.txt>

   [4] P. Rzewski, N. Robertson, J. Bai, "Origin/Access Content


Rzewski, et al            Expires May 17, 2001                  [Page 7]


Internet-Draft               CNDCS for HTTP                November 2000


   Peering", draft-rzewski-oacp-00.txt (work in progress), November 2000

   [5] P. Rzewski, B. Cain, N. Robertson, "Cross-Network Accounting for
   HTTP", draft-rzewski-cnacct-00.txt (work in progress), November 2000

Authors' Addresses

   Phillip A. Rzewski
   Inktomi Corporation
   4100 East Third Avenue
   MS FC1-4
   Foster City, CA 94404
   USA

   Phone: +1 650 653 2487
   Email: philr@inktomi.com


   Brad Cain
   Mirror Image Internet
   49 Dragon Court
   Woburn, MA  01801
   US

   Phone: +1 781 276 1904
   Email: brad.cain@mirror-image.com


   Niel Robertson
   Exodus
   2900 Nautilus Court
   Boulder, CO 80301
   US

   Phone: +1 303 381 2302
   Email: nielr@servicemetrics.com


Full Copyright Statement

   Copyright (C) The Internet Society (2000). All Rights Reserved.

   This document and translations of it may be copied and furnished to
   others, and derivative works that comment on or otherwise explain it
   or assist in its implementation may be prepared, copied, published


Rzewski, et al            Expires May 17, 2001                  [Page 8]


Internet-Draft               CNDCS for HTTP                November 2000


   and distributed, in whole or in part, without restriction of any
   kind, provided that the above copyright notice and this paragraph are
   included on all such copies and derivative works. However, this
   document itself may not be modified in any way, such as by removing
   the copyright notice or references to the Internet Society or other
   Internet organizations, except as needed for the purpose of
   developing Internet standards in which case the procedures for
   copyrights defined in the Internet Standards process must be
   followed, or as required to translate it into languages other than
   English.

   The limited permissions granted above are perpetual and will not be
   revoked by the Internet Society or its successors or assigns.

   This document and the information contained herein is provided on an
   "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
   TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
   BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
   HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
   MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

Acknowledgement

   Funding for the RFC editor function is currently provided by the
   Internet Society.


Rzewski, et al            Expires May 17, 2001                  [Page 9]