Network Working Group P. Rzewski Internet Draft Inktomi Expires 17, May 2001 B. Cain Mirror Image N. Robertson Exodus November 2000 Cross-Network Distribution of Content Signals for HTTP draft-rzewski-cndistcs-00.txt Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on May 14, 2001. Copyright Notice Copyright (C) The Internet Society (2000). All Rights Reserved. Abstract One of the goals of "content peering" is to allow content providers increased control of content presentation in remote networks. In order to provide this, simple, commonly accepted implementations are required for content providers to originate control messages ("content signals"), for intermediate networks to relay them, and for edge networks to receive and act on them. This draft proposes such an Rzewski, et al Expires May 17, 2001 [Page 1] Internet-Draft CNDCS for HTTP November 2000 implementation specifically for the control of HTTP content. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . 2 1.1 Purpose . . . . . . . . . . . . . . . . . . . . . . 2 1.2 Requirements . . . . . . . . . . . . . . . . . . . . 3 1.3 Terminology . . . . . . . . . . . . . . . . . . . . 3 2. Injection . . . . . . . . . . . . . . . . . . . . . 4 2.1 Role of a Content Injector . . . . . . . . . . . . . 4 2.2 Specific functions of a Content Injector . . . . . . 4 2.2.1 Basic Invalidation . . . . . . . . . . . . . . . . . 4 2.2.2 Explicit Invalidation . . . . . . . . . . . . . . . 5 2.2.3 Pre-Load . . . . . . . . . . . . . . . . . . . . . . 5 3. Distribution . . . . . . . . . . . . . . . . . . . . 5 3.1 Role of a Content Relay . . . . . . . . . . . . . . 5 3.2 Specific functions of a Content Relay . . . . . . . 6 3.2.1 Basic Distribution of Content Signals . . . . . . . 6 3.3 Role of a Surrogate . . . . . . . . . . . . . . . . 6 3.4 Specific functions of a Surrogate . . . . . . . . . 6 3.4.1 Surrogate Acting on an Invalidation . . . . . . . . 6 3.4.2 Surrogate acting on a Pre-Load . . . . . . . . . . . 6 3.5 Role of a Content Relay Surrogate . . . . . . . . . 7 4. Security . . . . . . . . . . . . . . . . . . . . . . 7 5. Acknowledgements . . . . . . . . . . . . . . . . . . 7 References . . . . . . . . . . . . . . . . . . . . . 7 Authors' Addresses . . . . . . . . . . . . . . . . . 8 Full Copyright Statement . . . . . . . . . . . . . . 8 1. Introduction 1.1 Purpose Standard web proxy caches use a set of heuristics to invalidate stale content and to attempt to maintain consistency. However, without direct validation support from origin servers, these techniques can be overly conservative, resulting in unnecessary trips to the origin servers and degradation in cache performance. A less conservative heuristic, on the other hand, allows the possibility of stale content and inconsistency among caches. It would be ideal for origin servers to have increased control over content validation and coherence in remote caches by providing a mechanism to explicitly invalidate and update stale content. However, due to the large number of origin servers and the large number of Rzewski, et al Expires May 17, 2001 [Page 2] Internet-Draft CNDCS for HTTP November 2000 remote caches, scaling this control can be non-trivial. This draft proposes a simple, HTTP-based implementation for the control of HTTP content through use of "content signals". Specific requirements are given for systems that originate content signals (Content Injectors), systems that pass content signals between networks (Content Relays), and systems that receive and act upon content signals (Surrogates). See [4] for a description of a full implementation of "content peering" using the systems described in this document. 1.2 Requirements The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [1]. An implementation is not compliant if it fails to satisfy one or more of the MUST or REQUIRED level requirements for the protocols it implements. An implementation that satisfies all the MUST or REQUIRED level and all the SHOULD level requirements for its protocols is said to be "unconditionally compliant"; one that satisfies all the MUST level requirements but not all the SHOULD level requirements for its protocols is said to be "conditionally compliant." 1.3 Terminology The following terms are used as defined in [3]: DISTRIBUTION, SURROGATE, CONTENT SIGNAL ACCEPT If a rule in a DISTRIBUTION POLICY allows a DISTRIBUTION system to perform an action on a TARGET URL, it is said that the DISTRIBUTION system ACCEPTS that TARGET URL. ACCOUNTING RELAY Described in detail in [5]. DISTRIBUTION POLICY A set of rules defined on a CONTENT INJECTOR, CONTENT RELAY, or SURROGATE that defines a set of TARGET URLS and a set of actions that may be performed on them. TARGET URL Rzewski, et al Expires May 17, 2001 [Page 3] Internet-Draft CNDCS for HTTP November 2000 A URL attached to a set of rules defined in a DISTRIBUTION POLICY. DISTRIBUTION systems described in this document MUST be able to indicate a set of TARGET URLS by Domain Name granularity, and MAY provide additional granularity (e.g. regular expressions) 2. Injection 2.1 Role of a Content Injector A CONTENT INJECTOR sends CONTENT SIGNALS using HTTP to the Hostname and Port of a CONTENT RELAY. A CONTENT INJECTOR MUST be able to send a CONTENT SIGNAL whenever an object referenced by a TARGET URL is added, changed, or deleted from the ORIGIN. For example, a CONTENT INJECTOR could logically be run whenever content is published from a staging server to an origin server. A CONTENT INJECTOR MUST re-send a CONTENT SIGNAL in the event that a CONTENT RELAY returns any status code other than 200. A CONTENT INJECTOR MAY send alarms (e-mail, SNMP traps, etc.) in the event that a re-send of a CONTENT SIGNAL is necessary. A CONTENT INJECTOR SHOULD provide a mechanism to exclude URLs containing certain file extensions (e.g. "cgi") from generating CONTENT SIGNALS. A CONTENT INJECTOR MAY provide a more flexible mechanism for URL exclusion, such as regular expressions. A CONTENT INJECTOR SHOULD provide a mechanism to cause only URLs containing certain file extensions (e.g. "gif") to generate CONTENT SIGNALS. A CONTENT INJECTOR MAY provide a more flexible mechanism for URL inclusion, such as regular expressions. 2.2 Specific functions of a Content Injector 2.2.1 Basic Invalidation Probably the simplest form of CONTENT SIGNAL that a CONTENT PROVIDER may wish to originate is an INVALIDATION. A CONTENT INJECTOR MUST originate a INVALIDATION in the form of an HTTP DELETE operation, as defined in [2]. DELETE [URL] Max-Forwards: 0 Rzewski, et al Expires May 17, 2001 [Page 4] Internet-Draft CNDCS for HTTP November 2000 The HTTP DELETE requests that a CONTENT INJECTOR sends MUST include the Max-Forwards: 0 directive so they are not forwarded to origin servers. 2.2.2 Explicit Invalidation When cross-network INVALIDATION is the sole purpose of the CONTENT SIGNAL, a CONTENT INJECTOR SHOULD also include the following additional header along with the basic INVALIDATION described in Section 2.2.1: CND: DELETE 2.2.3 Pre-load Another simple form of CONTENT SIGNAL that a CONTENT PROVIDER may wish to originate is a PRE-LOAD. A CONTENT INJECTOR MUST originate a PRE-LOAD using a basic INVALIDATION as described in Section 2.2.1, but MUST also include the following additional header: CND: GET 3. Distribution 3.1 Role of a Content Relay A CONTENT RELAY receives CONTENT SIGNALS from an UPSTREAM source (such as a CONTENT INJECTOR or another CONTENT RELAY). For each CONTENT SIGNAL received, a CONTENT RELAY MAY perform some local action, and then MAY pass the CONTENT SIGNAL to DOWNSTREAM systems. A CONTENT RELAY MUST re-send CONTENT SIGNALS in the event that a DOWNSTREAM system returns any status code other than 200. A CONTENT RELAY MAY send alarms (e-mail, SNMP traps, etc.) in the event that a re-send of a CONTENT SIGNAL is necessary. A CONTENT RELAY MUST be able to log all invalidation messages to an ACCOUNTING RELAY [5]. A CONTENT RELAY MAY have the option to maintain its own local object store and therefore take on a combined role as a SURROGATE. Such a system is called a CONTENT RELAY SURROGATE. A CONTENT RELAY SURROGATE has additional responsibility to act on CONTENT SIGNALS before Rzewski, et al Expires May 17, 2001 [Page 5] Internet-Draft CNDCS for HTTP November 2000 passing them to DOWNSTREAM systems. 3.2 Specific functions of a Content Relay 3.2.1 Basic Distribution of Content Signals A CONTENT RELAY receives CONTROL SIGNALS using HTTP on a specific Port. A CONTENT RELAY that receives and ACCEPTS a CONTENT SIGNAL MUST be able to pass these CONTENT SIGNALS to a set of DOWNSTREAM CONTENT RELAYS or SURROGATES. Passing a CONTENT SIGNAL is performed by opening an HTTP connection to a specific Hostname and Port for each DOWNSTREAM system and sending the unmodified CONTROL SIGNAL as received from the UPSTREAM source. 3.3 Role of a Surrogate Because a SURROGATE is responsible for actual delivery of CONTENT, it must act on each CONTENT SIGNAL received from an UPSTREAM source. A SURROGATE MUST have the ability to ignore cache heuristics (expiration headers, etc.) for TARGET URLS. Note that [2] states: "A stale cache entry may not normally be returned by a cache [...] unless it is first validated with the origin server." For example, a DISTRIBUTION POLICY (such as might be defined by a "content peering" agreement) that includes a commitment for the ORIGIN to issue explicit CONTROL SIGNALS for any TARGET URLS acts as a blanket validation by the ORIGIN. Therefore, cache heuristics attached to TARGET URL content can safely be ignored by SURROGATES and are assumed to be present only for the benefit of non-peered systems 3.4 Specific functions of a Surrogate 3.4.1 Surrogate Acting on an Invalidation A SURROGATE that receives and ACCEPTS an INVALIDATION MUST perform the local deletion of the specified URL from its object store. 3.4.2 Surrogate acting on a Pre-Load A SURROGATE that receives and ACCEPTS a PRE-LOAD MUST perform the local deletion of the specified URL from its object store, and then MUST originate an HTTP GET request of the URL and store the HTTP response in its local object store. Rzewski, et al Expires May 17, 2001 [Page 6] Internet-Draft CNDCS for HTTP November 2000 A SURROGATE MUST follow any HTTP 302 Redirect it receives in response to an HTTP GET. (Contrast with many HTTP caching proxy implementations that simply forward an HTTP Redirect on to a client.) 3.5 Role of a Content Relay Surrogate A CONTENT RELAY SURROGATE MUST perform all distribution tasks of a SURROGATE (e.g. INVALIDATION, PRE-LOAD) followed by all distribution tasks of a CONTENT RELAY (passing a CONTENT SIGNAL to DOWNSTREAM systems). 4. Security Because CONTROL SIGNALS are transferred over HTTP sessions, a CONTENT RELAY or SURROGATE MUST have the ability to restrict incoming HTTP connections to only a specified set of UPSTREAM CONTENT INJECTORS or CONTENT RELAYS. A CONTENT RELAY or SURROGATE MAY support the ability to perform SSL encryption over these HTTP sessions for added security. [Ed note: Other signed tamper prevention/detection mechanisms may be preferable due to the weight of the SSL negotiation on each connect as well as the hassles of key management.] 5. Acknowledgements The authors acknowledge the contributions and comments of Joe Bai (Adero), Arthur Huston (Inktomi), and Paul Francis (Inktomi). References [1] S. Bradner, "Key words for use in RFCs to Indicate Requirement Levels", RFC2119, March 1997 [2] R. Fielding, J. Gettys, J. Mogul, H. Frystyk, L. Masinter, P. Leach, T. Berners-Lee, "Hypertext Transfer Protocol -- HTTP/1.1", RFC2616, June 1999 [3] M. Day, B. Cain, G. Tomlinson, "A Model for CDN Peering", draft- day-cdnp-model-03.txt (work in progress), November 2000 [4] P. Rzewski, N. Robertson, J. Bai, "Origin/Access Content Rzewski, et al Expires May 17, 2001 [Page 7] Internet-Draft CNDCS for HTTP November 2000 Peering", draft-rzewski-oacp-00.txt (work in progress), November 2000 [5] P. Rzewski, B. Cain, N. Robertson, "Cross-Network Accounting for HTTP", draft-rzewski-cnacct-00.txt (work in progress), November 2000 Authors' Addresses Phillip A. Rzewski Inktomi Corporation 4100 East Third Avenue MS FC1-4 Foster City, CA 94404 USA Phone: +1 650 653 2487 Email: philr@inktomi.com Brad Cain Mirror Image Internet 49 Dragon Court Woburn, MA 01801 US Phone: +1 781 276 1904 Email: brad.cain@mirror-image.com Niel Robertson Exodus 2900 Nautilus Court Boulder, CO 80301 US Phone: +1 303 381 2302 Email: nielr@servicemetrics.com Full Copyright Statement Copyright (C) The Internet Society (2000). All Rights Reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published Rzewski, et al Expires May 17, 2001 [Page 8] Internet-Draft CNDCS for HTTP November 2000 and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English. The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assigns. This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Acknowledgement Funding for the RFC editor function is currently provided by the Internet Society. Rzewski, et al Expires May 17, 2001 [Page 9]