Network Working Group P. Rzewski Internet Draft Inktomi Expires 17, May 2001 J. Bai Adero N. Robertson Exodus November 2000 Origin/Access Content Peering for HTTP draft-rzewski-oacp-00.txt Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on May 14, 2001. Copyright Notice Copyright (C) The Internet Society (2000). All Rights Reserved. Abstract The goal of "content peering" is to logically connect content-based networks together to allow content providers increased control of content presentation in remote networks and to receive detailed information on the utilization of their content. The Origin/Access Content Peering (OACP) implementation provides a scalable mechanism for content providers to originate this control and also to receive Rzewski et al Expires May 17, 2001 [Page 1] Internet-Draft OACP November 2000 utilization data. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . 2 1.1 Purpose . . . . . . . . . . . . . . . . . . . . . . 2 1.2 OACP Components. . . . . . . . . . . . . . . . . . . 3 1.3 Terminology . . . . . . . . . . . . . . . . . . . . 4 1.4 Comparison with "CDN Peering" . . . . . . . . . . . 7 2. OACP Distribution . . . . . . . . . . . . . . . . . 7 2.1 Hoster/CDN Content Relay Implementation . . . . . . 7 2.2 Operator Content Relay Implementation . . . . . . . 8 2.3 Access Content Relay and Access Cache Implementation . . . . . . . . . . . . 8 3. OACP Accounting . . . . . . . . . . . . . . . . . . 8 3.1 Hoster/CDN Accounting Relay Implementation . . . . . 9 3.2 Operator Accounting Relay Implementation . . . . . . 9 3.3 Access Provider Accounting Relay Implementation . . 9 4. Operator Roles . . . . . . . . . . . . . . . . . . . 9 5. Security . . . . . . . . . . . . . . . . . . . . . . 10 6. Acknowledgements . . . . . . . . . . . . . . . . . . 10 References . . . . . . . . . . . . . . . . . . . . . 11 Authors' Addresses . . . . . . . . . . . . . . . . . 11 Full Copyright Statement . . . . . . . . . . . . . . 12 1. Introduction 1.1 Purpose HTTP caching proxies have been in increasing use in the Internet over the past several years. Access networks (who provide connectivity to large masses of end users through mechanisms such as dial) often use caches to improve performance for users and to conserve WAN bandwidth. Hoster networks (who provide data center connectivity to content providers) often use server accelerator caches to reduce load on customer origin servers. Recently emerged Content Delivery Networks (CDNs) use "surrogate" caches (often deployed across wide geographic regions) to deliver content on behalf of content providers in a more efficient manner. Standard web proxy caches use a set of heuristics to invalidate stale content and to attempt to maintain consistency. However, without direct validation support from origin servers, these techniques can be overly conservative, resulting in unnecessary trips to the origin servers and degradation in cache performance. A less conservative Rzewski et al Expires May 17, 2001 [Page 2] Internet-Draft OACP November 2000 heuristic, on the other hand, allows the possibility of stale content and inconsistency among caches. Standard web proxy caches also tend to maintain logs only with the intent that they will be viewed locally. If a content provider were interested in obtaining access to logs on caches in remote networks, there is typically no commonly accepted mechanism to gain access to them. Also, because each cache implementation may potentially keep logs in a different format, there's no guarantee that such logs would be immediately useful to a content provider without significant post-processing. Without the ability to see this utilization data, content providers often must mark some objects as uncacheable in order to generate adequate logs at their origin servers. Because caches in access networks are typically the first caching proxies hit by web surfers, they are a location where content providers may wish to exercise additional control and see detailed utilization information. Because the interaction between the content provider and access network crosses one or more administrative boundaries, the relationship between the origin and access endpoints is said to be a "content peering" relationship. The Origin/Access Content Peering (OACP) implementation describes a solution to propagate control and accounting information between content provider origins and caches in access networks. Roles are defined for intermediate networks that assist in the passing of control and accounting information. 1.2 OACP components An implementation of OACP contains CONTENT INJECTORS, CONTENT RELAYS, ACCOUNTING RELAYS, and SURROGATES, as described in [4] and [5]. This draft assumes that the functionality of the systems described in [4] and [5] is understood by the reader. The different networks participating in an OACP implementation use these systems in a particular way in order to provide the end-to-end control and accounting desired by the content provider. The components contributing to a full OACP implementation are shown in the following logical diagrams: Origin CDN/Hoster Operator Access Provider ---------- --------- --------- --------- -------- Rzewski et al Expires May 17, 2001 [Page 3] Internet-Draft OACP November 2000 |content | |content| |content| |content| |access| |injector|-->| relay |-->| relay |-->| relay |-->| cache| ---------- --------- --------- --------- -------- (many) (several) (few) (several) (many) CDN/Hoster Operator Access Provider ------------ ------------ ------------ -------- |accounting| |accounting| |accounting| |access| | relay |<--| relay |<--| relay |<--| cache| ------------ ------------ ------------ -------- (several) (few) (several) (many) A CONTENT INJECTOR is used to originate CONTENT SIGNALS on a content provider's origin/staging server. CONTENT RELAY systems are used within HOSTER, CDN, OPERATOR, and ACCESS PROVIDER networks to forward CONTENT SIGNALS and act upon them, where applicable. ACCOUNTING RELAY systems are used within these same networks to forward logging information and perform tasks related to settlement. In order to simplify some peering tasks, roles are also defined for a central OPERATOR. 1.3 Terminology The following words are used as defined in [3]: ACCOUNTING, CDN, CONTENT SIGNAL, ORIGIN, PUBLISHER, SURROGATE ACCESS To obtain connectivity to the Internet, as through dial, cable modem, or leased line. ACCESS CACHE In general, an HTTP caching proxy [2] operated by an ACCESS PROVIDER on behalf of its user base. ACCESS CLIENTS make use of an ACCESS CACHE through methods such as browser pointing or interception [1]. Note that in the OACP model, because an ACCESS CACHE performs some tasks directly on behalf of ORIGINS, it takes on a partial role as a SURROGATE. In particular, an ACCESS CACHE participating in OACP MUST have all the functionality of a SURROGATE defined in [4] and [5]. ACCESS CLIENT An entity that gains ACCESS to the Internet through an ACCESS Rzewski et al Expires May 17, 2001 [Page 4] Internet-Draft OACP November 2000 PROVIDER. An ACCESS CLIENT is typically represented by an HTTP user agent [2], such as a browser. ACCESS PROVIDER An entity that sells or otherwise gives ACCESS to the Internet. ACCOUNTING RELAY A system that aggregates ACCOUNTING data (such as logs) and passes them further UPSTREAM. CONTENT As defined in [3], though for the purposes of this document, only HTTP (not continuous media) is considered. CONTENT INJECTOR As described in [4], a system operating on behalf of an ORIGIN that sends CONTENT SIGNALS to a CONTENT RELAY. CONTENT PROVIDER A type of PUBLISHER. Specifically, a CONTENT PROVIDER stores their content on an ORIGIN. CONTENT RELAY As described in [4], a system that forwards CONTENT SIGNALS. CONTENT RELAY SURROGATE A CONTENT RELAY that also acts as a SURROGATE. DISTRIBUTION In OACP, DISTRIBUTION refers specifically to the propagation of CONTENT SIGNALS in a DOWNSTREAM direction. DOWNSTREAM In OACP, the direction of flow from ORIGIN toward ACCESS PROVIDERS. HOSTER An entity that provides network connectivity for ORIGIN systems. For the purposes of this document, a HOSTER does not operate a CDN. INVALIDATION A message indicating that content associated with a particular URL has expired, and hence should no longer be served to ACCESS CLIENTS. Rzewski et al Expires May 17, 2001 [Page 5] Internet-Draft OACP November 2000 OACP AGREEMENT See ORIGIN/ACCESS CONTENT PEERING AGREEMENT. OPERATOR A centralized entity that offloads some of the more menial tasks associated with OACP, such as billing for settlement, log file collection, and PROVISIONING. ORIGIN/ACCESS CONTENT PEERING AGREEMENT Agreement of an ACCESS PROVIDER to offer ACCESS CACHE resources (disk space for storing CONTENT, ACCOUNTING information, etc.) to a CONTENT PROVIDER, possibly in exchange for a fee. For the purposes of this document, may be abbreviated simply to OACP AGREEMENT. PARENTING Arrangement of two caches in a hierarchy such that one cache (the "child") contacts another explicitly specified cache (the "parent") for the servicing of some/all HTTP requests. Parenting is typically not performed across administrative boundaries. PEERED URL A URL that is subject to CONTENT PROVIDER control in ACCESS CACHES under the terms of an OACP AGREEMENT. For example, if an OACP AGREEMENT indicates that an ACCESS PROVIDER will accept invalidation messages for all objects associated with a CONTENT PROVIDER'S Domain Name, all URLs that reference the CONTENT PROVIDER'S Domain Name are PEERED URLS under that OACP AGREEMENT. PEERING CONFIGURATION The configuration of CONTENT RELAYS, ACCOUNTING RELAYS, and (possibly) ACCESS CACHES necessary to fulfill the terms of an OACP AGREEMENT. PIN To store content in a cache in a manner such that it will not be deleted (such as by a garbage collection algorithm). PRE-LOAD The placing of content into a CONTENT RELAY SURROGATE before an ACCESS CLIENT has requested it. PROVISIONING The act of adding PEERING CONFIGURATION to CONTENT RELAYS, ACCOUNTING RELAYS, and (possibly) ACCESS CACHES. Rzewski et al Expires May 17, 2001 [Page 6] Internet-Draft OACP November 2000 SURROGATE IDENTIFIER A unique identifier that may be assigned to each cache in an OACP implementation to identify where events have occurred. UPSTREAM In OACP, the direction of flow from ACCESS PROVIDERS toward an ORIGIN. 1.4 Comparison with "CDN Peering" Works such as [3] and related works submitted to the IETF have thoroughly documented architecture, taxonomy, framework, and areas for future work in "CDN Peering". There is some overlap between those works and this document. There are, however, two major differences in the approach of this document compared to the body of work on CDN Peering: 1) In OACP, attempts are made to document the specific roles of different types of networks, such as "access", "hoster", "CDN", and "operator". Depending on how they are read, the works on CDN Peering may allow for these different types of networks if they are considered as specific "types" of CDNs ("verticals"). However, in the light of the use of the word "CDN" in the industry to refer to a specific class of service providers, there is motivation to describe a content peering model that embraces the differences in different types of networks. 2) While OACP does imply an architecture, it is largely focused on implementation. Many of the methods described in this document and [4] or [5] for cross-network communication are in use today. 2. OACP Distribution Each participant in the OACP DISTRIBUTION chain is likely to have different requirements from a CONTENT RELAY system, as detailed in the following sections. 2.1 Hoster/CDN Content Relay Implementation A HOSTER/CDN typically requires a CONTENT RELAY to act as a single point of aggregation for CONTENT SIGNALS. By combining CONTENT SIGNALS from several ORIGINS on a single CONTENT RELAY, the HOSTER/CDN simplifies PROVISIONING for DOWNSTREAM networks. Rzewski et al Expires May 17, 2001 [Page 7] Internet-Draft OACP November 2000 A CDN provider may want the CONTENT RELAY to also act as a SURROGATE in their CDN. In this situation, the CONTENT RELAY would be a CONTENT RELAY SURROGATE, responding to CONTENT SIGNALS by processing INVALIDATION and PRE-LOAD operations before passing them DOWNSTREAM. Such a CDN would then have the option of directing CONTENT requests (such as those coming from DOWNSTREAM systems) to the CONTENT RELAY SURROGATE rather than allowing the request to be serviced by an ORIGIN. 2.2 Operator Content Relay Implementation By definition, an OPERATOR only assists in the peering between other entities and therefore would not typically operate a CONTENT RELAY SURROGATE, only an ordinary CONTENT RELAY. See Section 4 for a description of some additional actions an OPERATOR may perform as CONTENT SIGNALS pass through a CONTENT RELAY. 2.3 Access Content Relay and Access Cache Implementation An ACCESS PROVIDER will typically use a small number of CONTENT RELAY SURROGATES to locally store CONTENT associated with PEERED URLS. In such a configuration, an ACCESS PROVIDER would also be likely to: * Set their (many) ACCESS CACHES to use these CONTENT RELAY SURROGATES as PARENT caches for PEERED URLS. * PIN CONTENT associated with PEERED URLS in this CONTENT RELAY SURROGATE. * Limit the amount of PINNED storage in this CONTENT RELAY SURROGATE based on given parameters (e.g. per set of PEERED URLS). * Require a mechaism to deal with contention for PINNED storage in this CONTENT RELAY SURROGATE. * Require a mechanism in ACCESS CACHES to ignore cache heuristics (e.g. expiration headers) for PEERED URLS, since explicit CONTENT SIGNALS for all PEERED URLS would be guaranteed to come from the ORIGIN. 3. OACP Accounting Each participant in the OACP ACCOUNTING chain may have different requirements from an ACCOUNTING RELAY system, as detailed in the following sections. Rzewski et al Expires May 17, 2001 [Page 8] Internet-Draft OACP November 2000 3.1 Hoster/CDN Accounting Relay Implementation A HOSTER/CDN AR acts as a "sink" for logging information, and is therefore typically not required to pass logging information over the "last hop" to a CONTENT PROVIDER. This is because CONTENT PROVIDERS may have firewalls or other security mechanisms preventing an incoming HTTP POST. Therefore, it is generally left up to the CDN/HOSTER to provide an interface for CONTENT PROVIDERS to view/retrieve log information. 3.2 Operator Accounting Relay Implementation An OPERATOR typically passes logs from ACCESS PROVIDERS to CDN/HOSTER networks, as defined by OACP AGREEMENTS. See Section 4 for a description of some additional actions an OPERATOR may perform as logs pass through an ACCOUNTING RELAY. 3.3 Access Provider Accounting Relay Implementation An ACCESS PROVIDER would typically use an ACCOUNTING RELAY to combine logs of all their ACCESS CACHES and CONTENT RELAY SURROGATES together in order to post a single summary UPSTREAM (to an OPERATOR ACCOUNTING RELAY, for example). 3.4 Access Provider Access Cache and Content Relay Surrogate Implementation ACCESS CACHES and CONTENT RELAY SURROGATES act as a "source" for logging information, and therefore must originate logs in a common format and be able to pass them UPSTREAM, as described in [5]. 4. Operator Roles Much like NAP operators eased peering configuration and data exchange in the early days of bandwidth peering, there is an analogous role for an OPERATOR to assist in OACP. Some of the roles handled by an OPERATOR include: o Centralized Content Relay systems By providing CONTENT RELAY systems in locations accessible to both CDN/HOSTER and ACCESS PROVIDER participants, the OPERATOR can help ease PEERING CONFIGURATION. For example, a CDN/HOSTER can establish an agreement with an OPERATOR such that all CONTENT SIGNALS for the CDN/HOSTER will be sent to an OPERATOR CONTENT RELAY, and the OPERATOR CONTENT RELAY will pass these CONTENT Rzewski et al Expires May 17, 2001 [Page 9] Internet-Draft OACP November 2000 SIGNALS to several ACESS PROVIDERS. This prevents the CDN/HOSTER from needing to perform PROVISIONING for each ACCESS PROVIDER. o Content Signal validation For the protection of DOWNSTREAM partcipants, an OPERATOR may ensure the accuracy of CONTENT SIGNALS from UPSTREAM sources. For example, an OPERATOR may ensure that an ACCESS PROVIDER only receives CONTENT SIGNALS for their PEERED URLS. o Authorization for rate of refresh In order to prevent excessive flooding, an OPERATOR may enforce limits on how frequently CONTENT SIGNALS are sent between OACP peers. o Centralized Accounting Relay systems Centralized ACCOUNTING RELAY systems can ease PEERING CONFIGURATION and aggregation of utilization data. For example, a CDN/HOSTER can establish an agreement with an OPERATOR such that log data from all ACCESS PROVIDER peers will be interleaved into a single log file before being sent to the CDN/HOSTER ACCOUNTING RELAY. o Settlement Because ACCESS PROVIDERS are providing control of their cache resources and utilization data, they may wish to receive reimbursement from CONTENT PROVIDERS for this privilege. An OPERATOR providing centralized ACCOUNTING RELAY systems may choose to perform usage-based billing for this privilege by analyzing utilization logs. o Fraud detection Because ACCESS PROVIDERS may bill CONTENT PROVIDERS as just described, there may be fear on behalf of CDN/HOSTER participants that bogus utilization data may be created by ACCESS PROVIDERS to generate income. An OPERATOR providing centralized ACCOUNTING RELAY systems may choose to provide log analysis services to detect this type of fraud. 5. Security Because implementation of OACP relies entirely on the functionality of CONTENT RELAY and ACCOUNTING RELAY systems, see [4] and [5] for security requirements of these systems. 6. Acknowledgements Rzewski et al Expires May 17, 2001 [Page 10] Internet-Draft OACP November 2000 The authors acknowledge the contributions and comments of Arthur Huston (Inktomi), Paul Francis (Inktomi), and Brad Cain (Mirror Image). References [1] I. Cooper, I. Melve, G. Tomlinson, "Internet Web Replication and Caching Taxonomy", draft-ietf-wrec-taxonomy-05.txt (work in progress), July 2000 [2] R. Fielding, J. Gettys, J. Mogul, H. Frystyk, L. Masinter, P. Leach, T. Berners-Lee, "Hypertext Transfer Protocol -- HTTP/1.1", RFC2616, June 1999 [3] M. Day, B. Cain, G. Tomlinson, "A Model for CDN Peering", draft- day-cdnp-model-03.txt (work in progress), November 2000 [4] P. Rzewski, B. Cain, N. Robertson, "Cross-Network Distribution of Content Signals for HTTP", draft-rzewski-cndistcs-00.txt (work in progress), November 2000 [5] P. Rzewski, B. Cain, N. Robertson, "Cross-Network Accounting for HTTP", draft-rzewski-cnacct-00.txt (work in progress), November 2000 Authors' Addresses Phillip A. Rzewski Inktomi Corporation 4100 East Third Avenue MS FC1-4 Foster City, CA 94404 USA Phone: +1 650 653 2487 Email: philr@inktomi.com Joe Bai Adero 529 Main Street Boston, MA 02129 Rzewski et al Expires May 17, 2001 [Page 11] Internet-Draft OACP November 2000 US Phone: +1 617 241 3600 Email: joe.bai@adero.com Niel Robertson Exodus 2900 Nautilus Court Boulder, CO 80301 US Phone: +1 303 381 2302 Email: nielr@servicemetrics.com Full Copyright Statement Copyright (C) The Internet Society (2000). All Rights Reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English. The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assigns. This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Acknowledgement Rzewski et al Expires May 17, 2001 [Page 12] Internet-Draft OACP November 2000 Funding for the RFC editor function is currently provided by the Internet Society. Rzewski et al Expires May 17, 2001 [Page 13]