Network Working Group D. Li Internet-Draft Cisco Systems, Inc. Expires: May 21, 2002 I. Cooper Personal capacity M. Dahlin University of Texas M. Hamilton JANET Web Cache Service November 20, 2001 Requirements for a Resource Update Protocol draft-ietf-webi-rup-reqs-02.txt Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http:// www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on May 21, 2002. Copyright Notice Copyright (C) The Internet Society (2001). All Rights Reserved. Abstract This document establishes the requirements for a Resource Update Protocol which may be used in conjunction with World-Wide Web intermediary systems such as caching proxies and surrogate servers to facilitate cache coherence. It is envisaged that RUP will include invalidation of previously cached objects as a key feature, while providing hooks for future extensions to richer functionalities, such Li, et al. Expires May 21, 2002 [Page 1] Internet-Draft RUP Requirements November 2001 as directing a surrogate to retrieve content using delta encoding or IP multicast. The main goal is to enable proxy caching and content distribution of large amounts of frequently changing web objects, where periodically revalidating objects one by one is unacceptable in terms of performance and/or cache consistency. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . 4 2.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . 4 3. Design Guidelines . . . . . . . . . . . . . . . . . . . . . 5 4. Scoping Requirements . . . . . . . . . . . . . . . . . . . . 5 5. Use Cases . . . . . . . . . . . . . . . . . . . . . . . . . 6 5.1 Intra-CDN . . . . . . . . . . . . . . . . . . . . . . . . . 6 5.2 Inter-CDN . . . . . . . . . . . . . . . . . . . . . . . . . 6 5.3 Content provider to CDN . . . . . . . . . . . . . . . . . . 6 5.4 Content provider to arbitrary Web intermediary . . . . . . . 7 5.5 Operations . . . . . . . . . . . . . . . . . . . . . . . . . 7 5.5.1 RUP SERVER-driven invalidation . . . . . . . . . . . . . . . 7 5.5.2 RUP CLIENT-driven invalidation . . . . . . . . . . . . . . . 7 5.5.3 Content location update . . . . . . . . . . . . . . . . . . 8 5.5.4 Content prefetch hint . . . . . . . . . . . . . . . . . . . 8 5.5.5 Content updates . . . . . . . . . . . . . . . . . . . . . . 8 5.5.6 Metadata updates . . . . . . . . . . . . . . . . . . . . . . 9 6. Functional Requirements . . . . . . . . . . . . . . . . . . 9 6.1 Coherence Model . . . . . . . . . . . . . . . . . . . . . . 9 6.1.1 Confirmation of actions . . . . . . . . . . . . . . . . . . 9 6.1.2 Loose consistency . . . . . . . . . . . . . . . . . . . . . 9 6.1.3 HTTP Warnings . . . . . . . . . . . . . . . . . . . . . . . 9 6.1.4 Transitioning into/out of RUP control . . . . . . . . . . . 10 6.1.5 Express resynchronization . . . . . . . . . . . . . . . . . 10 6.2 Naming and Framing - Synchronization groups . . . . . . . . 10 6.3 Naming and Framing - Notification groups . . . . . . . . . . 12 6.4 Naming and Framing - Notification and extensibility . . . . 12 6.5 Client-Server Interaction . . . . . . . . . . . . . . . . . 13 6.6 Network and Host Environment . . . . . . . . . . . . . . . . 14 6.7 Host-to-host Communication . . . . . . . . . . . . . . . . . 15 7. Security Considerations . . . . . . . . . . . . . . . . . . 15 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 16 References . . . . . . . . . . . . . . . . . . . . . . . . . 16 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . 18 A. Change Log . . . . . . . . . . . . . . . . . . . . . . . . . 19 Full Copyright Statement . . . . . . . . . . . . . . . . . . 22 Li, et al. Expires May 21, 2002 [Page 2] Internet-Draft RUP Requirements November 2001 1. Introduction Web cache invalidation and cache coherence protocols enable cooperation among content servers and web intermediaries by eliminating the round trips in a per-request cache validation model using HTTP conditional request directives (e.g. If-Modified-Since). Current practices provided by existing HTTP cache control mechanisms appear unsatisfactory for many content providers. Existing mechanisms are based on the assumption that the expiration time for a given entity can be known at the moment it is delivered; there is a desire by content providers to be able to change a given entity and for users to see this change immediately. This desire leads us to suggest that there is a need for server-driven updates. Cache coherence and invalidation is discussed in detail in the caching literature, e.g. see [8] and [9] for background information. A number of cache coherence or cache invalidation protocols have been proposed by the research community and the caching and content distribution industry. Approaches vary, with some proponents seeking to enhance existing protocols and others developing new protocols either specifically for this purpose or which include this functionality. Examples include WCIP [1], PSI [2] and DOCP [3]. A carefully developed mechanism for the communication of information about changes to Internet resources offers the potential for other functions above and beyond invalidation of cached objects. Resource updates may also be an appropriate way of informing systems which generate content dynamically that the underlying data which they manipulate (e.g. to produce HTML pages) has changed. Such uses are currently outside the scope of consideration in this requirements document. The goal of this document is to outline requirements for a Resource Update Protocol (RUP), whose purpose is to improve cache coherence over HTTP's conditional request polling technique, in order to (a) provide stronger guarantees at a similar cost; or (b) provide similar guarantees at a lower cost. This document will be used to determine the suitability of protocols proposed by developers for the purpose of supporting resource updates. It is also, naturally, intended as a reference for such developers. [Ed note: This paragraph needs serious work] For the reasons described above, we will refer to an abstract Resource Update Protocol (RUP, or simply 'the protocol') whose Li, et al. Expires May 21, 2002 [Page 3] Internet-Draft RUP Requirements November 2001 functionality will initially be limited to simple invalidation of cached objects, as a basis for future extensions to richer functionality. However, general "meta data" is allowed, and accommodated as concrete payload types and arbitrary payload options (see the section on "notification and extensibility"). Note that RUP is at least conceptually a new protocol, but may in practice be based wholly or partly on existing protocols. 2. Terminology The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [4]. This document uses terms defined in the Internet Web Replication Taxonomy [5], and the HTTP/1.1 specification [6]. The reader is expected to be familiar with these documents. The term SURROGATE is used to refer to a demand driven surrogate origin web server unless explicitly stated otherwise. ORIGIN SERVER refers to the master origin server. 2.1 Definitions RUP SERVER The entity that knows the state of a content provider's resources and generates resource updates. Notionally co-located with the origin server. RUP CLIENT The entity that needs to know the state of resources and receives updates from the RUP SERVER. Notionally co-located with a SURROGATE. INVALIDATION The signal from a RUP SERVER to a RUP CLIENT to indicate that the master copy of an entity has changed. CONSISTENCY GUARANTEE To be defined... RESOURCE GROUP To be defined... [Ed note: additional wording followed, but this seems like a requirement of some kind not something that belongs in the terminology. It also might suggest that INVALIDATION is not the right term to be using. Li, et al. Expires May 21, 2002 [Page 4] Internet-Draft RUP Requirements November 2001 "RUP must clearly define the actions this signal implies or mandates and the semantics the actions accomplish, in relationship to the RUP coherence models. E.g., RUP may specify (among many things) that a RUP client must not serve an invalidated cached object without an if- modified-since HTTP query." 3. Design Guidelines The following is a list of guidelines that protocol designers should keep in mind. [Ed note: These are Ian's interpretations of the design guidelines in the -01 version and as such are up for discussion.] 1. The protocol SHOULD be extensible to accommodate richer functionalities in subsequent versions of the protocol. 2. The protocol SHOULD leverage existing technologies where possible (e.g. XML, HTTP, URIs). 3. Protocols conforming to this requirements document SHOULD interoperate or co-exist with each other. 4. INVALIDATIONS are expected to be sent to tens of thousands of SURROGATES. The protocols SHOULD provide reasonable functionalities to enable scaling to such numbers (e.g. by the use of relays). 5. The protocol MUST define levels of compliance and document any side effects that might arise should lesser levels of compliance be used. 6. The protocol SHOULD make expensive operations OPTIONAL. 4. Scoping Requirements The Resource Update Protocol is expected to be integrated into a variety of environments: origin servers and their delegates; Web intermediaries including surrogates and caching proxies. The Resource Update Protocol SHOULD NOT be implemented in user agents (e.g. browsers, search engine crawlers etc.). See Denial of Service issues in the Security Considerations for issues of building RUP into user agents. [Ed note: the following paragraph is broken. This is likely because the term SURROGATE is broken for us here. Help.] Li, et al. Expires May 21, 2002 [Page 5] Internet-Draft RUP Requirements November 2001 The motivation for RUP is to provide a tighter, open standard, method of creating a tighter binding between content provider and content delivery SURROGATES. When conflicts between content delivery SURROGATES and caching proxies arise the protocol SHOULD accommodate the SURROGATES by making OPTIONAL ... 5. Use Cases This section provides examples of how it is anticipated RUP will be deployed. While it is noted that these scenarios are only slightly different it is felt beneficial to identify them for the benefit of potential protocol developers. Particular attention is brought to the fact that RUP may be used to to interface between systems running alternative protocols beyond the boundaries defined here (e.g. individual CDNs may choose to operate proprietary protocols internally but speak RUP externally) or that RUP may be used on both sides of the system (e.g. between content provider and CDN and also within that CDN). 5.1 Intra-CDN A Content Distribution/Delivery Network may choose to use RUP to pass INVALIDATIONS from a notional single control node to intermediaries under its control. CDNs must typically have control protocols in order to ensure they serve the content requested by their customers (the content providers). However, such protocols are currently proprietary in nature. Standardization within IETF will enable CDN operators to share knowledge in a beneficial way, and with a wider technical community. 5.2 Inter-CDN Content Networks may internetwork, sharing the responsibility of serving content requests to end users. As such, INVALIDATIONS will also have to be shared among the cooperating networks. As an open standard, RUP would enable multiple Content Networks to share INVALIDATIONS among each other, while allowing each Network to operate an invalidation protocol of its own choosing internally. 5.3 Content provider to CDN When using the services of a CDN a content provider must use protocols or tools provided by that CDN in order to pass invalidation or update messages so that the CDN can then propagate that Li, et al. Expires May 21, 2002 [Page 6] Internet-Draft RUP Requirements November 2001 information to its own surrogates. In the absence of a standardized interface, a content provider may become someone tied to the use of services from one CDN because of the "pain" in changing to another CDN. 5.4 Content provider to arbitrary Web intermediary Caching proxies provide "best effort" content delivery capabilities on behalf of the network operators running them. In an effort to ensure end users see the content they wish to be seen, at all times, content providers typically make material appear uncacheable. If a tighter binding was provided between the content providers' systems and caching proxies, such devices could offload some of the requests (using additional information to serve otherwise "uncacheable" content) to the benefit of the content provider. This use case is similar to the "Content provider to CDN" case above. 5.5 Operations 5.5.1 RUP SERVER-driven invalidation A RUP SERVER sends entity or resource group INVALIDATIONS to RUP CLIENTS. The RUP SERVER (and its administrator) controls the activity according to the RUP SERVER's load, scheduling, and configured preference. The connection between RUP CLIENT and RUP SERVER MAY be established by either party; connections SHOULD be persistent. It SHOULD be possible to monitor the update guarantee. RUP SERVER-driven invalidations MUST be supported by candidate protocols. 5.5.2 RUP CLIENT-driven invalidation A RUP CLIENT queries the RUP SERVER for freshness status of an entity or group of objects referenced by URI. The RUP CLIENT controls RUP activity according to its load, failure recovery needs and configured preferences. The RUP SERVER replies to requests with latest changes since the last time the RUP CLIENT asked. [Ed note: initial text read "...based on information such as Etag, timestamp, and/or version number." Is it appropriate to have that text in this document?] Li, et al. Expires May 21, 2002 [Page 7] Internet-Draft RUP Requirements November 2001 Whether and when the RUP CLIENT communicates with the RUP SERVER is determined by the consistency guarantee the RUP CLIENT is committed to provide, and MUST [??] follow the semantic rules defined by the RUP protocol. RUP CLIENT-driven invalidations MUST be supported by candidate protocols. 5.5.3 Content location update The RUP SERVER is able to designate a new location as the source for an entity. Location may be, for example, from a new URI, parent caching proxy, CDN peer, multicast object distribution channel. [Ed note: Should we draw an analogy with HTTP 30x responses?] Support of content location update is OPTIONAL. 5.5.4 Content prefetch hint The RUP SERVER tags entities or RESOURCE GROUPS as suitable for prefetch. RUP CLIENTS MAY prefetch the content and pin it in their corresponding caches. Prefetch hints enable content to be prepositioned in caching systems that wish to cooperate. Support of content prefetch hint is OPTIONAL. 5.5.5 Content updates RUP proposals SHOULD NOT provide mechanisms for providing content updates. Any support of content updates is OPTIONAL. Content updates refers to the scenario where a RUP SERVER sends either the full content of updates to entities, or deltas of those changes to RUP CLIENTS. At the time of writing there is insufficient understanding of the kinds of content updates that RUP might need to support. In particular we are aware that mixing signalling with data leads to problems of scaling, object consistency, and security issues, among others. Existing mechanisms addressing content retrieval, e.g., HTTP [6] and Delta Encoding [7] demonstrate the high complexity of such functionality. Content prefetch hints enable participating RUP CLIENTS to invalidate and fetch content with somewhat similar results Li, et al. Expires May 21, 2002 [Page 8] Internet-Draft RUP Requirements November 2001 to content updates. 5.5.6 Metadata updates [Ed note: place holder; we lost a comment about RUP not being used to "update content or HTTP meta information of cached entities." This is still up for discussion, so this section is just to make sure it doesn't get lost.] 6. Functional Requirements 6.1 Coherence Model 6.1.1 Confirmation of actions A RUP SERVER MUST be able to specify whether actions it requests of a RUP CLIENT are acknowledged or not. In a unicast environment a RUP SERVER SHOULD request acknowledgements. Acknowledgements from RUP CLIENTS SHOULD be sent as a result of carrying out an INVALIDATION. Where relays are used in a single control domain the relay MUST preserve the semantics of acknowledgement. E.g. a relay MUST wait for every child's acknowledgement before acknowledging to the RUP SERVER. Semantics MUST be preserved irrespective of the number of levels of relays. 6.1.2 Loose consistency To support applications that do not require tight coupling (e.g. batch mirroring), RUP SHOULD make loose consistency available. In particular RUP SHOULD provide delta consistency guarantees such that a RUP SERVER MAY specify a maximum acceptable staleness, defined in seconds. If a cached entity is updated then within the period covered by the delta consistency guarantee the RUP CLIENT will either (a) be notified of the update, or (b) detect that its cache is no longer synchronized with the server. [Ed note: what does that latter part actually mean?] 6.1.3 HTTP Warnings A cache SHOULD [Ed node: MUST?] never return stale RUP-controlled entities without an appropriate HTTP/1.1 Warning [6]. [Ed note: Some description of the warning code chosen, especially if Li, et al. Expires May 21, 2002 [Page 9] Internet-Draft RUP Requirements November 2001 an existing code is used in this context] 6.1.4 Transitioning into/out of RUP control To transition from HTTP cache control to RUP controlled coherence a RUP CLIENT MUST first consider the resource stale and revalidate via RUP. To transition from RUP controlled coherence to HTTP cache control a RUP CLIENT MUST first consider the resource stale and revalidate via HTTP. 6.1.5 Express resynchronization [Ed note: unchanged. Question: Is this common sense (belongs in a more "discussion" oriented section, or does it need to be specified as a requirement? What are "foreground" and "background"?] It is essential that the protocol support "express resynchronization". I.e., if a RUP client becomes de-synchronized from a RUP server, the client should be able to reconnect (resynchronize) quickly. There are a number of ways to support this, e.g., batch revalidation based on version numbers, incremental background revalidation, incremental foreground revalidation, delayed invalidation, log playback, etc. It's up to the RUP protocol design to decide on the specific mechanisms for revision control and express resynchronization of resources. 6.2 Naming and Framing - Synchronization groups [Ed note: Need a description of what synchronization groups are and what they accomplish here. In the following points there's descriptive text mixed with requirements - we need to break them apart.] [Ed note: limited edits from here on; only nouns changed below - lots and lots of editing is still required.] 1. The protocol MUST enable definition of a "synchronization group" of objects, which is a group of objects about which a RUP CLIENT can subscribe to receive notifications. Synchronization groups represent the granularity of synchronization. RUP SERVERS must only send notifications about resources in a synchronization group to RUP CLIENTS that have requested notifications for that group. 2. The policy to group resources into synchronization groups is outside the scope of RUP. Grouping may be determined by the Li, et al. Expires May 21, 2002 [Page 10] Internet-Draft RUP Requirements November 2001 content provider, CDN operator, traffic analysis tools, or other means. RUP is not required to provide dynamic negotiation between the RUP SERVER and RUP CLIENT over the composition of a resource group. In other words, "targeted invalidation", in which a RUP SERVER only sends an invalidation about entity X to RUP CLIENTS that have registered callbacks on entity X, is out of scope for the initial version of RUP. This restriction is motivated by complexity and scalability concerns about RUP SERVERS (and RUP CLIENTS) having to negotiate and maintain individual views of resource groups for all the RUP CLIENTS (and RUP SERVERS) they speak to. It's anticipated that predefined resource groups will fit well with the majority of the RUP deployment cases (surrogates, mirror sites, and CDNs). 3. RUP must support "in band" and "out of band" means to describe the composition of a synchronization group. Out of band assignment of an entity to a synchronization means that assignment occurs outside of RUP information exchange procedure, (e.g., in the entity's HTTP header.) In-band assignment would include a synchronization group definition message in RUP. An out-of-band HTTP header based approach specification is simple and makes it efficient to determine to which group an object belongs, while in-band specification should be supported so that RUP is self-contained. In particular, CDN operators would prefer not having to change the origin web server before anything can be put into a resource group, but rather self-describe the composition within the group. 4. The protocols for describing synchronization group composition should be efficient with respect to both network transmission and client-matching logic for both in band and out of band protocols. Network transmission should support descriptions that grow less than linearly with the number of objects in a volume (e.g., URI prefix or regular expression matching) but they may also support listing of objects. For "out of band" protocols, they may support allowing a header to indicate that the current object is part of a particular synchronization group (rather than fully specifying the membership of the synchronization group). In order to service reads efficiently, it must be possible to implement matching logic so that work to determine which synchronization group(s) a cached object is a member of grows much more slowly than linear in the number of synchronization groups. (For example, consider a hypothetical RUP protocol where the membership of a synchronization group is specified by a short list of URI prefixes. Such URI prefixes can be organized into a tree so that given an object URI, the enclosing URI prefix can be found in work logarithmic to the number of URIs. Conversely, it may be more difficult to determine which (if any) arbitrary Li, et al. Expires May 21, 2002 [Page 11] Internet-Draft RUP Requirements November 2001 regular expression matches a given URL, so allowing synchronization groups to be defined by arbitrary regular expressions may limit scalability of RUP CLIENTS.) 5. The protocol should allow RUP SERVERS to be common with or disjoint from data servers. Therefore, in addition to specifying the collection of URIs that belong to a synchronization group, a in- or out-of-band definition of a synchronization group must also specify protocol and server information that indicate with whom to communicate (e.g., "protocol://example.com/channel7"). 6.3 Naming and Framing - Notification groups 1. The protocol must enable atomic notification regarding an arbitrary "notification group" of resources. For example, the protocol must be able to invalidate a list of multiple URIs with one message. 2. The protocol must enable efficient notification regarding a pre- specified group of resources (i.e., it must be possible to define a group where a notification that applies to all members of a group can be transmitted with network bandwidth that grows less than linearly with the size of the group.) For example, the protocol might implement this requirement by allowing a notification group to be specified in different ways such as (1) by a list of URIs included in the message, (2) by a regular expression or path prefix that refers to a set of URIs, and/or (3) by a URI that itself refers to a (possibly hierarchical) list of URIs. 6.4 Naming and Framing - Notification and extensibility 1. As an extension mechanism, a RUP message MAY carry a set of options for a notification group of resource. The set of options may be empty. RUP must specify a generic option format and define the content prefetch hint and content location update as options. No option is mandatory. A RUP CLIENT MAY ignore any options it doesn't recognize or doesn't want to support. If positive acknowledgement is requested, the acknowledgement MUST indicate the options it has carried out. 2. The protocol MUST define an extensible format for RUP messages that is capable of carrying a variety of payloads. Possible payloads include (1) cache invalidation, (2) content location update, (3) content prefetch hints, (4) removal and addition of resources to a resource group, (5) adjustments to cache Li, et al. Expires May 21, 2002 [Page 12] Internet-Draft RUP Requirements November 2001 consistency parameters, etc. While the above payloads may share the same RUP mechanism, it's not a requirement for the initial protocol to address all of them simultaneously. 6.5 Client-Server Interaction 1. The protocol must define "RUP CLIENT" and "RUP SERVER" roles. It SHOULD be possible for either the RUP SERVER or the CLIENT to initiate information exchange. (E.g. The name "RUP SERVER" does not require that entity to be a server as defined in [6].) 2. We anticipate that the primary RUP CLIENTS and RUP SERVERS will be Web intermediaries and origin servers, although the protocol SHOULD NOT be designed as to preclude use by other entities. For example, the origin server(s) MAY delegate the role of RUP SERVER to a CDN which operates dedicated content signaling channels and servers. 3. The protocol SHOULD be designed to scale to systems where there are a large number (more than 10,000) SURROGATES of a given origin server. This may require multiple levels of intermediary relay points and/or IP multicast. 4. The protocol SHOULD be capable of operating efficiently on a wide variety of underlying media, high latency satellite links in particular will need to be considered. E.g., caching vendors have TCP optimizations that an administrator can turn on if the link is satellite. A reliable multicast protocol would use more FEC If the link is asymmetric. The analogy in RUP is that RUP should be able to turn off any client->server messages (such as ACKs and client-driven updates) if the link is satellite or if the transport is IP multicast. 5. To support sequential consistency and monitoring of the RUP CLIENTS, it must be possible to determine whether resource update messages have been missed, e.g. due to a RUP CLIENT or RUP SERVER being down or unreachable. There must be a feedback mechanism which enables the RUP SERVER to determine the extent to which resource updates have propagated to SURROGATES and carried out. E.g., if the feedback from a SURROGATE never comes back or comes back as failed, the RUP SERVER may either delay publishing the content, syslog the failure, disable the SURROGATE that sent the failure code, or stop content routing to that CDN peer, etc. Specific deployments must be able to choose whether or not to operate with feedbacks. 6. It SHOULD be possible to reach a consistent state on all Li, et al. Expires May 21, 2002 [Page 13] Internet-Draft RUP Requirements November 2001 SURROGATES of a given origin server and collection of resources. I.e., RUP MUST guarantee that either a RUP CLIENT sees an update as intended or be able to detect that it might have missed an update. 7. The protocol SHOULD define the failure mode, i.e., the interaction and assumption of the RUP SERVER and RUP CLIENT in the presense of failure, in order to clearly define and preserve the semantic guarantees that can be offered in failure modes. In particular, if so desired, a RUP SERVER MUST be able to detect if some RUP CLIENTS didn't receive an update or didn't carry out an action, e.g., via positive acknowledgement, even if there's a network failure or client failure. Similarly, a RUP CLIENT MUST be able to detect if the RUP server or the network connection to the RUP SERVER has failed and, if so, automatically perform appropriate actions to expire contents that are potentially stale. 6.6 Network and Host Environment 1. The protocol SHOULD be useable both in a SURROGATE/origin server relationship and a traditional caching proxy/origin server relationship. The protocol SHOULD also be general enough to be useable in content delivery network (CDN) environments to allow freshness control of CDN delivery nodes. 2. It MUST be possible for the protocol to be used in an environment where some or all communications are mediated through a firewall or other intermediary device. The protocol design MUST identify issues involved in firewall traversal and provide ways by which these may be avoided or circumvented. These may not be explicitly security related concerns, e.g. working around any problems caused by use of Network Address Translation. [Ed note: I'm not sure the above can be true - all firewalls are different and I don't believe any protocol designer can anticipate everything that might happen. It might simply be impossible for RUP to punch a pinhole in the firewall without reconfiguration of the firewall itself.] 3. It MUST be possible for the protocol information to be relayed (single source, single destination) and/or be broadcasted (single source, multiple destinations) by RUP proxies. It MUST be possible for the protocol information to be cached (e.g., for broadcasting purposes) by RUP proxies. RUP MUST guarantee that the invalidation effect of a relayed messages on a compliant destination is the same as if the message reached the destination Li, et al. Expires May 21, 2002 [Page 14] Internet-Draft RUP Requirements November 2001 directly from the source. [Ed note: That's the first use of "RUP proxies". Are they "relays"? Looks like we need some more terminology.] 6.7 Host-to-host Communication 1. The protocol SHOULD layer cleanly and independently on top of the underlying communication layers, e.g., TCP, HTTP, BEEP, or SOAP. The protocol semantics and message formats SHOULD be self- contained in that they stay the same regardless of the underlying transport, and thus portable to different transports. 2. The protocol MUST allow the information transferred between the RUP SERVER and RUP CLIENT to be authenticated and if necessary encrypted. In particular, RUP should be layered on top of the TLS and SASL mechanisms, so as to accommodate security needs in various current and future deployment scenarios, without having to enumerate them in the RUP specification itself. If either the RUP SERVER or RUP CLIENT is not willing or capable of the security profile of a particular RUP session, the session MUST NOT be initiated. Instead, an alternate session (and its security profile) MAY be provided for them, e.g., via alternate configuration or out-of-band discovery mechanism. 3. RUP is required to identify and be able to work with at least one mechanism providing for discovery of RUP sessions. For example, RUP service could be manually configured, sent as header information in the HTTP responses from the origin server, or distributed via a separate out-of-bound mechanism. Such mechanisms are not required to be specified within RUP. This should not preclude or be a pre-requisite for development of the protocol per se. 7. Security Considerations Intermediaries open up a large number of new security problems which do not exist in the classical end-to-end model of the Internet, by introducing a 'Man In The Middle' by design. As such, it is essential that this protocol level work on intermediaries takes care to devise means by which the integrity of the resources being updated can be preserved - or at least tested. Following are the security threat models: Li, et al. Expires May 21, 2002 [Page 15] Internet-Draft RUP Requirements November 2001 1. Authentication and authorization: the protocol SHOULD be able to authenticate the parties of a session before commencing the session, so that no un-authorized parties may participate the session. 2. Session integrity: the protocol SHOULD be able to ensure the integrity of the resource update messages, so that any tempering can be detected and/or recovered from. 3. Session secrecy: the protocol SHOULD be able to ensure the secrecy of the resource update messages, so that no un-authorized parties can eavesdrop the session. 4. Denial of service: the protocol SHOULD provide counter measures to denial-of-service attacks, such as distributed SYN flood or resource update storm. [Ed note: Surely SYN flood operates at a level too far below anything that RUP would touch directly.] 5. In-band security: the protocol SHOULD be able to prevent trusted parties from flooding and/or disabling the RUP service, accidentally or intentionally. The above major risks associated with the protocol MUST be quantified and specifically addressed by the protocol design. 8. Acknowledgements Thanks to Mark Nottingham, Oskar Batuner, Mark Day, Phil Rzewski, Paul H. Gleichauf, Fred Douglis, Lisa Dusseault, Ted Hardie, Joe Touch, Brad Cain, Hilarie Orman, Lisa Amini, Joseph Hui, Alex Rousskov, Stephane Perret, Darren New, Christian Maciocco, Michael Condry, Renu Tewari, Tao Wu, and the rest of the WEBI mailing list for their contributions. The JANET Web Cache Service is funded by the Joint Information Systems Committee of the UK Higher and Further Education Funding Councils (JISC). References [1] Li, D., Cao, P. and M. Dahlin, "WCIP: Web Cache Invalidation Protocol", draft-danli-wrec-wcip-00.txt (work in progress), November 2000. [2] Krishnamurthy, B. and C. Wills, "Piggyback server invalidation for proxy cache coherency", In Computer Networks and ISDN Systems, Volume 30 1998. Li, et al. Expires May 21, 2002 [Page 16] Internet-Draft RUP Requirements November 2001 [3] Dilley, J., Arlitt, M., Perret, S. and T. Jin, "The Distributed Object Consistency Protocol", Technical Report HPL-1999-109, September 1999. [4] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [5] Cooper, I., Melve, I. and G. Tomlinson, "Internet Web Replication and Caching Taxonomy", RFC 3040, January 2001. [6] Fielding, R., Gettys, J., Mogul, J., Nielsen, H., Masinter, L., Leach, P. and T. Berners-Lee, "Hypertext Transfer Protocol -- HTTP/1.1", RFC 2616, June 1999. [7] Mogul, J., Clemm, G., van Hoff, A., Douglis, F., Feldmann, A., Krishnamurthy, B. and D. Hellerstein, "Delta encoding in HTTP", draft-mogul-http-delta-10 (work in progress), October 2001. [8] Belloum, A. and L. Hertzberger, "Maintaining Web cache coherency", In Information Research, Volume 6 No. 1, October 2000. [9] Gwertzman, J. and M. Seltzer, "World-Wide Web Cache Consistency", In Proceedings 1996 USENIX Technical Conference, January 1996. [10] Liu, C. and P. Cao, "Maintaining Strong Cache Consistency in the World-Wide Web", In Proceedings ICDCS97, May 1997. [11] Yin, J., Alvisi, L., Dahlin, M. and C. Lin, "Using Leases to Support Server-Driven Consistency in Large-Scale Systems", In Proceedings ICDCS98, May 1998. [12] Duuvvuri, V., Shenoy, P. and R. Tewari, "Adaptive Leases: A Strong Cache Consistency Mechanism for the World Wide Web", In Proceedings INFOCOM, 1999. [13] Li, D. and D. Cheriton, "Scalable Web Caching of Frequently Updated Objects using Reliable Multicast", In Proceedings USITS, October 1999. [14] Yin, J., Alvisi, L., Dahlin, M. and C. Lin, "Hierarchical Cache Consistency in a WAN", In Proceedings USITS, October 1999. [15] Yu, H., Breslau, L. and S. Shenker, "A Scalable Web Cache Consistency Architecture", In Proceedings SIGCOMM, 1999. [16] Yin, J., Alvisi, L., Dahlin, M. and A. Iyengar, "Engineering Li, et al. Expires May 21, 2002 [Page 17] Internet-Draft RUP Requirements November 2001 server-driven consistency for large scale dynamic web services", In Proceedings WWW10, 2001. [17] Cohen, J. and S. Aggarwal, "General Event Notification Architecture Base", internet-draft http://www.alternic.org/ drafts/drafts-c-d/draft-cohen-gena-p-base-01.pdf, July 1998. Authors' Addresses Dan Li Cisco Systems, Inc. 170 W. Tasman Drive. San Jose, CA 94043 USA Phone: +1 650 823 2362 EMail: lidan@cisco.com Ian Cooper Personal capacity No address available Phone: +44 7966 285145 EMail: ian@the-coopers.org Mike Dahlin University of Texas Taylor Hall 2.124 Department of Computer Sciences University of Texas Austin, TX 78712-1188 USA Phone: +1 512 327 7251 EMail: dahlin@cs.utexas.edu Li, et al. Expires May 21, 2002 [Page 18] Internet-Draft RUP Requirements November 2001 Martin Hamilton JANET Web Cache Service Computing Services Loughborough University Loughborough, Leics LE11 3TU UK Phone: +44 1509 263171 EMail: martin@wwwcache.ja.net Appendix A. Change Log Please direct your comments to WEBI mailing list webi@lists.equinix.com. To subscribe, email webi- request@lists.equinix.com with body (un)subscribe. The mailing list archive is at http://www.wrec.org/webi-archive/. Revision Log: 20-Nov-2001/Edits by Ian Cooper 1. Moved revision log to appendix. 2. Changed draft to "compact" mode to save some space 3. Significant changes in second paragraph of introduction in an attempt to clarify motivation for this work 4. Cleared up terminology section Question: Does the term invalidation fit? Removed initial wording which said "RUP must clearly define the actions this signal implies or mandates and the semantics the actions accomplish, in relationship to the RUP coherence models. E.g., RUP may specify (among many things) that a RUP CLIENT must not serve an invalidated cached object without a conditional HTTP request. 5. Moved references to cache coherence literature to the introduction (sub-optimal, but at least it's not in the terminology section any more) 6. Design guideline 1, changed to just read it should be extensible 7. Ian's interpretation on the design guidelines. 8. Attempted some wording to try to stop RUP being put into user agents (or acting as a pointer to protocol developers to get them to try to design things that make doing this difficult). Li, et al. Expires May 21, 2002 [Page 19] Internet-Draft RUP Requirements November 2001 Need a security considerations section to go with this... 9. Moved performance requirement from "scoping" to "guidelines" (it can't be any more than that) 10. Attempted to clarify the differences of administrative domain use cases by splitting into sub-sections (more detail is probably needed). 11. Split operations into subsections. Attempted to clarify the language. 12. Content updates added as an operation, but wording added to state that it SHOULD NOT be provided. 13. Changed format of 6.1 (coherence model) to give each point a sub-sub- section 14. Coherence model, requirement 6.1/6 (resource update guarantees must propagate through scaling mechanisms) clarified and restated in 6.1.1 since this is already talking about relays. (Note: Eeek, state!!!) 15. Changed client/server to RUP CLIENT and RUP SERVER throughout 16. Changed object etc. to entity throughout (I hope) Previous:/Edits by Dan Li 1. remove references to NAT. 2. describe the need to work with RUP session discovery mechanisms. 3. limit the doc to "requirements", and not "protocol design". 4. remove the terms "transaction" and "mirrors" since they convey strong meanings in other senses. 5. ensure that "confirmation of action" is a requirement. 6. clarify on the "guarantees" RUP needs to provides, and not claim to "support" or "require" strong consistency. 7. clarify the motivations for server-driven and client-driven modes. 8. clarify that general "meta data" is allowed, and accommodated as concrete payload types and arbitrary payload options, while Li, et al. Expires May 21, 2002 [Page 20] Internet-Draft RUP Requirements November 2001 invalidation is the primary payload at present. 9. clarify the terms "RUP client" and "RUP server", and remove ambiguous references to "client" or "server". 10. remove the distinction between "managed" and "unmanaged", remove any mention of them. 11. add use cases in terms of administrative roles, i.e., a list of entities that are allowed to participate within the RUP protocol realm. 12. include the reference to a microsoft's notification proposal. 13. add requirements on flexible content "selectors". 14. restructure the section on "naming and framing" to clarify three items Li, et al. Expires May 21, 2002 [Page 21] Internet-Draft RUP Requirements November 2001 Full Copyright Statement Copyright (C) The Internet Society (2001). All Rights Reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English. The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assigns. This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Acknowledgement Funding for the RFC Editor function is currently provided by the Internet Society. Li, et al. Expires May 21, 2002 [Page 22]