Network Working Group M. Hamilton Internet-Draft JANET Web Cache Service Expires: August 23, 2001 I. Cooper Equinix Inc. February 22, 2001 Requirements for a Resource Update Protocol draft-ietf-webi-rup-reqs-00.txt Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on August 23, 2001. Copyright Notice Copyright (C) The Internet Society (2001). All Rights Reserved. Abstract This document seeks to establish the requirements for a Resource Update Protocol which may be used in conjunction with World-Wide Web intermediary systems such as caching proxies and surrogate servers (proxy accelerators) to facilitate cache coherence and interoperability. It is envisaged that RUP will include invalidation of previously cached objects as a key feature, but not be limited to this functionality. The main goal is to enable proxy caching and content distribution of large amounts of frequently changing web objects, where periodically revalidating objects one by one is unacceptable in terms of performance and/or cache consistency. Hamilton & Cooper Expires August 23, 2001 [Page 1] Internet-Draft RUP Requirements February 2001 Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 3. Design Goals . . . . . . . . . . . . . . . . . . . . . . . . . 5 4. Scoping Requirements . . . . . . . . . . . . . . . . . . . . . 6 5. Functional Requirements . . . . . . . . . . . . . . . . . . . 7 6. Use Cases . . . . . . . . . . . . . . . . . . . . . . . . . . 9 7. Security Considerations . . . . . . . . . . . . . . . . . . . 10 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 11 References . . . . . . . . . . . . . . . . . . . . . . . . . . 12 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . 12 Full Copyright Statement . . . . . . . . . . . . . . . . . . . 14 Hamilton & Cooper Expires August 23, 2001 [Page 2] Internet-Draft RUP Requirements February 2001 1. Introduction A number of cache coherence or cache invalidation protocols have been proposed by the research community and the caching and content distribution industry. Approaches vary, with some proponents seeking to enhance existing protocols, and others developing new protocols either specifically for this purpose - or which include this functionality. Examples include WCIP[1], PSI[2] and DOCP[3]. A carefully developed mechanism for the communication of information about changes to Internet resources offers the potential for other functions above and beyond invalidation of cached objects. More general applications for this mechanism might include automated tracking of changes to related groups of resources through 'channel' subscriptions for real-time 'mirroring' of collections of resources, and the sharing of information about cached objects between intermediaries from different vendors. Resource updates may also be an appropriate way of informing systems which generate content dynamically that the underlying data which they manipulate (e.g. to produce HTML pages) has changed. The IETF's Web Intermediaries working group (WEBI) has been chartered to develop a Protocol based on requirements to be gathered here. For the reasons described above, we will refer to an abstract Resource Update Protocol (RUP, or simply 'the protocol') whose functionality need not be limited to simple invalidation of cached objects, but which will be capable of at least this. Note that RUP is at least conceptually a new protocol, but may in practice be based wholly or partly on existing protocols. This document is a first draft and will change significantly - at this stage in its life it is intended primarily to stimulate discussion. Hamilton & Cooper Expires August 23, 2001 [Page 3] Internet-Draft RUP Requirements February 2001 2. Terminology This document uses terms defined and explained in the WREC Taxonomy[4], and the HTTP/1.1 specification[5]. The reader should be familiar with both of these documents. In this document, the term "surrogate" is shorthand for a demand- driven surrogate origin server, unless explicitly stated otherwise. Similarly, "origin server" refers to a surrogate's master origin server. Cache coherence and invalidation is discussed in detail in the caching literature, e.g. see [6] and [7] for background information. Hamilton & Cooper Expires August 23, 2001 [Page 4] Internet-Draft RUP Requirements February 2001 3. Design Goals 1. The protocol must be simple and extensible, and it should be possible to use it to transport unforseen payloads without breaking existing implementations. The need for an extension mechanism (even in a purposely simple protocol) and the messy consequences of not providing this have been seen in a number of widely implemented and deployed protocols, e.g. syslog, with its hard coded priority and facility code bitfields. 2. The protocol must itself be widely deployable on the Internet, and should leverage existing technologies (e.g. XML, HTTP, URIs) as much as possible. This means that installed base and developer experience can be exploited, thus reducing the cost of entry to new implementors and would-be deployers of the protocol. Where work is being proposed in an area where there are existing mature technologies, this work must be justified in comparison with the work involved in simply re-using the existing technology. 3. The protocol should be easy to integrate into applications such as content management engines and Web server-side software components. At the time of writing, proprietary resource update protocols were in use in some commercial systems. The IETF's Resource Update Protocol should be capable of being used in this role, so as to facilitate open content exchange. Hamilton & Cooper Expires August 23, 2001 [Page 5] Internet-Draft RUP Requirements February 2001 4. Scoping Requirements 1. It must be possible for the protocol to be used in an environment where some or all communications are mediated through a firewall and/or Network Address Translation (NAT) device. The protocol design must identify issues involved in firewall/NAT traversal and provide ways by which these may be avoided or circumvented. These may not be explicitly security related concerns, e.g. working around any problems caused by use of Network Address Translation. 2. A mechanism providing for discovery of channels may be desirable, if a channel based model is adopted. This should not preclude or be a pre-requisite for development of the protocol per se - entities supporting RUP must be capable of being configured by hand too. 3. Even within a single administrative domain it is desirable that there be means by which the information transferred between (for example) origin server and surrogate can be authenticated and if necessary encrypted. This is an area where off-the-shelf solutions exist such as TLS and SASL - the developers of the protocol will need to determine how best to make use of these. 4. It is essential that the protocol supports revision control of updates, e.g. so that a surrogate can identify whether any updates the origin has notified it about are outstanding. Related efforts such as WEBDAV/DELTAV should be investigated, since they potentially provide an efficient bulk transfer system for the actual resource contents. Hamilton & Cooper Expires August 23, 2001 [Page 6] Internet-Draft RUP Requirements February 2001 5. Functional Requirements 1. The protocol should be useable both in a surrogate/origin server relationship and a traditional caching proxy/origin server relationship. The protocol should also be general enough to be useable in content delivery network (CDN) environments to allow freshness control of CDN delivery nodes. This will provide proxies with a low latency mechanism for cache coherence, obviating the need for cumbersome proxy revalidation. 2. The protocol must enable the communication of updates regarding an arbitrary group of resources, identified by unique URIs. 3. The protocol must define a client/server relationship. 4. We anticipate that the primary RUP clients and servers will be intermediaries (speaking the HTTP protocol) and origin servers, although the protocol should not be so designed as to preclude use by other entities. For example, the origin server or servers may delegate the role of RUP server to a CDN which operates dedicated content signalling channels and servers. 5. It must be possible to group resources together under some unique identifier such as a URI, which can be widely shared by a content provider with its surrogates. RUP resource group URIs can be designed to be unique, whereas URIs in the more general sense may not be. 6. There must be a feedback mechanism which enables the origin server to determine the extent to which resource updates have propagated to surrogates. 7. The protocol must define an extensible format for update messages which is capable of carrying a variety of payloads. Possible payloads include (1) cache invalidation, (2) prefetch instruction/hints, (3) small object full updates, and (4) small object delta updates. While the above payloads may share the same RUP mechanism, it's not a requirement for the initial protocol to address all of them simultaneously. 8. As a minimum the protocol must define an 'cache invalidation' payload to be used as a coherence mechanism. Additional requirements for the invalidation element of the protocol will need to be developed. 9. It must be possible to determine whether resource updates have been missed, e.g. due to a client or server being down or unreachable. It should be possible to replay or batch updates so Hamilton & Cooper Expires August 23, 2001 [Page 7] Internet-Draft RUP Requirements February 2001 that a consistent state is reached on all surrogates of a given origin server and collection of resources - i.e. update can effectively be guaranteed in a group of cooperating RUP clients and servers if they are prepared to work to achieve it. 10. The protocol should make perfect consistency (the update guarantee) possible, but not require it, so that loose consistency is available for those applications which do not require guaranteed updates - e.g. traditional batch mode mirroring applications. 11. The protocol must allow for the integration of commonly accepted standards for authentication, authorization and encryption. 12. The protocol should be designed to scale to systems where there are a large number (more than 10,000) surrogates of a given origin server. This can be done either directly or through multiple levels of intermediary. The protocol should be capable of operating efficiently on a wide variety of underlying media, high latency satellite links in particular will need to be considered. 13. Resource update guarantees must propagate correctly through the scaling mechanisms even if multiple levels of intermediary are used. Hamilton & Cooper Expires August 23, 2001 [Page 8] Internet-Draft RUP Requirements February 2001 6. Use Cases Please note that the protocol level details discussed here are only hypothetical at this stage, but necessary to support the examples. 1. Server-driven invalidation: in this scenario the RUP server would send object or resource group invalidations to the RUP clients, sending invalidation signals according to its own scheduling configuration. The connection between client and server could be established by either party, and could be persistent - so as to facilitate monitoring of the update guarantee through heartbeat packets. If an automated discovery mechanism was used to let clients detect servers (or vice versa), this would have security concerns which would need to be addressed. 2. Client-driven validation: in this scenario the RUP client would take the lead, querying the RUP server for the freshness status of an object or group of objects (denoted by a URI). The RUP server would reply with the latest changes since the last time the client asked - based on information such as Etag, timestamp, and/or version number. Whether and when the client asks the server is determined by the consistency guarantee the client is committed to provide, and should follow the semantic rules defined by the RUP protocol. The URI of a particular group of resources could be manually configured, sent as header information in the HTTP responses from the origin server, or distributed via a separate out-of-bound mechanism. 3. Small updates and update redirects: in this scenario the RUP server would notify the RUP client with either the full contents of a modified object or a delta update showing changes against the previous revision. Updates would also likely need to be small to avoid interfering with real-time cache invalidation and other meta-data signalling. The RUP server might also use "update redirects" - notifying the RUP clients that a large object is to be updated and that the full update is to be fetched from another source, e.g. a multicast object distribution channel. In this particular example there are related efforts which could be leveraged, such as SDP and SIP. Hamilton & Cooper Expires August 23, 2001 [Page 9] Internet-Draft RUP Requirements February 2001 7. Security Considerations Intermediaries open up a large number of new security problems which do not exist in the classical end-to-end model of the Internet, by introducing a 'Man In The Middle' by design. As such, it is essential that this protocol level work on intermediaries takes care to devise means by which the integrity of the resources being updated can be preserved - or at least tested. The major risks associated with the protocol should be quantified and specifically addressed by the protocol design. Hamilton & Cooper Expires August 23, 2001 [Page 10] Internet-Draft RUP Requirements February 2001 8. Acknowledgements Thanks to Mark Nottingham, Dan Li and the rest of the WEBI mailing list for their contributions. The JANET Web Cache Service is funded by the Joint Information Systems Committee of the UK Higher and Further Education Funding Councils (JISC). Hamilton & Cooper Expires August 23, 2001 [Page 11] Internet-Draft RUP Requirements February 2001 References [1] Li, D., Cao, P. and M. Dahlin, "WCIP: Web Cache Invalidation Protocol", draft-danli-wrec-wcip-00.txt (work in progress), November 2000. [2] Krishnamurthy, B. and C.E. Wills, "Piggyback server invalidation for proxy cache coherency", In Computer Networks and ISDN Systems, Volume 30 1998. [3] Dilley, J., Arlitt, M., Perret, S. and T. Jin, "The Distributed Object Consistency Protocol", Technical Report HPL-1999-109, September 1999. [4] Cooper, I., Melve, I. and G. Tomlinson, "Replication and Caching Taxonomy", RFC 3040, January 2001. [5] Fielding, R.T., Gettys, J., Mogul, J., Nielsen, H.F., Masinter, L., Leach, P. and T. Berners-Lee, "Hypertext Transfer Protocol - - HTTP/1.1", RFC 2616, June 1999. [6] Belloum, A. and L.O. Hertzberger, "Maintaining Web cache coherency", In Information Research, Volume 6 No. 1, October 2000. [7] Gwertzman, J. and M. Seltzer, "World-Wide Web Cache Consistency", In Proceedings 1996 USENIX Technical Conference, January 1996. Authors' Addresses Martin Hamilton JANET Web Cache Service Computing Services Loughborough University Loughborough, Leics LE11 3TU UK Phone: +44 1509 263171 EMail: martin@wwwcache.ja.net Hamilton & Cooper Expires August 23, 2001 [Page 12] Internet-Draft RUP Requirements February 2001 Ian Cooper Equinix Inc. 2450 Bayshore Parkway Mountain View, CA 94043 USA Phone: +1 650 316-6065 EMail: icooper@equinix.com Hamilton & Cooper Expires August 23, 2001 [Page 13] Internet-Draft RUP Requirements February 2001 Full Copyright Statement Copyright (C) The Internet Society (2001). All Rights Reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English. The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assigns. This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Acknowledgement Funding for the RFC Editor function is currently provided by the Internet Society. Hamilton & Cooper Expires August 23, 2001 [Page 14]