Network Working Group Ingrid Melve INTERNET-DRAFT Simon Wilkinson draft-melve-cachecontrol-00.txt Expires September 1998 Access-restricted, HTTP/1.1 Cache Control Extension Status of this Memo This document is an Internet-Draft. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as ``work in progress.'' To learn the current status of any Internet-Draft, please check the ``1id-abstracts.txt'' listing contained in the Internet-Drafts Shadow Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe), munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or ftp.isi.edu (US West Coast). Abstract User agents such as caches and web indexers, which act on behalf of more than one user are often given access to documents which are restricted by IP address or domain. These agents then republish this information to users outside the allowed block, as there is currently no means of marking these objects with their access restrictions. This document details an extension to the Cache-control header in HTTP/1.1 [HTTP/1.1] to add information about IP or domain based access restrictions. It also stresses that Cache-control should apply to all User-agents which work on behalf on a number of users, and not just to caches. 1. The rationale for header information about access restrictions Melve, Wilkinson [Page 1] Access-restricted extension March 1998 Web caches and indexing robots are examples of user agents which do not act on behalf of one end user. The problem of access control when sharing indexes or caches is not trivial for documents which have access control based on IP address or domain name, since there is no indication of access control being used for the particular document. Web servers do not send any information about access control done by IP address. If a user within the allowed IP address range requests the document, the document is stored in the cache and subsequent request will be served from the cache. This may cause documents to be served to users without access rights. Several popular web servers permit users to create their own access control, like Apache does with local .htaccess files, and the local web master may not know about access restrictions. The local cache master is even more unlikely to know about such restrictions. This problem is also the same for site licenses for information and software, if access control is implemented on basis of IP numbers or domain names. A sibling cache requests a document available at the cache server (this server has an IP address within the allowed range) and this may then be handed out to sibling. The solution to this has been manual configuration and collection of information about such site licenses by the cache administrators. 2. Access restrictions A number of methods have been proposed for communicating some access control information to visiting User-Agents. HTTP/1.1 provides the Cache-Control header which can indicate the "private" or "public" nature of a document, but provides no information as to the community that the information is private to. An example of why this causes problems is with site licenses for web information. A server may be located in the United States, and the users in Norway, yet using the "private" header prevents any cache from caching the data. Making documents uncacheable is clearly stupid as the latency often is too high to ensure a good service for the end users. Another method proposed for robots is the use of the robots.txt file This method works on a centrally controlled server, where the maintainer of the robots.txt file is aware of all access restrictions in place, but breaks down on a server where any user may add access restrictions to their pages. Proposed Cache Control Extension Access-restricted="IP:" Access-restricted="Domain:" Melve, Wilkinson [Page 2] Access-restricted extension March 1998 This header does not ensure the security of a document, but gives multi-user agents an opportunity to restrict access. If an unknown realm is encountered, the indexing robot or cache should treat the document as restricted and not share information. 3. The Access-restricted extension HTTP/1.1 allows an extension to Cache-Control directives, allowing additional extensions to act as modifiers to the base directives We propose the addition of an "Access-restricted" extension which would be used with the "private" directive to give additional control of cache information. This header does not ensure the security of a document, but gives multi-user agents an opportunity to restrict access. If an unknown realm is encountered, the indexing robot or cache should treat the document as restricted and not share information. 3.1 Using Access-restricted with IP address Access restriction by IP address is popular and may by locally configured by users for their web pages, which puts it out of control of the web master. In open shared communities, like universities, this may cause problems as restricted documents are indexed or cached. Information about which IP address ranges are allowed to access the document would prevent unauthorized users from gaining access. The Access-restricted header is followed by a comma separated list of IP ranges for which access to the document is permitted. Example Cache-Control: private, Access-restricted="IP:158.38.60.0/24" 3.2 Using Access-restricted with domains Access restrictions by domain should be interpreted as all FQDN in the domain and all subdomains of the domain name may get access. Domain names are restricted from left to right. Example Cache-Control: private, Access- restricted="Domain:*.tjener.uninett.no" Melve, Wilkinson [Page 3] Access-restricted extension March 1998 This restricts access to all hosts in the tjener.uninett.no subdomain Example Cache-Control: private, Access-restricted="Domain:nurket.uninett.no" This restricts access to the host nurket.uninett.no, and no other hosts 3.3 Comma separated lists Access restrictions may be combined by using a comma separated list Example Cache-Control: private, Access- restricted="IP:158.38.60.0/24,Domain:*.dcs.ed.ac.uk" This restricts access to all hosts in the IP address range 158.38.60.* as well as hosts in the subdomain dcs.ed.ac.uk (the example is broken to fit in the text) 4. Security considerations This proposal enhances the security of access restricted web objects, as it stops today's practice of accidental sharing. Information about access restrictions should only be handed out with the web objects, to prevent users without access from get information about these restricted web objects. Some servers may have restrictions which are time or load-dependent and expressing those can be a problem (i.e. a server intended as an EU mirror of U.S. data may refuse or redirect U.S. requests, unless its load is below some set point) Releasing the information on restrictions may provide an opportunity for someone to follow up with an IP or domain-spoofed request for the data. The proposal is to give access restriction information only to hosts which are not restricted, this reduces the problem. Content providers must bear in mind that there is no guarantee of a particular user agent honouring either the Cache-Control, or Access- restricted headers. Alternative measures should be taken if document confidentiality is important. 5. References [HTTP/1.1] Fielding, R., Gettys, J., Mogul, J., Frystyk, H., Berners- Melve, Wilkinson [Page 4] Access-restricted extension March 1998 Lee, T., "Hypertext Transfer Protocol -- HTTP/1.1", RFC 2068, January, 1997. 6. Authors' Addresses Ingrid Melve UNINETT Tempeveien 22, Trondheim, NORWAY Phone: +47 73 55 79 07 Email: Ingrid.Melve@uninett.no Simon Wilkinson Department of Computer Science, University of Edinburgh Kings Buildings Mayfield Road, Edinburgh Scotland, UK Email: sxw@dcs.ed.ac.uk Melve, Wilkinson [Page 5]