HTTP Working Group Koen Holtman, TUE Internet-Draft Andrew Mutz, Hewlett-Packard Expires: March 15, 1998 September 15, 1997 The Alternates Header Field draft-ietf-http-alternates-00.txt STATUS OF THIS MEMO This document is an Internet-Draft. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress". To learn the current status of any Internet-Draft, please check the "1id-abstracts.txt" listing contained in the Internet-Drafts Shadow Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe), munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or ftp.isi.edu (US West Coast). Distribution of this document is unlimited. Please send comments to the HTTP working group at . Discussions of the working group are archived at . General discussions about HTTP and the applications which use HTTP should take place on the mailing list. HTML and change bar versions of this document are available at . ABSTRACT HTTP allows web site authors to put multiple versions of the same information under a single URL. The Alternates header field can be used to transmit a machine-readable description of these versions. This allows the recipient to automatically select the most appropriate version. TABLE OF CONTENTS 1 Introduction 1.1 Background 1.2 Applicability 2 Terminology 2.1 Terms from HTTP/1.1 2.2 New terms 3 Notation 4 The Alternates header field 4.1 Definition 4.2 Length of variant lists 5 Variant descriptions 5.1 Syntax 5.2 URI 5.3 Source-quality 5.4 Type, charset, language, and length 5.5 Extension-attribute 6 Use of the Alternates header field 6.1 Use in a response which contains a variant 6.2 Use in a response which does not contain a variant 6.3 Use in a response which redirects to a variant 6.4 User agent guidelines 6.5 Negotiation on content encoding 6.6 Role of proxies 7 Security and privacy considerations 7.1 User agent choices revealing information of a private nature 7.2 Security holes revealed by negotiation 8 Acknowledgments 9 References 10 Authors' addresses 11 Appendix: Example of a variant selection algorithm 11.1 Computing overall quality values 11.2 Determining the result 11.3 Ranking dimensions 1 Introduction HTTP allows web site authors to put multiple versions of the same information under a single URI. Each of these versions is called a `variant'. The Alternates header field can be used to transmit a machine-readable description of these variants. This allows the recipient to automatically select the most appropriate variant. This specification defines the Alternates header field as part of the HTTP/1.x protocol suite [1]. Note: Though this specification is limited to discussing HTTP transactions, elements of this specification could also be used in other contexts. For example, variant descriptions could be used in multipart mail messages. 1.1 Background HTTP/1.1 allows web site authors to put multiple versions of the same information under a single resource URI. Each of these versions is called a `variant'. For example, a resource http://x.org/paper could offer three different variants of a paper: 1. HTML, English 2. HTML, French 3. Postscript, English Content negotiation is the process by which the best variant is selected if the resource is accessed. The selection is done by matching the properties of the available variants to the capabilities of the user agent and the preferences of the user. HTTP/1.1 [1] defines three forms of content negotiation: 1. Server-driven content negotiation, in which the origin server selects the best variant 2. Agent-driven content negotiation, in which the user agent selects the best variant 3. Transparent content negotiation, in which a a distributed process is used to choose the best variant, with either the user agent, the origin server, or a proxy in between making the final choice. See [1] for a detailed definition of these three forms, and a discussion of their individual advantages and disadvantages. HTTP/1.1 only defines the protocol elements necessary to support server-driven negotiation. This document defines the Alternates header field as a way of supporting agent-driven negotiation. The protocol elements needed for transparent content negotiation are defined in [4]. 1.2 Applicability This specification allows for agent-driven negotiation, using a subset of the protocol elements in [4]. Implementations based on this specification will be able to co-exist with implementations based on plain HTTP/1.0 [3], plain HTTP/1.1 [1], and with implementations using all protocol elements in [4]. 2 Terminology The words "MUST", "MUST NOT", "SHOULD", "SHOULD NOT", and "MAY" in this document are to be interpreted as described in RFC 2119 [5]. 2.1 Terms from HTTP/1.1 This specification mostly uses the terminology of the HTTP/1.1 specification [1]. The definitions below were reproduced from [1]. request An HTTP request message. response An HTTP response message. resource A network data object or service that can be identified by a URI. Resources may be available in multiple representations (e.g. multiple languages, data formats, size, resolutions) or vary in other ways. content negotiation The mechanism for selecting the appropriate representation when servicing a request. client A program that establishes connections for the purpose of sending requests. user agent The client which initiates a request. These are often browsers, editors, spiders (web-traversing robots), or other end user tools. server An application program that accepts connections in order to service requests by sending back responses. Any given program may be capable of being both a client and a server; our use of these terms refers only to the role being performed by the program for a particular connection, rather than to the program's capabilities in general. Likewise, any server may act as an origin server, proxy, gateway, or tunnel, switching behavior based on the nature of each request. origin server The server on which a given resource resides or is to be created. proxy An intermediary program which acts as both a server and a client for the purpose of making requests on behalf of other clients. Requests are serviced internally or by passing them on, with possible translation, to other servers. A proxy must implement both the client and server requirements of this specification. 2.2 New terms negotiable resource A resource, identified by a single URI, which has multiple representations (variants) associated with it. When servicing a request on its URI, it allows selection of the best variant using some form of content negotiation. variant list A list containing variant descriptions, which can be bound to a negotiable resource. variant description A machine-readable description of a variant resource, usually found in a variant list. A variant description contains the variant resource URI and various attributes which describe properties of the variant. Variant descriptions are defined in section 5. variant resource A resource from which a variant of a negotiable resource can be retrieved with a normal HTTP/1.x GET request, i.e. a GET request which does not use content negotiation. variant selection algorithm An algorithm which can choose the best variant from a variant list. 3 Notation The version of BNF used in this document is taken from [1], and many of the nonterminals used are defined in [1]. 4 The Alternates header field When returning a particular piece of content, a server may wish to notify the client that this content is available in multiple variants. This can be done by adding an Alternates header field, which lists the available variants, to the response. The Alternates header field can also be used in a response which does not include any particular variant, but which simply informs the client that multiple variants are available. An example of an alternates response header field, which lists three variants, is Alternates: {"paper.1" 0.9 {type text/html} {language en}}, {"paper.2" 0.7 {type text/html} {language fr}}, {"paper.3" 1.0 {type application/postscript} {language en}} On receipt of an Alternates header field, a user agent can use a variant selection algorithm to choose the best variant from the list. This specification does not define a standard variant selection algorithm, user agent implementers may use whichever algorithm they find most suitable. Appendix 11 contains an example of a variant selection algorithm. 4.1 Definition The Alternates response header field describes all available variants for the resource on which the request was made. The description for each variant includes an URI from which this variant can be retrieved. The Alternates header field can also contain directives for any negotiation process which is initiated by the receipt of the response. Alternates = "Alternates" ":" variant-list variant-list = 1#( variant-description ; see section 5 | fallback-variant | negotiation-directive ) fallback-variant = "{" <"> URI <"> "}" negotiation-directive = token [ "=" ( token | quoted-string ) ] An example is Alternates: {"paper.1" 0.9 {type text/html} {language en}}, {"paper.2" 0.7 {type text/html} {language fr}}, {"paper.3" 1.0 {type application/postscript} {language en}}, {"paper.html.en"}, x=y Any relative URI specified in a variant-description or fallback-variant field is relative to the request-URI. A variant list may contain multiple differing descriptions of the same variant. This can be convenient if the variant uses conditional rendering constructs, or if the variant resource returns multiple representations using a multipart media type. Only one fallback-variant field may be present. If the variant selection algorithm of the user agent finds that all variants described by variant-description fields are unacceptable, then it SHOULD choose the fallback variant, if present, as the best variant. If the user agent computes the overall quality values of the described variants, and finds that several variants share the highest value, then the first variant with this value in the list SHOULD be chosen as the best variant. This specification does not define any specific negotiation directives for the Alternates header field. User agents SHOULD ignore all negotiation directives they do not understand. If a proxy receives an Alternates header field with an unknown negotiation directive, it SHOULD, whenever possible, forward the response towards the user agent instead of trying to take part in a negotiation process itself. 4.2 Length of variant lists As a general rule, variant lists in Alternates header fields should be short: it is expected that a typical negotiable resource will have 2 to 10 variants, depending on its purpose. Servers which have many more variants SHOULD use a method of describing them which is more sophisticated than the Alternates header field defined in this document. 5 Variant descriptions 5.1 Syntax A variant can be described in a machine-readable way with a variant description. variant-description = "{" <"> URI <"> source-quality *variant-attribute"}" source-quality = qvalue variant-attribute = "{" "type" media-type "}" | "{" "charset" charset "}" | "{" "language" 1#language-tag "}" | "{" "length" 1*DIGIT "}" | extension-attribute extension-attribute = "{" extension-name extension-value "}" extension-name = token extension-value = *( token | quoted-string | LWS | extension-specials ) extension-specials = and "}"> Examples are {"paper.2" 0.7 {type text/html} {language fr}} {"paper.5" 0.9 {type text/html} {length 1002}} {"paper.1" 0.001} The various attributes which can be present in a variant description are covered in the subsections below. Each attribute may appear only once in a variant description. 5.2 URI The URI attribute gives the URI of the resource from which the variant can be retrieved with a GET request. It can be absolute or relative to the Request-URI. The variant resource may vary the content it sends (on the Cookie request header field, for example), but SHOULD NOT engage in content negotiation itself. 5.3 Source-quality The source-quality attribute gives the quality of the variant, as a representation of the negotiable resource, when this variant is rendered with a perfect rendering engine on the best possible output medium. If the source-quality is less than 1, it often expresses a quality degradation caused by a lossy conversion to a particular data format. For example, a picture originally in JPEG form would have a lower source quality when translated to the XBM format, and a much lower source quality when translated to an ASCII-art variant. Note however, that degradation is a function of the source; an original piece of ASCII-art may degrade in quality if it is captured in JPEG form. The source-quality could also represent a level of quality caused by skill of language translation, or ability of the used media type to capture the intended artistic expression. Servers should use the following table a guide when assigning source quality values: 1.000 perfect representation 0.900 threshold of noticeable loss of quality 0.800 noticeable, but acceptable quality reduction 0.500 barely acceptable quality 0.300 severely degraded quality 0.000 completely degraded quality The same table can be used by variant selection algorithms in user agents (see appendix 11) when assigning degradation factors for different content rendering mechanisms. Note that most meaningful values in this table are close to 1. This is due to the fact that quality degradation factors are generally combined by multiplying them, not by adding them. In the source-quality values, servers should not account for the size of the variant and its impact on transmission and rendering delays; the size of the variant should be stated in the length attribute and any size-dependent calculations should be done by a variant selection algorithm in the user agent. 5.4 Type, charset, language, and length The type attribute of a variant description carries the same information as its Content-Type response header field counterpart defined in [1], except for any charset information, which MUST be carried in the charset attribute. For, example, the header field Content-Type: text/html; charset=ISO-8859-4 has the counterpart attributes {type text/html} {charset ISO-8859-4} The language and length attributes carry the same information as their Content-* response header field counterparts in [1]. The length attribute, if present, MUST thus reflect the length of the variant alone, and not the total size of the variant and any objects inlined or embedded by the variant. Though all of these attributes are optional, it is often desirable to include as many attributes as possible, as this will increase the quality of the negotiation process. Note: A server is not required to maintain a one-to-one correspondence between the attributes in the variant description and the Content-* header fields in the variant response. For example, if the variant description contains a language attribute, the response does not necessarily have to contain a Content-Language header field. If a Content-Language header field is present, it does not have to contain an exact copy of the information in the language attribute. 5.5 Extension-attribute The extension-attribute allows future specifications to incrementally define new dimensions of negotiation, and eases content negotiation experiments. User agents conforming to this specification SHOULD treat all variants with extension attributes they do not recognize as unusable. Proxies SHOULD NOT do any negotiation processing for a response if an extension attribute unknown to them is present in the variant list. They SHOULD forward the response unchanged towards the user agent instead. The extension names "features" and "description" are reserved by this specification for use in transparent content negotiation [4]. 6 Use of the Alternates header field This section defines conventions and guidelines for the use of the Alternates header field. 6.1 Use in a response which contains a variant If a request is done on a negotiable resource, the server may return a particular variant in the response, together with an Alternates header field which notifies the client that multiple variants are available. An example of such a response is: HTTP/1.1 200 OK Date: Tue, 11 Jun 1996 20:05:31 GMT Content-Type: text/html Content-Language: en Last-Modified: Mon, 10 Jun 1996 10:01:14 GMT Content-Length: 5327 Alternates: {"paper.1" 0.9 {type text/html} {language en}}, {"paper.2" 0.7 {type text/html} {language fr}}, {"paper.3" 1.0 {type application/postscript} {language en}} Content-Location: paper.1 Vary: * Expires: Thu, 01 Jan 1980 00:00:00 GMT Cache-Control: max-age=604800 A paper about .... In this response, the Content-Location header field tells the user agent which variant was included. The Vary, Expires, and Cache-Control header fields ensure proper handling of the response by HTTP/1.0 and HTTP/1.1 caches. When detecting that an Alternates header field is present, a user agent MAY choose to use a variant selection algorithm to select the best variant of the negotiable resource. If the best variant is not the same one as is included in the response (as identified by the Content-Location header field), the user agent MAY do a new request on the variant resource of the best variant in order to retrieve it. 6.2 Use in a response which does not contain a variant If the response to a request on a negotiable resource does not contain a particular variant, the origin server should signal this by not including any Content-Location header field. An example of such a response is: HTTP/1.1 200 OK Date: Tue, 11 Jun 1996 20:02:21 GMT Content-Type: text/html Content-Length: 227 Alternates: {"paper.1" 0.9 {type text/html} {language en}}, {"paper.2" 0.7 {type text/html} {language fr}}, {"paper.3" 1.0 {type application/postscript} {language en}} Vary: * Expires: Thu, 01 Jan 1980 00:00:00 GMT Cache-Control: max-age=604800 <h2>Multiple Choices:</h2> <ul> <li><a href=paper.1>HTML, English version</a> <li><a href=paper.2>HTML, French version</a> <li><a href=paper.3>Postscript, English version</a> </ul> On receipt of such a response, the user agent SHOULD use a variant selection algorithm to select the best variant of the negotiable resource, and retrieve this variant. For compatibility with user agents which are not capable of handling the Alternates header field, a response body which allows the user to select the best variant manually can be included. 6.3 Use in a response which redirects to a variant By putting an Alternates header field in a redirection response, an origin server can avoid the sending of a variant, which may be the wrong variant, to a user agent capable of using the Alternates header field, while still providing automatic selection for user agents which are not capable of using the Alternates header field. An example of such a response is: HTTP/1.1 302 Moved Temporarily Date: Tue, 11 Jun 1996 20:05:31 GMT Content-Type: text/html Alternates: {"paper.1" 0.9 {type text/html} {language en}}, {"paper.2" 0.7 {type text/html} {language fr}}, {"paper.3" 1.0 {type application/postscript} {language en}} Location: paper.1 Content-Length: 53 This document is available <a href=paper.1>here</a>. Note the use of a Location header field instead of a Content-Location header field. On receipt of such a response, the user agent SHOULD use a variant selection algorithm to select the best variant of the negotiable resource, and retrieve this variant. 6.4 User agent guidelines Summarizing the three sections above, if an Alternates header field is present in the response, then * a user agent SHOULD use its variant selection algorithm to choose and retrieve the best variant if a Content-Location header field is absent, * and MAY use its variant selection algorithm to choose and retrieve the best variant if a Content-Location header field is present. If the user agent is displaying a variant as the result of a content negotiation process, and the variant is not an embedded or inlined object, the following requirements apply. 1. The user agent SHOULD make available though its user interface some indication that the resource being displayed is a negotiated resource instead of a plain resource. It SHOULD also allow the user to examine the variant list included in the Alternates header field. Such a notification and review mechanism is needed because of privacy considerations, see section 7.1. 2. If the user agent shows the URI of the displayed information to the user, it SHOULD be the negotiable resource URI, not the variant URI that is shown. This encourages third parties, who want to refer to the displayed information in their own documents, to make a hyperlink to the negotiable resource as a whole, rather than to the variant resource which happens to be shown. Such correct linking is vital for the interoperability of content across sites. The user agent SHOULD however also provide a means for reviewing the URI of the particular variant which is currently being displayed. 3. Similarly, if the user agent stores a reference to the displayed information for future use, for example in a hotlist, it SHOULD store the negotiable resource URI, not the variant URI. 6.5 Negotiation on content encoding Negotiation on the content encoding of a response is orthogonal to content negotiation based on the Alternates header field. The presence of an Alternates header field in a response does not change the rules, as stated by the HTTP/1.1 specification [1], which determine when a content-encoding may be added or removed by an origin server or proxy. 6.6 Role of proxies This specification does not define mechanisms by which proxies can use the Alternates header field, but does allow other specifications to define such mechanisms. To ensure extensibility of the Alternates header field, this specification does however define, in section 4.1 and section 5.5, that a proxy should not engage in a negotiation process when encountering an Alternates header field which has a component unknown to it. 7 Security and privacy considerations 7.1 User agent choices revealing information of a private nature The automatic selection and retrieval of a variant by a user agent will reveal a preference for this variant to the server. A malicious service author could provide a page with `fake' negotiability on (ethnicity-correlated) languages, with all variants actually being the same English document, as a means of obtaining privacy-sensitive information. Such a plot would however be visible to an alert victim if the list of available variants and their properties is reviewed through a mechanism as described in section 6.4. 7.2 Security holes revealed by negotiation Malicious servers could use content negotiation as a means of obtaining information about security holes which may be present in user agents. 8 Acknowledgments Work on HTTP content negotiation has been done since at least 1993. This specification builds on an earlier incomplete specification of the Alternates header field recorded in [2]. The authors wish to thank the individuals who have contributed to the work on content negotiation in the HTTP working group, including Brian Behlendorf, Daniel DuBois, Martin J. Duerst, Roy T. Fielding, Jim Gettys, Yaron Goland, Dirk van Gulik, Ted Hardie, Graham Klyne, Scott Lawrence, Larry Masinter, Jeffrey Mogul, Henrik Frystyk Nielsen, Frederick G.M. Roeber, Paul Sutton, and Klaus Weide. 9 References [1] R. Fielding, J. Gettys, J. C. Mogul, H. Frystyk, and T. Berners-Lee. Hypertext Transfer Protocol -- HTTP/1.1. RFC 2068, HTTP Working Group, January 1997. [2] Roy T. Fielding, Henrik Frystyk Nielsen, and Tim Berners-Lee. Hypertext Transfer Protocol -- HTTP/1.1. Internet-Draft draft-ietf-http-v11-spec-01.txt, HTTP Working Group, January 1996. [3] T. Berners-Lee, R. Fielding, and H. Frystyk. Hypertext Transfer Protocol -- HTTP/1.0. RFC 1945. MIT/LCS, UC Irvine, May 1996. [4] K. Holtman, A. Mutz. Transparent Content Negotiation in HTTP. Internet-Draft draft-ietf-http-negotiation-04.txt, HTTP Working Group, September 1997. [5] S. Bradner. Key words for use in RFCs to Indicate Requirement Levels. RFC 2119. Harvard University, March 1997. 10 Authors' addresses Koen Holtman Technische Universiteit Eindhoven Postbus 513 Kamer HG 6.57 5600 MB Eindhoven (The Netherlands) Email: koen@win.tue.nl Andrew H. Mutz Hewlett-Packard Company 1501 Page Mill Road 3U-3 Palo Alto CA 94304, USA Fax +1 415 857 4691 Email: mutz@hpl.hp.com 11 Appendix: Example of a variant selection algorithm A negotiating user agent will choose the best variant from a variant list with a variant selection algorithm. This appendix contains an example of such an algorithm. The inputs of the algorithm are a variant list from an Alternates header field, and an agent-side configuration database, which contains - a collection of quality values assigned to media types, languages, and charsets for the current request, following the model of the corresponding HTTP/1.1 [1] Accept- header fields, - a table which lists `forbidden' combinations of media types and charsets, i.e. combinations which cannot be displayed because of some internal user agent limitation. The output of the algorithm is either the best variant, or the conclusion that none of the variants are acceptable. 11.1 Computing overall quality values As a first step in the variant selection algorithm, the overall qualities associated with all variant descriptions in the list are computed. The overall quality Q of a variant description is the value Q = round5( qs * qt * qc * ql * qa ) where rounds5 is a function which rounds a floating point value to 5 decimal places after the point. It is assumed that the user agent can run on multiple platforms: the rounding function makes the algorithm independent of the exact characteristics of the underlying floating point hardware. The factors qs, qt, qc, ql, and qa are determined as follows. qs Is the source quality factor in the variant description. qt The media type quality factor is 1 if there is no type attribute in the variant description. Otherwise, it is the quality value assigned to this type by the configuration database. If the database does not assign a value, then the factor is 0. qc The charset quality factor is 1 if there is no charset attribute in the variant description. Otherwise, it is the quality value assigned to this charset by the configuration database. If the database does not assign a value, then the factor is 0. ql The language quality factor is 1 if there is no language attribute in the variant description. Otherwise, it is the highest quality value the configuration database assigns to any of the languages listed in the language attribute. If the database does not assign a value to any of the languages listed, then the factor is 0. qa The quality adjustment factor is 0 if the variant description lists a media type - charset combination which is `forbidden' by the table, and 1 otherwise. As an example, if a variant list contains the variant description {"paper.2" 0.7 {type text/html} {language fr}} and if the configuration database contains the quality value assignments types: text/html;q=1.0, type application/postscript;q=0.8 languages: en;q=1.0, fr;q=0.5 then the variant selection algorithm will compute the overall quality for the variant description as follows: {"paper.2" 0.7 {type text/html} {language fr}} | | | | | | V V V round5 ( 0.7 * 1.0 * 0.5 ) = 0.35000 With same configuration database, the variant list {"paper.1" 0.9 {type text/html} {language en}}, {"paper.2" 0.7 {type text/html} {language fr}}, {"paper.3" 1.0 {type application/postscript} {language en}} would yield the following computations: round5 ( qs * qt * qc * ql * qa ) = Q --- --- --- --- --- paper.1: 0.9 * 1.0 * 1.0 * 1.0 * 1.0 = 0.90000 paper.1: 0.7 * 1.0 * 1.0 * 0.5 * 1.0 = 0.35000 paper.3: 1.0 * 0.8 * 1.0 * 1.0 * 1.0 = 0.80000 11.2 Determining the result Using all computed overall quality values, the end result of the variant selection algorithm is determined as follows. If all overall quality values are 0, then the best variant is the fallback variant, if there is one in the Alternates header field, else the result is the conclusion that none of the variants are acceptable. If at least one overall quality value is greater than 0, then the best variant is the variant which has the description with the highest overall quality value, or, if there are multiple variant descriptions which share the highest overall quality value, the variant of the first variant description in the list which has this highest overall quality value. 11.3 Ranking dimensions Consider the following variant list: {"paper.greek" 1.0 {language el} {charset ISO-8859-7}}, {"paper.english" 1.0 {language en} {charset ISO-8859-1}} It could be the case that the user prefers the language "el" over "en", while the user agent can render "ISO-8859-1" better than "ISO-8859-7". The result is that in the language dimension, the first variant is best, while the second variant is best in the charset dimension. In this situation, it would be preferable to choose the first variant as the best variant: the user settings in the language dimension should take precedence over the hard-coded values in the charset dimension. To express this ranking between dimensions, the user agent configuration database should have a higher spread in the quality values for the language dimension than for the charset dimension. For example, with languages: el;q=1.0, en-gb;q=0.7, en;q=0.6, da;q=0, ... charsets: ISO-8859-1;q=1.0, ISO-8859-7;q=0.95, ISO-8859-5;q=0.97, unicode-1-1;q=0, ... the first variant will have an overall quality of 0.95000, while the second variant will have an overall quality 0.70000. This makes the first variant the best variant. Expires: March 15, 1998