Network Working Group E. Lee Internet-Draft G. Markham Intended status: Standards Track The Mozilla Foundation Expires: January 3, 2008 July 2, 2007 Link Fingerprints draft-lee-uri-linkfingerprints-00.txt Status of this Memo By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on January 3, 2008. Copyright Notice Copyright (C) The IETF Trust (2007). Lee & Markham Expires January 3, 2008 [Page 1] Internet-Draft Link Fingerprints July 2007 Abstract Link Fingerprints provides a backward-compatible technique for resource providers to ensure that the resource originally referenced is the same as the resource retrieved by an end user. Changes are localized to the user agent retrieving the resource, and this can automatically prevent the end user from accidentally using unintended modified data while being transparent to the end user if the data is correct. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1. Goals . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2. URI Fragment Identifier . . . . . . . . . . . . . . . . . 4 1.3. Requirements Notation . . . . . . . . . . . . . . . . . . 4 2. Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.1. Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.2. Hash Types . . . . . . . . . . . . . . . . . . . . . . . . 5 2.3. URI To Check . . . . . . . . . . . . . . . . . . . . . . . 5 2.4. Failure Cases . . . . . . . . . . . . . . . . . . . . . . 6 2.5. Implications Of Design Choices . . . . . . . . . . . . . . 6 3. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 4. Security Considerations . . . . . . . . . . . . . . . . . . . 8 5. Normative References . . . . . . . . . . . . . . . . . . . . . 9 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 10 Intellectual Property and Copyright Statements . . . . . . . . . . 11 Lee & Markham Expires January 3, 2008 [Page 2] Internet-Draft Link Fingerprints July 2007 1. Introduction Uniform Resource Identifiers (URI) [RFC2396] provide a simple way to identify resources - such as a web page linking to a binary file download. However, there is not an easy way to for the link provider to ensure that the consumer of the link retrieves the resource that the provider intended the receiver to find when following the link. This discrepancy is especially bad in situations where the intended resource is replaced by a malicious entity. Link Fingerprints help alleviate this problem by allowing the link provider to additionally provide a compact representation - a fingerprint - of the resource to be retrieved. After the resource is retrieved, the user agent can recreate the fingerprint with the received data and compare if there has been a mismatch. Hash functions, such as Secure Hash Algorithm (SHA), are good candidates for this task because given an original message with its corresponding digest, it is difficult to create a new message that results in the same digest. A link provider uses Link Fingerprints for a particular resource by placing the fingerprint in the fragment identifier portion of the URI. This way, the user agent retrieving the resource can automatically decide if the transferred data has the same fingerprint. Because the fragment identifier portion of a URI is not sent on a resource request, servers and networks do not need to even know about Link Fingerprints; only the end-clients need to be modified. 1.1. Goals Link Fingerprints is more convenient compared to existing manual file transfer verification tools because it was designed with several goals in mind: o Extended utilization. By specifying the Link Fingerprints in the fragment identifier of a URI, all resources, including binary file downloads, can have the user agent automatically check if they are retrieved correctly. o Backwards compatibility. Existing user agents disregard the fragment identifier when downloading files, so using the fragment identifier is harmless. If the user agent needs to display the content for a particular media type [RFC2046], it is able to degrade peacefully when it finds an unusable fragment identifier, which is optional to begin with. Lee & Markham Expires January 3, 2008 [Page 3] Internet-Draft Link Fingerprints July 2007 o Localized client changes. Link Fingerprints do not affect the servers and networks providing the resource, so the additional requirements of supporting Link Fingerprints is focused on the user agents that choose to implement it. o Minimal user-interaction. When a resource is successfully retrieved without a fingerprint mismatch, the end user sees nothing different. Only in the unlikely case of a Link Fingerprint failure would the user be notified of a problem. o Familiar syntax. Current uses of fragment identifiers such as XPointer for XML have standardized syntax that is well defined with existing parsers. Link Fingerprints reuse the basic syntax to facilitate implementations. 1.2. URI Fragment Identifier The fragment identifier of a URI as defined by the URI: Generic Syntax [RFC2396] is an optional part used by the user agent to perform additional tasks after retrieving a resource. It is the last part of a URI and is separated with a number sign ('#'). One commonly used fragment identifier is used by text/html [RFC2854] to have the user agent jump to an element with the 'id' that matches the fragment identifier. When retrieving a resource, the fragment identifier is not included because it is only used by the user agent. This means the received data of the resource is the same as if the URI did not contain a fragment identifier. 1.3. Requirements Notation The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119]. Lee & Markham Expires January 3, 2008 [Page 4] Internet-Draft Link Fingerprints July 2007 2. Design Link Fingerprints define a fragment identifier for all media types, so that all requests of URIs made by a user agent can be checked to make sure the fingerprints match up. Any fragment identifier that has the form of #hash() is interpreted by the definition of Link Fingerprints in the next sections. 2.1. Syntax The syntax used for Link Fingerprints is similar to XPointer for XML with "hash" as the name to identify Link Fingerprints. It provides a way to have multiple hash algorithms for future extensibility, but currently, only a single algorithm is recommended for easier implementation across user agents as well as consistent adoption and usage by content providers. LinkFingerprint ::= 'hash' '(' HashExpr ')' HashExpr ::= 'sha256' ':' [0-9a-f]{64} | HashType ':' HashData HashType ::= [a-z0-9]+ HashData ::= [^)]* Each HashExpr begins with a HashType that identifies how the corresponding HashData is handled. This specification defines a single HashType, sha256. 2.2. Hash Types For the hash type "sha256", the user agent computes the SHA-256 hash of the data transferred for the resource and compares it to the HashData. If they do not match up, the Link Fingerprint is considered to be a failure case. Additional HashTypes can be defined at a later time using the HashType and HashData pair to determine when the Link Fingerprints is a failure case. 2.3. URI To Check Various protocols such as the Hypertext Transfer Protocol (HTTP) allows for URIs to redirect to other URIs. Link Fingerprints requires checking of the fragment identifier of the original URI used to request for the resource to prevent a redirected resource to change the expected Link Fingerprint. Because the fragment identifier is handled at the user agent, all it needs to do is keep track of the original fragment identifier when requesting a URI that contains a Link Fingerprint. Lee & Markham Expires January 3, 2008 [Page 5] Internet-Draft Link Fingerprints July 2007 2.4. Failure Cases Any fragment identifier that begins with "hash(" and contains ")" is considered a Link Fingerprint, and those that do not match the defined syntax result in a syntax error. In these cases, the user agent SHOULD fail a request early because the link provider intended the request to be checked. This also saves on network resources because the received data for the URI is considered corrupt. URIs with a correct Link Fingerprints syntax are requested and checked according to its HashType, and if it results in a failure, the user agent SHOULD NOT make the data available to the end user. This action is to prevent the user from accidentally using data that has been tampered with, such as a virus. The user agent SHOULD inform the user that the content of the resource has been corrupted and that the provider of the URI should be notified. 2.5. Implications Of Design Choices Because Link Fingerprints uses the fragment identifier of URIs, it is backwards compatible with user agents that do not know about the "hash()" syntax because they will not do anything with the fragment identifier. This client-side change only requires handling of SHA- 256 hashes, which currently have no known weaknesses, and this should facilitate the adoption of Link Fingerprints. The strict failing of Link Fingerprints should help link providers to ensure that the correct syntax is used. This affects the forwards compatibility of new HashTypes because older clients implementing a previous definition of Link Fingerprints would consider the fragment identifier as a syntax error. However, the provider of the link chose to use the optional fragment identifier and made the active decision that the URI was important enough to use Link Fingerprints, so user agents should honor that by failing early. Lee & Markham Expires January 3, 2008 [Page 6] Internet-Draft Link Fingerprints July 2007 3. Examples The follow example show how Link Fingerprints are used in URIs. Assume that "abc123" is actually the 64 character hash required by SHA-256. http://example.com/file.exe#hash(sha256:abc123) If the transferred data for "file.exe" results in "abc123" after computing the SHA-256 hash, the data is made available to the user as if there were no Link Fingerprints. If the computed hash does not match up, the user agent informs the user of the problem. The following are some example uses within a HTML document. Download Lee & Markham Expires January 3, 2008 [Page 7] Internet-Draft Link Fingerprints July 2007 4. Security Considerations Processing of the text for the Link Fingerprints fragment identifier needs to be done carefully to avoid buffer overflow security problems. Specialized string processing for Link Fingerprints might be used for optimized parsing, so implementations need to make sure the strings are handled correctly. Even existing string processing tools such as regular expressions have buffer overflow security holes, so implementations need to make sure the usage with Link Fingerprints does not encounter those problem cases. Lee & Markham Expires January 3, 2008 [Page 8] Internet-Draft Link Fingerprints July 2007 5. Normative References [RFC2046] Freed, N. and N. Borenstein, "Multipurpose Internet Mail Extensions (MIME) Part Two: Media Types", RFC 2046, November 1996. [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC2396] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform Resource Identifiers (URI): Generic Syntax", RFC 2396, August 1998. [RFC2854] Connolly, D. and L. Masinter, "The 'text/html' Media Type", RFC 2854, June 2000. Lee & Markham Expires January 3, 2008 [Page 9] Internet-Draft Link Fingerprints July 2007 Authors' Addresses Edward S. Lee The Mozilla Foundation 1981 Landings Drive Building K Mountain View, CA 94043 USA Email: edilee@mozilla.com Gervase Markham The Mozilla Foundation Email: gerv@mozilla.com Lee & Markham Expires January 3, 2008 [Page 10] Internet-Draft Link Fingerprints July 2007 Full Copyright Statement Copyright (C) The IETF Trust (2007). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Intellectual Property The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org. Acknowledgment Funding for the RFC Editor function is provided by the IETF Administrative Support Activity (IASA). Lee & Markham Expires January 3, 2008 [Page 11]