Internet DRAFT - draft-lee-uri-linkfingerprints


Network Working Group                                             E. Lee
Internet-Draft                                                G. Markham
Intended status: Standards Track                  The Mozilla Foundation
Expires: January 3, 2008                                    July 2, 2007

                           Link Fingerprints

Status of this Memo

   By submitting this Internet-Draft, each author represents that any
   applicable patent or other IPR claims of which he or she is aware
   have been or will be disclosed, and any of which he or she becomes
   aware will be disclosed, in accordance with Section 6 of BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at

   The list of Internet-Draft Shadow Directories can be accessed at

   This Internet-Draft will expire on January 3, 2008.

Copyright Notice

   Copyright (C) The IETF Trust (2007).

Lee & Markham            Expires January 3, 2008                [Page 1]
Internet-Draft              Link Fingerprints                  July 2007


   Link Fingerprints provides a backward-compatible technique for
   resource providers to ensure that the resource originally referenced
   is the same as the resource retrieved by an end user.  Changes are
   localized to the user agent retrieving the resource, and this can
   automatically prevent the end user from accidentally using unintended
   modified data while being transparent to the end user if the data is

Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
     1.1.  Goals  . . . . . . . . . . . . . . . . . . . . . . . . . .  3
     1.2.  URI Fragment Identifier  . . . . . . . . . . . . . . . . .  4
     1.3.  Requirements Notation  . . . . . . . . . . . . . . . . . .  4
   2.  Design . . . . . . . . . . . . . . . . . . . . . . . . . . . .  5
     2.1.  Syntax . . . . . . . . . . . . . . . . . . . . . . . . . .  5
     2.2.  Hash Types . . . . . . . . . . . . . . . . . . . . . . . .  5
     2.3.  URI To Check . . . . . . . . . . . . . . . . . . . . . . .  5
     2.4.  Failure Cases  . . . . . . . . . . . . . . . . . . . . . .  6
     2.5.  Implications Of Design Choices . . . . . . . . . . . . . .  6
   3.  Examples . . . . . . . . . . . . . . . . . . . . . . . . . . .  7
   4.  Security Considerations  . . . . . . . . . . . . . . . . . . .  8
   5.  Normative References . . . . . . . . . . . . . . . . . . . . .  9
   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 10
   Intellectual Property and Copyright Statements . . . . . . . . . . 11

Lee & Markham            Expires January 3, 2008                [Page 2]
Internet-Draft              Link Fingerprints                  July 2007

1.  Introduction

   Uniform Resource Identifiers (URI) [RFC2396] provide a simple way to
   identify resources - such as a web page linking to a binary file
   download.  However, there is not an easy way to for the link provider
   to ensure that the consumer of the link retrieves the resource that
   the provider intended the receiver to find when following the link.
   This discrepancy is especially bad in situations where the intended
   resource is replaced by a malicious entity.

   Link Fingerprints help alleviate this problem by allowing the link
   provider to additionally provide a compact representation - a
   fingerprint - of the resource to be retrieved.  After the resource is
   retrieved, the user agent can recreate the fingerprint with the
   received data and compare if there has been a mismatch.  Hash
   functions, such as Secure Hash Algorithm (SHA), are good candidates
   for this task because given an original message with its
   corresponding digest, it is difficult to create a new message that
   results in the same digest.

   A link provider uses Link Fingerprints for a particular resource by
   placing the fingerprint in the fragment identifier portion of the
   URI.  This way, the user agent retrieving the resource can
   automatically decide if the transferred data has the same
   fingerprint.  Because the fragment identifier portion of a URI is not
   sent on a resource request, servers and networks do not need to even
   know about Link Fingerprints; only the end-clients need to be

1.1.  Goals

   Link Fingerprints is more convenient compared to existing manual file
   transfer verification tools because it was designed with several
   goals in mind:

   o  Extended utilization.  By specifying the Link Fingerprints in the
      fragment identifier of a URI, all resources, including binary file
      downloads, can have the user agent automatically check if they are
      retrieved correctly.

   o  Backwards compatibility.  Existing user agents disregard the
      fragment identifier when downloading files, so using the fragment
      identifier is harmless.  If the user agent needs to display the
      content for a particular media type [RFC2046], it is able to
      degrade peacefully when it finds an unusable fragment identifier,
      which is optional to begin with.

Lee & Markham            Expires January 3, 2008                [Page 3]
Internet-Draft              Link Fingerprints                  July 2007

   o  Localized client changes.  Link Fingerprints do not affect the
      servers and networks providing the resource, so the additional
      requirements of supporting Link Fingerprints is focused on the
      user agents that choose to implement it.

   o  Minimal user-interaction.  When a resource is successfully
      retrieved without a fingerprint mismatch, the end user sees
      nothing different.  Only in the unlikely case of a Link
      Fingerprint failure would the user be notified of a problem.

   o  Familiar syntax.  Current uses of fragment identifiers such as
      XPointer for XML have standardized syntax that is well defined
      with existing parsers.  Link Fingerprints reuse the basic syntax
      to facilitate implementations.

1.2.  URI Fragment Identifier

   The fragment identifier of a URI as defined by the URI: Generic
   Syntax [RFC2396] is an optional part used by the user agent to
   perform additional tasks after retrieving a resource.  It is the last
   part of a URI and is separated with a number sign ('#').  One
   commonly used fragment identifier is used by text/html [RFC2854] to
   have the user agent jump to an element with the 'id' that matches the
   fragment identifier.

   When retrieving a resource, the fragment identifier is not included
   because it is only used by the user agent.  This means the received
   data of the resource is the same as if the URI did not contain a
   fragment identifier.

1.3.  Requirements Notation

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   document are to be interpreted as described in [RFC2119].

Lee & Markham            Expires January 3, 2008                [Page 4]
Internet-Draft              Link Fingerprints                  July 2007

2.  Design

   Link Fingerprints define a fragment identifier for all media types,
   so that all requests of URIs made by a user agent can be checked to
   make sure the fingerprints match up.  Any fragment identifier that
   has the form of #hash() is interpreted by the definition of Link
   Fingerprints in the next sections.

2.1.  Syntax

   The syntax used for Link Fingerprints is similar to XPointer for XML
   with "hash" as the name to identify Link Fingerprints.  It provides a
   way to have multiple hash algorithms for future extensibility, but
   currently, only a single algorithm is recommended for easier
   implementation across user agents as well as consistent adoption and
   usage by content providers.

   LinkFingerprint ::=  'hash' '(' HashExpr ')'
   HashExpr        ::=  'sha256' ':' [0-9a-f]{64}
                        | HashType ':' HashData
   HashType        ::=  [a-z0-9]+
   HashData        ::=  [^)]*

   Each HashExpr begins with a HashType that identifies how the
   corresponding HashData is handled.  This specification defines a
   single HashType, sha256.

2.2.  Hash Types

   For the hash type "sha256", the user agent computes the SHA-256 hash
   of the data transferred for the resource and compares it to the
   HashData.  If they do not match up, the Link Fingerprint is
   considered to be a failure case.

   Additional HashTypes can be defined at a later time using the
   HashType and HashData pair to determine when the Link Fingerprints is
   a failure case.

2.3.  URI To Check

   Various protocols such as the Hypertext Transfer Protocol (HTTP)
   allows for URIs to redirect to other URIs.  Link Fingerprints
   requires checking of the fragment identifier of the original URI used
   to request for the resource to prevent a redirected resource to
   change the expected Link Fingerprint.  Because the fragment
   identifier is handled at the user agent, all it needs to do is keep
   track of the original fragment identifier when requesting a URI that
   contains a Link Fingerprint.

Lee & Markham            Expires January 3, 2008                [Page 5]
Internet-Draft              Link Fingerprints                  July 2007

2.4.  Failure Cases

   Any fragment identifier that begins with "hash(" and contains ")" is
   considered a Link Fingerprint, and those that do not match the
   defined syntax result in a syntax error.  In these cases, the user
   agent SHOULD fail a request early because the link provider intended
   the request to be checked.  This also saves on network resources
   because the received data for the URI is considered corrupt.

   URIs with a correct Link Fingerprints syntax are requested and
   checked according to its HashType, and if it results in a failure,
   the user agent SHOULD NOT make the data available to the end user.
   This action is to prevent the user from accidentally using data that
   has been tampered with, such as a virus.  The user agent SHOULD
   inform the user that the content of the resource has been corrupted
   and that the provider of the URI should be notified.

2.5.  Implications Of Design Choices

   Because Link Fingerprints uses the fragment identifier of URIs, it is
   backwards compatible with user agents that do not know about the
   "hash()" syntax because they will not do anything with the fragment
   identifier.  This client-side change only requires handling of SHA-
   256 hashes, which currently have no known weaknesses, and this should
   facilitate the adoption of Link Fingerprints.

   The strict failing of Link Fingerprints should help link providers to
   ensure that the correct syntax is used.  This affects the forwards
   compatibility of new HashTypes because older clients implementing a
   previous definition of Link Fingerprints would consider the fragment
   identifier as a syntax error.  However, the provider of the link
   chose to use the optional fragment identifier and made the active
   decision that the URI was important enough to use Link Fingerprints,
   so user agents should honor that by failing early.

Lee & Markham            Expires January 3, 2008                [Page 6]
Internet-Draft              Link Fingerprints                  July 2007

3.  Examples

   The follow example show how Link Fingerprints are used in URIs.
   Assume that "abc123" is actually the 64 character hash required by

   If the transferred data for "file.exe" results in "abc123" after
   computing the SHA-256 hash, the data is made available to the user as
   if there were no Link Fingerprints.  If the computed hash does not
   match up, the user agent informs the user of the problem.

   The following are some example uses within a HTML document.

   <a href="">Download</a>

   <img src=""/>

Lee & Markham            Expires January 3, 2008                [Page 7]
Internet-Draft              Link Fingerprints                  July 2007

4.  Security Considerations

   Processing of the text for the Link Fingerprints fragment identifier
   needs to be done carefully to avoid buffer overflow security
   problems.  Specialized string processing for Link Fingerprints might
   be used for optimized parsing, so implementations need to make sure
   the strings are handled correctly.  Even existing string processing
   tools such as regular expressions have buffer overflow security
   holes, so implementations need to make sure the usage with Link
   Fingerprints does not encounter those problem cases.

Lee & Markham            Expires January 3, 2008                [Page 8]
Internet-Draft              Link Fingerprints                  July 2007

5.  Normative References

   [RFC2046]  Freed, N. and N. Borenstein, "Multipurpose Internet Mail
              Extensions (MIME) Part Two: Media Types", RFC 2046,
              November 1996.

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119, March 1997.

   [RFC2396]  Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform
              Resource Identifiers (URI): Generic Syntax", RFC 2396,
              August 1998.

   [RFC2854]  Connolly, D. and L. Masinter, "The 'text/html' Media
              Type", RFC 2854, June 2000.

Lee & Markham            Expires January 3, 2008                [Page 9]
Internet-Draft              Link Fingerprints                  July 2007

Authors' Addresses

   Edward S. Lee
   The Mozilla Foundation
   1981 Landings Drive Building K
   Mountain View, CA  94043


   Gervase Markham
   The Mozilla Foundation


Lee & Markham            Expires January 3, 2008               [Page 10]
Internet-Draft              Link Fingerprints                  July 2007

Full Copyright Statement

   Copyright (C) The IETF Trust (2007).

   This document is subject to the rights, licenses and restrictions
   contained in BCP 78, and except as set forth therein, the authors
   retain all their rights.

   This document and the information contained herein are provided on an

Intellectual Property

   The IETF takes no position regarding the validity or scope of any
   Intellectual Property Rights or other rights that might be claimed to
   pertain to the implementation or use of the technology described in
   this document or the extent to which any license under such rights
   might or might not be available; nor does it represent that it has
   made any independent effort to identify any such rights.  Information
   on the procedures with respect to rights in RFC documents can be
   found in BCP 78 and BCP 79.

   Copies of IPR disclosures made to the IETF Secretariat and any
   assurances of licenses to be made available, or the result of an
   attempt made to obtain a general license or permission for the use of
   such proprietary rights by implementers or users of this
   specification can be obtained from the IETF on-line IPR repository at

   The IETF invites any interested party to bring to its attention any
   copyrights, patents or patent applications, or other proprietary
   rights that may cover technology that may be required to implement
   this standard.  Please address the information to the IETF at


   Funding for the RFC Editor function is provided by the IETF
   Administrative Support Activity (IASA).

Lee & Markham            Expires January 3, 2008               [Page 11]