Network Working Group                                        J. Peterson
Internet-Draft                                                   NeuStar
Expires: April 18, 2005                                 October 18, 2004


  Security Considerations for Impersonation and Identity in Messaging
                                Systems
                   draft-peterson-message-identity-00

Status of this Memo

   By submitting this Internet-Draft, I certify that any applicable
   patent or other IPR claims of which I am aware have been disclosed,
   and any of which I become aware will be disclosed, in accordance with
   RFC 3668.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as
   Internet-Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   This Internet-Draft will expire on April 18, 2005.

Copyright Notice

   Copyright (C) The Internet Society (2004).  All Rights Reserved.

Abstract

   This document provides an overview of the concept of identity in
   Internet messaging systems as a means of preventing impersonation.
   It describes the architectural roles necessary to provide identity,
   and details some approaches to the generation of identity assertions
   and the transmission of such assertions within messages.  The
   trade-offs of various design decisions are explained.


Peterson                 Expires April 18, 2005                 [Page 1]

Internet-Draft              Message Identity                October 2004


Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
     1.1   Terminology  . . . . . . . . . . . . . . . . . . . . . . .  4
   2.  What is Identity?  . . . . . . . . . . . . . . . . . . . . . .  5
   3.  Roles in an Identity System  . . . . . . . . . . . . . . . . .  6
     3.1   Identity provider  . . . . . . . . . . . . . . . . . . . .  6
     3.2   Verifier . . . . . . . . . . . . . . . . . . . . . . . . .  8
   4.  Threat Model of Impersonation in Messaging Systems . . . . . .  8
   5.  Identity Assertions  . . . . . . . . . . . . . . . . . . . . . 10
   6.  Keying for Assertions  . . . . . . . . . . . . . . . . . . . . 11
     6.1   Asymmetric Keys  . . . . . . . . . . . . . . . . . . . . . 11
       6.1.1   Certificates . . . . . . . . . . . . . . . . . . . . . 12
       6.1.2   Uncertified Public Keys  . . . . . . . . . . . . . . . 13
     6.2   Symmetric Keys . . . . . . . . . . . . . . . . . . . . . . 15
   7.  User-based and Domain-based Assertions . . . . . . . . . . . . 15
     7.1   Name Subordination . . . . . . . . . . . . . . . . . . . . 17
   8.  Reference Indicators and Replay Protection . . . . . . . . . . 18
     8.1   Canonicalization versus Replication  . . . . . . . . . . . 19
     8.2   Assertion Constraints and Scope  . . . . . . . . . . . . . 21
   9.  Placement of Assertions and Keys in Messages . . . . . . . . . 25
     9.1   Assertions in the Envelope . . . . . . . . . . . . . . . . 26
     9.2   Assertions in the Content  . . . . . . . . . . . . . . . . 27
     9.3   Distributing Keys by-Reference or by-Value . . . . . . . . 28
     9.4   Distributing Assertions by-Reference . . . . . . . . . . . 31
   10.   Privacy and Anonymity  . . . . . . . . . . . . . . . . . . . 31
   11.   Conclusion: Consensus Points and Questions . . . . . . . . . 32
   12.   Security Considerations  . . . . . . . . . . . . . . . . . . 34
   13.   IANA Considerations  . . . . . . . . . . . . . . . . . . . . 34
       Author's Address . . . . . . . . . . . . . . . . . . . . . . . 36
   A.  Acknowledgments  . . . . . . . . . . . . . . . . . . . . . . . 36
   B.  Verification Assertions  . . . . . . . . . . . . . . . . . . . 37
   C.  Messaging: Real-Time versus Store-and-Forward  . . . . . . . . 37
   D.  Third-Party Assertions . . . . . . . . . . . . . . . . . . . . 38
   E.  Alternatives to Identity Assertions  . . . . . . . . . . . . . 39
     E.1   Trusted Intermediary Networks  . . . . . . . . . . . . . . 39
   14.   Informative References . . . . . . . . . . . . . . . . . . . 34
     E.2   Dial-back Identity . . . . . . . . . . . . . . . . . . . . 40
       Intellectual Property and Copyright Statements . . . . . . . . 42


Peterson                 Expires April 18, 2005                 [Page 2]

Internet-Draft              Message Identity                October 2004


1.  Introduction

   Widespread forgery of the From header field of email [5] messages is
   the most immediate motivation for work on message identity systems.
   However, there are numerous other messaging systems used on the
   Internet that currently confront similar problems, or are likely to
   confront these problems in the future; notably instant messaging
   systems and other real-time communications systems that leverage a
   messaging architecture as a rendez-vous protocol for session
   establishment.  All of these systems suffer from a similar threat of
   impersonation (as described in Section 4).  Messaging identity
   mechanisms, as defined in this document, address specifically the
   threat of impersonation in messaging systems.

   It is unlikely that the diverse identity requirements of these
   various messaging systems will admit of any single solution that
   could be deployed for all such protocols.  However, there is much to
   be gained by considering the broad body of work on the messaging
   identity problem that has already been done across this wide
   selection of protocols.  The core commonalities of these systems
   permit a high-level analysis of the message identity problem that
   could assist all messaging protocols in selecting an appropriate way
   of incorporating identity.

   This document aspires to apply to messaging systems with the
   following architectural qualities:
   o  The messaging system has the two agents: originators and
      recipients.  Both originators and recipients interact with the
      system through endpoints.  Messages sent from endpoints may pass
      through multiple intermediaries before arriving at the recipient.
      For the purposes of this document, reflectors and similar services
      are lumped in with intermediaries, even if from a protocol
      perspective they act more like endpoints.
   o  The messaging system employs names that are constituted of a
      'host' portion, which is a DNS [6] name (allocated through the
      delegative administration of the DNS) and a 'user' portion which
      is administered by the domain indicated in the 'host' portion.
   o  The messaging system carries messages that are divided into two
      major components: envelope and contents.  The distinction between
      the two is inexact, but primarily the content is intended to be
      rendered by the recipient's application, whereas much of the
      envelope contains addressing and routing data that is used by
      intermediaries.  [In deference to the email community, 'envelope'
      here should be understood to encompass both the envelope and
      header portions of a message.]
   o  The messaging system is used in an interdomain context.  Different
      administrative domains may deploy messaging intermediaries and
      issue names to valid local users.  Administrative domains need to


Peterson                 Expires April 18, 2005                 [Page 3]

Internet-Draft              Message Identity                October 2004


      be capable of exchanging messages with one another if they have no
      previous association.
   o  The messaging system is capable of 'retargeting' a message in
      transit, and delivering it a recipient whose name in the system is
      not identical to that of the intended recipient specified by the
      originator.  Primarily, this arises when an intermediary forwards
      a message to multiple recipients (in which case the resource
      designated as the intended recipient is some sort of reflector).

   This document was written based on the author's experience on
   developing identity solutions for the Session Initiation Protocol
   (SIP, [11]) and on consideration of several proposals circulating to
   provide similar features in email.

   The scope of this document is limited to the generation, carriage,
   and consumption of identity assertions.  It does not consider any
   authorization decisions that might be made, on the basis of the
   identity of the originator, by the consumer of identity assertions.

   This document is organized as follows: Section 2 attempts to define
   identity, and to demonstrate broadly the current manner in which
   identity is communicated in messaging systems.  Section 3 describes
   the abstract roles that must be instantiated in a system in order to
   incorporate identity assertions into a messaging architecture: the
   identity provider and the verifier.  The threat model for
   impersonation in messaging systems is considered in Section 4.
   Section 5 defines an identity assertion, and explains the manner in
   which cryptography can be leveraged to generate assertions.  Section
   6 provides an overview of keying and key distribution architectures
   that provide a foundation for sharing cryptographic assertions.
   Section 7 compares the traditional concept of user-based assertions
   with the newer, and perhaps more promising, idea of domain-based
   assertions.  Section 8 considers the internal composition of an
   identity assertion, and the elements in a message which the assertion
   must guarantee in order to be correlated with a message.  Section 9
   considers various ways that an assertion might be added to a message.
   Section 10 considers the privacy and anonymity implications of adding
   identity assertions to messages.  Section 11 attempts to pose the key
   questions that should determine how a messaging protocol approaches
   the incorporation of an identity mechanism, and to note when this
   high-level analysis has revealed any general principles that point
   one way or another on these questions.  Various appendices discuss
   related material that is not directly in the scope of the primary
   analysis.

1.1  Terminology

   This document intentionally uses core terminology that is to neutral


Peterson                 Expires April 18, 2005                 [Page 4]

Internet-Draft              Message Identity                October 2004


   existing messaging protocols.  Terminology specific to email is taken
   from [21].

2.  What is Identity?

   Every communications system has a namespace.  For example, the
   telephone network uses telephone numbers as a namespace, the postal
   system uses postal addresses as a namespace, and the Internet
   Protocol uses Internet Protocol addresses as a namespace.  In order
   for a name to be usable, it must meet the syntactical constraints of
   the namespace, and it must be unique within the namespace.
   Accordingly, namespaces generally require significant centralization
   of administration, though in many cases, delegation can distribute
   this work across multiple distinct authorities.  In the context of a
   particular communications system, the semantics of these names
   enables the system to route communications to the appropriate
   resources.

   In the most common messaging system on the Internet today, email, the
   namespace is founded on the Internet Domain Name System (DNS [6]).
   Names (in RFC2822 [5] terms, the 'addr-spec') are constituted of a
   'host' portion, which is a DNS name (allocated through the delegative
   administration of the DNS) and a 'user' or 'local-part' portion which
   is administered by the domain indicated in the 'host' portion, and
   which designates a particular resource or user in the domain.  As the
   message transfer service delivers the message, the host portion of
   the destination email address is resolved in the DNS (though
   practically, a message may pass through many intermediary
   administrative domains before reaching its destination).  Aside from
   email, many other Internet messaging systems have constructed
   namespaces with the same components: a domain name host portion and a
   domain-specific user portion.

   When a message is delivered to its recipient, the recipient has a
   strong interest in knowing who the message is from.  While the
   contents of a message may be sufficient to identify the originator to
   the recipient, it is also may happen that:
   o  the contents of the message do not identify the originator
   o  the contents of the message fabricate the identity of the
      originator
   o  the recipient does not wish to read the contents of the message
      without first identifying the originator

   Most protocols therefore provide a field which designates the
   originator of a communication.  Generally, the originator is
   identified by their name in the communication system.  For example,
   in the postal network, the originator is identified by their return
   address; by convention, the return address of the originator appears


Peterson                 Expires April 18, 2005                 [Page 5]

Internet-Draft              Message Identity                October 2004


   on the outside of an envelope.  In caller identification systems used
   in the telephone network, the telephone number of caller is displayed
   to the callee.  In email systems, user agents render the contents of
   the RFC2822.From header field of an email message as the originator.

   Nothing forces the originator of a postal message to supply a genuine
   return address on an envelope; originators are incented to provide a
   genuine return address only if they want the envelope to be returned
   to them if it cannot be delivered.  Similarly, the RFC2822.From
   header field of an email message can be populated arbitrarily by the
   originator (though it is not necessarily the address to which bounces
   are sent).  Malicious originators may want to provide a misleading or
   false return address for their messages, or to withhold a return
   address altogether, in order to escape reports of abuse or to mislead
   the recipient about the origins of the message.  While there are
   valid cases where anonymous communication is necessary, impersonation
   can be very problematic.

   For the purposes of this document, 'identity' refers to mechanisms
   that provide an assurance of the originator of a message.  An
   identity assurance is provided by a party in the messaging
   architecture that can prove its authority over a segment of the
   namespace.  For the identity systems considered in this document,
   that may entail proof of authority over DNS names, or it may also be
   authority specific to a particular user within a domain.  This
   assurance is communicated along with the message, and can be verified
   by recipients of the message.

3.  Roles in an Identity System

   This document postulates two fundamental roles in a messaging
   identity architecture: an identity provider and a verifier.  These
   roles might usefully be instantiated by any elements in a messaging
   architecture.  Most commonly, an originator or a proxy for the
   originator of a message will act as an identity provider, and the
   recipient or a proxy for the recipient will act as a verifier.
   However, this is far from the only valid assignment of these roles.
   There are even useful architectures where it is meaningful for the
   originator to act both as the identity provider and the verifier
   (where token-based assertions are used to authenticate networks
   reports of undeliverable messages).


3.1  Identity provider

   The role of the identity provider in an identity architecture is to
   generate an identity assertion.  An identity assertion is a chunk of
   information added to a message which can later be verified to assure


Peterson                 Expires April 18, 2005                 [Page 6]

Internet-Draft              Message Identity                October 2004


   the identity of the originator.

   An identity provider must be capable of authenticating the originator
   of the message.  The messaging architecture of the system in
   question, and the entity that plays the role of the identity
   provider, will largely determine how this authentication takes place.
   If the identity provider is instantiated by the endpoint of the
   originator, for example, this authentication might be tacitly
   assumed, or occur in some application-specific way.  If the identity
   provider is built into an intermediary, some network authentication
   mechanism must be used by the identity provider to ascertain the
   identity of the originator.

   An identity provider must have some verifiable authority over a
   segment of the namespace of this messaging system; that is, it must
   be capable of proving to verifiers that it is the appropriate entity
   to identify the originator of a particular message.  This proof of
   authority can come in many forms, depending on the type of assertion
   that the identity provider generates.

   Once the originator has been authenticated, the identity provider
   must furthermore determine whether or not the originator is
   authorized to send the message in question; this practice is most
   relevant to cases in which the identity provider role is instantiated
   by an intermediary, since in those cases where the originator's
   endpoint instantiates the identity provider, the originator itself
   has authority over the relevant segment of the namespace.  When it is
   necessary, this authorization decision may be based on a number of
   factors; for our purposes, the most important is the identity claimed
   by the originator of the message.  An originator may be authorized to
   claim one identity, or any of a number of identities, in accordance
   with the policy of the controller of the namespace containing the
   identity.  Identity providers only provide identity assertions for
   messages in which the originator claims an authorized identity.

   Ideally, an identity provider will be last entity in the architecture
   that will modify the message in transit.  An assertion will create a
   signature over certain elements of the message, and if the message is
   subsequently modified, it may violate this signature.  The severity
   of this condition is entirely dependent on the nature of the
   assertion, and in the elements of the message which are guaranteed by
   the assertion.  In practice, most messaging systems modify messages
   in some fashion throughout their transit of the network, and
   subsequent modification after the generation of an identity assertion
   is most likely unavoidable in any practical deployments.

   An identity provider must be capable of modifying a message, or
   forcing another entity in the architecture to modify the message in a


Peterson                 Expires April 18, 2005                 [Page 7]

Internet-Draft              Message Identity                October 2004


   particular way, in order to incorporate the identity assertion.
   Commonly, creating an identity assertion involves the use of
   cryptography, and accordingly, generating identity assertions may
   slow message creation or processing in the identity provider.

3.2  Verifier

   A verifier consumes an identity assertion in order to verify the
   identity of the originator of a message.  After inspecting an
   identity assertion, a verifier may make an authorization decision to
   act on the message in any of a number of ways.  Authorization
   decisions made by verifiers are outside the scope of this document.

   In order to perform its function, a verifier must be capable reading
   the identity assertion in a message.  Depending on the placement of
   the assertion in the message, and the underlying architecture of the
   messaging system, this may limit the entities that can instantiate
   the verifier role.

   It is possible that more than one verifier will inspect the same
   assertion in a message.  In some architectures, it may make sense for
   one or more intermediaries to act as verifiers before a message
   reaches its recipient, which may also act as a verifier.
   Alternatively, an intermediary could reflect a message to a
   potentially large list of recipients, in which case each recipient
   (and/or intermediaries acting on their behalf) might act as a
   verifier.  In other architectures, an intermediary acting as a
   verifier might strip the identity assertion before forwarding the
   message; in such cases, the intermediary might replace the identity
   assertion with a verification assertion (see Appendix B).
   Verification assertions can also be added without stripping identity
   assertions.

   Commonly, the verification of an identity assertion involves the use
   of cryptography, and accordingly, verifying identity assertions may
   slow message processing in the verifier.

4.  Threat Model of Impersonation in Messaging Systems

   Impersonation is the practice of falsifying the elements of a message
   that indicate its originator.  This is generally done in order to
   mislead a recipient about the origins of the message.

   The most common adversary in impersonation threats is a passive
   attacker.  A passive attacker can capture email messages in some way:
   they may see messages in transit, they may see archives of messages
   on the web, or they might even be a recipient of a message.  By
   capturing messages, the impersonator learns how a genuine originator


Peterson                 Expires April 18, 2005                 [Page 8]

Internet-Draft              Message Identity                October 2004


   structures their messages, including the manner in which elements of
   the message that indicate the originator are populated.  The
   impersonator then sends messages that mimic the structures used by
   the originator they intend to impersonate, altering the destinations,
   contents, and other meaningful headers as needed.  In the case of
   fictional originators, impersonators merely create plausible-looking
   messages based on their experience with typical originators.  In many
   current messaging systems, there is no need to do anything other than
   adopt the name of the desired originator and inject the message into
   the messaging system.

   The manner in which an impersonator injects messages into the
   messaging system admits of varying degrees of sophistication.  A
   passive attacker may, for example, only be capable of injecting
   messages as an originator, or they may control or be capable of
   imitating intermediaries in the system.  This can have a large impact
   on the way that other elements in the messaging system perceive their
   forgeries.

   Another type of impersonator is an active attacker.  An active
   attacker can intercept messages in transport, modify them
   arbitrarily, and then return them to the message transit system.
   This is a harder sort of attack to mount, and a much harder attack to
   defeat; consequently it may not be in the scope of identity assertion
   systems to prevent this sort of attack.  Since many intermediaries
   that are not actually attackers exhibit essentially indistinguishable
   behavior, designers of identity systems are further disincented from
   meeting this threat.

   The uses of impersonation are legion.  An impersonator may want to
   avoid reports or abuse, or accountability for the contents of
   messages.  Or, an impersonator may want to make a message appear to
   come from a particular originator to whom they believe a recipient
   will be sympathetic (which may lead the recipient to read a message
   and inspect content of the impersonator's choosing).

   Primarily, the purpose of an identity assertion is to prevent
   impersonation.  This means that it must provide the following
   qualities:
   o  In order for an assertion to be valuable, it must provide a
      stronger assurance than the return address conventionally attached
      to a message.  For example, an email identity system would be
      totally uninteresting if it allowed any originator to arbitrarily
      populate their identity, because this would constitute no
      improvement over the existing RFC2822.From header field.
      Typically, the strength of the assertion depends on some form of
      cryptography, and provable authority over the namespace of the
      originator.  In some constrained environments, assertions instead


Peterson                 Expires April 18, 2005                 [Page 9]

Internet-Draft              Message Identity                October 2004


      derive their authority from some form of transitive trust (see
      Appendix E.1); such assertions are outside the scope of this
      document.
   o  The assertion must have a precise scope and constraints (see
      Section 8.2), whether these are explicit in the message or static
      and understood implicitly in the messaging protocol.  It is
      assumed that the means by which a passive attacker collects
      messages will also allow them to collect identity assertions, and
      impersonators may accordingly attempt to replay them.  Constraints
      are intended to combat replay attacks.
   o  The assertion must denote the identity provider in some secure
      fashion, and provide any information necessary for the verifier to
      validate cryptographic properties of the assertion.  Assertions
      must provide verifiers with a means of determining whether or not
      the identity provider is authoritative for the namespace of the
      originator of a message.


5.  Identity Assertions

   An identity assertion is a piece of information (perhaps a header, a
   parameter, or a attached document) added to a message by an identity
   provider in order to provide verifiers with identity information
   about the originator of the message.

   Most existing and proposed identity mechanisms for Internet messaging
   systems leverage some form of cryptography.  Public key (or
   'asymmetric') cryptography is an especially attractive tool in this
   context, because it allows a verifier to validate an assertion even
   if it has never before been contacted by that originator.  Symmetric
   key cryptography, by way of contrast, requires that the identity
   provider and verifier share some pre-arranged secret.

   Cryptographic signatures generated by an asymmetric keying mechanism
   provide authentication of the signer and integrity over the signed
   information.  There are a number of ways that a signature can provide
   identity information, depending on the type of key used to generate
   the signature, and the identity of the signer.

   Providing a signature over an identity string like 'joe@example.com'
   alone, however, does not provide a strong assertion of the identity
   of the originator of the message.  The assertion must contain enough
   supplemental information that it is clear that it refers to this
   particular message, not just any message in which an attacker might
   try to replay the assertion.  The constraints and scope of assertions
   is discussed further in Section 8.

   Assertions may also be encrypted.  In some cases, it may be desirable


Peterson                 Expires April 18, 2005                [Page 10]

Internet-Draft              Message Identity                October 2004


   to hide the identity of the originator of a message from
   intermediaries, but to reveal this information only to a particular
   recipient, or vice versa.  Potentially, this could provide certain
   privacy properties to an identity assertion mechanism (see Section
   10).

   The use of cryptography requires some mechanism for key distribution
   and may require a public key infrastructure with widely-distributed
   root certificates.  Encrypting identity assertions requires more
   complex keying systems.  The use of certificates, uncertified
   asymmetric keys, and symmetric keys is discussed in Section 6.

6.  Keying for Assertions

   Cryptographic identity assertions require the use of keys.  In order
   for a cryptographic signature over an assertion created by an
   identity provider to be validated by a verifier, both parties must
   possess corresponding keying material.  Since Internet messaging
   systems assume that messages can be sent to arbitrary recipients that
   have no previous association with the originator, key distribution is
   the primary problem confronting the use of cryptographic identity
   assertions.

   Note that regardless of the keying mechanism used, an identity
   provider may have multiple keys that it employs for various reasons.
   Provided that there is way to link an assertion to a particular key
   used by the identity provider, this requires no special support from
   the identity mechanism.

6.1  Asymmetric Keys

   Asymmetric keys are credentials that have been split into two
   components, a public and a private key.  The holder of the
   credentials keeps the private key secret, and widely distributes the
   public key.  If a document is signed with the private key, the
   signature over the document can be validated with the public key.
   This signature provides integrity over the document, and
   authenticates the signer.

   An identity assertion is a type of document that can be signed with a
   private key by an identity provider.  In order to validate the
   signature, the verifier must hold the corresponding public key, and
   must have some reason to think that this public key is associated
   with the identity provider.  In order for that signature to provide
   any guarantee of the identity of the originator, the verifier must
   also have some assurance that the identity provider is authoritative
   for namespace of the originator of a message.


Peterson                 Expires April 18, 2005                [Page 11]

Internet-Draft              Message Identity                October 2004


   Asymmetric keys may be generated by an identity provider, or acquired
   by the identity provider from a third party such as a certificate
   authority.  Thus, there are two significant varieties of public keys
   - uncertified public keys, and public keys within certificates.  The
   certification status of a public key has a tremendous impact on how
   it can be distributed and the manner in which it assures authority
   over a namespace.

6.1.1  Certificates

   A certificate [12] is a document that binds public keying material to
   a particular name, the 'subject' of the certificate.  The certificate
   is signed by a certificate authority, and accordingly, parties that
   validate certificates must possess the public keys of certificate
   authorities (and unfortunately, the chain of certification between a
   particular certificate and the root certificate authority can include
   multiple middleman certificates).  For the purposes of this document,
   self-signed certificates are simply considered uncertified public
   keys.

   Certificates support a wide variety of subject formats.  Two are
   significant to the scope of this document.  First, a certificate's
   subject can be a valid name in an Internet messaging system, such as
   an email address.  Second, the certificate's subject can be a domain
   name.  Depending on the nature of the subject, the certificate can
   sign user-based or domain-based assertions; this is discussed further
   in Section 7.

   Whether user-based or domain-based certificates are used,
   certificates have a common set of advantages and drawbacks.  The
   primary advantage of certificates is that they provide a strong link
   between a public key and a subject.  Accordingly, by looking at the
   subject of a certificate, it is relatively easy to decide whether or
   not they are authoritative for the namespace of a particular
   originator of a message (bearing in mind the caveats in Section 7.1).
   Because a certificate is a signed document, certificates can also be
   distributed over the network without requiring integrity over the
   transport; e.g., a certificate store for an identity provider could
   use an insecure transport like vanilla HTTP to distribute
   certificates.

   The downside is that certificates do not represent a permanent
   binding.  Certificates have an expiration date, and consequently
   certificates must be periodically renewed, which is an operational
   hassle for identity providers.  However, parties that rely on
   certificates cannot assume that a certificate is still valid simply
   because it has not expired.  Certificates can also be revoked,
   usually as a consequences of the compromise of their corresponding


Peterson                 Expires April 18, 2005                [Page 12]

Internet-Draft              Message Identity                October 2004


   private key.  Relying parties are therefore required to monitor
   certificate revocation lists (CRLs) issued by certificate
   authorities.  Because this entails cumbersome operational procedures,
   relying parties rarely adhere to this in practice.  With all that in
   mind, it must be remembered that uncertified public keys do not
   represent a permanent binding either, and that there are no
   comparable intrinsic mechanisms for determining the expiry or
   compromise of an uncertified public key, even if a relying party was
   sufficiently troubled by these concerns to employ them.

6.1.2  Uncertified Public Keys

   Public key cryptography can also be used for identity assertions
   without certificates; for example, an identity provider may generate
   a public/private key pair itself.  This requires a mechanism for
   distributing public keys in which the identity of the private key
   holder is implicitly or explicitly disclosed to potential verifiers,
   and verifiers understand unambiguously the namespace for which the
   identity provider is responsible.

   One way to associate an uncertified public key with a message
   originator is to transmit the public key in an initial unsigned
   message.  The recipient, upon receipt of the public key, could store
   it in a local, application-specific keychain, indexed by the
   originator's return address (for user-based assertions) or the
   originating domain (for domain-based assertions) - the message would
   need to make clear precisely who the identity provider is.  Future
   signed messages received from that originator (or domain) could be
   validated with the public key.  This mechanism of key distribution
   will be referred to in this document as the "leap-of-faith"
   mechanism.  It merits this particular name because the originator and
   recipient must have faith that no man-in-the-middle interfered with
   the initial message containing the public key.  If an active attacker
   were present in the key exchange, they could inject their own public
   key and impersonate the originator to that recipient.

   The leap-of-faith follows the example of SSH, which is widely
   regarded as a vast improvement over insecure telnet-style
   applications, and no doubt the leap-of-faith method of distributing
   public keys for identity providers would be an improvement over a
   lack of identity assertions altogether.  Unfortunately, messaging
   architectures almost inevitably involve application-layer
   intermediaries that could inspect or modify leap-of-faith keys, and
   in this respect messaging is significantly distinct from the
   traditional client-server architecture of SSH.

   The other challenges facing this approach rest largely in associating
   the key with a legitimate identity provider, and determining the


Peterson                 Expires April 18, 2005                [Page 13]

Internet-Draft              Message Identity                October 2004


   namespace for which that identity provider is authoritative.
   Practically, there isn't really a way to do so; when a message
   arrives with an uncertified public key in it, that key is ultimately
   serviceable only as an validation of that particular (anonymous)
   identity provider.  When future messages are received, the verifier
   can prove that these assertions were created by that same identity
   provider, but that verification offers no proof of the namespace for
   which that identity provider is authoritative.  This problem is
   severe enough that leap-of-faith key distribution is probably only
   meaningful for anonymous user-based assertions.  But again, even
   anonymous user-based assertions are better than nothing.

   The DNS might also be leveraged to bind a public key to an
   identifying domain.  DNSSEC [23], for example, provides public keys
   in a DNS resource record.  Those keys are known to be associated with
   a particular domain (thanks to the delegative structure of DNSSEC).
   Those keys, or some other keying material in the DNS which is signed
   via DNSSEC, could be used to provide a domain-based signature in the
   request for an identity assertion (see Section 7).  Even a simple
   hash of the public key used by the identity provider, placed in the
   DNS, would enable the transmission of domain-based public keys in
   messages without any need for a leap-of-faith.

   Note that strictly speaking, the keying material (or a hash of it)
   does not need to appear in the DNS in order for the DNS to be
   leveraged to bind a public key to an identifying domain.  If the
   identity provider were to run a key store service (like an HTTP
   server) that made its key available, then the identity provider could
   include a URI reference to that store with its assertion.  Since the
   DNS would be used to dereference that URI, the security of that store
   is predicated on the security of the DNS.  However, the operation of
   the store exposes the identity provider to further security risks
   (see Section 9.3), and since the DNS needs to be invoked in order to
   find the store, using keys or hashes in the DNS is ultimately more
   efficient from a messaging perspective.

   In the absence of operational DNSSEC, however, using the DNS to find
   uncertified keys is insecure.  While the technical specifications of
   DNSSEC are largely complete, it will likely be some time before
   DNSSEC is fully operationalized.  There are high-level changes that
   would need to sweep through the DNS in order to operationalize
   DNSSEC, whereas today individuals within the messaging community can
   opt to employ certificates, or not, on an incremental basis.  That
   much said, it can be argued that the difficulty of subverting the DNS
   is sufficiently high that this practice would deter a large number of
   potential impersonators; verifiers can make they own policy decisions
   about the strength of the assertion based on whether or not the zone
   containing the keying material uses DNSSEC.  Note, however, that this


Peterson                 Expires April 18, 2005                [Page 14]

Internet-Draft              Message Identity                October 2004


   approach has no obvious way to support user-based assertions short of
   placing many (for large domains, perhaps tens or hundreds of
   thousands) of records in the DNS corresponding the keys of particular
   individuals; since the identity provider's assurance of the namespace
   derives from the DNS zone in which these key records are stored, the
   security of providing domain-based assertions is materially the same.

   Given either approach, it is desirable for validators to be capable
   of caching uncertified public keys.  For DNS-based schemes, the cache
   duration could presumably be dictated by the time-to-live of the DNS
   resource record containing the key or hash of the key.  For the
   leap-of-faith approach, additional metadata associated with the
   public key would presumably dictate the length of time for which it
   is safe to cache the key.  Some further considerations related to
   caching are discussed in Section 9.3.

   One example of the leap-of-faith system in an Internet messaging
   protocol is given in RFC3261 [11] Section 23.2 (for the case where
   unsigned certificates are used).

6.2  Symmetric Keys

   The use of symmetric keys for an identity assertion is severely
   limited because it requires that the identity provider and verifier
   pre-arrange a shared secret, which, for the typical assignment of
   these roles, runs contrary to the requirement that the domain of the
   originator and recipient of a message require no previous
   association.  However, depending on the intended applicability of the
   assertion, this may not be an unreasonable constraint.

   For a case like determining that a bounce resulted from a message
   that an originator actually sent, the identity provider and verifier
   of a message are the same endpoint (the originator).  Since an
   endpoint can reasonably be expected to share a secret with itself,
   the use of symmetric keys is attractive for this use case.

   The interdomain use of symmetric keys is further limited by the
   difficulty of key distribution.  Asymmetric public keys can be
   distributed without fear that any passive attacker will be capable of
   leveraging the keys to impersonate the principal.  If a symmetric key
   used for identity assertions is captured by an attacker, however, the
   attacker can impersonate the principal for the lifetime of the key.
   Symmetric keys essentially need to be negotiated, in interdomain
   cases, through some out-of-band mechanism.

7.  User-based and Domain-based Assertions

   To understand the distinction between user-based and domain-based


Peterson                 Expires April 18, 2005                [Page 15]

Internet-Draft              Message Identity                October 2004


   assertions, it is simplest to assume that they are generated by
   certificates.  Consequently, the discussion in the next few
   paragraphs describes only the use of certificates to provide these
   assertions; alternatives to certificates are described at the end of
   this section.

   In the simplest assertion, the identity provider is directly
   authoritative for the name of the originator only.  For example, the
   identity provider holds a certificate with a subject of
   'joe@example.com', and provides an identity assertion with the
   private key corresponding to that certificate for only for messages
   sent by 'joe@example.com'.  We will refer to this sort of identity
   assertion as a user-based assertion.  Usually, the identity provider
   is in this instance the endpoint of the originator, though of course
   it would also be possible (though probably not very scalable) for an
   intermediary to manage a keyring of such certificates for every user
   in their domain.

   While this case is straightforward, there is no widely-supported
   public key infrastructure that issues user-based certificates to
   date.  The only successful PKI on the Internet today provides
   domain-based certificates, primarily for securing web transactions.
   These certificates have a hostname subject of the form 'example.com'
   (or, more commonly, 'www.example.com').  While there are many reasons
   why domain-based certificates are more successful than user-based
   certificates, for our purposes the most important is enrollment: it
   is very easy for a certificate authority to determine who controls
   'www.example.com' (since this is a matter of public record), but very
   difficult for a certificate authority to determine to whom
   'example.com' has allocated the username 'joe'.  The only deployable
   means of doing so today (email pings) are essentially leap-of-faith
   mechanisms.

   Because domain-based certificates are widely available, and the root
   certificates of the major certificate authorities that issue these
   certificates are installed on almost all Internet-enabled platforms,
   the prospect of leveraging domain-based certificates for identity in
   messaging systems is very attractive.  Compared to user-based
   certificates, domain-based certificates are also attractive because
   there need to be fewer of them in the overall messaging system, since
   there are generally many users to a given domain.  This is
   advantageous both for identity providers, especially from a cost
   perspective, and for verifiers, who will need to persist many
   certificates from remote domains.

   When domain-based assertions are employ, the certificate itself does
   not provide the identity of the originator, but it does prove that
   the identity provider is authoritative over a particular segment of


Peterson                 Expires April 18, 2005                [Page 16]

Internet-Draft              Message Identity                October 2004


   the namespace.  Accordingly, the identity provider's signature must
   cover some field of the request that contains the identity that the
   signer is asserting.  In order for the assertion to have any
   strength, that identity must be within the segment of the namespace
   for which the signer is authoritative; i.e., if the certificate of
   the signer proves authority over 'example.com', then the signature
   would be valid if the identity of the originator were
   'joe@example.com', but not if the signature were over
   'alice@example.org'.  This gives rise to quite a few subtleties which
   are discussed in Section 7.1.

   In order to acquire a domain-based identity assertion for a request,
   originators would typically need to forward their message to an
   intermediary that instantiates the identity provider role (unless the
   originator holds a certificate authoritative for its own domain).
   This in and of itself can be viewed as a drawback, since in many
   messaging architectures originators are not required to send messages
   through any specific local intermediary.  Moreover, messaging
   protocols are used in some environments that constrain the first-hop
   local intermediary to which an originator sends a request (e.g.,
   blocking outbound SMTP with an enterprise firewall).  In those
   environments, originators would be unable to acquire an identity
   assertion from an intermediary that was unsanctioned by the operator
   of the environment.

   Note that the considerations applying to domain-based certificates
   also apply to most DNS-based mechanisms for public key distribution -
   the identity assertions generated by keys distributed on a per-domain
   basis through the DNS are domain-based assertions.  The distinction
   lies in the strength of the assurance - uncertified public keys
   distributed through the DNS without DNSSEC are inherently less secure
   than certificates, and thus can be said to provide a weaker
   domain-based assurance.

7.1  Name Subordination

   Identity assertions become harder to verify when the subject of the
   signer's certificate does not correspond exactly with the
   originator's name.  There needs to be a deterministic way of deciding
   if an identity provider is authoritative over the namespace
   containing an originator's name.

   For example, how should a verifier treat an identity assertion
   generated by an identity provider with a certificate for
   'joe@example.com' when the originator of the associated message is
   given as 'joe@mail.example.com'? The problem is more pronounced with
   domain-based assertions.  How should a verifier treat an identity
   assertion generated by 'alice.example.com' for a message whose


Peterson                 Expires April 18, 2005                [Page 17]

Internet-Draft              Message Identity                October 2004


   originator is 'joe@example.com'? What if the domain were
   'joe.example.com', or 'mail.example.com', or 'sip.example.com'?

   We are forced to pose this authorization question because the
   verifier has no way to know how the identifying domain 'example.com'
   has allocated its namespace - which is why these problems are
   problems of name subordination.  While authorization policy is
   outside the scope of this document, there are potentially ways to
   design a messaging identity system such that these concerns never
   arise.  The most obvious way is to be very strict about generating
   assertions - to mandate, for example, that identity providers cannot
   provide domain-based assertions for messages unless their domain (the
   subject of their certificate, or the zone containing their key in the
   DNS) corresponds exactly to the host portion of the originator's
   return address.  But this may be too rigid to support some use cases.

   Another possible solution is to leverage the DNS in some new way to
   designate the identity provider for a domain.  Just as one resource
   record type designates the mail exchanger to which mail should be
   sent, some other DNS resource record might designate the identity
   provider for email messages in a domain (e.g., for 'example.com', the
   identity provider for mail messages resides at
   'mail-ident.example.com').  Predictably, this solution is limited by
   the lack of an operational DNSSEC infrastructure in the DNS.  Without
   DNSSEC, it is possible that an attacker could spoof DNS responses to
   suggest that an inappropriate host is the signer for the domain;
   essentially, this grants the attacker the ability to impersonate any
   user in the domain.

8.  Reference Indicators and Replay Protection

   If any attacker can cut an identity assertion from a legitimate
   message, paste it into an arbitrary message of their own, and thereby
   fool a verifier into believing that the hacked message came from the
   originator of the legitimate message, then the value of the identity
   assertion is essentially nil, given that it exists primarily to
   prevent impersonation.  If an identity assertion provided only a
   signature over the name of the originator, assertions would be
   trivially exploitable in precisely this fashion.

   Accordingly, an assertion must cover more than just the originator's
   name.  It must cover enough additional information that the assertion
   cannot be replayed in a substantially message.

   Ideally, the identity assertion would provide a signature over the
   entire message, envelope and contents alike.  If this were the case,
   then an attacker could only replay the identity assertion in an
   identical message - which would be a duplication rather than an


Peterson                 Expires April 18, 2005                [Page 18]

Internet-Draft              Message Identity                October 2004


   impersonation.  But practically, this wouldn't work for any existing
   messaging system.  In fact, in most messaging systems, intermediaries
   need to modify the envelope in order to perform their duties.  While
   strictly speaking, intermediary modification of the content is not a
   baseline requirement for a messaging system, some intermediaries do
   so in order to enforce any of a number of domain-specific policies.
   Consequently, were a signature over an entire messages included in
   identity assertions, such signature are likely to fail to validate at
   verifiers.

   Thus, only some subset of the message must be signed.  The selection
   of the exact subset is a very difficult problem.  For the purposes of
   this document, the elements of a message that need to be signed in
   order to bind an identity assertion to a particular message will be
   termed the 'reference indicators'.  The manner in which a subset is
   identified or carried in the message also admits of more than one
   plausible design choice.


8.1  Canonicalization versus Replication

   There are two basic approaches to generating a signature over a
   subset of a message - canonicalization and replication.

   Canonicalization entails the generation of a canonical string from
   the reference indicators.  The signature is generated over that
   string (or a hash of that string), even though the string as such
   does not appear in the message.  The canonicalization system must
   specify the reference indicators that are going to be signed.  The
   reference indicators can be specified statically, as a component of
   the specification of the mechanism, or dynamically, on a per-message
   basis.  In the former case, every identity assertion for every
   message in the system will generate a canonical string containing
   exactly the same reference indicators.  In the latter case, each
   assertion will denote in some manner which reference indicators have
   been incorporated within the canonical string.  When a verifier
   receives the message, it extracts those same reference indicators
   from the message, generates the same canonical string, hashes it
   where applicable, and then determines whether or not the signature in
   the identity assertion is valid for that canonical string.

   The most practicable canonicalization procedures incorporate only the
   most specific reference indicators from a message.  For example,
   inclusion of the entire RFC2822.From header field value (including
   the header field name, the colon, whitespace, etc) is much more
   problematic than the inclusion of only the addr-spec component of the
   From header field value.  The less specific the reference indicators
   are, the harder it is for them to be canonicalized, and the more


Peterson                 Expires April 18, 2005                [Page 19]

Internet-Draft              Message Identity                October 2004


   likely it is that intermediaries (though munging white-space,
   changing line-wrap, and so on) may inadvertently change the canonical
   string that will be generated by the verifier, or that the verifier
   will miss a blot of whitespace, and so on.

   Both dynamic and static reference indicators for canonicalization
   have their drawbacks.  Static reference indicators can be too
   limiting; it is difficult to anticipate the reference integrity needs
   of every imaginable message.  Dynamic reference indicators, however,
   are extremely complicated.  The syntactical system required to
   describe reference indicators is potentially an exercise in arbitrary
   string manipulation, especially when attempting to denote reference
   indicators with a high degree of specificity.  Dynamic reference
   indicators also leave much more room for error in the generation of
   the canonical string, and accordingly, more room for discrepancy in
   the manner that the verifier generates the canonical string.  It is
   difficult to strike a balance, and once you allow any reference
   indicators to be decided on a per-message basis, the slope becomes
   very slippery.

   Replication attempts to avoid the difficulties of canonicalization by
   providing a copy of the reference indicators that is carried within
   the message itself.  The simplest form of replication is the
   reproduction of the entire message, which is then tunneled within the
   message itself.  The identity assertion is a signature over the
   replication.  Of course, if the entire message is carried within
   itself, this doubles the size of the message (not even counting the
   signature), and so presenting a subset of the message is again
   desirable.  However, unlike canonicalization, replication does not
   require any pre-agreement or denotation of the reference indicators.
   The reference indicators that appear in the replicated message are
   visible to the verifier, and the verifier validates the signature
   over the replication, not over elements in the original message which
   need to assembled into a canonical string.  If the signature over the
   replication is valid, the verifier then compares the values of the
   reference indicators in the replication to the corresponding elements
   of the message.  If these two correspond, then the identity assertion
   is valid for this message.

   Another significant distinction between canonicalization and
   replication is that a verifier inspecting a replication-based
   assertion can determine which reference indicators do not correspond
   to the received message; a verifier validating a
   canonicalization-based assertion can only tell whether or not the
   reference indicators as whole exactly match the current message.  As
   a consequence, a verifier of a replication-based assertion can be
   lenient towards minor discrepancies between the message signed by the
   identity provider and the received message.  If the verifier were


Peterson                 Expires April 18, 2005                [Page 20]

Internet-Draft              Message Identity                October 2004


   implemented in a endpoint, the endpoint might even render an account
   of the discrepancies to a user, who might be able to make an informed
   decision about the severity of the differences.  Another alternative
   is that verifiers might 'clobber' the contents of the outer envelope
   with the replicated envelope, treating only the replicated headers as
   authoritative and ignoring any discrepancies.  Clobbering, however,
   only works when the reference indicators are carefully chosen;
   otherwise, it may disguise the actions of an impersonator who has
   cut-and-pasted the replicated assertion into a message of their
   choosing.

   Replication furthermore introduces the interesting possibility that
   envelope elements intended for end-to-end consumption (that do not
   need to be inspected by intermediaries, like the Subject header of
   email) might be included in the replicated body, but not in the
   headers.  The originator might intentionally provide only the minimum
   amount of information necessary in the envelope of the message, but
   arrange for the identity provider to place detailed end-to-end
   information in the assertion.  Were the assertion then to be
   encrypted, the identity provider could securely tunnel end-to-end
   elements to the recipient.  This is most meaningful when the
   originator acts as the identity provider.  Note, however, the
   limitations of encryption in Section 10.

   A very basic replication scheme deals poorly with the content of the
   message.  Many messaging protocols allow large content
   (multi-megabyte email attachments leap to mind).  Replicating this
   content is extremely undesirable.  In contrast,
   canonicalization-based assertions do not become larger as the signed
   content grows.  Since canonicalization can use a hash of the
   canonical string, even if the canonical string is built from the
   message body, the size of the hash is unchanged.  Traditional
   approaches to message content security (including PGP [18] and S/MIME
   [17]) sign a hash over the message content.  Accordingly, some hybrid
   approaches replicate reference indicators in the envelope but
   canonicalize content, which can yield the best of both worlds.

   One example of a canonicalization-based identity assertion for SIP is
   given in sip-identity [25].  The core SIP standard, RFC3261 [11]
   Section 23.4.2, provides a replication model that entails tunneling
   the entire message; RFC3893 [14] provides a replication model for SIP
   which is restricted to reference indicators alone.

8.2  Assertion Constraints and Scope

   The choice of reference indicators dictates the constraints and scope
   of the assertion.  For example, if the reference indicators include
   something like the email Date header field, then it is possible,


Peterson                 Expires April 18, 2005                [Page 21]

Internet-Draft              Message Identity                October 2004


   after verification, to apply authorization policies related to the
   time the Date header was created.

   Whether canonicalization or replication is used, the selection of a
   set of reference indicators must be informed by nature of the
   messaging protocols.  Which elements of the envelope and content are
   necessarily immutable from the identity provider to the verifier
   (however those roles are assigned)? Which are always mutable? Which
   elements are conceivably reference indicators?

   The following may not be a necessary or sufficient list for any given
   messaging protocol, but it does exemplify the sort of analysis that
   needs to be performed in determining whether or not an element should
   be used as a reference indicator.

   Beginning with the envelope, at a minimum, the name of the originator
   of the message has to be preserved (in email, the addr-spec component
   of the RFC2822.From header field).

   It is also highly desirable to include some indicator that denotes
   the intended recipient(s) of the message.  Without such an indicator,
   a message containing a valid identity assertion could be replayed to
   a different recipient by a passive attacker who captured the message,
   and the verifier would be unable to determine that the originator did
   not intended to send this message to the designated recipient.
   Barring the presence of other reference indicators, even the intended
   recipient of the request could act as such a passive attacker.

   Constraining the assertion with some sort of unique identifier for
   the message is also very desirable.  Most messaging protocols provide
   a unique message identifier in order to enable the recipient to
   detect duplicates, or to enable correspondents to refer to a previous
   messages unambiguously.  While the presence of a unique identifier in
   the constraints does not prevent passive attackers from replaying
   assertions to new verifiers, it does change the situation when
   impersonators attempt to replay assertions to the same verifier
   (which complements selecting the intended destination as a reference
   indicator).  It enables verifiers to remember unique identifiers for
   some period of time.  By doing so, verifiers can discover that they
   previously verified a message with this unique identifier.  This
   does, however, have some important limitations.  The first is
   scalability.  Intermediaries that act as verifiers can potentially
   process staggering numbers of messages, and recording every passing
   unique identifier in such intermediaries is probably infeasible.
   However, this does not mean that the presence of a unique identifier
   would not be useful for recipients that act as verifiers (who might
   persist messages, including the unique identifier, for various
   reasons anyway).  The second limitation is a race condition.  If the


Peterson                 Expires April 18, 2005                [Page 22]

Internet-Draft              Message Identity                October 2004


   attacker's message is delivered to the verifier before the legitimate
   message, a verifier might mistakenly believe that the attacker's
   message is the valid one; while active attackers are the most likely
   to successfully mount this sort of attack, in store-and-forward
   architectures it is possible that a passive attacker might do so
   (though not in a deterministic fashion, probably).  A third
   limitation is that in some architectures, a particularly intrusive
   intermediary might alter the unique identifier of a message in the
   process of forwarding the message; in these environments, the unique
   identifier has little value.

   Providing a time-based constraint can complement the use of unique
   identifiers and other local policies at the verifier.  Virtually all
   messaging protocols provide an indicator in the envelope that states
   when the message was created (such as the Date header in email).
   This can aid verifier policies that help to manage replay protection.
   For example, verifiers could be configured with interval of time
   derived from some assessment of how long a message can plausible be
   supposed to have remained in transit in the message system.  If they
   receive a message, and the time that has elapsed since the creation
   of the message exceeds that interval, they could consider the
   assertion invalid, or at least suspect.  Obviously, this interval
   would be very different depending on whether the messaging
   architecture is based on a store-and-forward methodology or a
   real-time delivery methodology.  In concert with unique identifiers,
   this interval of time could be used to determine how long a verifier
   needs to remember unique identifiers recorded from valid past
   messages.  Some ways in which a passive attacker might collect
   assertions for replay (from web pages of email archives, for example)
   could involve the retrieval of very dated assertions that would be
   flagged by this sort of policy.

   Some elements of a message are half-way between envelope and content,
   such as the typical Subject header field of email and SIP.  Since it
   is common practice for endpoints to render this element to the user,
   and the element can significantly change how recipients understand
   the message, it should serve as a reference indicator.

   While any single one of the envelope-based reference indicators
   described above would be insufficient to provide a strong assurance
   of identity, in concert, they can meet the majority of the plausible
   threats, and require such a high degree of sophistication from the
   attacker that most impersonation would be eliminated.  However, it
   may not be possible for a verifier instantiated by an intermediary to
   make full use of all of these indicators (the message's unique
   identifier, for example).  Moreover, it may not be possible for an
   originator to act as an identity provider for all of these reference
   indicators if certain elements (like the message's unique identifier)


Peterson                 Expires April 18, 2005                [Page 23]

Internet-Draft              Message Identity                October 2004


   are generated by an intermediary.

   The content of the message is also apparently a critical reference
   indicator.  Without a signature over the content, a passive attacker
   who captures the message could preserve the envelope of a message but
   send a completely different content, which allows the attacker to
   impersonate the asserted originator and provide the content of their
   choosing.  Since in many messaging architectures, intermediaries can
   legitimately alter the contents of messages (most commonly, either by
   adding to the content or modifying the existing content in some
   fashion), defending against replay of a message with a modified
   content by a passive attacker is essentially the same level of
   difficult as defending against message modifications made by an
   active attacker.

   In environments where intermediaries do modify message content for
   legitimate or at least quasi-legitimate reasons, the issue of
   protecting the content is academic.  A signature over the content
   will be violated if the content is changed.  There relatively few
   approaches to preventing intermediaries from violating these
   signatures; a few examples include:
   o  If the messages use MIME [8], it is possible to apply MIME-layer
      security to particular bodies in the content.  If trivial
      additions are made by an intermediary (such as appending a few
      lines of text to the message), then they will fall out of the
      scope of the MIME body or bodies.  If one is especially lucky, the
      intermediary might even be MIME-aware, and capable of
      understanding how to interact with the complex multipart bodies
      that MIME-layer security frequently requires.
   o  If rather than merely adding to the content, the intermediary
      seeks to modify existing message content (filtering for content
      that appears inappropriate, perhaps), then the only recourse is to
      encrypt the content in its entirety.  If it is unable to
      understand the content, an intermediary will not be able to make
      these sorts of alterations.

   Neither of these solutions is applicable in all cases.  However,
   given the use of envelope-based reference indicators described above
   in an identity assertion, most impersonations that replayed an
   assertion but changed the content would be perceived as duplicates
   (based on the unique identifier), outdated, or potentially in
   violation of a Subject header constraint; moreover, it could only
   impersonate the originator to a specific recipient or specific set of
   recipients.  It could be argued, therefore, that it is not necessary
   to use the content as a reference indicator.  But in messaging
   systems and environments where it is safe to do so, the value of
   including the content as a reference indicator is clear.


Peterson                 Expires April 18, 2005                [Page 24]

Internet-Draft              Message Identity                October 2004


   An account of the mutable and immutable elements in a SIP message is
   given in RFC3261.  The most complete analysis of reference indicators
   in SIP is given in the Security Considerations of sip-identity [25].
   Given the sheer number of possible headers used by email (see [22]),
   a complete analysis of mutable and immutable elements is probably a
   fool's errand.

   The surfeit of possible reference indicators may tempt designers to
   punt on deciding, at a protocol level, which ones are appropriate,
   and simply to allow identity providers to make this decision based on
   domain policy or even on a per-message basis.  There are two
   disadvantages that this flexibility incurs.  In the first place, if
   the assertion is based on canonicalization, the assertion must be
   accompanied by some sort of description of the reference indicators
   that have been used to generate the assertion.  Determining how to
   describe reference indicators precisely is a significant challenge.
   In the second place, it leaves a great deal of ambiguity in
   intermediary behavior.  How can an identity provider anticipate which
   elements an intermediary might want to modify? If the standard is
   firm about this matter, and all identity provider rely on the same
   reference indicators, then operators of intermediaries will be
   incented to phase out any practices that modify those elements.  If,
   on the other hand, each identity provider does things a little
   differently, there could be significant operational turmoil that
   could potentially lead to a rollback from the identity mechanism.

   In the end analysis, for any given messaging system, there is
   probably a finite set of identifiable elements that should be used as
   reference indicators.  At worst, there should be a set of fixed
   reference indicators that can be supplemented with optional, dynamic
   reference indicators as needed.  Note that other constraints and
   reference indicators that might be added by third-parties to the
   identity process are described in Appendix D, and should not be
   considered a part of the identity assertion created by the identity
   provider to identity the originator a message.

9.  Placement of Assertions and Keys in Messages

   Most Internet messaging systems employ messages that are divided into
   two major parts: envelope and contents.  The envelope of a message is
   typically made up of headers, like the traditional email RFC2822.From
   header field, though some messaging systems used alternative schemes
   (like XML for XMPP [13]).  The contents consist of one or more
   message bodies, typically, but not always, MIME bodies.

   The division between envelope and contents is imprecise in most
   messaging systems.  As a general rule, the envelope is used by
   endpoints and intermediaries in the addressing and routing of a


Peterson                 Expires April 18, 2005                [Page 25]

Internet-Draft              Message Identity                October 2004


   message, whereas the content is generated by the originator's
   endpoint, consumed by the recipient's endpoint, and rendered to the
   recipient's application in some fashion.  However, many elements in
   the envelope are also rendered to the recipient (a classic example
   being the Subject header field of email), and intermediaries have
   numerous reasons to inspect or modify the contents of messages.

   Given that an identity assertion needs to appear somewhere in a
   message, there are two plausible alternatives:
   o  it could appear in the message content
   o  it could appear in the message envelope, as a value or parameter
      of a new or existing element

   The attractiveness of one or another of these options is greatly
   dependent on the nature of the assertion, and particularly on the
   size and encoding of the assertion.  Canonicalization will result in
   a smaller assertion than replication.  To speak in particulars
   briefly, for a base64-encoded assertion based on an RSA signature
   (1024 bit key) of a SHA1 hash of the canonical string, the resulting
   assertion is 175 bytes long - varying the key length will make the
   message proportionally larger or smaller, obviously.  It is difficult
   to gauge the likely size of an assertion based on replication, since
   it is highly dependent on the number of reference indicators
   included, but it would be significantly larger.  An example in
   RFC3893 of a S/MIME-based replication assertion for SIP (containing
   six headers) is 913 bytes long, counting the multipart/MIME wrapper
   and the signature.  A base64 encoded version of that assertion is
   1240 bytes long.

9.1  Assertions in the Envelope

   An envelope is generally composed of a set of elements that describe
   the originator and intended recipient of a message, the subject of
   the message, the time the message was created, some unique
   identifiers for the message, and so on.  It is a common practice for
   intermediaries to inspect an envelope's elements in order to make
   forwarding decisions, and to add additional elements to the envelope
   to reflect various circumstances surrounding the delivery of the
   message.

   Provided that an assertion is short and syntactically manageable,
   there's no reason why it couldn't appear in some new or existing
   envelope element.  Some messaging systems have a practical (if not
   theoretical) limit on the size of envelope elements, in others this
   is no cause for concern.

   The syntax of the assertion is a more complicated issue.  If identity
   assertions based on replication are used, and are intended to be


Peterson                 Expires April 18, 2005                [Page 26]

Internet-Draft              Message Identity                October 2004


   stored in the envelope, it may be syntactically confusing to store a
   set of envelope elements within a single envelope element.  Worst
   case, this confusion could be alleviated by encoding the entire
   assertion in some fashion (such as base64), but this would result in
   quite a large string.  Even in cases where element length is limited,
   it is possible that a very large string encoded in this fashion could
   be split across multiple envelope elements, and internally ordered in
   some way, to meet practical size limits.

   Intermediaries generally have an easier time reading and writing
   parts of the envelope than the content, and according, if one intends
   for intermediaries to instantiate the identity provider or verifier
   roles, then placing assertions in the envelope has the distinct
   advantage of requiring less changes to intermediary behavior.

   Also, some messaging architectures might not guarantee the survival
   of particular portions of the message as they traverse
   intermediaries.  For example, if intermediaries customarily rewrite
   or delete particular envelope elements, it would be a poor design
   decision to store an identity assertion as a value in those element.

9.2  Assertions in the Content

   Given an assertion of large size or cumbersome syntax, storing an
   assertion in the envelope might be undesirable.  Appending the
   assertion to the contents of the message (perhaps using a multipart
   MIME body) might therefore seem superior.

   However, for some messaging systems, placing identity assertions in
   the content may limit the set of entities in the messaging
   architecture that can instantiate the identity provider role.  It
   might be illegal for an intermediary, for example, to modify content.
   This is the case with SIP, where an intermediary cannot delete or
   modify the contents of a SIP message.

   Placing assertions in the content can also limit the set of entities
   that can instantiate the verifier role.  Email intermediaries are not
   required to be capable of understanding or parsing the contents of
   email messages (especially MIME bodies), and accordingly, they cannot
   be expected to act as verifiers of an identity assertion that appears
   as part of the message content without requiring significant changes
   to their functionality.

   Furthermore, an identity system should be compatible with end-to-end
   security of message contents.  If the identity system requires that
   an intermediary add a body to a message, and the endpoints are using
   some end-to-end integrity mechanism like S/MIME or PGP, appending the
   assertion to the content may violate that end-to-end integrity.  If


Peterson                 Expires April 18, 2005                [Page 27]

Internet-Draft              Message Identity                October 2004


   MIME is supported by these intermediaries, however, this problem
   becomes less pressing, as intermediaries might add the assertion as a
   complete MIME body by transposing the existing content into a new
   multipart.

   Placing assertions in the content is further complicated by the
   manner in which the content of a message is rendered to the
   recipient.  Ultimately, an identity assertion is not a component of
   the content that should be blindly rendered to the user.  It is more
   appropriate for a recipient's endpoint to consume the assertion as an
   input to an authorization decision, which may in turn change the
   manner in which the message is rendered to the user.  The assertion
   itself, a collection of cryptographic bits, is not something that
   should be intermingled with the content rendered to the recipient.
   Endpoints that do not support the identity assertion scheme, however,
   are likely to do just that, and accordingly, placing the assertion in
   the content leads to serious backwards-compatibility concerns.

   A messaging system based entirely on the use of MIME content,
   however, overcome these difficulties.  Various Content-Dispositions
   (see [10]) can inform the recipient's endpoint that it should not
   render the content of a body to a user.  Moreover, it can flag the
   body as specifically containing an identity assertion.  One such
   Content-Disposition for identity assertions ("aib") is defined in
   RFC3893.  In messaging systems where multipart MIME support is not
   guaranteed in endpoints, however, this would lead to backwards
   compatibility issues.

9.3  Distributing Keys by-Reference or by-Value

   The various keying schemes described in Section 6 entail a few
   high-level models by which keys can be incorporated into requests:
   inclusion by-reference and inclusion by-value.  In the by-reference
   case, some resource in the network would hold the key, and the
   message would either explicitly (with something like a URI) or
   implicitly (through some understanding built into the identity
   architecture and messaging protocols) indicate where the key for a
   particular identity provider can be acquired.  In the by-value case,
   the key would accompany the identity assertion in the message.

   When an originator acts as a identity provider, they may not be
   capable of operating or contracting with a network service such as a
   key store.  In those cases, they have no alternative but to include
   keys by-value.  Intermediary-based identity providers would generally
   have no trouble offering keys by-reference.

   Including keys by-value is attractive if the keys are self-validating
   (as is the case with public keys bound to certificates).  If keys are


Peterson                 Expires April 18, 2005                [Page 28]

Internet-Draft              Message Identity                October 2004


   not self-validating, then clearly an impersonator could trivially
   include a key of their own choosing with the request - this is an
   instance of the leap-of-faith model described in Section 6.1.2.
   Including certificates by-value, however, can be troublesome given
   the comparatively greater size of certificates (though in many
   messaging architectures, certificates can be incorporated into the
   content of the message without dire ramifications).  For the purposes
   of comparison, an example self-signed certificate constitutes about
   1100 bytes of data when base64 encoded; the public key it contains is
   about 270 bytes of base64 data (for a 1024 bit RSA key).

   The high-level problem with including keys by-reference is that the
   verifier must have network access (if they do not already possess the
   key) in order to validate the signature within an assertion; this is
   not a requirement for verifier by-value assertions, and is important
   for recipient-based verifiers in store-and-forward messaging
   architectures.  There are also potentially non-obvious consequences
   of including keys by-reference.  Consider, for example, that if the
   message is not rendered to a recipient instantiating the identity
   provider role for a protracted period of time (weeks or months), it
   is possible that the key used by the identity provider will expire or
   changed during that time; one interesting property of carrying
   certificates by-value is that a verifier can determine, on the basis
   of an expired certificate shipped with a message, if the assertion
   was valid at the time it was created, provided that the assertion is
   constrained by its creation time (though there would be good reasons
   to be cautious about this practice).

   Also, as another non-obvious consequence of by-reference key
   distribution, note that the key store used by the identity provider
   will be notified each time that a verifier acquires a key.  This can
   actually have important privacy implications, because in some cases,
   this could reveal most or all of the recipients of the request to the
   identity provider.  The implications of this are further discussed in
   Section 10.

   It is important to recall that messages may be reflected to multiple
   recipients - potentially, many thousands in some environments.  While
   it may seem to save message bandwidth to include keys by-reference,
   thousands of requests to the key store may result in profoundly
   greater network traffic.  Note, however, that the impact for
   domain-based keys is probably less than the impact for user-based
   keys, since domain-based keys need to be acquired on a per-domain
   basis, and a domain generally encompasses many users.  There's no
   question that the impact might be significant in either case.

   Comparing the total bandwidth consumed by the two approaches, it is
   also important to note that verifiers can cache credentials.  So, if


Peterson                 Expires April 18, 2005                [Page 29]

Internet-Draft              Message Identity                October 2004


   the verifier already has the key, and the key is still valid, a
   reference to the key in a message will not necessitate the key's
   retrieval.  When keys are sent by-value, however, the originator has
   no way to know whether or not a potential recipient already possesses
   the key; this problem is compounded by the general difficulty of
   anticipating who might conceivably receive a message.  The only safe
   policy is to send the key every time, when keys are distributed
   by-value.  When compared with the potential for thousands of
   recipients to retrieve the key from a key store, however, this is
   still comparatively a minor inconvenience.

   It is furthermore the case that any network service which distributes
   keys to verifiers will add new threats to the overall identity
   architecture.  The security properties of the protocols used to
   implement the service become critical to the strength of the
   assertion.  Moreover, those services could be subject to
   denial-of-service attacks intended to prevent verification of
   messages with identity assertions.

   Any network service that can provide keys by-reference to verifiers
   might also provide keys to originators; originators, in contrast to
   verifiers, would only need to access this local service very
   infrequently, and at worst, only one originator would need to access
   this service per message, which compares very favorably to the
   unbounded set of verifiers to which a message might be distributed.
   In fact, the manner in which originators authenticate themselves to
   identity providers (which is outside the scope of this document) may
   innately entail a key exchange - the originator may learn the keys of
   their local domain as a matter of course.  Provided these keys are
   bound in certificates, this could potentially serve as an attractive
   manner for originators to learn their identity provider's keys in
   order to include them in messages by-value.  This may be important in
   architectures where it is desirable to add the key to the content of
   a message, but intermediaries lack the capability or permissions to
   make useful additions to the content.

   Too hard to choose? Ultimately, there is an easy way to be flexible
   about the incorporation of keys into a message.  If there is a field
   in the message that provides a URI where the key can be acquired,
   this can be purposed to include a key by-reference or by-value.  It
   can be used by-reference to indicate, for example, an HTTP or HTTPS
   URI where the key can be acquired; alternatively, it could use some
   form of DNS URL (such as [24]) to denote a particular DNS resource
   record where the key is located.  If the message uses MIME bodies as
   content, it could use the CID URI scheme [9] to designate a
   particular MIME body that contains the key.  The only option
   considered in this document for which a URI does not provide a
   solution is carrying they key by-value in the envelope, but of


Peterson                 Expires April 18, 2005                [Page 30]

Internet-Draft              Message Identity                October 2004


   course, it wouldn't make much sense to have, for example, one header
   in a SIP request contain a URI reference to another header in the
   same message - a special-purpose header should be used to carry keys
   by-value in the envelope.

9.4  Distributing Assertions by-Reference

   It is also possible to distribute assertions by-reference; to force
   the verifier to contact a service operated by the identity provider
   in order to acquire an assertion that would be used to verify the
   message.  This is identical, from a security perspective, to a
   dial-back identity scheme; see Appendix E.2.

10.  Privacy and Anonymity

   Anonymity plays an important part in communication systems.  The
   existence of an identity system should not preclude anonymous message
   originators.  However, it is possible to strike a balance in which
   anonymized messages still contain identity assertions, and those
   identity assertions are potentially still valuable.

   Considering the classic case of an originator wishing to be anonymous
   to recipients, there are numerous ways in which this could be
   realized in the context of an identity system.  If the domain of the
   originator permits anonymous messaging, the originator could populate
   their return address in the message with, say,
   'anonymous@example.com', and send the message through the identity
   provider of 'example.com'.  This sort of anonymity is meaningful for
   domains with a great many users, and less useful as the number of
   users in the domain grows smaller.  Alternatively, a message
   anonymization service unrelated to the originator's usual domain
   could act as an identity provider for a message.  Receiving a message
   signed by 'anonymous@anonymizer.example.org' is still, in all
   likelihood, preferably from an authorization perspective to receiving
   a message without any identity assertion whatsoever.  An assertion
   provides a pointer of accountability to the originating domain in
   cases of abuse.

   Another important form of privacy relates to preventing
   intermediaries responsible for message transfer from reading the
   identity assertion.  Encryption of assertions entails a very
   different key distribution problem than identity.  In order to send
   an encrypted message to a recipient, the recipient must possess a
   corresponding decryption key.  This key needs to be shared, in some
   fashion, with the identity provider before the identity provider can
   encrypt the assertion for that recipient.  The problem is complicated
   by the potential existence of multiple recipients.  If the identity
   assertion is encrypted for one particular recipient, and ends up


Peterson                 Expires April 18, 2005                [Page 31]

Internet-Draft              Message Identity                October 2004


   being distributed to multiple recipients by a reflector, the addition
   recipients will not be able to read or verify the assertion.

   There are at least two strategies for overcoming this problem:
   o  Encrypt the assertion on a per-recipient basis (i.e., include
      multiple versions of the assertion, each one encrypted with a key
      corresponding to the decryption key of each recipient).
   o  Force all recipients to share a common decryption key, and encrypt
      the assertion only once with that key.

   Both of these approaches are limited by the fact that the identity
   provider cannot anticipate who will receive the message.  Moreover,
   as the list of recipients grows larger, these strategies become
   increasingly unmanageable.  Even if a message is retargeted to only
   one destination, the identity provider has no way to anticipate what
   that destination might be.  In the end analysis, encryption of
   assertions is a very difficult practice to manage in messaging
   identity architectures.

   When a message is reflected to multiple recipients, this can give
   rise to another privacy problem.  If the identity provider's keying
   material is included in the message by-reference, then the identity
   provider will know who the verifiers are when they content the key
   store to acquire the key (given that identity providers operates or
   has some oversight of the key store).  While not all reflectors need
   to protect the privacy of their distribution list, it is very
   probable that some do.  This problem can even arise when a message is
   forwarded by one recipient to another recipient, who subsequently
   verifiers the message, if the original recipient did not want to
   reveal to the originator that their message was forwarded.  In an
   identity architecture in which keys are always distributed by-value,
   this problem never arises; if the originator or identity provider can
   choose to include keys by-reference, however, this could be a
   material concern.  The concern lessen as the number of messages
   assured by the identity provider grows larger (i.e., large domains
   using domain-based assertions); any individual request becomes a
   needle in a haystack.  Nothing about a request for a key alone
   identifies the message that the verifier is validating - although if
   user-based assertions are used, it will reveal the originator of the
   message.  This is a major distinction between distributing keys
   by-reference and distributing assertions by-reference; dial-back
   identity schemes (see Appendix E.2) notify the identity provider of
   the exact message that the verifier is inspecting.

11.  Conclusion: Consensus Points and Questions

   If the analysis in this document illustrates anything, it's the sheer
   number of moving parts that must be fixed in order to arrive at an


Peterson                 Expires April 18, 2005                [Page 32]

Internet-Draft              Message Identity                October 2004


   identity solution for a messaging system.  It does, however, identify
   the core consensus points in arriving at an identity solution.  The
   following are the major points that require analysis:
   o  keying: asymmetric keys vs.  symmetric keys
   o  asymmetric keys: certificates vs.  uncertified
   o  assertion structure: canonicalization vs.  replication
   o  reference indicators: static vs.  dynamic
   o  identity providers: originators vs.  intermediaries
   o  verifiers: recipients vs.  intermediaries
   o  content: a reference indicator vs.  not a reference indicator
   o  assertions: domain-based vs.  user-based
   o  assertion placement: envelope vs.  content
   o  key distribution: by-reference vs.  by-value

   In order to arrive at a consensus on those points, questions like the
   following need to be asked.

   Do your use cases include identity assertions being validated by
   verifiers who have no previous association with the identity
   provider? If so, this argues for using asymmetric keys rather than
   symmetric keys, since symmetric keys assume some pre-arranged key
   exchange between the identity provider and the verifier.

   Is the privacy of the recipients of a message with respect to the
   identity provider, when a message is forwarded to unanticipated
   destinations, important? At a high level, if you believe so, this
   argues for supplying keys in messages by-value, rather than
   by-reference.  Alternatively, if the by-reference key store is the
   DNS, one could argue that requests for keys are likely to be lost in
   the general mass of queries targeting the DNS server (though this may
   not be the case in practice, depending on how the query strings are
   formulated).

   Do you want recipients of a message to be able to verify messages
   off-line? If so, this also argues for supplying keys by-value.  If
   keys are supplied by-value, it is far better to use certificates than
   uncertified public keys, especially if you want domain-based
   assertions.

   Is it critical that an identity provider be securely associated with
   a particular domain? If you say 'yes' to this, this argues for
   domain-based assertions.  Furthermore, depending on exactly how
   critical it is, this argues for using certificates rather than any
   system relying on the DNS (given the current state of DNSSEC
   deployment) or a leap-of-faith system.

   Is it possible to arrive at fixed set of reference indicators for
   messages in your messaging system? If so, then this argues for using


Peterson                 Expires April 18, 2005                [Page 33]

Internet-Draft              Message Identity                October 2004


   canonicalization rather than replication in assertions.  If not, then
   replication is probably a better bet than dynamic canonicalization.
   If you can use canonicalization, then placing assertions in the
   envelope is preferable for most messaging systems.

   Do you want the use of identity assertions to be opportunistic for
   endpoints? If so, then you want intermediaries to instantiate the
   identity provider role.

   Are you willing to try to prevent active attackers as well as passive
   attackers? If so, then you may be willing to try to use message
   content as a reference indicator.

12.  Security Considerations

   This document is entirely concerned with the security of Internet
   messaging systems.  It provides a survey of existing mechanisms to
   provide identity in Internet messaging systems in order to counter
   the seminal threat of impersonation.  Since it treats messaging in
   the abstract, rather than discussing any particular protocol, it
   makes no specific recommendation for advancing any particular
   approach for the problem.  It does, however, show how some
   architectural decisions, at a high level, are likely to be more
   successful than others.  It also suggests a way to divide-and-conquer
   decision-making about identity enhancements for applicable messaging
   systems.

13.  IANA Considerations

   This document contains no considerations for the IANA.

14  Informative References

   [1]   Postel, J., "Simple Mail Transfer Protocol", RFC 821, STD 10,
         August 1982.

   [2]   Crocker, D., "Standard for the format of ARPA Internet text
         messages", RFC 822, August 1982.

   [3]   Oikarinen, J. and D. Reed, "Internet Relay Chat Protocol", RFC
         1459, May 1993.

   [4]   Klensin, J., "Simple Mail Transfer Protocol", RFC 2821, April
         2001.

   [5]   Resnick, P., "Internet Message Format", RFC 2822, April 2001.

   [6]   Mockapetris, P., "Domain names - concepts and facilities", RFC


Peterson                 Expires April 18, 2005                [Page 34]

Internet-Draft              Message Identity                October 2004


         1034, STD 13, November 1987.

   [7]   Linn, J., "Privacy Enhancement for Internet Electronic Mail:
         Part I: Message Encryption and Authentication Procedures", RFC
         1421, February 1993.

   [8]   Freed, N. and N. Borenstein, "Multipurpose Internet Mail
         Extensions (MIME) Part One: Format of Internet Message Bodies",
         RFC 2033, November 1987.

   [9]   Levinson, E., "Content-ID and Message-ID Uniform Resource
         Locators", RFC 2111, March 1997.

   [10]  Troost, R., Dorner, S. and K. Moore, "Communicating
         Presentation Information in Internet Messages: The
         Content-Disposition Header Field", RFC 2183, August 1997.

   [11]  Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, A.,
         Peterson, J., Sparks, R., Handley, M. and E. Schooler, "SIP:
         Session Initiation Protocol", RFC 3261, May 2002.

   [12]  Housley, R., Polk, W., Ford, W. and D. Solo, "Internet X.509
         Public Key Infrastructure Certificate and Certificate
         Revocation List (CRL) Profile", RFC 3280, April 2002.

   [13]  St. Andre, P., "Extensible Messaging and Presence Protocol:
         Instant Messaging and Presence", RFC 3921, October 2004.

   [14]  Peterson, J., "Session Initiation Protocol (SIP) Authenticated
         Identity Body (AIB) Format", RFC 3893, September 2004.

   [15]  Watson, M., "Short Term Requirements for Network Asserted
         Identity", RFC 3324, November 2002.

   [16]  Jennings, C., Peterson, J. and M. Watson, "Private Extensions
         to the Session Initiation Protocol (SIP) for Asserted Identity
         within Trusted Networks", RFC 3325, November 2002.

   [17]  Ramsdell, B., "Secure/Multipurpose Internet Mail Extensions (S/
         MIME) Version 3.1: Message Specification", RFC 3851, July 2004.

   [18]  Elkins, M., Del Toro, D., Levien, R. and T. Roesler, "MIME
         Security with OpenPGP", RFC 3156, August 2001.

   [19]  Sparks, R., "The Session Initiation Protocol (SIP) REFER
         Method", RFC 3515, April 2003.

   [20]  Sparks, R., "The Session Initiation Protocol (SIP) Referred-by


Peterson                 Expires April 18, 2005                [Page 35]

Internet-Draft              Message Identity                October 2004


         Mechanism", RFC 3892, September 2004.

   [21]  Crocker, D., "Internet Mail Architecture",
         draft-crocker-mail-arch-01 (work in progress), July 2004.

   [22]  Klyne, G. and J. Palme, "Registration of mail and MIME header
         fields", draft-klyne-hdrreg-mail-05 (work in progress), May
         2004.

   [23]  Arends, R., Austein, R., Larson, M., Massey, D. and S. Rose,
         "Protocol Modifications for the DNS Security Extensions",
         draft-ietf-dnsext-dnssec-protocol-09 (work in progress),
         October 2004.

   [24]  Josefsson, S., "Domain Name System Uniform Resource Locators",
         draft-josefsson-dns-url-10 (work in progress), September 2004.

   [25]  Peterson, J. and C. Jennings, "Enhancements for Authenticated
         Identity Management in the Session Initiation Protocol (SIP)",
         draft-ietf-sip-identity-03 (work in progress), September 2004.

   [26]  Bradner, S., "Key words for use in RFCs to indicate requirement
         levels", RFC 2119, March 1997.

   [27]  Handley, M., Schulzrinne, H., Schooler, E. and J. Rosenberg,
         "SIP: Session Initiation Protocol", RFC 2543, March 1999.


Author's Address

   Jon Peterson
   NeuStar, Inc.
   1800 Sutter St
   Suite 570
   Concord, CA  94520
   US

   Phone: +1 925/363-8720
   EMail: jon.peterson@neustar.biz
   URI:   http://www.neustar.biz/

Appendix A.  Acknowledgments

   The author drew considerable inspiration for this document from the
   longstanding discussion of identity on the SIP mailing list.  The IAB
   Workshop on Messaging in October of 2004 was also a valuable
   influence.


Peterson                 Expires April 18, 2005                [Page 36]

Internet-Draft              Message Identity                October 2004


Appendix B.  Verification Assertions

   A verification assertion is a piece of information added to a message
   by an intermediary-based verifier which asserts that an identity
   assertion in the message was verified.  These assertions are most
   useful in architectures in which a recipient cannot be expected to
   instantiate the verifier role itself.  However, it is also possible
   that verification assertions could be inspected by intermediaries
   between the verifier and the recipient.

   Verification assertions may be cryptographic, but typically they are
   not.  Usually, the recipient has some specific trust relationship
   with the verifier, which may include the use of some other form of
   security (for example, network or transport layer security) which
   guarantees that the verification assertion was created by the trusted
   verifier.

   A verifier may strip any identity assertion from a message before
   adding a verification assertion, or it may leave the assertion in the
   message.  The latter option is preferable, in so far as it is
   forwards-compatible with recipients instantiating the verifier role.

   While verification assertions are probably important for some
   architectures, they are not strictly necessary to implement an
   identity service.  In fact, by rendering the identity architecture
   less end-to-end, verification assertions may weaken the overall
   security of the architecture.

Appendix C.  Messaging: Real-Time versus Store-and-Forward

   In most respects, the high-level messaging architectures discussed in
   this document share common security properties regardless of whether
   they are real-time or store-and-forward.  However, there are a few
   important respects in which the two differ:
   Delay from Computation The instantiation of the verifier and identity
      provider roles by the system (more or less irrespective of where
      they are located) will incur some delay corresponding to the
      complexity of the cryptosystems they employ.  While this delay is
      not likely to be noticeable in store-and-forward messaging
      systems, it may be perceptible (and undesirable) in real-time
      communications systems.
   Offline Handling Store-and-forward systems allow users to read their
      messages offline.  Accordingly, if the recipient acts as a
      verifier, the verifier might not be online when it reads the
      message.  This has important implications for any sort keys or
      assertions that are carried by-reference (and for dial-back
      identity schemes).


Peterson                 Expires April 18, 2005                [Page 37]

Internet-Draft              Message Identity                October 2004


   Delivery Receipts Real-time messaging protocols tend to provide
      real-time acknowledgements of message delivery by default.  These
      acknowledgements in turn have important identity properties.
      While the same is true of various optional delivery
      acknowledgement mechanisms that can be used in store-and-forward
      systems, in real-time systems the responses returned to a message
      can invoke all sorts of behavior on the originator side, including
      resubmission of the request to alternate destinations and so on.
      Any sort of response identity is outside the scope of this
      document, and believed to be separable from the message identity
      work described in this document.
   By-Value Subversion In order to subvert a request to acquire keys
      by-value from a key store, it really helps if the attacker knows
      when the verifier will initiate the request.  In real-time
      messaging architectures, this is relatively clear - it will be
      soon after the message has been sent.  In store-and-forward
      architectures, since the verifier might not validate the message
      for hours or days or weeks, it can be very difficult for the
      attacker to make this determination.  Not that even in
      store-and-forward architecture, if an intermediary acts a
      verifier, this distinction becomes less acute - there is a
      comparatively smaller time-window in zwhich an intermediary is
      likely to verify a assertion, and accordingly, it may be easier to
      subvert request for a key when an intermediary is the target.
   Creation Time as a Reference Indicator In real-time messaging
      systems, the creation time of a message is a very strong reference
      indicator, since deliver of messages is expected to be very quick.
      Accordingly, passive attackers have only a small interval of time
      to mount a replay attack using an assertion with a creation time
      reference indicator.  In store-and-forward architectures, the
      delivery window is extremely large, so creation time is a less
      valuable reference indicator (though not entirely useless).

Appendix D.  Third-Party Assertions

   Many messaging architectures assign important roles to third parties.
   To take a familiar example, email has the concept of a mailing list
   which sends messages on behalf of an originator.  For the purposes of
   this document, a third-party assertion is differentiated from an
   ordinary identity assertion as follows: a third-party assertion is
   provided by an identity provider that is not authoritative for the
   namespace containing the name of the originator of the message
   (following the general constraints of Section 7.1.

   Depending on the sorts of authorization decisions that a verifier
   might want to perform, the identity of the originator may be
   secondary, or even totally irrelevant, when a third-party is
   involved.  A particular recipient might wish to accept any email


Peterson                 Expires April 18, 2005                [Page 38]

Internet-Draft              Message Identity                October 2004


   message from a particular mailing list, for example, without regard
   to the identity of a particular originator.  Other practical examples
   include chat-rooms of instant messaging systems, and systems in which
   one endpoint can instruct another endpoint to send a message (such as
   the SIP REFER [19] method).

   Clearly, the manner in which a third-party asserts something about a
   message is orthogonal to the broader question of how to identify the
   originator of a message.  However, it is certainly possible that
   third-parties may want to add additional cryptographic information to
   a message in order to allow particular authorization decisions to be
   made available to recipients.  The formulation of third-party
   assertions seems to be a problem that is entirely separable from the
   identification of the originator, and is thus out of scope of this
   document.  Future work could identify a means of providing
   third-party assertions that was entirely supplemental to the identity
   work in this document.

   An example of a third-party assertion is the Referred-by [20] token
   associated with the SIP REFER method.

Appendix E.  Alternatives to Identity Assertions

   Identity assertions are not the only means of increasing a
   recipient's surety of the identity of an originator of a request.

E.1  Trusted Intermediary Networks

   It is important to note that identity assertions are primarily
   motivated by the interdomain nature of messaging.  Within a single
   administrative domain, both the originator and the recipient of any
   message must trust the same domain in order for messaging to function
   at all.  Accordingly, they can assume (perhaps without good
   justification) that the domain would not connect them if it had not
   properly authenticated them both.

   Given this, some messaging architectures try to extend the boundaries
   of an administrative domain in order to treat interdomain messaging
   as an intradomain problem.  In contrast to cryptographic assertions,
   these identity systems rely on particular deployment architectures to
   guarantee the security properties of the assertion.  The only
   assertion that is actually carried in the message is a separate
   envelope element that provides an 'authoritative' return address.

   For example, consider the 'trust domain' concept defined in Section
   2.3 of RFC3324.  In this messaging architecture, a trusted network is
   a set of intermediaries that exchange messages with one another over
   a closed network (a network either logically or physically


Peterson                 Expires April 18, 2005                [Page 39]

Internet-Draft              Message Identity                October 2004


   inaccessible from the Internet, over which intermediaries pass
   messages to one another).

   Assuming such a trusted network, one can design a very simple
   identity assertion.  For example, in an email network, one could
   introduce a new 'Trusted-From' header field whose contents could only
   be set by intermediaries in the trusted network.  The identity
   information conveyed by such a system is the contents of this trusted
   header.  Recipients treat this trusted header as the assured identity
   of the originator.  An example of this sort of trusted assertion is
   RFC3325 [16], which defines the P-Asserted-Identity header field for
   SIP.

   The traditional Internet Relay Chat (IRC [3]) service relied on a
   similar concept of trusted intermediaries.  Intermediaries formed a
   meshed trust network over which messages passed, and each server was
   responsible for authenticating its users.

   While this model has enjoyed considerable success in closed networks
   such as the telephone network, it has a number of limitations which
   render it incompatible with widespread Internet deployment of a
   messaging architecture.  Forming closed overlay networks of providers
   that agree on network or transport-layer security standards and
   practices does not agree with the general model of Internet
   messaging, in which domains may exchange messages without any
   previous association.

   Other, more sophisticated forms of transitive trust are ad-hoc.  For
   example, a message could contain an explicit indication that any
   intermediary that relays the message needs to use some form of
   transport or network-layer security when sending to the next hop.
   Assuming a proper keying architecture, intermediaries can mutually
   authenticate one another from the originating domain to the domain of
   the recipient.  The SIPS URI scheme in RFC3261 has this property.
   The main drawback to such mechanisms is that it is impossible for any
   intermediary or recipient to verify that appropriate lower-layer
   security was used over any particular transit hop.  This is, in fact,
   the main problem with trusted networks in general - a given domain
   must trust that the remainder of the domains in the network behave
   properly.

E.2  Dial-back Identity

   A dial-back identity system for messaging works as follows: when a
   verifier receives a message, it inspects the name that identifies the
   originator (such as the RFC2822.From header for email), and then
   launches a dial-back request to that name.  This dial-back request
   must contain reference indicators for the request, either by-value or


Peterson                 Expires April 18, 2005                [Page 40]

Internet-Draft              Message Identity                October 2004


   possibly as a hash of a canonicalization of the reference indicators.
   In another variant, the message itself contains such a hash which is
   verifiable by the recipient (essentially, an unsigned identity
   assertion), and the recipient then sends that hash in the backwards
   direction to the identity provider.

   Assuming the name of the originator is valid, an identity provider
   responsible for the namespace of the originator's name will receive
   the request.  If this identity provider is the originator, it can
   reply to the request with a positive response if it did indeed send
   the message in question.  If the identity provider is some
   intermediary, it would need some way to ascertain that the originator
   sent that message; possibly, the originator sent the message through
   the identity provider, and the identity provider keeps state for
   every message it handled.  However the intermediary-based identity
   provider learns of the validity of a request, it returns a positive
   response if the request was in fact sent from the originator in
   question.  If the identity provider does not recognize the described
   message, it sends a negative response.  No response (because the
   domain of the originator's name doesn't exist, or exists but has no
   identity provider) is assumed to be a negative response.

   Depending on the semantics of the request, it may be somewhat
   intensive for the identity provider to make a determination of
   whether or not the request was actually sent by the originator.  If a
   message is forwarded to numerous recipients, obviously this
   per-message work becomes larger, and for cases like large email
   mailing lists, it may become unmanageable.  The use of unsigned
   hashes in the message moves this work to a phase before the message
   is sent, rather than after the dial-back request is received.

   In some respects, dial-back has similar properties to DNS-based
   mechanisms of keying distribution discussed in Section 6.1.2.  Since
   these system relies on a request being sent in the backwards
   direction using the name of the originator, it would necessarily rely
   on the validity of the DNS to reach that name.  However, unlike the
   DNS-based uncertified keying mechanisms, dial-back requires no
   special modifications to the DNS.

   Dial-back identity systems have enjoyed some success in real-time
   messaging systems, but clearly their applicability to
   store-and-forward systems is limited, especially when the identity
   provider role is instantiated by originators.

   All in all, within their domain of applicability, dial-back identity
   systems improve security with little expenditure of design effort.
   They are not considered further in this document because they are not
   predicated on identity assertions as such.


Peterson                 Expires April 18, 2005                [Page 41]

Internet-Draft              Message Identity                October 2004


Intellectual Property Statement

   The IETF takes no position regarding the validity or scope of any
   Intellectual Property Rights or other rights that might be claimed to
   pertain to the implementation or use of the technology described in
   this document or the extent to which any license under such rights
   might or might not be available; nor does it represent that it has
   made any independent effort to identify any such rights.  Information
   on the procedures with respect to rights in RFC documents can be
   found in BCP 78 and BCP 79.

   Copies of IPR disclosures made to the IETF Secretariat and any
   assurances of licenses to be made available, or the result of an
   attempt made to obtain a general license or permission for the use of
   such proprietary rights by implementers or users of this
   specification can be obtained from the IETF on-line IPR repository at
   http://www.ietf.org/ipr.

   The IETF invites any interested party to bring to its attention any
   copyrights, patents or patent applications, or other proprietary
   rights that may cover technology that may be required to implement
   this standard.  Please address the information to the IETF at
   ietf-ipr@ietf.org.


Disclaimer of Validity

   This document and the information contained herein are provided on an
   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
   ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
   INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
   INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.


Copyright Statement

   Copyright (C) The Internet Society (2004).  This document is subject
   to the rights, licenses and restrictions contained in BCP 78, and
   except as set forth therein, the authors retain all their rights.


Acknowledgment

   Funding for the RFC Editor function is currently provided by the
   Internet Society.


Peterson                 Expires April 18, 2005                [Page 42]