Network Working Group J. Peterson Internet-Draft NeuStar Expires: April 18, 2005 October 18, 2004 Security Considerations for Impersonation and Identity in Messaging Systems draft-peterson-message-identity-00 Status of this Memo By submitting this Internet-Draft, I certify that any applicable patent or other IPR claims of which I am aware have been disclosed, and any of which I become aware will be disclosed, in accordance with RFC 3668. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on April 18, 2005. Copyright Notice Copyright (C) The Internet Society (2004). All Rights Reserved. Abstract This document provides an overview of the concept of identity in Internet messaging systems as a means of preventing impersonation. It describes the architectural roles necessary to provide identity, and details some approaches to the generation of identity assertions and the transmission of such assertions within messages. The trade-offs of various design decisions are explained. Peterson Expires April 18, 2005 [Page 1] Internet-Draft Message Identity October 2004 Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1 Terminology . . . . . . . . . . . . . . . . . . . . . . . 4 2. What is Identity? . . . . . . . . . . . . . . . . . . . . . . 5 3. Roles in an Identity System . . . . . . . . . . . . . . . . . 6 3.1 Identity provider . . . . . . . . . . . . . . . . . . . . 6 3.2 Verifier . . . . . . . . . . . . . . . . . . . . . . . . . 8 4. Threat Model of Impersonation in Messaging Systems . . . . . . 8 5. Identity Assertions . . . . . . . . . . . . . . . . . . . . . 10 6. Keying for Assertions . . . . . . . . . . . . . . . . . . . . 11 6.1 Asymmetric Keys . . . . . . . . . . . . . . . . . . . . . 11 6.1.1 Certificates . . . . . . . . . . . . . . . . . . . . . 12 6.1.2 Uncertified Public Keys . . . . . . . . . . . . . . . 13 6.2 Symmetric Keys . . . . . . . . . . . . . . . . . . . . . . 15 7. User-based and Domain-based Assertions . . . . . . . . . . . . 15 7.1 Name Subordination . . . . . . . . . . . . . . . . . . . . 17 8. Reference Indicators and Replay Protection . . . . . . . . . . 18 8.1 Canonicalization versus Replication . . . . . . . . . . . 19 8.2 Assertion Constraints and Scope . . . . . . . . . . . . . 21 9. Placement of Assertions and Keys in Messages . . . . . . . . . 25 9.1 Assertions in the Envelope . . . . . . . . . . . . . . . . 26 9.2 Assertions in the Content . . . . . . . . . . . . . . . . 27 9.3 Distributing Keys by-Reference or by-Value . . . . . . . . 28 9.4 Distributing Assertions by-Reference . . . . . . . . . . . 31 10. Privacy and Anonymity . . . . . . . . . . . . . . . . . . . 31 11. Conclusion: Consensus Points and Questions . . . . . . . . . 32 12. Security Considerations . . . . . . . . . . . . . . . . . . 34 13. IANA Considerations . . . . . . . . . . . . . . . . . . . . 34 Author's Address . . . . . . . . . . . . . . . . . . . . . . . 36 A. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 36 B. Verification Assertions . . . . . . . . . . . . . . . . . . . 37 C. Messaging: Real-Time versus Store-and-Forward . . . . . . . . 37 D. Third-Party Assertions . . . . . . . . . . . . . . . . . . . . 38 E. Alternatives to Identity Assertions . . . . . . . . . . . . . 39 E.1 Trusted Intermediary Networks . . . . . . . . . . . . . . 39 14. Informative References . . . . . . . . . . . . . . . . . . . 34 E.2 Dial-back Identity . . . . . . . . . . . . . . . . . . . . 40 Intellectual Property and Copyright Statements . . . . . . . . 42 Peterson Expires April 18, 2005 [Page 2] Internet-Draft Message Identity October 2004 1. Introduction Widespread forgery of the From header field of email [5] messages is the most immediate motivation for work on message identity systems. However, there are numerous other messaging systems used on the Internet that currently confront similar problems, or are likely to confront these problems in the future; notably instant messaging systems and other real-time communications systems that leverage a messaging architecture as a rendez-vous protocol for session establishment. All of these systems suffer from a similar threat of impersonation (as described in Section 4). Messaging identity mechanisms, as defined in this document, address specifically the threat of impersonation in messaging systems. It is unlikely that the diverse identity requirements of these various messaging systems will admit of any single solution that could be deployed for all such protocols. However, there is much to be gained by considering the broad body of work on the messaging identity problem that has already been done across this wide selection of protocols. The core commonalities of these systems permit a high-level analysis of the message identity problem that could assist all messaging protocols in selecting an appropriate way of incorporating identity. This document aspires to apply to messaging systems with the following architectural qualities: o The messaging system has the two agents: originators and recipients. Both originators and recipients interact with the system through endpoints. Messages sent from endpoints may pass through multiple intermediaries before arriving at the recipient. For the purposes of this document, reflectors and similar services are lumped in with intermediaries, even if from a protocol perspective they act more like endpoints. o The messaging system employs names that are constituted of a 'host' portion, which is a DNS [6] name (allocated through the delegative administration of the DNS) and a 'user' portion which is administered by the domain indicated in the 'host' portion. o The messaging system carries messages that are divided into two major components: envelope and contents. The distinction between the two is inexact, but primarily the content is intended to be rendered by the recipient's application, whereas much of the envelope contains addressing and routing data that is used by intermediaries. [In deference to the email community, 'envelope' here should be understood to encompass both the envelope and header portions of a message.] o The messaging system is used in an interdomain context. Different administrative domains may deploy messaging intermediaries and issue names to valid local users. Administrative domains need to Peterson Expires April 18, 2005 [Page 3] Internet-Draft Message Identity October 2004 be capable of exchanging messages with one another if they have no previous association. o The messaging system is capable of 'retargeting' a message in transit, and delivering it a recipient whose name in the system is not identical to that of the intended recipient specified by the originator. Primarily, this arises when an intermediary forwards a message to multiple recipients (in which case the resource designated as the intended recipient is some sort of reflector). This document was written based on the author's experience on developing identity solutions for the Session Initiation Protocol (SIP, [11]) and on consideration of several proposals circulating to provide similar features in email. The scope of this document is limited to the generation, carriage, and consumption of identity assertions. It does not consider any authorization decisions that might be made, on the basis of the identity of the originator, by the consumer of identity assertions. This document is organized as follows: Section 2 attempts to define identity, and to demonstrate broadly the current manner in which identity is communicated in messaging systems. Section 3 describes the abstract roles that must be instantiated in a system in order to incorporate identity assertions into a messaging architecture: the identity provider and the verifier. The threat model for impersonation in messaging systems is considered in Section 4. Section 5 defines an identity assertion, and explains the manner in which cryptography can be leveraged to generate assertions. Section 6 provides an overview of keying and key distribution architectures that provide a foundation for sharing cryptographic assertions. Section 7 compares the traditional concept of user-based assertions with the newer, and perhaps more promising, idea of domain-based assertions. Section 8 considers the internal composition of an identity assertion, and the elements in a message which the assertion must guarantee in order to be correlated with a message. Section 9 considers various ways that an assertion might be added to a message. Section 10 considers the privacy and anonymity implications of adding identity assertions to messages. Section 11 attempts to pose the key questions that should determine how a messaging protocol approaches the incorporation of an identity mechanism, and to note when this high-level analysis has revealed any general principles that point one way or another on these questions. Various appendices discuss related material that is not directly in the scope of the primary analysis. 1.1 Terminology This document intentionally uses core terminology that is to neutral Peterson Expires April 18, 2005 [Page 4] Internet-Draft Message Identity October 2004 existing messaging protocols. Terminology specific to email is taken from [21]. 2. What is Identity? Every communications system has a namespace. For example, the telephone network uses telephone numbers as a namespace, the postal system uses postal addresses as a namespace, and the Internet Protocol uses Internet Protocol addresses as a namespace. In order for a name to be usable, it must meet the syntactical constraints of the namespace, and it must be unique within the namespace. Accordingly, namespaces generally require significant centralization of administration, though in many cases, delegation can distribute this work across multiple distinct authorities. In the context of a particular communications system, the semantics of these names enables the system to route communications to the appropriate resources. In the most common messaging system on the Internet today, email, the namespace is founded on the Internet Domain Name System (DNS [6]). Names (in RFC2822 [5] terms, the 'addr-spec') are constituted of a 'host' portion, which is a DNS name (allocated through the delegative administration of the DNS) and a 'user' or 'local-part' portion which is administered by the domain indicated in the 'host' portion, and which designates a particular resource or user in the domain. As the message transfer service delivers the message, the host portion of the destination email address is resolved in the DNS (though practically, a message may pass through many intermediary administrative domains before reaching its destination). Aside from email, many other Internet messaging systems have constructed namespaces with the same components: a domain name host portion and a domain-specific user portion. When a message is delivered to its recipient, the recipient has a strong interest in knowing who the message is from. While the contents of a message may be sufficient to identify the originator to the recipient, it is also may happen that: o the contents of the message do not identify the originator o the contents of the message fabricate the identity of the originator o the recipient does not wish to read the contents of the message without first identifying the originator Most protocols therefore provide a field which designates the originator of a communication. Generally, the originator is identified by their name in the communication system. For example, in the postal network, the originator is identified by their return address; by convention, the return address of the originator appears Peterson Expires April 18, 2005 [Page 5] Internet-Draft Message Identity October 2004 on the outside of an envelope. In caller identification systems used in the telephone network, the telephone number of caller is displayed to the callee. In email systems, user agents render the contents of the RFC2822.From header field of an email message as the originator. Nothing forces the originator of a postal message to supply a genuine return address on an envelope; originators are incented to provide a genuine return address only if they want the envelope to be returned to them if it cannot be delivered. Similarly, the RFC2822.From header field of an email message can be populated arbitrarily by the originator (though it is not necessarily the address to which bounces are sent). Malicious originators may want to provide a misleading or false return address for their messages, or to withhold a return address altogether, in order to escape reports of abuse or to mislead the recipient about the origins of the message. While there are valid cases where anonymous communication is necessary, impersonation can be very problematic. For the purposes of this document, 'identity' refers to mechanisms that provide an assurance of the originator of a message. An identity assurance is provided by a party in the messaging architecture that can prove its authority over a segment of the namespace. For the identity systems considered in this document, that may entail proof of authority over DNS names, or it may also be authority specific to a particular user within a domain. This assurance is communicated along with the message, and can be verified by recipients of the message. 3. Roles in an Identity System This document postulates two fundamental roles in a messaging identity architecture: an identity provider and a verifier. These roles might usefully be instantiated by any elements in a messaging architecture. Most commonly, an originator or a proxy for the originator of a message will act as an identity provider, and the recipient or a proxy for the recipient will act as a verifier. However, this is far from the only valid assignment of these roles. There are even useful architectures where it is meaningful for the originator to act both as the identity provider and the verifier (where token-based assertions are used to authenticate networks reports of undeliverable messages). 3.1 Identity provider The role of the identity provider in an identity architecture is to generate an identity assertion. An identity assertion is a chunk of information added to a message which can later be verified to assure Peterson Expires April 18, 2005 [Page 6] Internet-Draft Message Identity October 2004 the identity of the originator. An identity provider must be capable of authenticating the originator of the message. The messaging architecture of the system in question, and the entity that plays the role of the identity provider, will largely determine how this authentication takes place. If the identity provider is instantiated by the endpoint of the originator, for example, this authentication might be tacitly assumed, or occur in some application-specific way. If the identity provider is built into an intermediary, some network authentication mechanism must be used by the identity provider to ascertain the identity of the originator. An identity provider must have some verifiable authority over a segment of the namespace of this messaging system; that is, it must be capable of proving to verifiers that it is the appropriate entity to identify the originator of a particular message. This proof of authority can come in many forms, depending on the type of assertion that the identity provider generates. Once the originator has been authenticated, the identity provider must furthermore determine whether or not the originator is authorized to send the message in question; this practice is most relevant to cases in which the identity provider role is instantiated by an intermediary, since in those cases where the originator's endpoint instantiates the identity provider, the originator itself has authority over the relevant segment of the namespace. When it is necessary, this authorization decision may be based on a number of factors; for our purposes, the most important is the identity claimed by the originator of the message. An originator may be authorized to claim one identity, or any of a number of identities, in accordance with the policy of the controller of the namespace containing the identity. Identity providers only provide identity assertions for messages in which the originator claims an authorized identity. Ideally, an identity provider will be last entity in the architecture that will modify the message in transit. An assertion will create a signature over certain elements of the message, and if the message is subsequently modified, it may violate this signature. The severity of this condition is entirely dependent on the nature of the assertion, and in the elements of the message which are guaranteed by the assertion. In practice, most messaging systems modify messages in some fashion throughout their transit of the network, and subsequent modification after the generation of an identity assertion is most likely unavoidable in any practical deployments. An identity provider must be capable of modifying a message, or forcing another entity in the architecture to modify the message in a Peterson Expires April 18, 2005 [Page 7] Internet-Draft Message Identity October 2004 particular way, in order to incorporate the identity assertion. Commonly, creating an identity assertion involves the use of cryptography, and accordingly, generating identity assertions may slow message creation or processing in the identity provider. 3.2 Verifier A verifier consumes an identity assertion in order to verify the identity of the originator of a message. After inspecting an identity assertion, a verifier may make an authorization decision to act on the message in any of a number of ways. Authorization decisions made by verifiers are outside the scope of this document. In order to perform its function, a verifier must be capable reading the identity assertion in a message. Depending on the placement of the assertion in the message, and the underlying architecture of the messaging system, this may limit the entities that can instantiate the verifier role. It is possible that more than one verifier will inspect the same assertion in a message. In some architectures, it may make sense for one or more intermediaries to act as verifiers before a message reaches its recipient, which may also act as a verifier. Alternatively, an intermediary could reflect a message to a potentially large list of recipients, in which case each recipient (and/or intermediaries acting on their behalf) might act as a verifier. In other architectures, an intermediary acting as a verifier might strip the identity assertion before forwarding the message; in such cases, the intermediary might replace the identity assertion with a verification assertion (see Appendix B). Verification assertions can also be added without stripping identity assertions. Commonly, the verification of an identity assertion involves the use of cryptography, and accordingly, verifying identity assertions may slow message processing in the verifier. 4. Threat Model of Impersonation in Messaging Systems Impersonation is the practice of falsifying the elements of a message that indicate its originator. This is generally done in order to mislead a recipient about the origins of the message. The most common adversary in impersonation threats is a passive attacker. A passive attacker can capture email messages in some way: they may see messages in transit, they may see archives of messages on the web, or they might even be a recipient of a message. By capturing messages, the impersonator learns how a genuine originator Peterson Expires April 18, 2005 [Page 8] Internet-Draft Message Identity October 2004 structures their messages, including the manner in which elements of the message that indicate the originator are populated. The impersonator then sends messages that mimic the structures used by the originator they intend to impersonate, altering the destinations, contents, and other meaningful headers as needed. In the case of fictional originators, impersonators merely create plausible-looking messages based on their experience with typical originators. In many current messaging systems, there is no need to do anything other than adopt the name of the desired originator and inject the message into the messaging system. The manner in which an impersonator injects messages into the messaging system admits of varying degrees of sophistication. A passive attacker may, for example, only be capable of injecting messages as an originator, or they may control or be capable of imitating intermediaries in the system. This can have a large impact on the way that other elements in the messaging system perceive their forgeries. Another type of impersonator is an active attacker. An active attacker can intercept messages in transport, modify them arbitrarily, and then return them to the message transit system. This is a harder sort of attack to mount, and a much harder attack to defeat; consequently it may not be in the scope of identity assertion systems to prevent this sort of attack. Since many intermediaries that are not actually attackers exhibit essentially indistinguishable behavior, designers of identity systems are further disincented from meeting this threat. The uses of impersonation are legion. An impersonator may want to avoid reports or abuse, or accountability for the contents of messages. Or, an impersonator may want to make a message appear to come from a particular originator to whom they believe a recipient will be sympathetic (which may lead the recipient to read a message and inspect content of the impersonator's choosing). Primarily, the purpose of an identity assertion is to prevent impersonation. This means that it must provide the following qualities: o In order for an assertion to be valuable, it must provide a stronger assurance than the return address conventionally attached to a message. For example, an email identity system would be totally uninteresting if it allowed any originator to arbitrarily populate their identity, because this would constitute no improvement over the existing RFC2822.From header field. Typically, the strength of the assertion depends on some form of cryptography, and provable authority over the namespace of the originator. In some constrained environments, assertions instead Peterson Expires April 18, 2005 [Page 9] Internet-Draft Message Identity October 2004 derive their authority from some form of transitive trust (see Appendix E.1); such assertions are outside the scope of this document. o The assertion must have a precise scope and constraints (see Section 8.2), whether these are explicit in the message or static and understood implicitly in the messaging protocol. It is assumed that the means by which a passive attacker collects messages will also allow them to collect identity assertions, and impersonators may accordingly attempt to replay them. Constraints are intended to combat replay attacks. o The assertion must denote the identity provider in some secure fashion, and provide any information necessary for the verifier to validate cryptographic properties of the assertion. Assertions must provide verifiers with a means of determining whether or not the identity provider is authoritative for the namespace of the originator of a message. 5. Identity Assertions An identity assertion is a piece of information (perhaps a header, a parameter, or a attached document) added to a message by an identity provider in order to provide verifiers with identity information about the originator of the message. Most existing and proposed identity mechanisms for Internet messaging systems leverage some form of cryptography. Public key (or 'asymmetric') cryptography is an especially attractive tool in this context, because it allows a verifier to validate an assertion even if it has never before been contacted by that originator. Symmetric key cryptography, by way of contrast, requires that the identity provider and verifier share some pre-arranged secret. Cryptographic signatures generated by an asymmetric keying mechanism provide authentication of the signer and integrity over the signed information. There are a number of ways that a signature can provide identity information, depending on the type of key used to generate the signature, and the identity of the signer. Providing a signature over an identity string like 'joe@example.com' alone, however, does not provide a strong assertion of the identity of the originator of the message. The assertion must contain enough supplemental information that it is clear that it refers to this particular message, not just any message in which an attacker might try to replay the assertion. The constraints and scope of assertions is discussed further in Section 8. Assertions may also be encrypted. In some cases, it may be desirable Peterson Expires April 18, 2005 [Page 10] Internet-Draft Message Identity October 2004 to hide the identity of the originator of a message from intermediaries, but to reveal this information only to a particular recipient, or vice versa. Potentially, this could provide certain privacy properties to an identity assertion mechanism (see Section 10). The use of cryptography requires some mechanism for key distribution and may require a public key infrastructure with widely-distributed root certificates. Encrypting identity assertions requires more complex keying systems. The use of certificates, uncertified asymmetric keys, and symmetric keys is discussed in Section 6. 6. Keying for Assertions Cryptographic identity assertions require the use of keys. In order for a cryptographic signature over an assertion created by an identity provider to be validated by a verifier, both parties must possess corresponding keying material. Since Internet messaging systems assume that messages can be sent to arbitrary recipients that have no previous association with the originator, key distribution is the primary problem confronting the use of cryptographic identity assertions. Note that regardless of the keying mechanism used, an identity provider may have multiple keys that it employs for various reasons. Provided that there is way to link an assertion to a particular key used by the identity provider, this requires no special support from the identity mechanism. 6.1 Asymmetric Keys Asymmetric keys are credentials that have been split into two components, a public and a private key. The holder of the credentials keeps the private key secret, and widely distributes the public key. If a document is signed with the private key, the signature over the document can be validated with the public key. This signature provides integrity over the document, and authenticates the signer. An identity assertion is a type of document that can be signed with a private key by an identity provider. In order to validate the signature, the verifier must hold the corresponding public key, and must have some reason to think that this public key is associated with the identity provider. In order for that signature to provide any guarantee of the identity of the originator, the verifier must also have some assurance that the identity provider is authoritative for namespace of the originator of a message. Peterson Expires April 18, 2005 [Page 11] Internet-Draft Message Identity October 2004 Asymmetric keys may be generated by an identity provider, or acquired by the identity provider from a third party such as a certificate authority. Thus, there are two significant varieties of public keys - uncertified public keys, and public keys within certificates. The certification status of a public key has a tremendous impact on how it can be distributed and the manner in which it assures authority over a namespace. 6.1.1 Certificates A certificate [12] is a document that binds public keying material to a particular name, the 'subject' of the certificate. The certificate is signed by a certificate authority, and accordingly, parties that validate certificates must possess the public keys of certificate authorities (and unfortunately, the chain of certification between a particular certificate and the root certificate authority can include multiple middleman certificates). For the purposes of this document, self-signed certificates are simply considered uncertified public keys. Certificates support a wide variety of subject formats. Two are significant to the scope of this document. First, a certificate's subject can be a valid name in an Internet messaging system, such as an email address. Second, the certificate's subject can be a domain name. Depending on the nature of the subject, the certificate can sign user-based or domain-based assertions; this is discussed further in Section 7. Whether user-based or domain-based certificates are used, certificates have a common set of advantages and drawbacks. The primary advantage of certificates is that they provide a strong link between a public key and a subject. Accordingly, by looking at the subject of a certificate, it is relatively easy to decide whether or not they are authoritative for the namespace of a particular originator of a message (bearing in mind the caveats in Section 7.1). Because a certificate is a signed document, certificates can also be distributed over the network without requiring integrity over the transport; e.g., a certificate store for an identity provider could use an insecure transport like vanilla HTTP to distribute certificates. The downside is that certificates do not represent a permanent binding. Certificates have an expiration date, and consequently certificates must be periodically renewed, which is an operational hassle for identity providers. However, parties that rely on certificates cannot assume that a certificate is still valid simply because it has not expired. Certificates can also be revoked, usually as a consequences of the compromise of their corresponding Peterson Expires April 18, 2005 [Page 12] Internet-Draft Message Identity October 2004 private key. Relying parties are therefore required to monitor certificate revocation lists (CRLs) issued by certificate authorities. Because this entails cumbersome operational procedures, relying parties rarely adhere to this in practice. With all that in mind, it must be remembered that uncertified public keys do not represent a permanent binding either, and that there are no comparable intrinsic mechanisms for determining the expiry or compromise of an uncertified public key, even if a relying party was sufficiently troubled by these concerns to employ them. 6.1.2 Uncertified Public Keys Public key cryptography can also be used for identity assertions without certificates; for example, an identity provider may generate a public/private key pair itself. This requires a mechanism for distributing public keys in which the identity of the private key holder is implicitly or explicitly disclosed to potential verifiers, and verifiers understand unambiguously the namespace for which the identity provider is responsible. One way to associate an uncertified public key with a message originator is to transmit the public key in an initial unsigned message. The recipient, upon receipt of the public key, could store it in a local, application-specific keychain, indexed by the originator's return address (for user-based assertions) or the originating domain (for domain-based assertions) - the message would need to make clear precisely who the identity provider is. Future signed messages received from that originator (or domain) could be validated with the public key. This mechanism of key distribution will be referred to in this document as the "leap-of-faith" mechanism. It merits this particular name because the originator and recipient must have faith that no man-in-the-middle interfered with the initial message containing the public key. If an active attacker were present in the key exchange, they could inject their own public key and impersonate the originator to that recipient. The leap-of-faith follows the example of SSH, which is widely regarded as a vast improvement over insecure telnet-style applications, and no doubt the leap-of-faith method of distributing public keys for identity providers would be an improvement over a lack of identity assertions altogether. Unfortunately, messaging architectures almost inevitably involve application-layer intermediaries that could inspect or modify leap-of-faith keys, and in this respect messaging is significantly distinct from the traditional client-server architecture of SSH. The other challenges facing this approach rest largely in associating the key with a legitimate identity provider, and determining the Peterson Expires April 18, 2005 [Page 13] Internet-Draft Message Identity October 2004 namespace for which that identity provider is authoritative. Practically, there isn't really a way to do so; when a message arrives with an uncertified public key in it, that key is ultimately serviceable only as an validation of that particular (anonymous) identity provider. When future messages are received, the verifier can prove that these assertions were created by that same identity provider, but that verification offers no proof of the namespace for which that identity provider is authoritative. This problem is severe enough that leap-of-faith key distribution is probably only meaningful for anonymous user-based assertions. But again, even anonymous user-based assertions are better than nothing. The DNS might also be leveraged to bind a public key to an identifying domain. DNSSEC [23], for example, provides public keys in a DNS resource record. Those keys are known to be associated with a particular domain (thanks to the delegative structure of DNSSEC). Those keys, or some other keying material in the DNS which is signed via DNSSEC, could be used to provide a domain-based signature in the request for an identity assertion (see Section 7). Even a simple hash of the public key used by the identity provider, placed in the DNS, would enable the transmission of domain-based public keys in messages without any need for a leap-of-faith. Note that strictly speaking, the keying material (or a hash of it) does not need to appear in the DNS in order for the DNS to be leveraged to bind a public key to an identifying domain. If the identity provider were to run a key store service (like an HTTP server) that made its key available, then the identity provider could include a URI reference to that store with its assertion. Since the DNS would be used to dereference that URI, the security of that store is predicated on the security of the DNS. However, the operation of the store exposes the identity provider to further security risks (see Section 9.3), and since the DNS needs to be invoked in order to find the store, using keys or hashes in the DNS is ultimately more efficient from a messaging perspective. In the absence of operational DNSSEC, however, using the DNS to find uncertified keys is insecure. While the technical specifications of DNSSEC are largely complete, it will likely be some time before DNSSEC is fully operationalized. There are high-level changes that would need to sweep through the DNS in order to operationalize DNSSEC, whereas today individuals within the messaging community can opt to employ certificates, or not, on an incremental basis. That much said, it can be argued that the difficulty of subverting the DNS is sufficiently high that this practice would deter a large number of potential impersonators; verifiers can make they own policy decisions about the strength of the assertion based on whether or not the zone containing the keying material uses DNSSEC. Note, however, that this Peterson Expires April 18, 2005 [Page 14] Internet-Draft Message Identity October 2004 approach has no obvious way to support user-based assertions short of placing many (for large domains, perhaps tens or hundreds of thousands) of records in the DNS corresponding the keys of particular individuals; since the identity provider's assurance of the namespace derives from the DNS zone in which these key records are stored, the security of providing domain-based assertions is materially the same. Given either approach, it is desirable for validators to be capable of caching uncertified public keys. For DNS-based schemes, the cache duration could presumably be dictated by the time-to-live of the DNS resource record containing the key or hash of the key. For the leap-of-faith approach, additional metadata associated with the public key would presumably dictate the length of time for which it is safe to cache the key. Some further considerations related to caching are discussed in Section 9.3. One example of the leap-of-faith system in an Internet messaging protocol is given in RFC3261 [11] Section 23.2 (for the case where unsigned certificates are used). 6.2 Symmetric Keys The use of symmetric keys for an identity assertion is severely limited because it requires that the identity provider and verifier pre-arrange a shared secret, which, for the typical assignment of these roles, runs contrary to the requirement that the domain of the originator and recipient of a message require no previous association. However, depending on the intended applicability of the assertion, this may not be an unreasonable constraint. For a case like determining that a bounce resulted from a message that an originator actually sent, the identity provider and verifier of a message are the same endpoint (the originator). Since an endpoint can reasonably be expected to share a secret with itself, the use of symmetric keys is attractive for this use case. The interdomain use of symmetric keys is further limited by the difficulty of key distribution. Asymmetric public keys can be distributed without fear that any passive attacker will be capable of leveraging the keys to impersonate the principal. If a symmetric key used for identity assertions is captured by an attacker, however, the attacker can impersonate the principal for the lifetime of the key. Symmetric keys essentially need to be negotiated, in interdomain cases, through some out-of-band mechanism. 7. User-based and Domain-based Assertions To understand the distinction between user-based and domain-based Peterson Expires April 18, 2005 [Page 15] Internet-Draft Message Identity October 2004 assertions, it is simplest to assume that they are generated by certificates. Consequently, the discussion in the next few paragraphs describes only the use of certificates to provide these assertions; alternatives to certificates are described at the end of this section. In the simplest assertion, the identity provider is directly authoritative for the name of the originator only. For example, the identity provider holds a certificate with a subject of 'joe@example.com', and provides an identity assertion with the private key corresponding to that certificate for only for messages sent by 'joe@example.com'. We will refer to this sort of identity assertion as a user-based assertion. Usually, the identity provider is in this instance the endpoint of the originator, though of course it would also be possible (though probably not very scalable) for an intermediary to manage a keyring of such certificates for every user in their domain. While this case is straightforward, there is no widely-supported public key infrastructure that issues user-based certificates to date. The only successful PKI on the Internet today provides domain-based certificates, primarily for securing web transactions. These certificates have a hostname subject of the form 'example.com' (or, more commonly, 'www.example.com'). While there are many reasons why domain-based certificates are more successful than user-based certificates, for our purposes the most important is enrollment: it is very easy for a certificate authority to determine who controls 'www.example.com' (since this is a matter of public record), but very difficult for a certificate authority to determine to whom 'example.com' has allocated the username 'joe'. The only deployable means of doing so today (email pings) are essentially leap-of-faith mechanisms. Because domain-based certificates are widely available, and the root certificates of the major certificate authorities that issue these certificates are installed on almost all Internet-enabled platforms, the prospect of leveraging domain-based certificates for identity in messaging systems is very attractive. Compared to user-based certificates, domain-based certificates are also attractive because there need to be fewer of them in the overall messaging system, since there are generally many users to a given domain. This is advantageous both for identity providers, especially from a cost perspective, and for verifiers, who will need to persist many certificates from remote domains. When domain-based assertions are employ, the certificate itself does not provide the identity of the originator, but it does prove that the identity provider is authoritative over a particular segment of Peterson Expires April 18, 2005 [Page 16] Internet-Draft Message Identity October 2004 the namespace. Accordingly, the identity provider's signature must cover some field of the request that contains the identity that the signer is asserting. In order for the assertion to have any strength, that identity must be within the segment of the namespace for which the signer is authoritative; i.e., if the certificate of the signer proves authority over 'example.com', then the signature would be valid if the identity of the originator were 'joe@example.com', but not if the signature were over 'alice@example.org'. This gives rise to quite a few subtleties which are discussed in Section 7.1. In order to acquire a domain-based identity assertion for a request, originators would typically need to forward their message to an intermediary that instantiates the identity provider role (unless the originator holds a certificate authoritative for its own domain). This in and of itself can be viewed as a drawback, since in many messaging architectures originators are not required to send messages through any specific local intermediary. Moreover, messaging protocols are used in some environments that constrain the first-hop local intermediary to which an originator sends a request (e.g., blocking outbound SMTP with an enterprise firewall). In those environments, originators would be unable to acquire an identity assertion from an intermediary that was unsanctioned by the operator of the environment. Note that the considerations applying to domain-based certificates also apply to most DNS-based mechanisms for public key distribution - the identity assertions generated by keys distributed on a per-domain basis through the DNS are domain-based assertions. The distinction lies in the strength of the assurance - uncertified public keys distributed through the DNS without DNSSEC are inherently less secure than certificates, and thus can be said to provide a weaker domain-based assurance. 7.1 Name Subordination Identity assertions become harder to verify when the subject of the signer's certificate does not correspond exactly with the originator's name. There needs to be a deterministic way of deciding if an identity provider is authoritative over the namespace containing an originator's name. For example, how should a verifier treat an identity assertion generated by an identity provider with a certificate for 'joe@example.com' when the originator of the associated message is given as 'joe@mail.example.com'? The problem is more pronounced with domain-based assertions. How should a verifier treat an identity assertion generated by 'alice.example.com' for a message whose Peterson Expires April 18, 2005 [Page 17] Internet-Draft Message Identity October 2004 originator is 'joe@example.com'? What if the domain were 'joe.example.com', or 'mail.example.com', or 'sip.example.com'? We are forced to pose this authorization question because the verifier has no way to know how the identifying domain 'example.com' has allocated its namespace - which is why these problems are problems of name subordination. While authorization policy is outside the scope of this document, there are potentially ways to design a messaging identity system such that these concerns never arise. The most obvious way is to be very strict about generating assertions - to mandate, for example, that identity providers cannot provide domain-based assertions for messages unless their domain (the subject of their certificate, or the zone containing their key in the DNS) corresponds exactly to the host portion of the originator's return address. But this may be too rigid to support some use cases. Another possible solution is to leverage the DNS in some new way to designate the identity provider for a domain. Just as one resource record type designates the mail exchanger to which mail should be sent, some other DNS resource record might designate the identity provider for email messages in a domain (e.g., for 'example.com', the identity provider for mail messages resides at 'mail-ident.example.com'). Predictably, this solution is limited by the lack of an operational DNSSEC infrastructure in the DNS. Without DNSSEC, it is possible that an attacker could spoof DNS responses to suggest that an inappropriate host is the signer for the domain; essentially, this grants the attacker the ability to impersonate any user in the domain. 8. Reference Indicators and Replay Protection If any attacker can cut an identity assertion from a legitimate message, paste it into an arbitrary message of their own, and thereby fool a verifier into believing that the hacked message came from the originator of the legitimate message, then the value of the identity assertion is essentially nil, given that it exists primarily to prevent impersonation. If an identity assertion provided only a signature over the name of the originator, assertions would be trivially exploitable in precisely this fashion. Accordingly, an assertion must cover more than just the originator's name. It must cover enough additional information that the assertion cannot be replayed in a substantially message. Ideally, the identity assertion would provide a signature over the entire message, envelope and contents alike. If this were the case, then an attacker could only replay the identity assertion in an identical message - which would be a duplication rather than an Peterson Expires April 18, 2005 [Page 18] Internet-Draft Message Identity October 2004 impersonation. But practically, this wouldn't work for any existing messaging system. In fact, in most messaging systems, intermediaries need to modify the envelope in order to perform their duties. While strictly speaking, intermediary modification of the content is not a baseline requirement for a messaging system, some intermediaries do so in order to enforce any of a number of domain-specific policies. Consequently, were a signature over an entire messages included in identity assertions, such signature are likely to fail to validate at verifiers. Thus, only some subset of the message must be signed. The selection of the exact subset is a very difficult problem. For the purposes of this document, the elements of a message that need to be signed in order to bind an identity assertion to a particular message will be termed the 'reference indicators'. The manner in which a subset is identified or carried in the message also admits of more than one plausible design choice. 8.1 Canonicalization versus Replication There are two basic approaches to generating a signature over a subset of a message - canonicalization and replication. Canonicalization entails the generation of a canonical string from the reference indicators. The signature is generated over that string (or a hash of that string), even though the string as such does not appear in the message. The canonicalization system must specify the reference indicators that are going to be signed. The reference indicators can be specified statically, as a component of the specification of the mechanism, or dynamically, on a per-message basis. In the former case, every identity assertion for every message in the system will generate a canonical string containing exactly the same reference indicators. In the latter case, each assertion will denote in some manner which reference indicators have been incorporated within the canonical string. When a verifier receives the message, it extracts those same reference indicators from the message, generates the same canonical string, hashes it where applicable, and then determines whether or not the signature in the identity assertion is valid for that canonical string. The most practicable canonicalization procedures incorporate only the most specific reference indicators from a message. For example, inclusion of the entire RFC2822.From header field value (including the header field name, the colon, whitespace, etc) is much more problematic than the inclusion of only the addr-spec component of the From header field value. The less specific the reference indicators are, the harder it is for them to be canonicalized, and the more Peterson Expires April 18, 2005 [Page 19] Internet-Draft Message Identity October 2004 likely it is that intermediaries (though munging white-space, changing line-wrap, and so on) may inadvertently change the canonical string that will be generated by the verifier, or that the verifier will miss a blot of whitespace, and so on. Both dynamic and static reference indicators for canonicalization have their drawbacks. Static reference indicators can be too limiting; it is difficult to anticipate the reference integrity needs of every imaginable message. Dynamic reference indicators, however, are extremely complicated. The syntactical system required to describe reference indicators is potentially an exercise in arbitrary string manipulation, especially when attempting to denote reference indicators with a high degree of specificity. Dynamic reference indicators also leave much more room for error in the generation of the canonical string, and accordingly, more room for discrepancy in the manner that the verifier generates the canonical string. It is difficult to strike a balance, and once you allow any reference indicators to be decided on a per-message basis, the slope becomes very slippery. Replication attempts to avoid the difficulties of canonicalization by providing a copy of the reference indicators that is carried within the message itself. The simplest form of replication is the reproduction of the entire message, which is then tunneled within the message itself. The identity assertion is a signature over the replication. Of course, if the entire message is carried within itself, this doubles the size of the message (not even counting the signature), and so presenting a subset of the message is again desirable. However, unlike canonicalization, replication does not require any pre-agreement or denotation of the reference indicators. The reference indicators that appear in the replicated message are visible to the verifier, and the verifier validates the signature over the replication, not over elements in the original message which need to assembled into a canonical string. If the signature over the replication is valid, the verifier then compares the values of the reference indicators in the replication to the corresponding elements of the message. If these two correspond, then the identity assertion is valid for this message. Another significant distinction between canonicalization and replication is that a verifier inspecting a replication-based assertion can determine which reference indicators do not correspond to the received message; a verifier validating a canonicalization-based assertion can only tell whether or not the reference indicators as whole exactly match the current message. As a consequence, a verifier of a replication-based assertion can be lenient towards minor discrepancies between the message signed by the identity provider and the received message. If the verifier were Peterson Expires April 18, 2005 [Page 20] Internet-Draft Message Identity October 2004 implemented in a endpoint, the endpoint might even render an account of the discrepancies to a user, who might be able to make an informed decision about the severity of the differences. Another alternative is that verifiers might 'clobber' the contents of the outer envelope with the replicated envelope, treating only the replicated headers as authoritative and ignoring any discrepancies. Clobbering, however, only works when the reference indicators are carefully chosen; otherwise, it may disguise the actions of an impersonator who has cut-and-pasted the replicated assertion into a message of their choosing. Replication furthermore introduces the interesting possibility that envelope elements intended for end-to-end consumption (that do not need to be inspected by intermediaries, like the Subject header of email) might be included in the replicated body, but not in the headers. The originator might intentionally provide only the minimum amount of information necessary in the envelope of the message, but arrange for the identity provider to place detailed end-to-end information in the assertion. Were the assertion then to be encrypted, the identity provider could securely tunnel end-to-end elements to the recipient. This is most meaningful when the originator acts as the identity provider. Note, however, the limitations of encryption in Section 10. A very basic replication scheme deals poorly with the content of the message. Many messaging protocols allow large content (multi-megabyte email attachments leap to mind). Replicating this content is extremely undesirable. In contrast, canonicalization-based assertions do not become larger as the signed content grows. Since canonicalization can use a hash of the canonical string, even if the canonical string is built from the message body, the size of the hash is unchanged. Traditional approaches to message content security (including PGP [18] and S/MIME [17]) sign a hash over the message content. Accordingly, some hybrid approaches replicate reference indicators in the envelope but canonicalize content, which can yield the best of both worlds. One example of a canonicalization-based identity assertion for SIP is given in sip-identity [25]. The core SIP standard, RFC3261 [11] Section 23.4.2, provides a replication model that entails tunneling the entire message; RFC3893 [14] provides a replication model for SIP which is restricted to reference indicators alone. 8.2 Assertion Constraints and Scope The choice of reference indicators dictates the constraints and scope of the assertion. For example, if the reference indicators include something like the email Date header field, then it is possible, Peterson Expires April 18, 2005 [Page 21] Internet-Draft Message Identity October 2004 after verification, to apply authorization policies related to the time the Date header was created. Whether canonicalization or replication is used, the selection of a set of reference indicators must be informed by nature of the messaging protocols. Which elements of the envelope and content are necessarily immutable from the identity provider to the verifier (however those roles are assigned)? Which are always mutable? Which elements are conceivably reference indicators? The following may not be a necessary or sufficient list for any given messaging protocol, but it does exemplify the sort of analysis that needs to be performed in determining whether or not an element should be used as a reference indicator. Beginning with the envelope, at a minimum, the name of the originator of the message has to be preserved (in email, the addr-spec component of the RFC2822.From header field). It is also highly desirable to include some indicator that denotes the intended recipient(s) of the message. Without such an indicator, a message containing a valid identity assertion could be replayed to a different recipient by a passive attacker who captured the message, and the verifier would be unable to determine that the originator did not intended to send this message to the designated recipient. Barring the presence of other reference indicators, even the intended recipient of the request could act as such a passive attacker. Constraining the assertion with some sort of unique identifier for the message is also very desirable. Most messaging protocols provide a unique message identifier in order to enable the recipient to detect duplicates, or to enable correspondents to refer to a previous messages unambiguously. While the presence of a unique identifier in the constraints does not prevent passive attackers from replaying assertions to new verifiers, it does change the situation when impersonators attempt to replay assertions to the same verifier (which complements selecting the intended destination as a reference indicator). It enables verifiers to remember unique identifiers for some period of time. By doing so, verifiers can discover that they previously verified a message with this unique identifier. This does, however, have some important limitations. The first is scalability. Intermediaries that act as verifiers can potentially process staggering numbers of messages, and recording every passing unique identifier in such intermediaries is probably infeasible. However, this does not mean that the presence of a unique identifier would not be useful for recipients that act as verifiers (who might persist messages, including the unique identifier, for various reasons anyway). The second limitation is a race condition. If the Peterson Expires April 18, 2005 [Page 22] Internet-Draft Message Identity October 2004 attacker's message is delivered to the verifier before the legitimate message, a verifier might mistakenly believe that the attacker's message is the valid one; while active attackers are the most likely to successfully mount this sort of attack, in store-and-forward architectures it is possible that a passive attacker might do so (though not in a deterministic fashion, probably). A third limitation is that in some architectures, a particularly intrusive intermediary might alter the unique identifier of a message in the process of forwarding the message; in these environments, the unique identifier has little value. Providing a time-based constraint can complement the use of unique identifiers and other local policies at the verifier. Virtually all messaging protocols provide an indicator in the envelope that states when the message was created (such as the Date header in email). This can aid verifier policies that help to manage replay protection. For example, verifiers could be configured with interval of time derived from some assessment of how long a message can plausible be supposed to have remained in transit in the message system. If they receive a message, and the time that has elapsed since the creation of the message exceeds that interval, they could consider the assertion invalid, or at least suspect. Obviously, this interval would be very different depending on whether the messaging architecture is based on a store-and-forward methodology or a real-time delivery methodology. In concert with unique identifiers, this interval of time could be used to determine how long a verifier needs to remember unique identifiers recorded from valid past messages. Some ways in which a passive attacker might collect assertions for replay (from web pages of email archives, for example) could involve the retrieval of very dated assertions that would be flagged by this sort of policy. Some elements of a message are half-way between envelope and content, such as the typical Subject header field of email and SIP. Since it is common practice for endpoints to render this element to the user, and the element can significantly change how recipients understand the message, it should serve as a reference indicator. While any single one of the envelope-based reference indicators described above would be insufficient to provide a strong assurance of identity, in concert, they can meet the majority of the plausible threats, and require such a high degree of sophistication from the attacker that most impersonation would be eliminated. However, it may not be possible for a verifier instantiated by an intermediary to make full use of all of these indicators (the message's unique identifier, for example). Moreover, it may not be possible for an originator to act as an identity provider for all of these reference indicators if certain elements (like the message's unique identifier) Peterson Expires April 18, 2005 [Page 23] Internet-Draft Message Identity October 2004 are generated by an intermediary. The content of the message is also apparently a critical reference indicator. Without a signature over the content, a passive attacker who captures the message could preserve the envelope of a message but send a completely different content, which allows the attacker to impersonate the asserted originator and provide the content of their choosing. Since in many messaging architectures, intermediaries can legitimately alter the contents of messages (most commonly, either by adding to the content or modifying the existing content in some fashion), defending against replay of a message with a modified content by a passive attacker is essentially the same level of difficult as defending against message modifications made by an active attacker. In environments where intermediaries do modify message content for legitimate or at least quasi-legitimate reasons, the issue of protecting the content is academic. A signature over the content will be violated if the content is changed. There relatively few approaches to preventing intermediaries from violating these signatures; a few examples include: o If the messages use MIME [8], it is possible to apply MIME-layer security to particular bodies in the content. If trivial additions are made by an intermediary (such as appending a few lines of text to the message), then they will fall out of the scope of the MIME body or bodies. If one is especially lucky, the intermediary might even be MIME-aware, and capable of understanding how to interact with the complex multipart bodies that MIME-layer security frequently requires. o If rather than merely adding to the content, the intermediary seeks to modify existing message content (filtering for content that appears inappropriate, perhaps), then the only recourse is to encrypt the content in its entirety. If it is unable to understand the content, an intermediary will not be able to make these sorts of alterations. Neither of these solutions is applicable in all cases. However, given the use of envelope-based reference indicators described above in an identity assertion, most impersonations that replayed an assertion but changed the content would be perceived as duplicates (based on the unique identifier), outdated, or potentially in violation of a Subject header constraint; moreover, it could only impersonate the originator to a specific recipient or specific set of recipients. It could be argued, therefore, that it is not necessary to use the content as a reference indicator. But in messaging systems and environments where it is safe to do so, the value of including the content as a reference indicator is clear. Peterson Expires April 18, 2005 [Page 24] Internet-Draft Message Identity October 2004 An account of the mutable and immutable elements in a SIP message is given in RFC3261. The most complete analysis of reference indicators in SIP is given in the Security Considerations of sip-identity [25]. Given the sheer number of possible headers used by email (see [22]), a complete analysis of mutable and immutable elements is probably a fool's errand. The surfeit of possible reference indicators may tempt designers to punt on deciding, at a protocol level, which ones are appropriate, and simply to allow identity providers to make this decision based on domain policy or even on a per-message basis. There are two disadvantages that this flexibility incurs. In the first place, if the assertion is based on canonicalization, the assertion must be accompanied by some sort of description of the reference indicators that have been used to generate the assertion. Determining how to describe reference indicators precisely is a significant challenge. In the second place, it leaves a great deal of ambiguity in intermediary behavior. How can an identity provider anticipate which elements an intermediary might want to modify? If the standard is firm about this matter, and all identity provider rely on the same reference indicators, then operators of intermediaries will be incented to phase out any practices that modify those elements. If, on the other hand, each identity provider does things a little differently, there could be significant operational turmoil that could potentially lead to a rollback from the identity mechanism. In the end analysis, for any given messaging system, there is probably a finite set of identifiable elements that should be used as reference indicators. At worst, there should be a set of fixed reference indicators that can be supplemented with optional, dynamic reference indicators as needed. Note that other constraints and reference indicators that might be added by third-parties to the identity process are described in Appendix D, and should not be considered a part of the identity assertion created by the identity provider to identity the originator a message. 9. Placement of Assertions and Keys in Messages Most Internet messaging systems employ messages that are divided into two major parts: envelope and contents. The envelope of a message is typically made up of headers, like the traditional email RFC2822.From header field, though some messaging systems used alternative schemes (like XML for XMPP [13]). The contents consist of one or more message bodies, typically, but not always, MIME bodies. The division between envelope and contents is imprecise in most messaging systems. As a general rule, the envelope is used by endpoints and intermediaries in the addressing and routing of a Peterson Expires April 18, 2005 [Page 25] Internet-Draft Message Identity October 2004 message, whereas the content is generated by the originator's endpoint, consumed by the recipient's endpoint, and rendered to the recipient's application in some fashion. However, many elements in the envelope are also rendered to the recipient (a classic example being the Subject header field of email), and intermediaries have numerous reasons to inspect or modify the contents of messages. Given that an identity assertion needs to appear somewhere in a message, there are two plausible alternatives: o it could appear in the message content o it could appear in the message envelope, as a value or parameter of a new or existing element The attractiveness of one or another of these options is greatly dependent on the nature of the assertion, and particularly on the size and encoding of the assertion. Canonicalization will result in a smaller assertion than replication. To speak in particulars briefly, for a base64-encoded assertion based on an RSA signature (1024 bit key) of a SHA1 hash of the canonical string, the resulting assertion is 175 bytes long - varying the key length will make the message proportionally larger or smaller, obviously. It is difficult to gauge the likely size of an assertion based on replication, since it is highly dependent on the number of reference indicators included, but it would be significantly larger. An example in RFC3893 of a S/MIME-based replication assertion for SIP (containing six headers) is 913 bytes long, counting the multipart/MIME wrapper and the signature. A base64 encoded version of that assertion is 1240 bytes long. 9.1 Assertions in the Envelope An envelope is generally composed of a set of elements that describe the originator and intended recipient of a message, the subject of the message, the time the message was created, some unique identifiers for the message, and so on. It is a common practice for intermediaries to inspect an envelope's elements in order to make forwarding decisions, and to add additional elements to the envelope to reflect various circumstances surrounding the delivery of the message. Provided that an assertion is short and syntactically manageable, there's no reason why it couldn't appear in some new or existing envelope element. Some messaging systems have a practical (if not theoretical) limit on the size of envelope elements, in others this is no cause for concern. The syntax of the assertion is a more complicated issue. If identity assertions based on replication are used, and are intended to be Peterson Expires April 18, 2005 [Page 26] Internet-Draft Message Identity October 2004 stored in the envelope, it may be syntactically confusing to store a set of envelope elements within a single envelope element. Worst case, this confusion could be alleviated by encoding the entire assertion in some fashion (such as base64), but this would result in quite a large string. Even in cases where element length is limited, it is possible that a very large string encoded in this fashion could be split across multiple envelope elements, and internally ordered in some way, to meet practical size limits. Intermediaries generally have an easier time reading and writing parts of the envelope than the content, and according, if one intends for intermediaries to instantiate the identity provider or verifier roles, then placing assertions in the envelope has the distinct advantage of requiring less changes to intermediary behavior. Also, some messaging architectures might not guarantee the survival of particular portions of the message as they traverse intermediaries. For example, if intermediaries customarily rewrite or delete particular envelope elements, it would be a poor design decision to store an identity assertion as a value in those element. 9.2 Assertions in the Content Given an assertion of large size or cumbersome syntax, storing an assertion in the envelope might be undesirable. Appending the assertion to the contents of the message (perhaps using a multipart MIME body) might therefore seem superior. However, for some messaging systems, placing identity assertions in the content may limit the set of entities in the messaging architecture that can instantiate the identity provider role. It might be illegal for an intermediary, for example, to modify content. This is the case with SIP, where an intermediary cannot delete or modify the contents of a SIP message. Placing assertions in the content can also limit the set of entities that can instantiate the verifier role. Email intermediaries are not required to be capable of understanding or parsing the contents of email messages (especially MIME bodies), and accordingly, they cannot be expected to act as verifiers of an identity assertion that appears as part of the message content without requiring significant changes to their functionality. Furthermore, an identity system should be compatible with end-to-end security of message contents. If the identity system requires that an intermediary add a body to a message, and the endpoints are using some end-to-end integrity mechanism like S/MIME or PGP, appending the assertion to the content may violate that end-to-end integrity. If Peterson Expires April 18, 2005 [Page 27] Internet-Draft Message Identity October 2004 MIME is supported by these intermediaries, however, this problem becomes less pressing, as intermediaries might add the assertion as a complete MIME body by transposing the existing content into a new multipart. Placing assertions in the content is further complicated by the manner in which the content of a message is rendered to the recipient. Ultimately, an identity assertion is not a component of the content that should be blindly rendered to the user. It is more appropriate for a recipient's endpoint to consume the assertion as an input to an authorization decision, which may in turn change the manner in which the message is rendered to the user. The assertion itself, a collection of cryptographic bits, is not something that should be intermingled with the content rendered to the recipient. Endpoints that do not support the identity assertion scheme, however, are likely to do just that, and accordingly, placing the assertion in the content leads to serious backwards-compatibility concerns. A messaging system based entirely on the use of MIME content, however, overcome these difficulties. Various Content-Dispositions (see [10]) can inform the recipient's endpoint that it should not render the content of a body to a user. Moreover, it can flag the body as specifically containing an identity assertion. One such Content-Disposition for identity assertions ("aib") is defined in RFC3893. In messaging systems where multipart MIME support is not guaranteed in endpoints, however, this would lead to backwards compatibility issues. 9.3 Distributing Keys by-Reference or by-Value The various keying schemes described in Section 6 entail a few high-level models by which keys can be incorporated into requests: inclusion by-reference and inclusion by-value. In the by-reference case, some resource in the network would hold the key, and the message would either explicitly (with something like a URI) or implicitly (through some understanding built into the identity architecture and messaging protocols) indicate where the key for a particular identity provider can be acquired. In the by-value case, the key would accompany the identity assertion in the message. When an originator acts as a identity provider, they may not be capable of operating or contracting with a network service such as a key store. In those cases, they have no alternative but to include keys by-value. Intermediary-based identity providers would generally have no trouble offering keys by-reference. Including keys by-value is attractive if the keys are self-validating (as is the case with public keys bound to certificates). If keys are Peterson Expires April 18, 2005 [Page 28] Internet-Draft Message Identity October 2004 not self-validating, then clearly an impersonator could trivially include a key of their own choosing with the request - this is an instance of the leap-of-faith model described in Section 6.1.2. Including certificates by-value, however, can be troublesome given the comparatively greater size of certificates (though in many messaging architectures, certificates can be incorporated into the content of the message without dire ramifications). For the purposes of comparison, an example self-signed certificate constitutes about 1100 bytes of data when base64 encoded; the public key it contains is about 270 bytes of base64 data (for a 1024 bit RSA key). The high-level problem with including keys by-reference is that the verifier must have network access (if they do not already possess the key) in order to validate the signature within an assertion; this is not a requirement for verifier by-value assertions, and is important for recipient-based verifiers in store-and-forward messaging architectures. There are also potentially non-obvious consequences of including keys by-reference. Consider, for example, that if the message is not rendered to a recipient instantiating the identity provider role for a protracted period of time (weeks or months), it is possible that the key used by the identity provider will expire or changed during that time; one interesting property of carrying certificates by-value is that a verifier can determine, on the basis of an expired certificate shipped with a message, if the assertion was valid at the time it was created, provided that the assertion is constrained by its creation time (though there would be good reasons to be cautious about this practice). Also, as another non-obvious consequence of by-reference key distribution, note that the key store used by the identity provider will be notified each time that a verifier acquires a key. This can actually have important privacy implications, because in some cases, this could reveal most or all of the recipients of the request to the identity provider. The implications of this are further discussed in Section 10. It is important to recall that messages may be reflected to multiple recipients - potentially, many thousands in some environments. While it may seem to save message bandwidth to include keys by-reference, thousands of requests to the key store may result in profoundly greater network traffic. Note, however, that the impact for domain-based keys is probably less than the impact for user-based keys, since domain-based keys need to be acquired on a per-domain basis, and a domain generally encompasses many users. There's no question that the impact might be significant in either case. Comparing the total bandwidth consumed by the two approaches, it is also important to note that verifiers can cache credentials. So, if Peterson Expires April 18, 2005 [Page 29] Internet-Draft Message Identity October 2004 the verifier already has the key, and the key is still valid, a reference to the key in a message will not necessitate the key's retrieval. When keys are sent by-value, however, the originator has no way to know whether or not a potential recipient already possesses the key; this problem is compounded by the general difficulty of anticipating who might conceivably receive a message. The only safe policy is to send the key every time, when keys are distributed by-value. When compared with the potential for thousands of recipients to retrieve the key from a key store, however, this is still comparatively a minor inconvenience. It is furthermore the case that any network service which distributes keys to verifiers will add new threats to the overall identity architecture. The security properties of the protocols used to implement the service become critical to the strength of the assertion. Moreover, those services could be subject to denial-of-service attacks intended to prevent verification of messages with identity assertions. Any network service that can provide keys by-reference to verifiers might also provide keys to originators; originators, in contrast to verifiers, would only need to access this local service very infrequently, and at worst, only one originator would need to access this service per message, which compares very favorably to the unbounded set of verifiers to which a message might be distributed. In fact, the manner in which originators authenticate themselves to identity providers (which is outside the scope of this document) may innately entail a key exchange - the originator may learn the keys of their local domain as a matter of course. Provided these keys are bound in certificates, this could potentially serve as an attractive manner for originators to learn their identity provider's keys in order to include them in messages by-value. This may be important in architectures where it is desirable to add the key to the content of a message, but intermediaries lack the capability or permissions to make useful additions to the content. Too hard to choose? Ultimately, there is an easy way to be flexible about the incorporation of keys into a message. If there is a field in the message that provides a URI where the key can be acquired, this can be purposed to include a key by-reference or by-value. It can be used by-reference to indicate, for example, an HTTP or HTTPS URI where the key can be acquired; alternatively, it could use some form of DNS URL (such as [24]) to denote a particular DNS resource record where the key is located. If the message uses MIME bodies as content, it could use the CID URI scheme [9] to designate a particular MIME body that contains the key. The only option considered in this document for which a URI does not provide a solution is carrying they key by-value in the envelope, but of Peterson Expires April 18, 2005 [Page 30] Internet-Draft Message Identity October 2004 course, it wouldn't make much sense to have, for example, one header in a SIP request contain a URI reference to another header in the same message - a special-purpose header should be used to carry keys by-value in the envelope. 9.4 Distributing Assertions by-Reference It is also possible to distribute assertions by-reference; to force the verifier to contact a service operated by the identity provider in order to acquire an assertion that would be used to verify the message. This is identical, from a security perspective, to a dial-back identity scheme; see Appendix E.2. 10. Privacy and Anonymity Anonymity plays an important part in communication systems. The existence of an identity system should not preclude anonymous message originators. However, it is possible to strike a balance in which anonymized messages still contain identity assertions, and those identity assertions are potentially still valuable. Considering the classic case of an originator wishing to be anonymous to recipients, there are numerous ways in which this could be realized in the context of an identity system. If the domain of the originator permits anonymous messaging, the originator could populate their return address in the message with, say, 'anonymous@example.com', and send the message through the identity provider of 'example.com'. This sort of anonymity is meaningful for domains with a great many users, and less useful as the number of users in the domain grows smaller. Alternatively, a message anonymization service unrelated to the originator's usual domain could act as an identity provider for a message. Receiving a message signed by 'anonymous@anonymizer.example.org' is still, in all likelihood, preferably from an authorization perspective to receiving a message without any identity assertion whatsoever. An assertion provides a pointer of accountability to the originating domain in cases of abuse. Another important form of privacy relates to preventing intermediaries responsible for message transfer from reading the identity assertion. Encryption of assertions entails a very different key distribution problem than identity. In order to send an encrypted message to a recipient, the recipient must possess a corresponding decryption key. This key needs to be shared, in some fashion, with the identity provider before the identity provider can encrypt the assertion for that recipient. The problem is complicated by the potential existence of multiple recipients. If the identity assertion is encrypted for one particular recipient, and ends up Peterson Expires April 18, 2005 [Page 31] Internet-Draft Message Identity October 2004 being distributed to multiple recipients by a reflector, the addition recipients will not be able to read or verify the assertion. There are at least two strategies for overcoming this problem: o Encrypt the assertion on a per-recipient basis (i.e., include multiple versions of the assertion, each one encrypted with a key corresponding to the decryption key of each recipient). o Force all recipients to share a common decryption key, and encrypt the assertion only once with that key. Both of these approaches are limited by the fact that the identity provider cannot anticipate who will receive the message. Moreover, as the list of recipients grows larger, these strategies become increasingly unmanageable. Even if a message is retargeted to only one destination, the identity provider has no way to anticipate what that destination might be. In the end analysis, encryption of assertions is a very difficult practice to manage in messaging identity architectures. When a message is reflected to multiple recipients, this can give rise to another privacy problem. If the identity provider's keying material is included in the message by-reference, then the identity provider will know who the verifiers are when they content the key store to acquire the key (given that identity providers operates or has some oversight of the key store). While not all reflectors need to protect the privacy of their distribution list, it is very probable that some do. This problem can even arise when a message is forwarded by one recipient to another recipient, who subsequently verifiers the message, if the original recipient did not want to reveal to the originator that their message was forwarded. In an identity architecture in which keys are always distributed by-value, this problem never arises; if the originator or identity provider can choose to include keys by-reference, however, this could be a material concern. The concern lessen as the number of messages assured by the identity provider grows larger (i.e., large domains using domain-based assertions); any individual request becomes a needle in a haystack. Nothing about a request for a key alone identifies the message that the verifier is validating - although if user-based assertions are used, it will reveal the originator of the message. This is a major distinction between distributing keys by-reference and distributing assertions by-reference; dial-back identity schemes (see Appendix E.2) notify the identity provider of the exact message that the verifier is inspecting. 11. Conclusion: Consensus Points and Questions If the analysis in this document illustrates anything, it's the sheer number of moving parts that must be fixed in order to arrive at an Peterson Expires April 18, 2005 [Page 32] Internet-Draft Message Identity October 2004 identity solution for a messaging system. It does, however, identify the core consensus points in arriving at an identity solution. The following are the major points that require analysis: o keying: asymmetric keys vs. symmetric keys o asymmetric keys: certificates vs. uncertified o assertion structure: canonicalization vs. replication o reference indicators: static vs. dynamic o identity providers: originators vs. intermediaries o verifiers: recipients vs. intermediaries o content: a reference indicator vs. not a reference indicator o assertions: domain-based vs. user-based o assertion placement: envelope vs. content o key distribution: by-reference vs. by-value In order to arrive at a consensus on those points, questions like the following need to be asked. Do your use cases include identity assertions being validated by verifiers who have no previous association with the identity provider? If so, this argues for using asymmetric keys rather than symmetric keys, since symmetric keys assume some pre-arranged key exchange between the identity provider and the verifier. Is the privacy of the recipients of a message with respect to the identity provider, when a message is forwarded to unanticipated destinations, important? At a high level, if you believe so, this argues for supplying keys in messages by-value, rather than by-reference. Alternatively, if the by-reference key store is the DNS, one could argue that requests for keys are likely to be lost in the general mass of queries targeting the DNS server (though this may not be the case in practice, depending on how the query strings are formulated). Do you want recipients of a message to be able to verify messages off-line? If so, this also argues for supplying keys by-value. If keys are supplied by-value, it is far better to use certificates than uncertified public keys, especially if you want domain-based assertions. Is it critical that an identity provider be securely associated with a particular domain? If you say 'yes' to this, this argues for domain-based assertions. Furthermore, depending on exactly how critical it is, this argues for using certificates rather than any system relying on the DNS (given the current state of DNSSEC deployment) or a leap-of-faith system. Is it possible to arrive at fixed set of reference indicators for messages in your messaging system? If so, then this argues for using Peterson Expires April 18, 2005 [Page 33] Internet-Draft Message Identity October 2004 canonicalization rather than replication in assertions. If not, then replication is probably a better bet than dynamic canonicalization. If you can use canonicalization, then placing assertions in the envelope is preferable for most messaging systems. Do you want the use of identity assertions to be opportunistic for endpoints? If so, then you want intermediaries to instantiate the identity provider role. Are you willing to try to prevent active attackers as well as passive attackers? If so, then you may be willing to try to use message content as a reference indicator. 12. Security Considerations This document is entirely concerned with the security of Internet messaging systems. It provides a survey of existing mechanisms to provide identity in Internet messaging systems in order to counter the seminal threat of impersonation. Since it treats messaging in the abstract, rather than discussing any particular protocol, it makes no specific recommendation for advancing any particular approach for the problem. It does, however, show how some architectural decisions, at a high level, are likely to be more successful than others. It also suggests a way to divide-and-conquer decision-making about identity enhancements for applicable messaging systems. 13. IANA Considerations This document contains no considerations for the IANA. 14 Informative References [1] Postel, J., "Simple Mail Transfer Protocol", RFC 821, STD 10, August 1982. [2] Crocker, D., "Standard for the format of ARPA Internet text messages", RFC 822, August 1982. [3] Oikarinen, J. and D. Reed, "Internet Relay Chat Protocol", RFC 1459, May 1993. [4] Klensin, J., "Simple Mail Transfer Protocol", RFC 2821, April 2001. [5] Resnick, P., "Internet Message Format", RFC 2822, April 2001. [6] Mockapetris, P., "Domain names - concepts and facilities", RFC Peterson Expires April 18, 2005 [Page 34] Internet-Draft Message Identity October 2004 1034, STD 13, November 1987. [7] Linn, J., "Privacy Enhancement for Internet Electronic Mail: Part I: Message Encryption and Authentication Procedures", RFC 1421, February 1993. [8] Freed, N. and N. Borenstein, "Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies", RFC 2033, November 1987. [9] Levinson, E., "Content-ID and Message-ID Uniform Resource Locators", RFC 2111, March 1997. [10] Troost, R., Dorner, S. and K. Moore, "Communicating Presentation Information in Internet Messages: The Content-Disposition Header Field", RFC 2183, August 1997. [11] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, A., Peterson, J., Sparks, R., Handley, M. and E. Schooler, "SIP: Session Initiation Protocol", RFC 3261, May 2002. [12] Housley, R., Polk, W., Ford, W. and D. Solo, "Internet X.509 Public Key Infrastructure Certificate and Certificate Revocation List (CRL) Profile", RFC 3280, April 2002. [13] St. Andre, P., "Extensible Messaging and Presence Protocol: Instant Messaging and Presence", RFC 3921, October 2004. [14] Peterson, J., "Session Initiation Protocol (SIP) Authenticated Identity Body (AIB) Format", RFC 3893, September 2004. [15] Watson, M., "Short Term Requirements for Network Asserted Identity", RFC 3324, November 2002. [16] Jennings, C., Peterson, J. and M. Watson, "Private Extensions to the Session Initiation Protocol (SIP) for Asserted Identity within Trusted Networks", RFC 3325, November 2002. [17] Ramsdell, B., "Secure/Multipurpose Internet Mail Extensions (S/ MIME) Version 3.1: Message Specification", RFC 3851, July 2004. [18] Elkins, M., Del Toro, D., Levien, R. and T. Roesler, "MIME Security with OpenPGP", RFC 3156, August 2001. [19] Sparks, R., "The Session Initiation Protocol (SIP) REFER Method", RFC 3515, April 2003. [20] Sparks, R., "The Session Initiation Protocol (SIP) Referred-by Peterson Expires April 18, 2005 [Page 35] Internet-Draft Message Identity October 2004 Mechanism", RFC 3892, September 2004. [21] Crocker, D., "Internet Mail Architecture", draft-crocker-mail-arch-01 (work in progress), July 2004. [22] Klyne, G. and J. Palme, "Registration of mail and MIME header fields", draft-klyne-hdrreg-mail-05 (work in progress), May 2004. [23] Arends, R., Austein, R., Larson, M., Massey, D. and S. Rose, "Protocol Modifications for the DNS Security Extensions", draft-ietf-dnsext-dnssec-protocol-09 (work in progress), October 2004. [24] Josefsson, S., "Domain Name System Uniform Resource Locators", draft-josefsson-dns-url-10 (work in progress), September 2004. [25] Peterson, J. and C. Jennings, "Enhancements for Authenticated Identity Management in the Session Initiation Protocol (SIP)", draft-ietf-sip-identity-03 (work in progress), September 2004. [26] Bradner, S., "Key words for use in RFCs to indicate requirement levels", RFC 2119, March 1997. [27] Handley, M., Schulzrinne, H., Schooler, E. and J. Rosenberg, "SIP: Session Initiation Protocol", RFC 2543, March 1999. Author's Address Jon Peterson NeuStar, Inc. 1800 Sutter St Suite 570 Concord, CA 94520 US Phone: +1 925/363-8720 EMail: jon.peterson@neustar.biz URI: http://www.neustar.biz/ Appendix A. Acknowledgments The author drew considerable inspiration for this document from the longstanding discussion of identity on the SIP mailing list. The IAB Workshop on Messaging in October of 2004 was also a valuable influence. Peterson Expires April 18, 2005 [Page 36] Internet-Draft Message Identity October 2004 Appendix B. Verification Assertions A verification assertion is a piece of information added to a message by an intermediary-based verifier which asserts that an identity assertion in the message was verified. These assertions are most useful in architectures in which a recipient cannot be expected to instantiate the verifier role itself. However, it is also possible that verification assertions could be inspected by intermediaries between the verifier and the recipient. Verification assertions may be cryptographic, but typically they are not. Usually, the recipient has some specific trust relationship with the verifier, which may include the use of some other form of security (for example, network or transport layer security) which guarantees that the verification assertion was created by the trusted verifier. A verifier may strip any identity assertion from a message before adding a verification assertion, or it may leave the assertion in the message. The latter option is preferable, in so far as it is forwards-compatible with recipients instantiating the verifier role. While verification assertions are probably important for some architectures, they are not strictly necessary to implement an identity service. In fact, by rendering the identity architecture less end-to-end, verification assertions may weaken the overall security of the architecture. Appendix C. Messaging: Real-Time versus Store-and-Forward In most respects, the high-level messaging architectures discussed in this document share common security properties regardless of whether they are real-time or store-and-forward. However, there are a few important respects in which the two differ: Delay from Computation The instantiation of the verifier and identity provider roles by the system (more or less irrespective of where they are located) will incur some delay corresponding to the complexity of the cryptosystems they employ. While this delay is not likely to be noticeable in store-and-forward messaging systems, it may be perceptible (and undesirable) in real-time communications systems. Offline Handling Store-and-forward systems allow users to read their messages offline. Accordingly, if the recipient acts as a verifier, the verifier might not be online when it reads the message. This has important implications for any sort keys or assertions that are carried by-reference (and for dial-back identity schemes). Peterson Expires April 18, 2005 [Page 37] Internet-Draft Message Identity October 2004 Delivery Receipts Real-time messaging protocols tend to provide real-time acknowledgements of message delivery by default. These acknowledgements in turn have important identity properties. While the same is true of various optional delivery acknowledgement mechanisms that can be used in store-and-forward systems, in real-time systems the responses returned to a message can invoke all sorts of behavior on the originator side, including resubmission of the request to alternate destinations and so on. Any sort of response identity is outside the scope of this document, and believed to be separable from the message identity work described in this document. By-Value Subversion In order to subvert a request to acquire keys by-value from a key store, it really helps if the attacker knows when the verifier will initiate the request. In real-time messaging architectures, this is relatively clear - it will be soon after the message has been sent. In store-and-forward architectures, since the verifier might not validate the message for hours or days or weeks, it can be very difficult for the attacker to make this determination. Not that even in store-and-forward architecture, if an intermediary acts a verifier, this distinction becomes less acute - there is a comparatively smaller time-window in zwhich an intermediary is likely to verify a assertion, and accordingly, it may be easier to subvert request for a key when an intermediary is the target. Creation Time as a Reference Indicator In real-time messaging systems, the creation time of a message is a very strong reference indicator, since deliver of messages is expected to be very quick. Accordingly, passive attackers have only a small interval of time to mount a replay attack using an assertion with a creation time reference indicator. In store-and-forward architectures, the delivery window is extremely large, so creation time is a less valuable reference indicator (though not entirely useless). Appendix D. Third-Party Assertions Many messaging architectures assign important roles to third parties. To take a familiar example, email has the concept of a mailing list which sends messages on behalf of an originator. For the purposes of this document, a third-party assertion is differentiated from an ordinary identity assertion as follows: a third-party assertion is provided by an identity provider that is not authoritative for the namespace containing the name of the originator of the message (following the general constraints of Section 7.1. Depending on the sorts of authorization decisions that a verifier might want to perform, the identity of the originator may be secondary, or even totally irrelevant, when a third-party is involved. A particular recipient might wish to accept any email Peterson Expires April 18, 2005 [Page 38] Internet-Draft Message Identity October 2004 message from a particular mailing list, for example, without regard to the identity of a particular originator. Other practical examples include chat-rooms of instant messaging systems, and systems in which one endpoint can instruct another endpoint to send a message (such as the SIP REFER [19] method). Clearly, the manner in which a third-party asserts something about a message is orthogonal to the broader question of how to identify the originator of a message. However, it is certainly possible that third-parties may want to add additional cryptographic information to a message in order to allow particular authorization decisions to be made available to recipients. The formulation of third-party assertions seems to be a problem that is entirely separable from the identification of the originator, and is thus out of scope of this document. Future work could identify a means of providing third-party assertions that was entirely supplemental to the identity work in this document. An example of a third-party assertion is the Referred-by [20] token associated with the SIP REFER method. Appendix E. Alternatives to Identity Assertions Identity assertions are not the only means of increasing a recipient's surety of the identity of an originator of a request. E.1 Trusted Intermediary Networks It is important to note that identity assertions are primarily motivated by the interdomain nature of messaging. Within a single administrative domain, both the originator and the recipient of any message must trust the same domain in order for messaging to function at all. Accordingly, they can assume (perhaps without good justification) that the domain would not connect them if it had not properly authenticated them both. Given this, some messaging architectures try to extend the boundaries of an administrative domain in order to treat interdomain messaging as an intradomain problem. In contrast to cryptographic assertions, these identity systems rely on particular deployment architectures to guarantee the security properties of the assertion. The only assertion that is actually carried in the message is a separate envelope element that provides an 'authoritative' return address. For example, consider the 'trust domain' concept defined in Section 2.3 of RFC3324. In this messaging architecture, a trusted network is a set of intermediaries that exchange messages with one another over a closed network (a network either logically or physically Peterson Expires April 18, 2005 [Page 39] Internet-Draft Message Identity October 2004 inaccessible from the Internet, over which intermediaries pass messages to one another). Assuming such a trusted network, one can design a very simple identity assertion. For example, in an email network, one could introduce a new 'Trusted-From' header field whose contents could only be set by intermediaries in the trusted network. The identity information conveyed by such a system is the contents of this trusted header. Recipients treat this trusted header as the assured identity of the originator. An example of this sort of trusted assertion is RFC3325 [16], which defines the P-Asserted-Identity header field for SIP. The traditional Internet Relay Chat (IRC [3]) service relied on a similar concept of trusted intermediaries. Intermediaries formed a meshed trust network over which messages passed, and each server was responsible for authenticating its users. While this model has enjoyed considerable success in closed networks such as the telephone network, it has a number of limitations which render it incompatible with widespread Internet deployment of a messaging architecture. Forming closed overlay networks of providers that agree on network or transport-layer security standards and practices does not agree with the general model of Internet messaging, in which domains may exchange messages without any previous association. Other, more sophisticated forms of transitive trust are ad-hoc. For example, a message could contain an explicit indication that any intermediary that relays the message needs to use some form of transport or network-layer security when sending to the next hop. Assuming a proper keying architecture, intermediaries can mutually authenticate one another from the originating domain to the domain of the recipient. The SIPS URI scheme in RFC3261 has this property. The main drawback to such mechanisms is that it is impossible for any intermediary or recipient to verify that appropriate lower-layer security was used over any particular transit hop. This is, in fact, the main problem with trusted networks in general - a given domain must trust that the remainder of the domains in the network behave properly. E.2 Dial-back Identity A dial-back identity system for messaging works as follows: when a verifier receives a message, it inspects the name that identifies the originator (such as the RFC2822.From header for email), and then launches a dial-back request to that name. This dial-back request must contain reference indicators for the request, either by-value or Peterson Expires April 18, 2005 [Page 40] Internet-Draft Message Identity October 2004 possibly as a hash of a canonicalization of the reference indicators. In another variant, the message itself contains such a hash which is verifiable by the recipient (essentially, an unsigned identity assertion), and the recipient then sends that hash in the backwards direction to the identity provider. Assuming the name of the originator is valid, an identity provider responsible for the namespace of the originator's name will receive the request. If this identity provider is the originator, it can reply to the request with a positive response if it did indeed send the message in question. If the identity provider is some intermediary, it would need some way to ascertain that the originator sent that message; possibly, the originator sent the message through the identity provider, and the identity provider keeps state for every message it handled. However the intermediary-based identity provider learns of the validity of a request, it returns a positive response if the request was in fact sent from the originator in question. If the identity provider does not recognize the described message, it sends a negative response. No response (because the domain of the originator's name doesn't exist, or exists but has no identity provider) is assumed to be a negative response. Depending on the semantics of the request, it may be somewhat intensive for the identity provider to make a determination of whether or not the request was actually sent by the originator. If a message is forwarded to numerous recipients, obviously this per-message work becomes larger, and for cases like large email mailing lists, it may become unmanageable. The use of unsigned hashes in the message moves this work to a phase before the message is sent, rather than after the dial-back request is received. In some respects, dial-back has similar properties to DNS-based mechanisms of keying distribution discussed in Section 6.1.2. Since these system relies on a request being sent in the backwards direction using the name of the originator, it would necessarily rely on the validity of the DNS to reach that name. However, unlike the DNS-based uncertified keying mechanisms, dial-back requires no special modifications to the DNS. Dial-back identity systems have enjoyed some success in real-time messaging systems, but clearly their applicability to store-and-forward systems is limited, especially when the identity provider role is instantiated by originators. All in all, within their domain of applicability, dial-back identity systems improve security with little expenditure of design effort. They are not considered further in this document because they are not predicated on identity assertions as such. Peterson Expires April 18, 2005 [Page 41] Internet-Draft Message Identity October 2004 Intellectual Property Statement The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org. Disclaimer of Validity This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Copyright Statement Copyright (C) The Internet Society (2004). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. Acknowledgment Funding for the RFC Editor function is currently provided by the Internet Society. Peterson Expires April 18, 2005 [Page 42]