Internet Architecture Board P. Faltstrom, Ed. Internet-Draft G. Huston, Eds., Ed. Expires: June 14, 2005 IAB December 14, 2004 A Survey of Internet Identities draft-iab-identities-02.txt Status of this Memo This document is an Internet-Draft and is subject to all provisions of section 3 of RFC 3667. By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she become aware will be disclosed, in accordance with RFC 3668. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on June 14, 2005. Copyright Notice Copyright (C) The Internet Society (2004). Abstract This memo provides an overview of the various realms of identification used within the Internet protocol suite, with a view to noting the interdependencies of the different identifiers and consequent implications for updating their specifications or changing their infrastructures' operations. Faltstrom & Huston, Eds. Expires June 14, 2005 [Page 1] Internet-Draft Internet Identities December 2004 Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1 Desirable properties of Internet Identities . . . . . . . 3 2. A Hierarchy of Identities . . . . . . . . . . . . . . . . . 5 2.1 Media Access Addresses . . . . . . . . . . . . . . . . . . 6 2.1.1 Summary . . . . . . . . . . . . . . . . . . . . . . . 8 2.2 IP Addresses . . . . . . . . . . . . . . . . . . . . . . . 8 2.2.1 Summary . . . . . . . . . . . . . . . . . . . . . . . 9 2.3 Service and Session Identities . . . . . . . . . . . . . . 10 2.3.1 Summary . . . . . . . . . . . . . . . . . . . . . . . 12 2.4 Routing and Forwarding Identities . . . . . . . . . . . . 13 2.4.1 Summary . . . . . . . . . . . . . . . . . . . . . . . 14 2.5 Mobile Identities . . . . . . . . . . . . . . . . . . . . 15 2.5.1 Summary . . . . . . . . . . . . . . . . . . . . . . . 15 2.6 Opportunistic Identities . . . . . . . . . . . . . . . . . 16 2.6.1 Summary . . . . . . . . . . . . . . . . . . . . . . . 17 2.7 Domain Names . . . . . . . . . . . . . . . . . . . . . . . 17 2.7.1 Summary . . . . . . . . . . . . . . . . . . . . . . . 18 2.8 Uniform Resource Identifiers . . . . . . . . . . . . . . . 19 2.8.1 Summary for URLs . . . . . . . . . . . . . . . . . . . 22 2.9 Uniform Resource Names . . . . . . . . . . . . . . . . . . 23 2.9.1 Summary . . . . . . . . . . . . . . . . . . . . . . . 23 2.10 Human Friendly Strings . . . . . . . . . . . . . . . . . 24 2.10.1 Summary . . . . . . . . . . . . . . . . . . . . . . 25 3. Issues with Identities . . . . . . . . . . . . . . . . . . . 25 3.1 Overloading the IP Address . . . . . . . . . . . . . . . . 25 3.2 Dynamic DNS Updates and Nomadism . . . . . . . . . . . . . 27 3.3 URLs and Persistent Identifiers . . . . . . . . . . . . . 28 4. The DNS in Identity Spaces . . . . . . . . . . . . . . . . . 31 4.1 The role of the DNS . . . . . . . . . . . . . . . . . . . 32 4.2 Changing the DNS . . . . . . . . . . . . . . . . . . . . . 32 4.3 The DNS is a strict lookup service . . . . . . . . . . . . 33 4.4 Coherency of the DNS . . . . . . . . . . . . . . . . . . . 34 4.5 The DNS as an Identity Glue . . . . . . . . . . . . . . . 35 5. Security Considerations . . . . . . . . . . . . . . . . . . 36 6. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 36 7. Informative References . . . . . . . . . . . . . . . . . . . 36 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . 38 A. IAB Members . . . . . . . . . . . . . . . . . . . . . . . . 39 Intellectual Property and Copyright Statements . . . . . . . 40 Faltstrom & Huston, Eds. Expires June 14, 2005 [Page 2] Internet-Draft Internet Identities December 2004 1. Introduction In any communications domain where two parties wish to conduct a conversation across a network each party must specify to the network sufficient information for the network to identify the other party. When the conversation refers to a resource or service that is accessible through the network, the only effective way to refer to such a resource of service is to use an identifier that can subsequently be passed to the network to perform the access. Some networks use a single identifier domain to identity all parties and services. Other networks use a collection of discrete identifier domains, where each identifier domain has a specific realm of discourse or application. The Internet is an example of a multiple-identifier domain network, where there are a number of identity domains, each referring to a particular function or area of application. In terms of routing and forwarding IP packets the identity domain used is that of IP addresses, while in terms of identifying particular services or resources the URI form of identity is commonly used. In terms of human use of identities, the most common form of identity in the Internet is based upon the domain name. This memo examines the role of identities and identifiers, together with an overview of the various realms of identity that are used in the Internet. The document then looks in more detail at the Domain Name System (DNS) and examines its role in relation to these identity realms. One of the characteristics of the Internet's multiple identifier systems is their heavy interdependence. This memo notes those interrelationships and provides some observations on implications for technical or operational evolution of their specifications. 1.1 Desirable properties of Internet Identities Before exploring the set of Internet-based identity realms, its useful to enumerate a set of desirable characteristics of any useful identity system. The following list is of characteristics and some related questions related to properties of the identifier is proposed as a useful, although not comprehensive, collection of identity attributes: Uniqueness: In what realm is the identifier unique? Can the same identifier be associated with two or more distinct Faltstrom & Huston, Eds. Expires June 14, 2005 [Page 3] Internet-Draft Internet Identities December 2004 objects within the domain of a single realm? Can multiple identifiers in a single realm be associated with the same object? An identifier can only be used reliably and deterministically when there is a unique association with an object. An identity realm is generally useful when the association between identities and objects is a relationship where each unique identifier references a single unique object. Note that there is no requirement in the reverse direction, in that it typically makes little difference to the utility of an identity realm if a unique object is associated with multiple identities. In other words it can lead to ambiguities in identity resolution if an identity is associated with two or more distinct objects, but it generally is not as critical if an object is associated with multiple identities (i.e. multiple identity aliases for a unique object). Consistency: Is the identity asserted within a consistent identifier space? This avoids an assertion of identity being interpreted by another party in an unintended manner. Persistence: Does the identity remain constant, or are gratuitous changes in the mapping from the identifier to the referenced object avoided? Constantly changing identities are, at the very least, difficult to track. Trust: Can a particular identity withstand a challenge as to its validity? Other parties who would like to use this identity would like to be reassured that they are not being deceived. 'Use' in this context is a generic term that includes actions such as resolution to the object identified by the identity value, storage of the identity value for subsequent 'use', referral, where the token is passed to another party for their 'use'. Faltstrom & Huston, Eds. Expires June 14, 2005 [Page 4] Internet-Draft Internet Identities December 2004 Robustness: Is the identity realm capable of withstanding deliberate or unintentional attempts to corrupt it in various ways? A robust identity system should be able to withstand efforts to undertake identity theft or identity fraud. Withholding: If the identity is composed of a number of components, are only those components of the identity that are essential to support the communication exposed to other parties? Compound identity systems should not reveal those components of the identity structure that are not relevant to the identity operation being performed. Referential Consistency: If the identity is used in the context of a reference, then when the referenced object is altered or relocated, does the identifier remain a valid reference to the object? Referential consistency refers not only to the constancy of the reference in the face of changes to the referenced object, but also consistemncy when one entity passed the identity value to another. The identity resolution should remain constant in such cases. Structure: Is the token space from which identity values are drawn structured or unstructured? Structured token spaces allow various forms of retrieval operations based on the identity value to be undertaken efficiently, while unstructured token spaces allow for more flexible generation and use of identities within more restrictive realms of discourse. This list is not attempting to be a complete enumeration of required identifier properties, but instead list the most important desireable properties of identifier realms in the context of the Internet. 2. A Hierarchy of Identities In networking models there is a conceptual layering of functionality, starting at the layer of bits on the wire at the media access level and moving up a stack of layers through internetworking, end-to-end transport and application levels. Each one of these layers creates Faltstrom & Huston, Eds. Expires June 14, 2005 [Page 5] Internet-Draft Internet Identities December 2004 at least one context in which identifiers are used for the communication. It would appear that from this perspective an identity within the Internet is not just a single identity, but an collection of various identities, used in a variety of contexts. 2.1 Media Access Addresses There are two generic types of base media in this realm. One is a point-to-point medium, a bilateral communications system where all Protocol Data Units (PDUs) generated by one party are passed to the other party. In such environments use of media access addresses are not strictly required. The other form of environment is a multi-access environment, where a number of parties can communicate directly using a common medium. In this environment the sender must specify the intended recipient of the PDU, and to achieve this all connected entities must use a unique media access address. The most common of these multi-access media are encompassed within the IEEE 802 collection of media standards. These IEEE 802 technologies share a common structure of Media Access Control layer address (MAC address) to uniquely identity devices connected within a LAN. There are two forms of this identity space, one using a 48 bit identity space (EUI-48 [21]), and the other a 64 bit space (EUI-64 [22]). Both identity spaces can be considered as partially-structured identity spaces, where a number of bits within this MAC address determines whether the address has been globally or locally assigned. Globally assigned values are globally unique, but are structured in such a way that there is no imposed hierarchy within the address that could be used for efficient searching, in contexts such as, for example, a routing or forwarding application. A global MAC address identity certainly passes one of the more basic tests of an identity domain, that of uniqueness. Two parties cannot assume the same MAC address value and use this same value as a unique identity. So in a LAN context, a collection of devices can distinguish between each other by virtue of this unique MAC address. A manufacturer of Ethernet devices is assigned a manufacturer's block of Ethernet MAC addresses and uniquely places one address in each device. The end consumer has no need to reconfigure the device with a new address, nor is there any need to alter existing MAC addresses each time the LAN changes with new devices being added, and the address is intended to be globally unique. Beyond these attributes there are some real weaknesses in using a MAC address as an identity outside the context of a LAN environment. The identity space, while globally unique, has few other distinguishing properties. The structure of the identity space does not reflect its current location within a particular network topology, so its of no Faltstrom & Huston, Eds. Expires June 14, 2005 [Page 6] Internet-Draft Internet Identities December 2004 assistance as a location token. In the context of equating a device identity to this network interface identity, the identity has limited persistence, in that it follows the interface hardware, not the host computer or its use. For example, switching a device from a wireless to a wired connection changes its MAC identity. The identity has no capability to express any linkage to any other identity domain. It has no internal structure of sub-fields that could be interpreted as pointers into other identity fields. Its precise semantics are to define an interface to a network rather than the device itself. This issue of identifier scope comes up in link layer security discussions where it may not be the best possible approach to bind master session keys to MAC addresses, rather than some other identity. Another example, in IEEE 802.11i it is possible for a host to have multiple interfaces and therefore there is a significant difference between binding an Master Session Key to a MAC address and binding to a host identity. This lack of a direct association between an interface's MAC address and a host device has undesirable effects when it has been assumed that a MAC address equates to a host identity. In "Authentication for DHCP Messages " [13] where the MAC address takes on the role of the DHCP client-identifier, or in the administrative model of IEEE 802.11-1999 Wired Equivalent Privacy (WEP) [23] it can be an administrative burden to keep track of all the network interface cards, their MAC addresses and their associated secrets. Even despite these limitations, the MAC address is regarded as a useful identity mechanism in the context of an identity space. The original 48 bit identity specification has been augmented with 16 padding bits in order to be incorporated into the IEEE 64-bit EUI-64 global identifier structure, which in turn has been incorporated into the IPv6 address architecture as the interface identifier component of the unicast address [9]. It should be noted that this latter action of embedding one identity (a MAC address) in another (the IPv6 address) lifts the original identity outside its original context. There have been some concerns noted where public disclosure of the MAC address within every IPv6 address also discloses both the unique identifier and, potentially, the role of the device. For example, a device manufactured by a specialized storage manufacturer is more likely to be a very expensive storage subsystem housing mission-critical data. This may not be information that is intended to be made public, and a follow-up proposal advocated the ability for the interface identifier within an IPv6 address to be a temporary randomized value [12]. This is illustrative of one of the side-effects of identity interdependence where one identity realm is embedded in another. Faltstrom & Huston, Eds. Expires June 14, 2005 [Page 7] Internet-Draft Internet Identities December 2004 2.1.1 Summary Uniqueness: MAC addresses are globally unique Consistency: Mac addresses are intended to be a unique token for a network interface. Within this limited scope they are consistent. Persistence: MAC addresses are associated with an interface, in the hardware, and, as such, are persistent. There are uses of MAC addresses that do not assume permanence, but that has more to do with the impermanence of binding that address to some other identifier (e.g., IP address) than anything else. Trust, Robustness: Both Trust and Robustness seem to be tied to the use of the MAC address, as there is no Internet infrastructure for assigning or reporting them. This raises the question of whether the process that asserts the MAC address of a hardware element and the transmission of that assertion is trustable, and whether the the manufacturing process that embeds MAC addresses into hardware is always reliable. Withholding: MAC addresses are not decomposed, and are completely exposed. Referential Consistency: The MAC address refers to an interface, rather than a device or an endpoint of an application. If the hardware component is moved then the the MAC address is still a valid identifier for the interface. However, from the perspective of dependent identifier systems, there may be some consistency issues, in that another system may no longer be able to see the interface identified by the MAC address. Structure: MAC addresses are unstructured. 2.2 IP Addresses Moving up one level in the protocol stack model provides an identity based on the internetworking layer, namely the IP address. The IPv4 address is a 32 bit field providing each Internet-connected interface with a unique value. IPv6 uses effectively the same construct, using a 128 bit identity domain rather than a 32 bit domain. In both cases the IP address is a structured identity space where there is a Faltstrom & Huston, Eds. Expires June 14, 2005 [Page 8] Internet-Draft Internet Identities December 2004 globally significant prefix that is used in the context of routing and forwarding outside of a particular local domain, and a local part that is used to deliver the packet to the correct interface of the associated device within the local network. The fact that the structure of the address is based on the requirements of routing, and is therefore topologically sensitive, implies that the underlying semantics of the IP identity can be most reasonably assumed to be temporal rather than persistent. As an identity token, an IP address should be unique. It is structured to be useful to forward packets to the addressed device, and it's well known, in that it's not a secret value. An IP address is not everything one could hope for in an identity. The IP address identifies an interface, not a device or its user. A device with multiple active interfaces has multiple IP addresses, and while it's obvious to the device itself that it has multiple identities, no remote party can tell that the multiple identities are in fact pseudonyms, and that the multiple addresses simply reflect the potential for multiple paths to reach the same endpoint. Furthermore, the IP address is an information-bearing identifier, which is structured in such a way that it can be used in routing and forwarding. This is helpful in the sense that there is no need to deploy a second identity system that refers only to locality within a network, however it compromises the use of the address as an identity, since in some circumstances a change in the connectivity of a local network will require a renumbering of that network, such that the address of each individual device will change. This is a specific example of the more generic observation about IP addresses, namely that the IP address carries both the identity of the endpoint in the IP realm and the location of the endpoint in the IP network. It is a matter of longstanding study that continues today as to the merits of delineating these two roles of identity at the IP level, creating one identity realm as a means of uniquely identifying an instance of a protocol stack within an end device (variously called a " stack identifier" or "endpoint identifier" in previous studies) and a second identity realm that is used to identify the current location of the identity element within the network (typically called a "locator" identity) [1][25][5]. 2.2.1 Summary Uniqueness: With the exception of certain identified special cases, such as private addresses [4], IP unicast addresses are globally unique. In the context of anycast use of IP addresses, an IP anycast Faltstrom & Huston, Eds. Expires June 14, 2005 [Page 9] Internet-Draft Internet Identities December 2004 address represents a collection of individual entities that undertake an equivalent function. In the context of multicast IP addresses, an IP multicast address represents a set of many independent hosts. Consistency: Mostly consistent; private addresses are known by convention, not by any internal identifier structure. Persistence: IP addresses are intended to be persistent. Becuase of the duality of the address as representing both identity and location, a change in location, such as in a mobile device, often triggers a change in IP address. Trust: There is no systematic method of validating an assertion of an IP address. Robustness: Attempting to hijack an IP address also requires some form of corruption of the routing system on order for other system to be informed of the updated location of the corrupted address. Withholding: IP addresses cannot be decomposed, and are completely exposed. Referential Consistency: It is normally the case that IP addresses are referentially consistent, and one party can pass a reference to a correspondant party to any other party by means of passing the IP address. One caveat is that where the IP address is deliberately corrupted, by viture of the use of a NAT device, or in the case of dynamically addigned addresses, then IP addresses lose their referential consistency. As noted above, anycast addresses may not preserve precise referential integrity. Structure: The structure of an IP address refers to a structure of topology in a locational context. In order to provide effective summarization tools in the context of routing, the IP address is structured such that adjacent devices use adjacent address, such that a common address prefix can be used to summarize the location of a local network of addressed devices. 2.3 Service and Session Identities In the TCP/IP protocol suite the next level of identity is that of Faltstrom & Huston, Eds. Expires June 14, 2005 [Page 10] Internet-Draft Internet Identities December 2004 the transport session. In order for a system to advertise a particular service that is a point of attachment for clients it combines three fields: IP server address, transport protocol identity, and the address of the local service identity (port number) into a compound identity that describes a particular service port on a particular device. The port address concept, used in TCP and UDP, represent generic identities for service rendezvous points. When combined with an IP address they become particular service points, or, identified service points, and these compound identification objects (IP address, Transport Protocol, Port) are service identifiers. The identity concept for transport is further extended by including the sender's IP address and port address. The corresponding 5-tuple of (Source IP address, Destination IP address, Transport Protocol, Source Port, Destination Port) is an identifier for a particular instance of a session. Not only is this 5-tuple used at the destination point to correctly de-multiplex an incoming packet stream and send each packet to the correct local instance of the application, the session identity can also be used within the network to recognize a 'flow' of packets that require identical forwarding treatment and may require identical service treatment, if so configured. In the latter case the session identity is being used to trigger a particular service response within the network, and the assumption being made within such contexts is that this 5-tuple is sufficiently unique to identify particular sessions to the relevant network elements. (SCTP also has a port address, but uses a set of IP addresses to identify the remote end. At the network level a 'flow' or 'stream' is identified as a collection of 5-tuples, rather than as a single 5-tuple.) There are circumstances where the complete 5-tuple is not visible to the network, such as in the use of IPSEC [8]. It is an objective when using the Encapsulating Security Payload (ESP) protocol when confidentiality is enabled to hide session information. The objective of the deliberate attempt to occlude these details is in order to impede traffic analysis or greatly reduce the information obtainable via traffic analysis. When using IPSEC with ESP the user has choices about how ESP is deployed. One choice is to use a separate Security Association for each flow, while another choice is to use a single Security Association for multiple flows to hide that flow information. It is not uncommon to use multiple already encrypted flows and re-encrypt them together using a common Security Association. This technique is very effective in impeding or preventing traffic analysis. The triple (source IP, dest IP, Security Parameter Index (SPI)) will identify the full flow granularity that the user intended to reveal. Of course, the SPI Faltstrom & Huston, Eds. Expires June 14, 2005 [Page 11] Internet-Draft Internet Identities December 2004 value will change at key rollover time, but usually the packet patterns (size, frequency of transmission, etc) will reveal which new SPI value corresponds with which previous SPI value. So if an entity is trying to identify flows, it is best to use that natural triple in the case of IPSEC with ESP. Session identities are intended to be unique at any single point in time, in that two distinct sessions will not share a common session identity. However, this identity is temporal, in that once the session is finished the identifier is no longer of direct relevance, and at a subsequent time a different session may use the same 5-tuple. As well as impermanence, session level identifiers exhibit a very fine level of granularity, and as such are often at a level of detail which is too fine to be a useful general identity token across the entire Internet realm. One use is to allow a session to construct an identity that refers to itself or its correspondent that can then be handed into a quality of service policy controller to request a specialized service response for the session. Other uses of session identities can be found in filters, firewalls and network address translators, as well as various forms of middleware applications. 2.3.1 Summary Uniqueness: A session identity is unique in a very limited context, such that the session identity is only unique between the communicating endpoints, and only unique for the lifetime of the session. Consistency: A session identity is intended to be consistent within the scope of the IP-level multiplexing and demultiplexing function performed in the endpoints of the session. Some forms of active middleware attempt to use this session level identity as a means of session identification. This use out of intended context is not always reliable. Persistence: Session identities are not persistent Trust: Session identities are not necessarily trustable. Additional mechanisms would be required to improve the trust attributes of session identities. Faltstrom & Huston, Eds. Expires June 14, 2005 [Page 12] Internet-Draft Internet Identities December 2004 Robustness: Session identities are not robust, and some other form of session context is required to minimize the risk of session hijacking. Withholding: Session identities are an instance of withholding, in that an end point session state includes a number of additional information items relating to packet sequence numbers and endpoint protocol state. These items are withheld from the explicit protocol exchange and are inferred at each end from the protocol exchange. Referential Consistency: Session identities are not referentially consistent. Structure: Session identities have a number of components, but each component is is not internally structured. 2.4 Routing and Forwarding Identities As mentioned above, IP addresses provide information required by routing and forwarding systems. Forwarding is undertaken using the entire address as the lookup function into a forwarding table, using the best match of the address against a table entry as the basis of the forwarding decision (where 'best' refers to a precise match across the longest sequence of leading bits). Routing within the Internet uses a hierarchy of environments, ranging from a non-routed multi-access local network, through a set of locally routed networks where routing is based on comprehensive knowledge of local network topology, through to the interdomain routing environment, where routing is based on a sequence of edge-to-edge transits across domains. This hierarchy of routing is reflected in the structure of addresses. At any point in the routing hierarchy an address is divided into two parts, a routing network part and a subnet address part. Early definitions of this address structure used a fixed division, while later refinements of classless IPv4 addresses and IPv6 both use an explicit prefix length value that is combined with an IP address prefix to form the routing identifier. Interdomain IP routing incorporates both routing identifiers and routing domains, or "autonomous systems". Within a given routing domain, IP routing is performed using only routing identifiers. However for routing between domains, IP routing is performed using a new identity, the Autonomous System number. The most common implementation of inter-domain routing is a distance vector distributed computation of inter-domain topology using vectors of AS Faltstrom & Huston, Eds. Expires June 14, 2005 [Page 13] Internet-Draft Internet Identities December 2004 numbers as both a loop detection and a path preference mechanism. The AS identity space is an unstructured space of numeric values, allocated from a single 16-bit identifier space. An IP address is located within a routing system by identifying the most specific enclosing routing identifier. Forwarding a packet to a specific IP address involves an algorithm of locating the associated routing identifier and undertaking the forwarding action associated with that object. Coherency of the routing system demands that routing identifiers are managed in a consistent fashion. The overloading of an IP address as both an IP identity and a component of a routing identifier implies that a device's location is implicitly described by its IP address. As noted earlier, relocating a device to a new network location, or relocating a network to a different point in the overall Internet topology necessarily implies associating a new IP address with the device. In the absence of any other mechanisms, this new IP address replaces the previous IP address, changing the device's IP identity, the device's service identities and the device's session identities. 2.4.1 Summary Uniqueness: Routing identities are intended to be unique, deriving their uniqueness from the underlying properties of the IP address space. Consistency: Routing identities are intended to be a unique token within the context of a routing realm. Persistence: Routing identities are not persistent, and have a liketime associated with connectivity of the described entity within the routing realm. Trust: There is no systematic method of validating an assertion of a routing identity. Robustness: The identity structure does not inherantly prevent various forms of corruption of routing identities Withholding: An inter-domain routing identity is a compund entiy consisting of an address prefix and an autonomous system number. There is no withholding of elements of this identity. Faltstrom & Huston, Eds. Expires June 14, 2005 [Page 14] Internet-Draft Internet Identities December 2004 Referential Consistency: Routing identities are intended to be consistent within a routing realm, and the operation of routing protocols rely on this referential consistency of routing identities. Structure: Routing identities are not internally structured. 2.5 Mobile Identities Device and network mobility adds an additional dimension to identity, in that mobility implies some level of decoupling of the notion of location with that of identity. In one form of approach to this generic space, that of device mobility, a device has an additional IP address that acts as a 'current locator' that describes the device's current location within the network, while the device also retains a constant 'home address' that in effect acts as the device's constant identity and also acts as the discovery service point for its current location. With this approach a 'home agent' acts as a proxy agent for the device when it is roaming beyond the confines of its local network. The home agent tunnels traffic sent to the home address to an address at the host's current topological location, called the 'care of' address in Mobile IP. The host is responsible for updating the binding between the home address and the care of address in the home agent, by sending a binding update message when the care of address changes. The mechanism involved in mapping between the home address and care of address is very similar to the mechanism used on the local link for the ARP neighbour cache, except IP addresses are involved for both. This approach raises a critical issue for identities, namely that of robustness. Approaches to mobility need to be aware of a potentially hostile environment where third parties may attempt to subvert the implicit redirection of traffic by assuming the identity of the mobile element through the generation of false updates of the current location. 2.5.1 Summary Uniqueness: A mobile identity is a compound entity using two IP addresses: a 'home' address that functions as an endpoint identifier and a 'core of' address, which functions as a current endpoint location identifier. A mobile identity is intended to be unique. Faltstrom & Huston, Eds. Expires June 14, 2005 [Page 15] Internet-Draft Internet Identities December 2004 Consistency: A mobile identity is consistent within the realm of mobile IP protocols. Persistence: A mobile identity is not persistent,. The endpoint identity value is intended to be persistent, while the endpoint location identifier is only intended to be valid while the mobile entity is located at the specified 'care of' location. Trust: Of itself a mobile identity is not trustable. Mobile IP protocols add additional communication exchnages between the mobile entity and 'agent' entities in order to create trust in a mobile identity. Robustness: A mobile identity is not intrinsically robust. The protocol exchanges between the mobile entity and its agents can create a robust mobile identities. Withholding: Manipulation of mobile identities can include deliberate withholding of the current location information, or the persistent identity information. Referential Consistency: Becuase of the temporal nature of the location identifier, mobile identities are not consistent over time. Structure: The structure of a mobile identity is derived fromt he structure of the underlying IP address space. 2.6 Opportunistic Identities This concept of maintaining some form of identity association in the face of a communicating within a potentially hostile environment has lead to a proposal for an identity token that has its roots in the public / private key pairs. In this approach the identity token is associated with the public key value of a public / private key pair. A message encrypted with a private key can be passed to the other party where only the originating party's publicly asserted identity (or public key) can decrypt the message. Such identity realms can serve to support a reliable assertion that the received message originated from the same party that originated the communication and that the message has not been tampered with Faltstrom & Huston, Eds. Expires June 14, 2005 [Page 16] Internet-Draft Internet Identities December 2004 while in transit. The identity systems are opportunistic in that they are self- generated identities, and have no external structure. The implication is that such identities have no particular structure and may not be completely unique. For this reason their utility in other identity applications where persistence or referential integrity is required, such as acting as a persistent reference to other attributes of a named object, is limited. 2.6.1 Summary Uniqueness: Opportunistic identities are not unique. Consistency: Opportunistic identities may not be consistent. Persistence: Opportunistic identities are not necessarily persistent. Trust: Opportunistic identities are not trustable in general. There may be limited contexts in which an opportunistic identity may be considered trustable.. Robustness: Oppostunistic identities are normally robust in the sense that they are not generally divulged, are generated in a manner that is systematically predictable by a third party, and are often drawn from a sufficient large space that they are resilient to guessing techniques. In this sense an opportunistic identity can be considered to be robust. Withholding: Opportunistic identities can be simple or compound tokens. Withholding is possible in the case of compound identity realms. Referential Consistency: Opportunistic identities are normally bounded by a particular context of use, and are not referentially consistent outside of this context. Structure: Opportunistic identities may not necessarily be structured. 2.7 Domain Names The set of identities described so far have no particular human-visible aspects of their function. The identity tokens are Faltstrom & Huston, Eds. Expires June 14, 2005 [Page 17] Internet-Draft Internet Identities December 2004 structured to meet a particular purpose, and are not intended, as their primary purpose, to be manipulated by humans nor are they intended to be used primarily within the realm of human discourse. By contrast, the Domain Name System (DNS) was specifically intended to be a name realm that is suitable to be included in human discourse, yet at the same time to admit enough structure to be manipulated by computer applications in a deterministic fashion. The DNS is essentially a hierarchical name space, where the hierarchical name structure allows the space to be efficiently searched and managed in a distributed fashion, but also supports one of the most desirable attributes for an identity space, namely uniqueness. The explicit hierarchy also assists in ensuring uniqueness, as DNS names are intended to be unique across the entire name string rather than just at the first component, so that "a.b.c" is a different identifier to "a.d.e " even though the first token in the domain names, "a", is the same in both cases. The most common use of the DNS is to map domain names to IP addresses, but other uses are possible via mapping a name to a number of other defined 'resources'. The core functionality of the DNS is that of a unique, structured, name space and a mapping capability that allows a query to be performed to retrieve the mapping information for a DNS name for a particular class of resource mapping. The Domain Name System is more than a set of syntactic rules for constructing a well-formed DNS name. The resultant name, if well constructed and properly implemented, can be used as a referral token to a service environment. In this fashion the DNS encompasses a translation service that maps from domain names to defined resources, including IP addresses. For example, given a well formed DNS name, a DNS lookup can query for a corresponding IP address. The DNS describes a data model, a set of relationships between data objects as well as a protocol used to send queries and receive answers. As DNS names provide a mapping from a name to a resource, the name does not need to change when the resource changes location or some other identifying attribute. The mapping changes, but the name remains constant, and for this reason domain names can be considered to be stable unique identifiers, residing within a structured space that can be efficiently searched and managed in a highly distributed manner. 2.7.1 Summary Faltstrom & Huston, Eds. Expires June 14, 2005 [Page 18] Internet-Draft Internet Identities December 2004 Uniqueness: DNS identifiers are unique. Consistency: DNS identifiers are intended to be consistent. There are a number of issues relating to character equivalence within various languages that impinge upon consistency of interpretation in some contexts. Persistence: DNS identities are intended to be persistent. Trust: The trustability of a DNS identity is based on the integrity of delegation within the hierarchy of the DNS identity. Robustness: The DNS is implemented as a distributed name database, and the robustness of the DNS is based on the robustness of this database. Withholding: DNS identities are capable of wothholding. A DNS identity can be regarded as a DNS name, and an associated set of resource records. Resource record values are withheld unless explicitly requested as part of a resolution query. Referential Consistency: DNS identities are intended to the referentially consistent. Structure: The DNS name space uses a hierarchical structure. 2.8 Uniform Resource Identifiers When communicating, applications often need more information than a domain name. For electronic mail, for example, the sending application must use a combination of the domain name, the TCP protocol, the mail delivery or mail agent's service port and the mailbox name of the recipient. Other applications require different compound identification objects, in accordance with their characteristics. This compound identity may be specified in the format of a Uniform Resource Locator, or URL. Uniform Resource Locators (URLs) are a subset of a more generic form of resource identification, Uniform Resource Identifiers (URIs). As an identity space, the URI space is very loosely defined, and it's quite remarkable as to the extent to which it has spread across the world as a form of object identifier, or identity token. URLs refer Faltstrom & Huston, Eds. Expires June 14, 2005 [Page 19] Internet-Draft Internet Identities December 2004 to the subset of URIs that identify resources via a representation of their primary access mechanism. Other forms of URIs provide resource identification through a name scheme or by other attributes of the resource. There are few syntax rules to the Universal Resource Identifier space, and only a small amount of common semantic structure. The original IETF documentation, RFC 1630 [2], refers quite simply to a syntax of a prefix word, a colon, and a following string. Where there is hierarchy in the following string, slashes are used to delineate the hierarchical levels, and the hierarchy runs from left to right. The current generic syntax of URIs is described in RFC 2396 [7], and the only change to this generic syntax is to refer to 'schemes', as in ":". The common usage of URIs has been more structured than this general specification, and most URI schemes do not provide a single string that is an alias for an identity, but instead form an identity from the instructions that specify how to access the resource, in the same way as a postal address is often constructed from the instructions as to how to deliver a postal letter to you. This form of a URI, which can be viewed as a location specification, is the basis of the URL scheme. In other words such protocol-scheme URLs consist of what could be interpreted as a device selector, an application selector and an application-specific string that acts as an object reference. Within such protocol-scheme URLs the scheme prefix is an identifier that uniquely identifies the service being referenced, or in terms of access it references the protocol and port address to be used. The first, or top, level of the hierarchical following string is either the DNS name of the server, or the DNS name coupled with some specific qualification, such as a mail address. Any subsequent hierarchical components represent service-specific instructions to be specified that lead you to the referenced object. Thus we have "mailto:user@domain.example.com" for a mail specification, where the scheme prefix "mailto" identifies the use of the TCP transport protocol, a port address of 25 and a protocol of SMTP. The following string, "user@domain.example.com" references the mail agent (a DNS lookup of "domain.example.com" for an MX resource record) and a value to be used in the protocol exchange (delivery to the mailbox "user@domain. example.com"). Similarly, "http://www.example.com/directory/hierarchy/index.html" for a specific web page uses "http" as a scheme identifier for TCP, port 80, protocol HTTP, the initial part of the following string to reference the server (a DNS lookup for an A or AAAA resource record for "www.example.com") and an HTTP protocol request for "www.example.com/directory/hierarchy/index.html". Faltstrom & Huston, Eds. Expires June 14, 2005 [Page 20] Internet-Draft Internet Identities December 2004 In this form of the URL identity system uniqueness is keyed from the general use of a DNS name within the URL, and the wrapping around the DNS string is taking the general form of the DNS as an alias for an IP address, and, additionally, specifying a service point, and then arguments that are needed to provide to this service point in order to retrieve the referenced resource. In that way a protocol-scheme URL is closer to a description of an algorithm than to an identifier whose structure of the identifier is adapted to tasks such as sorting, searching or equivalence operations. There are issues with consistency here in that while the hierarchically structured string set makes sense to one application it may not make any sense in the context of a different application. The persistence of protocol-scheme URLs is also an issue, in that the resource may change location over time, and the corresponding algorithm to locate the resource, or URL, must necessarily change as well. The other major difference between a structured identifier space and the protocol-scheme URL approach is that the structured identifier space requires some form of lookup to apply the identity into a retrieval system. By changing the outcomes from the lookup operation, the identity owner can track changes in the location of the resource. In the protocol-scheme URL approach there is no way to understand how widely the identity has circulated, and it is not possible to update the in-circulation copies of the URL. The property of the DNS is that in itself, the DNS identities are simple structured tokens, and they require a lookup operation to be performed in order to produce an algorithm that allows an application to refer to a particular object. While such protocol-scheme URLs are widely used as service and resource identities, they pale in significance, persistence and utility when compared with DNS names. In other words URLs specify "how" to access a service, while generic DNS names can be interpreted as identity tokens that can be used to identify a resource that may host a service (or "who"). It is also not surprising from this perspective to see the emergence of DNS resource records that refer to URLs, as in NAPTR records [10]. In this approach the first DNS lookup retrieves one or more URLs that have been associated with the DNS name, and a second lookup is used to resolve any DNS names as may be referenced in the URL strings. In this framework a service may change its location, or the access algorithm may be altered (and by necessity, the URL changed), but the DNS identity that maps to this URL remains constant. This is one of the clearer forms of delineating identity from access mechanisms. This mapping can also be used for service discovery. Given the name of a domain it is possible to look up NAPTR records to discover what URLs can be used for communication with that domain. This is for example used in the ENUM specification [11]. In ENUM a lookup in DNS Faltstrom & Huston, Eds. Expires June 14, 2005 [Page 21] Internet-Draft Internet Identities December 2004 of NAPTR records for a domain name created from an E.164 number is via transformation turned into a list of URLs. This give an ability to know what URLs one can use in order to contact the entity referred to by a given E.164 number. The more general form of this approach can use NAPTR resource records to associate a DNS name with one or more resources. The name that has the NAPTR records can be considered as an identity token, while the associated NAPTR records provide a mapping from this identity to the instantiation of the identified service. This approach has been used in the Archive Resource Key (ARK) proposal [26]. Of course not all URIs are protocol-scheme URLs of the form outlined above. URIs are a very general construct where the initial "scheme" part of the URI determines the structure and semantics of the remainder of the URI string. The next section examines that class of URIs where persistence of the identity is a specific feature of the identity realm, the Uniform Resource Name. 2.8.1 Summary for URLs This summary section explicity refers to URLs rather than URIs. The more general case of URIs is one that, in the general case, is unclear on all these desireable identity attributes. Uniqueness: URLs are intended to be unique. Consistency: URLs are not necessarily consistent. A URL typically includes the specification of an application, an enpoint and some additional arguments for the application to apply to the application instance on the nominated endpoint. Inconsistent interpretation of URL components by other applications is possible. Persistence: URLs are not necessarily persistent, as they implicitly identify how to access a resource or service, rather than identifying the resource or service per se. If the service or resource changes location, a new URL is required to reference the new location. In the case where URIs use a DNS identifier as part of the URI scheme, as in URLs, for example, such URIs also depend on the persistence of the underlying DNS identity for persistence of the URL. Trust: URLs are not necessarily trustable. Faltstrom & Huston, Eds. Expires June 14, 2005 [Page 22] Internet-Draft Internet Identities December 2004 Robustness: URLs are not necessarily robust. Withholding: URLs are not capable of withholding elements of the URL identity. Referential Consistency: URLs are intended to be referentially consistent, but are limited in terms of their persistence. Structure: URLs are a structured identity space. 2.9 Uniform Resource Names To solve the problem of lack of long term stability for references, URNs can be used as an alternative to recursive references into the DNS. URNs are generally considered not to be entirely within a human realm as they often include what would appear to be long random combination of characters. URNs are intended to be globally unique, and never reused. As long as a named object exists, it retains that name. An object can have many names. The object may cease to exist, in which case the URN can no longer be resolved, because the resolution service (from URN to URI) is no longer working, but, as the name exists (virtually), a new service can be created and the object re-established if there is need for it. RFC 3305 [14] describes in more detail the different views that exist on the relationship between URIs, URLs and URNs. 2.9.1 Summary Uniqueness: URNs are intended to be unique. Consistency: URNs are indended to be consistent. Persistence: URNs are intended to be persistent. Trust: It is unclear how trust relationships are formed with URNs. Robustness: As with trust relationships, the robustness properties of URNs are unclear. Faltstrom & Huston, Eds. Expires June 14, 2005 [Page 23] Internet-Draft Internet Identities December 2004 Withholding: It is unlikely that URNs can withold parts of the URN. Referential Consistency: URNs are intended to be referentially consistent. Structure: URNs included unstructured components. 2.10 Human Friendly Strings URIs have a problem that URNs didn't solve, and that is the ability for humans to remember them. Humans act in a context, so global uniqueness is not important at this level of abstraction. Instead, when a human uses a name, they normally want a resolution service that "does what they want". In this realm the context of the name is an important factor in resolving the name to an object, and global uniqueness is neither necessary nor assumed. This area of human friendly strings is a topic of ongoing work. One possible goal for a working system is to be able to handle the so-called "side of the bus" problem. A human sees something in an advertisement on the side of a bus, remembers it (or remembers part of it), and when they come to a computer they try to get more information about what they have seen. This involves complex language and localization (and internationalization) problems. There has been various ideas connected to "layers above DNS", for example mentioned in RFC 3467 [19] (subject of the SIREN Research Group in the IRTF). This topic encompasses an effort to decouple the naming realms that makes sense to humans, with their various forms of implied context for resolution, from the naming realms that work for computers, with the implication of explicit specification of resolution, and define a mapping between them. The DNS can't handle the types of names that often make sense to people, because people always work in a context (such as a geographical context of 'locality'), and it's no longer sufficient for people to fit their needs into what DNS can handle. For a some time it was considered possible to overload the semantics of the DNS label (machine-parseable, vaguely human-recognizable) but it is becoming evident that this is not a tenable approach, and some distinction needs to be drawn between DNS names and context-sensitive human-friendly strings. No real human friendly naming system exists today on the Internet. Faltstrom & Huston, Eds. Expires June 14, 2005 [Page 24] Internet-Draft Internet Identities December 2004 2.10.1 Summary Uniqueness: HFS are intended to be unique within a context of discourse. Consistency: Unclear. Persistence: Unclear, although it would appear to be a desireable attribute in this context. Trust: Unclear. Robustness: Unclear. Withholding: Unclear. Referential Consistency: Desireable. Structure: Unclear. 3. Issues with Identities 3.1 Overloading the IP Address An IP address suffers from semantic overload in attempting to carry both location and some form of constant identity. If a network or individual device changes access providers then this is, in effect, a change in network location, and if provider-based address aggregation is being used, then the local IP address will change. The same issue applies with mobile devices. This implies that an IP address is not necessarily a permanent or truly persistent association with a device, and such impermanence is a weakness in any persistent identity system. Another issue with IP addresses, at least in version 4 of the protocol, is that of their total span. While 32 bits is still a large size, encompassing some 4.4 billion unique addresses, there is an inevitable level of wastage in deployment, and a completely exhausted 32 bit address space may only encompass a connectivity realm of perhaps only 1 or 2 billion IP devices. When this is coupled with a world of embedded IP devices in all kinds of Faltstrom & Huston, Eds. Expires June 14, 2005 [Page 25] Internet-Draft Internet Identities December 2004 industrial and consumer applications, 1 or 2 billion addresses is insufficient to provide unique addressing to every possible device. In response to these address pressures there has been the introduction of a number of technologies that dilute the strong binding of IP address with identity. Such approaches tend to treat the IP address purely as a routing and forwarding token without any of the other attributes of identity, including persistence and public association. For example, DHCP, or address-lending, is a commonly used method of extending a fixed pool of IP addresses over a domain where not every device is connected to the network at any given time, or when devices enter and leave a local network over time and need addresses only for the time they are within the local network's domain. In this form of identity, the association of the device with a particular IP address is temporary, and hence there is some weakening of the identity concept, as the dynamically-assigned IP address is being used primarily for routing and forwarding. This has been taken a further step with the use of dynamic Network Address Translation (NAT) approaches, where a single device has a pool of public addresses to use, and maps a privately used address to one of its public addresses when the private device initiates a session with a remote public device. The private-side device has no direct knowledge of the public address that the NAT edge will use for the session, nor does the corresponding public-side device necessarily know that it is using a temporary identity association to address the private device. These forms of changes to the original semantics of an IP address are significant architectural changes to the concept of identity at the level of IP, particularly in the presence of NATs. The widespread deployment of such approaches continues to underline the concept that as an identity token there is a lack of persistence in an IP address, and the various forms of aliasing weaken its utility as an identity system. The conclusion drawn from these observations is that, increasingly, an IP address, in the world of the IPv4 Internet, is being seen primarily as a locality token with a very weak association with some form of identity. Version 6 of IP represents an effort to restructure the address field, and the 128 bits of address space represents a very large space in which to attempt to place structure. One of the more innovative concepts that was discussed within the development of IPv6 was extending the concept of the IPv6 interface identifier field of the address to be a globally unique identifier. This had some obvious connotations in being able to identify when the connectivity for a device has changed, as in such cases the globally unique interface identifier could remain constant while the routing prefix may have changed. There was also some potential applications in the Faltstrom & Huston, Eds. Expires June 14, 2005 [Page 26] Internet-Draft Internet Identities December 2004 area of supporting multi-homed networks, where a local network could be seen via different routing prefixes. At present these aspects of IPv6 address architecture are the topic of ongoing work in the IETF. One of the fundamental issues with this form of approach is management of an interface identifier space that is globally unique and persistent, as well as being adequately robust. Current directions of activity in this area indicate that the self-assertion of identity using this field within IPv6 are insufficiently robust to prevent various forms of redirection attacks. Approaches currently being investigated are looking deeper into various aspects of mechanisms that are intended to provide corroboration of identity assertion in the face of locator change and additional protocol mechanisms appear to be a common feature of the current proposals relating to multi-homing and aspects of mobility. 3.2 Dynamic DNS Updates and Nomadism An alternative mechanism to revising the semantics of the IP address is looking at the concept of moving the role of completing the transition of persistent identity into the DNS. Here the constant identity of the device is its DNS name. In a mobile context, as the device or network it roams across the network, and by using a sequence of secure dynamic incremental updates to the DNS, update the association of the constant DNS name to the new local IP address. This approach has possible applications in various multi-homing scenarios. However, this approach is not without attendant considerations. Much of the leverage of the DNS as an efficient lookup mechanism is based on extensive use of local caching of DNS information. Increasing the responsiveness of the DNS to dynamic updates implies that the extent to which cached information can be retained is compromised, and any cache has to refer more frequently to the primary source to refresh the currency of the local cached copy. The tradeoff here is greater DNS traffic loads and increased DNS server query loads in order to get a more responsive name system. Such a mechanism also requires an "always available" primary DNS server to accept the incremental updates, so that the failure backup mechanism of the DNS with primary and secondary servers is compromised in this nomadic model with the requirement for primary server availability in order to undertake an authoritative update to the DNS. An alternative approach is to equip the DNS with an additional resource record that contains an identity value in addition to the current A or AAAA address values. This approach can be used in conjunction with an additional element within the protocol stack that could allow the transport layer to operate using this identity field, and a new stack element provides a dynamic mapping between this Faltstrom & Huston, Eds. Expires June 14, 2005 [Page 27] Internet-Draft Internet Identities December 2004 identity and a 'current' locator value, where the equivalent current locator is passed into the IP protocol element. An alternative to this approach of changing mappings is to place the responsibility for the redirection into the application protocol. For example, with SIP, the mobile node could use the REGISTER method to change its registration for session setup. This may not be fast, but may be faster than dynamic DNS updates and perhaps even fast enough for handling initiating new sessions. A mobile HTTP server, on the other hand, would have to use HTTP Redirect from a fixed server whose address was in the DNS. 3.3 URLs and Persistent Identifiers URLs are, as their name suggests, locators rather than location independent identifiers. When the resource is relocated, or when multiple copies of the same resource exist, the URL scheme cannot persist across the change. Despite the almost universal use of the URL within web browsers, URLs are not an ideal candidate for a persistent identity. This weakness in the URL scheme has lead to the consideration of many alternate naming schemes, although the underlying requirements for any candidate naming scheme is that it is cleanly mappable into a URI-styled format and that there is a robust resolution system associated with the name scheme. Resolution is a critical factor here, as without the ability operate in a predictable, robust, scalable, trustable and reliable manner when translating an identifier into a resource, entity or service access description, the identifier scheme is of dubious value. The requirement for persistent identifiers is not intended to dispense with URLs, or similar forms of locators and service descriptors, but to separate the notions of identification and location, and to use distinct label space for each concept, and to use a resolution mechanism to map from the identifier to the location descriptor. Work on the development of a unique permanent identifier space has proceeded concurrently with the formalization of URL schemes, using the name of URN (Uniform Resource Name) schemes. A specification outlining the minimum requirements of the URN can be found at [3]. The syntax of the URN as expressed in [6] is as follows: urn:: Faltstrom & Huston, Eds. Expires June 14, 2005 [Page 28] Internet-Draft Internet Identities December 2004 The NID ensures the global uniqueness of the identifier. The NSS can take any form specified by the naming authority provided that it is unique within that namespace. The simple structure of the identifier reflects recognition of the need to accommodate different requirements and different schemes. Because the local, or namespace specific, string can be in any form, the identifier structure allows maximum flexibility in the identifier while providing a mechanism to assure global uniqueness and facilitating interoperability between discrete systems. There is a need to distinguish between naming schemes and resolution systems. A naming scheme, as a procedure for creating unique URNs that conform to a specific syntax, is independent of the resolution service which resolves the URN to locate the resource. Ideally, a naming scheme should not be tied to any specific resolution system and a resolution service should be capable of resolving a URN from any given name scheme. This objective is consistent with the intentions behind the development of the URN. A persistent identifier, especially when used for archival data must of necessity be capable of outlasting any systems and protocols that are currently in use. However the lack of a commonly agreed upon resolution system is also a major obstacle to the wide deployment of URNs. A variety of solutions have been proposed, including the NAPTR (Naming Authority PoinTeR) DNS resource record [10], that provides rules for mapping parts of URIs to domain names and then using these domain names as DNS lookup queries to find mapped URIs. This was specification has been further refined as the Dynamic Delegation Discovery System (DDDS) [15][16][17][18]. As noted in RFC3404 [18]: "For the short term, the Domain Name System (DNS) is the obvious candidate for the resolution framework, since it is widely deployed and understood. However, it is not appropriate to use DNS to maintain information on a per-resource basis. First of all, DNS was never intended to handle that many records. Second, the limited record size is inappropriate for catalogue information. Third, domain names are not appropriate as URNs. Therefore our approach is to use the DDDS to locate "resolvers" that can provide information on individual resources, potentially including the resource itself." There appears to be some residual issues over the status of URNs. For URNs to achieve widespread deployment, not only is consensus on functional requirements and syntax needed, but the ability to Faltstrom & Huston, Eds. Expires June 14, 2005 [Page 29] Internet-Draft Internet Identities December 2004 recognize and resolve URNs should be incorporated into the application realm. For example, it would be a reasonable objective to incorporate URN support in standard Web browsers. However a pre-requisite for this step is the definition and construction of the necessary resolving infrastructure, developed either by leveraging off the existing Domain Name System or by some other route. As long as application developers are uncertain of what is to be accepted as a standard resolution mechanism, and while naming scheme developers are uncertain of how to register their name and resolution schemes these issues will not be fully resolved. Until the resolution issues are clarified and there is clear consensus to adopt a particular specification, implementation of URN systems will require some form of application level assistance by way of proxy servers. The implication is that use of URNs will require encapsulation in a URL in order to specify the appropriate proxy server address. This approach has already been undertaken in the specification of PURLS [24], which is a naming scheme that incorporates within the PURL a conventional URL reference to a resolver to specify a PURL resolution service and a name part of the URL that the resolution service translates to the resource URL. In a web-based context this is handed back to the client as an HTTP redirect. The dependency of the identifier scheme on the behavior of a particular application (namely HTTP in this case) is not the most desireable of attributes for an identity scheme. If the PURL was to be used in a different context by a different application, a comparable redirection mechanism would be required to support the desired outcome. In comparison, the Handle system [20] uses a non-URL name scheme, and resolution in applications requires modification of the application. The 'handle' itself is a persistent identifier consisting of two parts. The syntax is a two part identifier of "/< name>" where the naming authority is an administrative unit authorized to create and maintain handles and the name of the resource is a string which must be unique to that authority but which has no prescribed syntax. Use of handles can be through standard web browsers using a plug-in, or through unmodified web clients using proxy servers and embedding the handle within a URL that specifies a handle resolver in a manner similar to the PURL approach. The specification of a distinct handle syntax allows handles to be used in a broader set of contexts than web browsing as there is independence of the identifier to a particular access protocol and server location. The issue of resolution of the compound identifiers remains problematic, and the use of embedding the URN into a proxy URL to Faltstrom & Huston, Eds. Expires June 14, 2005 [Page 30] Internet-Draft Internet Identities December 2004 undertake redirection can be argued as defeating the purpose of having location and protocol independent identifiers, since the resultant identifier includes the location of the proxy agent. The full value of persistent identifiers to ensure persistence in citations can only be realized if they are actually useful when citing documents and objects. In order to use them, the user must know that there is a persistent identifier and must be able to discover what it is and how to resolve the identity. At present this is difficult because of the nature of the redirects used in most existing systems. 4. The DNS in Identity Spaces How good are any of these identities? Which one should be used in which context? Each of these digital identities have a context of usage, or realm of discourse, and outside of that realm they tend to break down as a cohesive and useful identity tool. Offering a MAC address as an email point of contact makes little sense, even though it could conceivably be used to form a unique identity in the mail realm. Offering an identification at the appropriate level of abstraction that provides a description of the mode of contact and identity in a form that matches the actions at this level is often used to distinguish between identities. At the level of human interaction we commonly identify email addresses using a domain and user name part. We do this because this is what you need to enter into your mail application in order to send me a message. There are considerations when generating identity spaces based on generic descriptions of algorithms of how to access the specific resource, trigger the particular application or contact a particular individual or role's network point of presence. These considerations, commonly found in conjunction with URI's, raise consideration of maintaining referential integrity, allowing efficient searching and persistence of the identity. The human world, and its digital counterpart, is far from static. Any identity system that aspires to be useful in a human space needs to be able to support a maintenance function that allows any implicit reference that is contained in an identity space to be updated and refreshed in a reliable, trustable and timely manner. Knowing who you were is a less important piece of information as compared to knowing who you are right now. That leads to consideration of structured identity spaces whose two major attributes are: o sufficient structure to ensure that specific instances of the identity are unique, and Faltstrom & Huston, Eds. Expires June 14, 2005 [Page 31] Internet-Draft Internet Identities December 2004 o appropriate structure to allow rapid lookup of the identity to be able to retrieve the current set of associated pointers within various specified realms. There is a good match between these desired attributes and those of the DNS, and one perspective to be drawn from this is that the major underpinning of useful and lasting digital identities rests within the framework of the DNS. In other words any useful identity space is highly likely to have managerial and operational characteristics that would largely parallel that of the DNS. 4.1 The role of the DNS Different identities are used in the Internet for different purposes. IP addresses are essential at the level of forwarding protocol data units across the network, but are unwieldy to use in the context of naming resources and services at the level of human operation of applications. In the context of URIs, the use of a DNS identity within a URI ensures that the identity of a service doesn't have to be changed when the IP address changes. The domain name creates an abstraction layer above the IP addresses that allows a service to be identified without particular reference to its current location within the Internet, and using a name realm that has better properties for human use. We could use something else, like static tables, databases or more similar systems like X.500. But, none of these alternatives have been able to prove they scale as well as DNS. Both the protocol itself and the data model with the distributed delegation has proven to be extremely efficient (even though some things could be "better"). The perspective being espoused here is that we don't have any current viable alternative to the DNS in terms of a structured identity space that supports mapping across identity realms. Even if we stop using domain names in URI's and instead using something else, deploying a translation service from this other name to IP addresses would inevitably involve recreating much of the functionality of the DNS. 4.2 Changing the DNS Because DNS is the service we use for mapping between many of the namespaces we use on the Internet, it is extremely important it works. Because of this, changes to DNS must be made with care. This refers to both changes to the protocol as well as the DNS data model. Example of changes to the protocol include the need for DNSSEC Faltstrom & Huston, Eds. Expires June 14, 2005 [Page 32] Internet-Draft Internet Identities December 2004 (signed record sets) which makes it possible for a recipient of a DNS response to verify whether it comes from an authoritative source. This has been discussed in the IETF for some years, and is illustrative of the required level of care in the design of changes to the DNS. Example of a change to the database structure include a move from an hierarchical to a flatter namespace. The result might be a disruptive change of DNS traffic on the global Internet which in turn might make further scaling difficult. Another similar change is allocation of names which are not registered properly. Especially in the root zone, this leads to problems such as the inability to later allocate and set a policy for the domain, and increased number of queries for non-existing names in the root zone when leakage of names happens from presumed to be closed networks. Example of the former are the very large TLDs like .com and .de. Example of the latter is the use of the pseudo TLDs '.local' and '.gprs' which are being used in private or enterprise contexts without any proper definition or registration and their consequent leakage of queries into the 'public' DNS. 4.3 The DNS is a strict lookup service When sending a query to a server, the server is to send the same data back regardless of context. Further, the server should send either a "match" which consists of one or more resource records, or a "failure" which include the special response "no such domain". This implies that two users sending the same query from two different locations at the same time should receive the same data in response. Or, the same user using two different computers with different operating system should receive the same data. Having the DNS server doing a "search", undertaking "fuzzy matching" or inferring some additional context to a query that guides the server to choose a particular response is ill-advised. The DNS server can not know the context of the query, nor should it guess what the DNS response is to be used for. It is always tempting to assume that the response is to be used by the most popular operating system for the most popular application of the day. It must though be remembered that other operating systems and other applications might break when fuzzy matching happens. For example, instead of giving back a "no such response" it is conceivable to give back something which pushes a potential error to the application layer by returning a synthesized answer that has resource records pointing to some form of application- level service. This implies the DNS server must know what application layer protocol is in use, and that a "no" at the application layer has the same semantics as a "no" on the DNS Faltstrom & Huston, Eds. Expires June 14, 2005 [Page 33] Internet-Draft Internet Identities December 2004 (naming) layer. Often TCP is used at the application layer which implies a "no" might only be signalled to the other end by not accepting the connection, which means the querying client cannot differentiate between "no such (dns) name" and "no response in application protocol". 4.4 Coherency of the DNS DNS is a bootstrap mechanism that publishes your data in a manner that allows queries from others to be answered. If you make mistakes in your local DNS configuration then you don't destroy the utility of the DNS for yourself, but you destroy the ability for others to contact you. Someone trying to reach your webpage might not be able to do so as they can not find the proper translation from your domain name to the IP address of the web server. It is also the case that mis-configurations most often happen in the glue between parent zone and child zone, and not in the child zone itself. Because of this, if you know where your nameserver is, you might not see the errors, as they have to do with finding the nameserver, and not the content of it. As mentioned before, it is very important the same response is sent back regardless of from where it is sent. The assumption within the DNS is that you should be able to pass a URI with an embedded domain name in it to all of your friends, and they all should be able to resolve it in an identical fashion. It is extremely important the domain names are globally unique, and lead to the same result every time, and from every location. Part of the coherence requirement is that the servers must be able to give back the same response to the same query. The implication is that all servers have to use the same matching algorithms when attempting to locate a match between a query and the local data used to form a response. What matching algorithm is used when looking in the data cannot change between servers because then they will give back different results for the same query. Complications arise when considering this in the context of use of various character sets within the DNS. Having each server use a local set of rules that defines equivalence of characters generates the situation of the same query generating different responses. The implication is that the consideration of different matching/equality rules can be solved by creating "bundles" of characters which are to be treated as equal, and solving the problem at the time of registration. This gives a greater choice for the registrant, and it can also give a higher freedom regarding context, as the bundles possibly look differently depending on such things like (parent) domain and language. Faltstrom & Huston, Eds. Expires June 14, 2005 [Page 34] Internet-Draft Internet Identities December 2004 4.5 The DNS as an Identity Glue When comparing the desired attributes of a useful identity system to the properties of the DNS it is evident that there is a reasonable level of fit between the DNS and a generic identity realm. The DNS provides a namespace that ensures uniqueness, is consistent, can support persistence, and referential consistency. The space is structured in a manner that supports relatively efficient lookup over a large name space that has both hierarchical structuring and within that some areas of large flat name spaces. The DNS can support trust models in terms of being able to validate the authenticity of responses. The DNS can support a variety of resource records that allow a DNS name token to be used as a search object that can map to related values drawn from other identifier realms, as well as supporting indirect self-reference through the use of NAPTR records and URIs. There are obvious trade-offs in the design, protocol and deployment of the DNS in terms of resiliency, dynamic behaviours and scalability. While it is not argued here that the DNS represents the only optimal trade-off between these properties, it is argued that any other identity space with similar properties will be faced with precisely the same set of trade-offs. It is also probable that any similar identity space faced with the same requirements of scalability, operational performance, accuracy and validity of responses and flexibility of mapping the identity space to related objects in other identity realms would find a resolution between these requirements in a manner that would not differ markedly from the DNS. The salient observation here is that an identity system acts generically as a reference to an initial point of rendezvous in a communication transaction. In this vein the role of the identity system is to identify how other parties in the network can refer to the identified element using an identity token that is persistent, with associated referential mappings into other identity realms that reflect the current status of the element. Once a communications state has been established using the rendezvous points, if there are characteristics of the application that require the subsequent exchange of information (such as location changes in a mobility environment, or a server hand-over at the application level) this is generally the task of components within the protocol stack, using a trust relationship between the communicating parties to alter the identity elements used within the stack to match the changing characteristics. Faltstrom & Huston, Eds. Expires June 14, 2005 [Page 35] Internet-Draft Internet Identities December 2004 5. Security Considerations Any identity system that provides a mapping from an identity value within one realm to an identity value (or set of values) within another realm will present a number of considerations with respect to security. The trust model for an identity system is that the mapping supported by the identity system is authentic, and that when the identity value is used as a key in a query operation, the response should be an accurate response that correctly represents the mapping originally provided by the assigned holder of that identity value. Equally, it is necessary to correctly report responses where an invalid or unassigned identity value is used, providing the query agent with a clear indication that the identity value is not assigned. In a hierarchically structured identity space there are a number of potential weak points in the identity space, where vulnerabilities exist for third parties to intercept queries and substitute a non-authentic response. This could involve misrepresentation of the of the root servers for the hierarchy, or misrepresentation of delegation points, as well as misrepresentation of responses for particular mapping queries. Any design of an identity space resolution service should be resilient to these forms of attack, by using appropriate mechanisms to reduce the risks of interception and misrepresentation in identity resolution operations. However, recognizing the lack of absolute assurances that a resolution system is resilient to all forms of attack, a resolution services should also be capable of exposing the trust model that exists within the identity space, and allow a user of the resolution service the ability to validate the response against the trust model. In other words authenticity should be a verifiable quality of the identity realm, rather than simply being an assertion that is interpretable only as a article of faith. 6. Acknowledgements The editors acknowledge the contributions made by Ran Atkinson, Brian Carpenter, Vint Cerf, Leslie Daigle, Joel Halpern and James Kempf in the preparation of this document. 7 Informative References [1] Saltzer, J., "On the Naming and Binding of Network Destinations", RFC 1498, August 1993. [2] Berners-Lee, T., "Universal Resource Identifiers in WWW: A Faltstrom & Huston, Eds. Expires June 14, 2005 [Page 36] Internet-Draft Internet Identities December 2004 Unifying Syntax for the Expression of Names and Addresses of Objects on the Network as used in the World-Wide Web", RFC 1630, June 1994. [3] Sollins, K. and L. Masinter, "Functional Requirements for Uniform Resource Names", RFC 1737, December 1994. [4] Rekhter, Y., Moskowitz, R., Karrenberg, D., Groot, G. and E. Lear, "Address Allocation for Private Internets", BCP 5, RFC 1918, February 1996. [5] Carpenter, B., Crowcroft, J. and Y. Rekhter, "IPv4 Address Behaviour Today", RFC 2101, February 1997. [6] Moats, R., "URN Syntax", RFC 2141, May 1997. [7] Berners-Lee, T., Fielding, R. and L. Masinter, "Uniform Resource Identifiers (URI): Generic Syntax", RFC 2396, August 1998. [8] Kent, S. and R. Atkinson, "Security Architecture for the Internet Protocol", RFC 2401, November 1998. [9] Deering, S. and R. Hinden, "Internet Protocol, Version 6 (IPv6) Specification", RFC 2460, December 1998. [10] Mealling, M. and R. Daniel, "The Naming Authority Pointer (NAPTR) DNS Resource Record", RFC 2915, September 2000. [11] Faltstrom, P., "E.164 number and DNS", RFC 2916, September 2000. [12] Narten, T. and R. Draves, "Privacy Extensions for Stateless Address Autoconfiguration in IPv6", RFC 3041, January 2001. [13] Droms, R. and W. Arbaugh, "Authentication for DHCP Messages", RFC 3118, June 2001. [14] Mealling, M. and R. Denenberg, "Report from the Joint W3C/IETF URI Planning Interest Group: Uniform Resource Identifiers (URIs), URLs, and Uniform Resource Names (URNs): Clarifications and Recommendations", RFC 3305, August 2002. [15] Mealling, M., "Dynamic Delegation Discovery System (DDDS) Part One: The Comprehensive DDDS", RFC 3401, October 2002. [16] Mealling, M., "Dynamic Delegation Discovery System (DDDS) Part Two: The Algorithm", RFC 3402, October 2002. Faltstrom & Huston, Eds. Expires June 14, 2005 [Page 37] Internet-Draft Internet Identities December 2004 [17] Mealling, M., "Dynamic Delegation Discovery System (DDDS) Part Three: The Domain Name System (DNS) Database", RFC 3403, October 2002. [18] Mealling, M., "Dynamic Delegation Discovery System (DDDS) Part Four: The Uniform Resource Identifiers (URI)", RFC 3404, October 2002. [19] Klensin, J., "Role of the Domain Name System (DNS)", RFC 3467, February 2003. [20] Sun, S., Lannom, L. and B. Boesch, "Handle System Overview", RFC 3650, November 2003. [21] IEEE, "Guidelines for use of a 48-bit Global Identifier (EUI-48)", December 2003, . [22] IEEE, "Guidelines for 64-bit Global Identifier (EUI-64) Registration Authority", December 2003, . [23] IEEE, "802.11 Wireless", December 2003, . [24] OCLC, "PURLS: Persistent Uniform Resource Locators", December 1995, . [25] Shoch, J., "Internetwork Naming, Addressing, and Routing", Proceedings of the 17th IEEE Computer Society International Conference pp. 72-79, December 1978. [26] Kunze, J. and R. Rodgers, "The ARK Persistent Identifier Scheme", draft-kunze-ark-08 (work in progress), July 2004. Authors' Addresses Patrik Faltstrom (editor) Internet Architecture Board EMail: paf@cisco.com Geoff Huston (editor) Internet Architecture Board EMail: gih@telstra.net Faltstrom & Huston, Eds. Expires June 14, 2005 [Page 38] Internet-Draft Internet Identities December 2004 Appendix A. IAB Members Internet Architecture Board Members at the time this document was drafted were: Bernard Aboba Harald Alvestrand Rob Austein Leslie Daigle Patrik Faltstrom Sally Floyd Jun-ichiro Itojun Hagino Mark Handley Geoff Huston Pete Resnick Bob Hinden Eric Rescorla Jonathan Rosenberg Faltstrom & Huston, Eds. Expires June 14, 2005 [Page 39] Internet-Draft Internet Identities December 2004 Intellectual Property Statement The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org. Disclaimer of Validity This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Copyright Statement Copyright (C) The Internet Society (2004). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. Acknowledgment Funding for the RFC Editor function is currently provided by the Internet Society. Faltstrom & Huston, Eds. Expires June 14, 2005 [Page 40]