INTERNET-DRAFT R. Kahn IETF Stream CNRI Intended Status: Informational A. Maffei WHOI Expires: 26 December 2012 24 June 2012 Terminology and Use Cases for Interoperability of Identifier Resolution Systems draft-kahn-dsii-id-res-sys-00 Abstract Identifier Systems have been in existence for many years and are in widespread use. An opaque identifier conveys essentially nothing about the information identified and must be converted into useful intermediate information, known as state information, before the information being identified can be accessed or used. An identifier that is can be processed by a system in the Internet to provide useful and actionable intermediate state information is said to be resolvable. Although there is ongoing discussion about the value of opaque versus non-opaque identifiers, we assume the existence of a minimal syntax structure for the identifier and also that the resolution mechanisms only knows about the minimal syntax of the identifier. In this document we introduce the notion of interoperability of identifier resolution systems as a key component for enabling interoperability of heterogeneous information systems more broadly, and discuss the role of unique persistent resolvable identifiers. Terminology is proposed to facilitate this discussion and a set of use cases demonstrating the need for interoperability of resolution systems are included. Status of this Memo This Internet-Draft is submitted to IETF in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." Kahn and Maffei Expires 26 December 2012 [Page 1] INTERNET-DRAFT 24 June 2012 The list of current Internet-Drafts can be accessed at http://www.ietf.org/1id-abstracts.html The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html Copyright and License Notice Copyright (c) 2012 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. 1. Introduction Identifier Systems have been in existence for many years and are in widespread use in society. They range from the simplest systems of enumeration to more complex means of conveying information about the things, values, services, or other identified items. Opaque identifiers, such as strings from cryptographic hash functions, convey essentially nothing about the information identified and must be resolved before they can be used in a digital environment. An identifier that can be processed by a system in the Internet to provide useful and actionable state information is said to be resolvable. In this document, we focus on resolvable identifiers and the resolution systems that support them. Specifically, we introduce the notion of interoperability of resolution systems and discuss the notion of unique persistent resolvable identifiers along with a common terminology for discussing these systems. This capability has been implemented as part of the Digital Object Architecture [1]. Finally, a set of selected use cases that support this notion is also provided. Kahn and Maffei Expires 26 December 2012 [Page 2] INTERNET-DRAFT 24 June 2012 2. Definitions Identifier While some systems of identification have used names and semantics to identify a variety of things, many other systems have used combinations of syntax and semantics. In this document we assume no semantics are available in the identifier and only enough syntax for the resolution system to understand how to resolve it. Identifier Generation Rules These rules specify how to create a well-formed identifier for use within a resolution system. Normally, the rules will consist of syntax constraints, but may go beyond that, as in requiring that some or all of an identifier be generated in a specific manner. Identifierspace This refers to the set of identifiers that are generated using the rules. Digital Object A digital object consists of structured data in the form of a bit sequence or a set of bit sequences that can be interpreted by a computer or other computational facility. Each sequence consists of a type-value pair, at least one of which is the associated unique persistent identifier for the digital object. A digital object may incorporate separately identifiable digital objects or only the identifiers for those objects or a combination of the two. The digital objects need not all be stored in one place, or stored at all, but if they are stored, they are generally assumed to be accessible from the one or more known locations in the Internet. Indeed some digital objects may constitute mobile programs that given specific tasks to perform in the Internet, and would interact with other programs, registries, repositories or information resources structured as digital objects in order to carry out its assigned tasks. Resolution System A system in the Internet that accepts an identifier as input and produces useful and actionable "state information" such that, for example, the digital object associated with that identifier may be accessed and used. This state information is intended to inform the user or the user's system as to specific actions to take, restrictions on actions and associated information such as a means Kahn and Maffei Expires 26 December 2012 [Page 3] INTERNET-DRAFT 24 June 2012 of authentication, public keys, and whether certain actions will incur costs or require allocation of significant resources such as storage. It is assumed that many different resolution systems may exist in the Internet. Each such system may have its own identifier rules and produce information in different ways. Uniqueness An identifier is said to be unique at a given time with respect to a given resolution system if there is only one authorized set of useful and actionable information held by the resolution system for that identifier, and that information is produced by a resolution request with that identifier. Persistence An identifier is said to be persistent if its associated resolution system can be relied upon to return useful and actionable state information about the identified digital object over a period of time that is at least as long as the digital object exists and in the face of changes to the digital object, such as ownership and location, without the identifier itself changing. The period of time may vary somewhat depending on the domain of use. For example, the identifier may outlast the identified digital object, such that a reference to it using the identifier can be resolved to useful and actionable state information even when the digital object is no longer available. The quality of persistence is dependent on a combination of social and business infrastructure, e.g., an organization assuming responsibility for maintaining the resolution information and underlying resolution system and technology. Immutability The term immutable refers to a digital object and its associated identifier that is fixed and does not change. Other than the digital object identifier, however, other aspects of the digital object metadata may change over time as, for example, the object is moved from place to place, as long as the basic structured information that comprises the digital object does not change over time. Mutability This term refers to a digital object that is not immutable. Kahn and Maffei Expires 26 December 2012 [Page 4] INTERNET-DRAFT 24 June 2012 Structured Data and Metadata Normally, a resolution request leads to one of two basic results. One result is obtaining structured data in digital form that represents a service, resource or other information represented in digital form such as a document, bill of lading, or movie. The other result is obtaining metadata about the service, resource, or information. In this document, we focus on metadata that is used for rapid reaction by the user's system, as in clicking an identifier to find the location(s) of the item, or even the location of the systems to provide parts of the metadata. More general forms of metadata, such as might typically be used in a search system, would not be provided by the resolution system; rather, the user could be referred to a more substantive registry of such information with advanced browsing and searching capabilities. 3. Use cases for Actionable Identifier Resolver Interoperability Interoperable Proxying Let us assume a user of resolution system A needs to resolve an identifier known only to resolution system B. We assume the user does not have the software needed to use resolution system B, or else the user might try to resolve the identifier directly from resolution system B. Rather, the user asks resolution system A to do the resolution and system A takes on the responsibility to ask resolution system B. These two resolution systems are interoperable if a resolution request sent from resolution system A to resolution system B will succeed in producing the information in a form usable by the party who requested it. Typically, this will require a kind of service level agreement (SLA) between the two resolution system administrators to share information about how they store and otherwise manage resolution information in order to produce a useful response to the user, and also to agree on what various fields mean in order to do an effective translation into the terms of the asking resolution system. This document does not address how such SLAs should be structured or how any resolution system should operate internally. Most likely, only a small number of resolution systems will be desirable for handling the bulk of identifier resolution requirements. A list of such systems could be made available in the Internet, if desired. Each resolution system may have its own means of providing security, but this document does not address how security may be achieved for the overall interoperable proxy service. Kahn and Maffei Expires 26 December 2012 [Page 5] INTERNET-DRAFT 24 June 2012 Distributed replica management Retention and caching decisions depend on the status of replicas of digital objects managed by multiple, distributed services. Duplicates may, in principle, have different identifiers in different systems. In order to query or harvest the status of these digital objects, to make retention or migration decisions (e.g., deduplication, or migrating a digital object which a service is about to drop), it might be necessary to resolve multiple identifiers against multiple resolution services. Provenance tracking In a complex distributed workflow, source data and related data services may be managed separately, perhaps via heterogeneous systems that rely on different identifier systems. In order to reason over provenance it may be necessary to access these objects or their metadata. Data fusion In this case a digital object is assembled from component digital objects, which may not all be accessible via the same service or within the same identifierspace; as a result, the identifiers may not all be resolvable in the same identifierspace and the assembly process would require resolving each identifier separately. Harvesting and spidering In this case, which is similar to the data fusion case, an agent fetches a digital object which contains references to other digital objects and then needs to be able to resolve those references recursively. 4. Security Considerations 5. IANA Considerations No IANA actions are needed. 6. Informational References [1] Corporation for National Research Initiatives, "A brief overview of the Digital Object Architecture", 1 June 2010. Kahn and Maffei Expires 26 December 2012 [Page 6] INTERNET-DRAFT 24 June 2012 Authors' Addresses Robert Kahn Corporation for National Research Initiatives 1895 Preston White Drive, Suite 100 Reston, VA 20191 USA EMail: rkahn@cnri.reston.va.us Andrew Maffei Woods Hole Oceanographic Institution Mailstop 44 Woods Hole, MA 02543 USA EMail: amaffei@whoi.edu Kahn and Maffei Expires 26 December 2012 [Page 7]