Network Working Group Juha Hakala Internet-Draft Helsinki University Library Category: Informational Hartmut Walravens draft-hakala-isbn-00.txt The International ISBN Agency Expires: 28 February 2001 30 August 2000 Using International Standard Book Numbers as Uniform Resource Names Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on 28 February, 2001. Abstract This document discusses how International Standard Book Numbers (ISBN) can be supported within the URN framework and the syntax for URNs defined in RFC 2141 [Moats].Much of the discussion below is based on the ideas expressed in RFC 2288 [Lynch]. Chapter 5 contains a URN namespace registration request modelled according to the template in RFC 2611 [Daigle et al.]. 1. Introduction As part of the validation process for the development of URNs the IETF URN working group agreed that it is important to demonstrate that the current URN syntax proposal can accommodate existing identifiers from well established namespaces. One such infrastructure for assigning and managing names comes from the bibliographic community. Bibliographic identifiers function as names for objects that exist both in print and, increasingly, in electronic formats. RFC 2288 [Lynch et. al.] investigated the feasibility of using three identifiers (ISBN, ISSN and SICI) as URNs. This document will analyse the usage of ISBNs as URNs in more details than RFC 2288. A registration request for acquiring Namespace Identifier (NID) "ISBN" for ISBNs is included in chapter 5. The document at hand is part of a global co-operation of the national libraries to foster identification of electronic documents in general and utilisation of URNs in particular. The document was written as a co- operative project between the Helsinki University Library and The International ISBN Agency. We have used the URN Namespace Identifier "ISBN" for ISBNs in examples below. 2. Identification vs. Resolution As a rule the ISBNs identify finite, manageably-sized objects, but these objects may still be large enough that resolution to a hierarchical system is appropriate. The materials identified by an ISBN may exist only in printed or other physical form, not electronically. The best that a resolver will be able to offer in this case is bibliographic data from a national bibliography database, including information about where the physical resource is stored in the national library's holdings. 3. International Standard Book Numbers 3.1 Overview RFC 2288 [Lynch] describes the ISBN system in the following way: An International Standard Book Number (ISBN) identifies an edition of a monographic work. The ISBN is defined by the standard NISO/ANSI/ISO 2108:1992 [ISO1] Basically, an ISBN is a ten-digit number (actually, the last digit can be the letter "X" as well, as described below) which is divided into four variable length parts usually separated by hyphens when printed. The parts are as follows (in this order): * a group identifier which specifies a group of publishers, based on national, geographic or some other criteria, * the publisher identifier, * the title identifier, * and a modulus 11 check digit, using X instead of 10. The group and publisher number assignments are managed in such a way that the hyphens are not needed to parse the ISBN unambiguously into its constituent parts. However, the ISBN is normally transmitted and displayed with hyphens to make it easy for human beings to recognize these parts without having to make reference to or have knowledge of the number assignments for group and publisher identifiers. There are plans to extend the ISBN into 13 digits in order to make the system more suitable for identification of electronic monographs. So called Bookland ISBN will consist of a traditional ISBN preceded by the 978 or 979 EAN flag. 3.2 Encoding Considerations and Lexical Equivalence RFC 2288 [Lynch] says that: Embedding ISBNs within the URN framework presents no particular encoding problems, since all of the characters that can appear in an ISBN are valid in the identifier segment of the URN. %-encoding, as described in [MOATS] is never needed. Example: URN:ISBN:0-395-36341-1 For the ISBN namespace, some additional equivalence rules are appropriate. Prior to comparing two ISBN URNs for equivalence, it is appropriate to remove all hyphens, and to convert any occurrences of the letter X to upper case. 3.3 Resolution of ISBN-based URNs The existing ISBN structure is very suitable for URN resolution purposes. The group identifier can assist in the resolver discovery process. For instance, group identifier "951" means Finland. In this case, the Finnish national bibliography database will be able to resolve the URN either into bibliographic data or - if the resource is available in the Internet - to the document itself. In some cases group identifier does not identify a single country but a language area. For instance group identifier "3" is used by German, German Swiss and Austrian publishers. It may also be that there is more than one national bibliography database that may contain the needed resource. In these cases, it is necessary to define a cascade of URN resolution services - for instance, German national bibliography, Austrian national bibliography and Swiss national bibliography, in this order - into the DNS records describing the resolution service for ISBNs starting with "3". In ISBN, group identifier or even the publisher identifier can be used as a "hint". Technically it is possible to incorporate also URN resolution services maintained by publishers into the common structure. For instance, "951-0" is the unique ISBN identifier of the largest publisher in Finland, Sanoma-WSOY. If they launch their own URN resolution services, resolution requests for ISBNs starting with "951-0" will be directed to the publisher's server, and all other requests to the national bibliography 3.4 Additional considerations The basic guidelines for assigning ISBNs to electronic resources are the following: * Format/means of delivery are irrelevant as to the decision whether a product needs an ISBN. If the content meets the requirement, it gets an ISBN, no matter what the format of the delivery system. * Each format of a digital publication should have a separate ISBN. The definition of new edition is normally based on one of the two criteria: * A change in the kind of packaging involved: the hard cover edition, the paper cover edition and the library-binding edition would each get a separate ISBN. The same applies to different formats of digital files. * A change in the text, not including packaging art or minor changes such as correcting a spelling error. Again, this criterion applies whether the publication is print on paper or digital. Although these rules seem very clear, their interpretation may vary. As [Lynch] points out, The choice of whether to assign a new ISBN or to reuse an existing one when publishing a revised printing of an existing edition of a work or even a revised edition of a work is somewhat subjective. Practice varies from publisher to publisher (indeed, the distinction between a revised printing and a new edition is itself somewhat subjective). The use of ISBNs within the URN framework simply reflects these existing practices. Note that it is likely that an ISBN URN will often resolve to many instances of the work (many URLs). Further details on the process of assigning ISBNs can be found in section 5 (Namespace registration) below. 4. Security Considerations This document proposes means of encoding ISBNs within the URN framework. This document describes ISBN-based URN resolution only in a generic level; thus questions of secure or authenticated resolution mechanisms are out of scope. It does not address means of validating the integrity or authenticating the source or provenance of URNs that contain ISBNs. Issues regarding intellectual property rights associated with objects identified by the ISBNs are also beyond the scope of this document, as are questions about rights to the databases that might be used to construct resolvers. 5. Namespace registration URN Namespace ID Registration for the International Standard Book Number (ISBN) Namespace ID: ISBN This Namespace ID is the same as the internationally used acronym for International Standard Book Number. Giving this NID to any other identifier system would cause a lot of confusion. Registration Information: Version: 1 Date: 2000-08-30 Declared registrant of the namespace: Name: Hartmut Walravens E-mail: hartmut.walravens@sbb.spk-berlin.de Affiliation: Director, The International ISBN Agency Address: Staatsbibliothek zu Berlin - Preußischer Kulturbesitz - D-10772 Berlin, Germany Declaration of syntactic structure: An ISBN is a ten-digit number (actually, the last digit can be the letter "X" as well, as described below) which is divided into four variable length parts usually separated by hyphens when printed. The parts are as follows (in this order): * a group identifier which specifies a group of publishers, based on national, geographic or some other criteria, * the publisher identifier, * the title identifier, * and a modulus 11 check digit, using X instead of 10. Example: URN:ISBN:0-395-36341-1 Relevant ancillary documentation: The ISBN (International Standard Book Number) is a unique machine- readable identification number, which marks any edition of a book unmistakably. This number is defined in ISO Standard 2108. The number has been in use now for 30 years and has revolutionised the international book-trade. 154 countries are officially ISBN members, and more countries are joining the system. The administration of the ISBN system is carried out on three levels: International agency Group agencies Publisher levels The International ISBN agency is located within the State Library Berlin. The main functions of the International ISBN Agency are: * To promote, co-ordinate and supervise the world-wide use of the ISBN system. * To approve the definition and structure of group agencies. * To allocate group identifiers to group agencies. * To advise on the establishment and functioning of group agencies. * To advise group agencies on the allocation of international publisher identifiers. * To publish the assigned group numbers and publishers prefixes in up- to-date form. More information about ISBN usage can be found from the ISBN Users' Manual. 4th edition of this document is available at http://www.isbn.spk- berlin.de/html/userman.htm. Identifier uniqueness considerations: ISBN that has been assigned once will never be re-used. Identifier persistence considerations: The ISBN accompanies a publication from its production onwards. It is persistent; ISBN once given will never leave the publication. Identifier assignment process: Assignment of ISBNs is always controlled by ISBN group agencies, which are often national and quite frequently located in the national libraries. Publishers are usually given blocks of ISBNs, from which they pick identifiers for newly published items. As pointed out earlier, there is some variation between different publishers in ISBN assignment. In practice these differences are so small that they do not pose a threat to the usability of the ISBN system as a whole. Identifier resolution process: URNs based on ISBNs will be primarily resolved via the national bibliography databases. Since ISBN group agencies are as a rule located in national libraries, the national bibliography databases cover almost every publication which does have an ISBN. If group identifier does not define a country but a language area there may be many countries using the same group identifier. For instance, "3" is used in Austria, Germany and Switzerland. In this case, a cascade of national bibliographies needs to be defined. The International ISBN Agency also maintains a list of publishers who have been assigned a publisher identifier within the ISBN system. The publisher identifier may be used allow participation of resolution services maintained by publishers into the URN resolution system for ISBN. Rules for Lexical Equivalence: For the ISBN namespace, some additional equivalence rules are appropriate. Prior to comparing two ISBN URNs for equivalence, it is appropriate to remove all hyphens, and to convert any occurrences of the letter X to upper case. Conformance with URN Syntax: Embedding ISBNs within the URN framework presents no particular encoding problems, since all of the characters that can appear in an ISBN are valid in the identifier segment of the URN %-encoding, as described in [MOATS] is never needed. Example: URN:ISBN:0-395-36341-1 Validation mechanism: Validity of an ISBN string can be checked by modulus 11 check digit, included into the ISBN. X is used instead of 10. Validity of ISBN assignments can be checked from the group agencies or directly from the publisher. Scope: Global. 6. References [Daigle et al.]: Daigle, L., van Gulik, D., Iannella, R. & Faltstrom, P.: URN Namespace Definition Mechanisms, RFC2611, June 1999. [Lynch] Lynch, C., Using Existing Bibliographic Identifiers as Uniform Resource Names, RFC 2288, February 1998 [Moats] Moats, R., "URN Syntax", RFC 2141, May 1997. 7. Authors' Addresses Juha Hakala Helsinki University Library - The National Library of Finland P.O. Box 26 FIN-00014 Helsinki University FINLAND E-mail: juha.hakala@helsinki.fi ...Hartmut Walravens The International ISBN agency Staatsbibliothek zu Berlin - Preußischer Kulturbesitz - D-10772 Berlin, GERMANY E-mail: hartmut.walravens@sbb.spk-berlin.de 8. Full Copyright Statement Copyright (C) The Internet Society (2000). All Rights Reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English. The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assigns. This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.