INTERNET-DRAFT The Path URN Specification ************************** draft-ietf-uri-urn-path-00.txt Expires Sept 25, 1995 Daniel LaLiberte Michael Shapiro This document is also available in HTML at: Status of this memo =================== This document is an Internet-Draft. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." To learn the current status of any Internet-Draft, please check the "1id-abstracts.txt" listing contained in the Internet-Drafts Shadow Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe), munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or ftp.isi.edu (US West Coast). This Internet Draft expires Sept 25, 1995. Last modified: Mon Mar 20 12:13:51 1995 Abstract ======== A new "path" URN scheme is proposed that defines a uniformly hierarchical name space. The resolution of a path URN is a two-step process: locating the resolution server and locating the resource within the server. Existing DNS capabilities are used to locate the resolution server and HTTP is used as the protocol for locating a resource within the server. Introduction ============ Conceptually, the path scheme defines a uniformly hierarchical name space. A path is a sequence of components and an optional opaque string. An example path is: path:/A/B/C/doc.html Names are assigned by naming authorities that are responsible for a subtree of the name space, and naming authories may delegate responsibility to sub-authorities. Each naming authority corresponds to a name resolution service, which may be shared by several naming authorities. In this document, we first describe the name resolution process conceptually. This is followed by a detailed description of our (planned) implementation, the encoding rules, and the discussion of URN requirements. The Name Resolution Process =========================== This section describes the resolution process conceptually but not completely. See the implementation section for the details. The name resolution process involves two steps: First we traverse the path left to right until we find a most-specific server, then we interact with that server to resolve the remainder of the path name. The server has the option of returning a redirection to a URL. The resolution process starts at the path name root located at some fixed, globally known network address. The root corresponds to a name resolution service which resolves the first component of a path into the address of another node. Generally, each node in the hierarchy resolves a path component into another node at the next lower level. This process repeats until no more-specific resolver is found. The name resolver for each node must tell clients whether there is a more-specific resolver for the given path. This information will be used by clients to avoid requesting resolution for components of the path that do not have a more-specific resolver. If there is a more-specific resolver, then the client proceeds with the process of requesting subsequent components of the path. If there is not a more-specific resolver, then this first phase of the resolution process is completed. Clients are expected to make use of caches to retain information about recently visited name resolvers so that resolution of a path can start from the most-specific known resolver instead of at the root. Once the most-specific resolver is found for a particular path, it returns the address of a separate terminal resolver to the client. The client then sends the full path to this terminal resolver. The path scheme defines the protocol for interacting with the terminal resolver as HTTP. The result of the terminal resolution may be any document, identified by Content-type, or it may be a redirection to a URL. The URL may be, for example, an http URL or another path URN. Implementation of Resolution ============================ The implementation of the resolution process follows the abstract two-step process. The first step resolves the name into an IP address and a port number. The second step involves contacting a server at the IP address and port number returned by the first step and, using the HTTP protocol, issuing a GET of the entire URN. Resolving the name into a server and port number +++++++++++++++++++++++++++++++++++++++++++++++++ The resolution of a name into a server and port number is done using existing DNS capabilities. As an aid for the discussion that follows, the following partial document tree is used: / | A | -------------------------- | | B1* B2* | | ---------- | | | | C1 C2* C | D* The nodes marked with * are server nodes. They have one or more (IP-address, port) pairs associated with them. /A/B1 serves all documents under /A/B1 except /A/B1/C2 /A/B2 serves all documents under /A/B2 execpt /A/B2/C/D The resolution process proceeds as follows. 1. The entire URN, except the scheme and the final component, is converted to a DNS name appended with ".path.urn". For example, path:/A/B2/C1/doc.html is converted to c1.b2.a.path.urn 2. Partial-names are built starting with the last three components of the DNS name and iteratively adding components. All DNS records associated with this partial-name are requested using DNS resolvers. o If the TXT record is missing, then the URN does not resolve into a server and the URN is assumed to be invalid. o If there is an A record, then this is a server node. The TXT record lists sub-nodes not handled by this server. o If none of the sub-nodes listed in the TXT record match, then this is the server. o Else this implies that there is a DNS entry for the sub-node. The matching component is added to the partial-name to form a new partial-name and this step is repeated. o If there is no A record o If no A record has been encountered up to this point, the next component of the URN is added to the partial-name to form a new partial-name and this step repeated. o If at least one A record has been encounted up to this point o If none of the sub-nodes listed in the TXT record match the remaining components of the path, then the most recent partial-name that had an A record is the server for this name. o Else this implies that there is a DNS entry for the sub-node. The matching component is added to the partial-name to form a new partial-name and this step is repeated. Once the server DNS entry is located, the IP-address(es) are extracted from the A record and the associated port number(s) extracted from the TXT record. To clarify the above algorithm, some examples are presented. The examples use the partial document tree specified previously. The DNS entries for this partial tree are: TXT A a.path.urn -empty- -none- b1.a.path.urn c2, port=n ip-address c2.b1.a.path.urn port=n ip-address b2.a.path.urn d.c, port=n ip-address d.c.b2.a.path.urn port=n ip-address Example lookups /A/B1/C1/doc.ps a.path.urn no A record repeat with b1.a.path.urn b1.a.path.urn has A record, TXT doesn't have c1 this is the server /A/B2/C/D/doc.ps a.path.urn no A record repeat with b2.a.path.urn b2.a.path.urn has A record, TXT has d.c repeat with d.c.b2.a.path.urn d.c.b2.a.path.urn has A record this is the server Alternatively, there could be an entry for c.b2.a.path.urn instead of it being subsumed in b2.a.path.urn: TXT A a.path.urn -empty- -none- b2.a.path.urn c, port=n ip-address c.b2.a.path.urn d -none- d.c.b2.a.path.urn port=n ip-address The lookups proceed as /A/B2/C/D/doc.ps a.path.urn no A record repeat with b2.a.path.urn b2.a.path.urn has A record, TXT has c repeat with c.b2.a.path.urn c.b2.a.path.urn no A record, TXT has d repeat with d.c.b2.a.path.urn d.c.b2.a.path.urn has A record this is the server /A/B2/C/E/doc.ps a.path.urn no A record repeat with b2.a.path.urn b2.a.path.urn has A record, TXT has c repeat with c.b2.a.path.urn c.b2.a.path.urn no A record, TXT does not have e server at b2.a.path.urn Locating the Resource +++++++++++++++++++++ The full path URN is passed to the server using the HTTP protocol as a GET request. The server must either return a full response (with HTTP header and response), or a URI-header in HTTP message types 301 (moved permanently) or 302 (moved temporarily). For the redirect messages, the client should process the URLs normally. If the HTTP server returns a full response, the object returned could be the named object itself, or it might be metadata for the object. In either case, it would be identified by the Content-type header line. If and when URC standards are defined, clients that are capable of handling URCs indicate that in the Accepts header line. For clients that cannot handle URCs, the server could automatically process the URC to instead return a URL for the object, or it could return the object itself. Encoding Syntax =============== ::= "path:" ::= "/" [ ] ::= "" | "/"