Network Working Group Robert Thurlow Internet Draft May 2003 Document: draft-thurlow-nfsv4-namespace-00.txt A Namespace For NFS Version 4 Status of this Memo This document is an Internet-Draft and is subject to all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet- Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/1id-abstracts.html The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html Discussion and suggestions for improvement are requested. This document will expire in November, 2003. Distribution of this draft is unlimited. Abstract Recent work on Replication and Migration for NFSv4 has reminded us of a more fundamental problem: NFS currently lacks a coherent enterprise or global namespace. With changes in a minor revision of NFS Version 4, this can be addressed to make services like replication and migration of filesystems fully functional. Expires: November 2003 [Page 1] Title A Namespace For NFS Version 4 May 2003 Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Problem Statement . . . . . . . . . . . . . . . . . . . . . 4 3. Requirements . . . . . . . . . . . . . . . . . . . . . . . . 4 4. Implementation Options . . . . . . . . . . . . . . . . . . . 4 5. Minor Revision Client-Server Changes . . . . . . . . . . . . 5 5.1. Finding The Global Root . . . . . . . . . . . . . . . . . 5 5.2. Junction Nodes . . . . . . . . . . . . . . . . . . . . . . 5 5.3. New error number: NFS4ERR_REFERRAL . . . . . . . . . . . . 7 5.4. Enhanced Lookup Procedure . . . . . . . . . . . . . . . . 7 5.5. An Example . . . . . . . . . . . . . . . . . . . . . . . . 8 6. Server-Namespace Interaction . . . . . . . . . . . . . . . . 9 6.1. Use The Directory . . . . . . . . . . . . . . . . . . . . 9 6.2. Use Replication . . . . . . . . . . . . . . . . . . . . . 9 7. Appendix A: XDR Protocol Definition File . . . . . . . . . 10 8. Normative References . . . . . . . . . . . . . . . . . . . 12 9. Informative References . . . . . . . . . . . . . . . . . . 12 10. Author's Address . . . . . . . . . . . . . . . . . . . . 13 Expires: November 2003 [Page 2] Title A Namespace For NFS Version 4 May 2003 1. Introduction Unlike some other distributed filesystems, NFS has never had a unified, Internet- or enterprise-level namespace. NFS namespaces are often, at best, confined to a set of machines within an administrative domain which is often smaller or much smaller than the company-wide Intranet. This has become a larger issue with the EOL of such systems as AFS and DCE/DFS, which each provide a strong and unified namespace at the enterprise level that can be extended to a global namespace for the Internet. To create a large-scale namespace, we have to reconfigure the way the NFS client discovers resources. In general, the client specially handles certain "junction" points in its view of the namespace; when one of these junction points is manipulated, the client consults some kind of distributed database to find distributed filesystems and attributes for them, and then mounts and uses them. This makes the namespace a composite construction accessed by different protocols at different times. This is neither necessary nor desireable; with an extensible distributed filesystem protocol such as NFS Version 4, these junctions can be embedded into the NFS virtual filesystem and all necessary information can be discovered by NFSv4 protocol operations. The server could also support mechanisms to work with other distributed filesystems. Once clients understand junctions and how to get referrals to actual locations, we can support generic servers to provide clients with easy access to the namespace. These servers need not store any data files, but could just store a replica of the top level of the global namespace. They could advertise the global root via SLP [RFC2608] and let clients find the locations of useful filesystems via referrals. This easily supports an enterprise-level consistent namespace, which can be made global with industry agreement regarding the management of a global root directory and the servers to present it. A namespace solves some problems for [REPLMIG] as well. An unsolved problem in replication is how to inform the client of the locations of the replicas; so far, non-standard extentions to automounter maps and manual mount command syntax must be used. With an extended lookup, attributes and several or all locations can be returned in- band. This document does not currently define a syntax for the top levels of the filesystem. Any syntax should define the names used for the top two levels (e.g. "/nfs/sun.com"), and should also define a shorthand for accessing the enterprise-level root without the need to go through the top level unnecessarily. Expires: November 2003 [Page 3] Title A Namespace For NFS Version 4 May 2003 2. Problem Statement Customers have seen a gap in NFS with respect to such distributed filesystems as AFS and DCE/DFS. They want to be able to build a namespace which is seen consistently by all clients in their enterprise, and some hope for a truly global Internet-wide hierarchy of files, to which you would gain only the access level you deserved. They want to be able to delegate authority to match the data; I may not be able to dictate where my home directory is in the namespace, but I should be able to construct my home directory as multiple filesystems and be able to "publish" those parts. Because the best we have been able to do involves a highly-configurable Automounter daemon with variance across platforms, combined with non-standard databases with non-standard filesystem location information, the NFS industry can't currently offer this functionality. 3. Requirements Customers requirements include the following: o Permit me to build (at least) enterprise-wide namespaces o Permit me to delegate management of parts of the namespace to owners of data o Make this manageable from almost anywhere o Don't make me deploy a new naming service o Permit reasonable backwards compatibility 4. Implementation Options The problem boils down to a couple of issues: o How does the client find a server for a relevant root vnode? o How does the client detect and navigate a "junction point" where it must transition from a higher-level to a lower-level filesystem? The first issue can be solved by configuring the root location into the client or by having the client do a network transaction to find a suitable global root server. Hard-coding this information does not scale to many clients, so a network transaction is in order. The most suitable deployed service to find an instance of a highly- replcated object is the Service Location Protocol [SLP]. By doing an Expires: November 2003 [Page 4] Title A Namespace For NFS Version 4 May 2003 SLP request, a client should be able to find a nearby server which knows how to find the global root and can thus see all the data in that one filesystem. Typically, though, that filesystem would consist almost entirely of some locations of more interesting resources. The "junction" point, where virtual filesystems meet, is inherently an abstraction, since the separate virtual filesystems must never completely look like a single one. Since the junction is abstract, there are different ways to construct it. The client can construct it based on information from a distributed service such as LDAP or the server can construct it and make it visible through the NFS protocol. If the server constructs it, it could again base the construction on a service like LDAP, or it could hold copies in actual filesystem objects, with the filesystems managed as replicas. The best choice at this time seems to be to have the server make an abstraction visible to the client via NFS Version 4 minor revision protocol elements. The server would be able to construct symbolic links for NFS Version 2 and Version 3 clients and to construct other types of referrals to other distributed filesystems (e.g. Microsoft DFS referrals for CIFS clients). How the server gets its data is not so clear at present and is not specified by this draft. Further ideas are discussed in Section 6. 5. Minor Revision Client-Server Changes 5.1. Finding The Global Root The NFS client should begin navigation of the global namespace by issuing an SLP call to look for a service named "Global_NFS". It should then attempt to ask that server for information about the global or enterprise root. 5.2. Junction Nodes Junction nodes to other distributed filesystems could be represented by the following XDR definition: enum nodetype4 { NAME_NFS_URL = 1, NAME_NFS_IP = 2, NAME_SMB = 3 }; Expires: November 2003 [Page 5] Title A Namespace For NFS Version 4 May 2003 enum ipaddrtype { NAME_IPV4 = 1, NAME_IPV6 = 2 }; struct nameipnode4 { ipaddrtype type; opaque addr; opaque path<>; }; union namenode4 switch (nodetype4 type) { case NAME_NFS_URL: /* nfs://server.domain.com/export/dir1/dir2 */ opaque nfs_url; case NAME_NFS_IP: /* 10.0.0.2 + /export/dir1/dir2 */ nameipnode4 nfs_ip; case NAME_SMB: /* As defined by: http://www.ietf.org/internet-drafts/draft-crhertel-smb-url-04.txt */ /* smb://server.domain.com/export/dir1/dir2 */ opaque smb_url; }; enum opttype4 { NAME_ACCESS = 1, NAME_MASTER = 2, NAME_VOLID = 3, NAME_STRING = 4 }; enum access4 { NAME_RO = 0, NAME_RW = 1 }; struct keyvalue { opaque key<>; opaque value<>; }; Expires: November 2003 [Page 6] Title A Namespace For NFS Version 4 May 2003 union options4 switch (opttype4 type) { case NAME_ACCESS: /* ro/rw */ access4 acc; case NAME_MASTER: /* master true/false */ bool master; case NAME_VOLID: /* volume ID if multiple paths to master */ int64 volid; case NAME_STRING: /* generic string=value option */ keyvalue kv; }; struct location4 { namenode4 loc; options4 opts<>; }; This definition would permit servers to be able to send referrals containing NFS URLs, which would require a name service lookup, or a combination of IPv4 or IPv6 address and a path name, suitable for immediate use, and even an SMB URL for Samba servers. 5.3. New error number: NFS4ERR_REFERRAL A new error number should be added to those defined in [RFC3530], defined this way: NFS4ERR_REFERRAL The name being looked up is valid, but refers to an object on another NFS server. The RLOOKUP operation will provide more information about this node. 5.4. Enhanced Lookup Procedure An enhanced RLOOKUP operation is proposed for a future NFS Version 4 minor revision. It will act like the current LOOKUP operation in [RFC3530] in most cases, but will return much richer data in operation response when a node is a junction. This extra information makes it possible to begin use of a referred filesystem without an extra round-trip. The definition is: struct RLOOKUP4args { /* CURRENT_FH: directory */ component4 objname; }; Expires: November 2003 [Page 7] Title A Namespace For NFS Version 4 May 2003 union referral4 switch (nfsstat4 status) { case NFS4ERR_REFERRAL: location4 locarray<>; default: void; }; struct RLOOKUP4res { /* CURRENT_FH: object */ referral4 refer; }; 5.5. An Example The client calls: Fd = open("/nfs/sun.com/corp/data/spreadsheet.pdf", ...); The following traffic would result: SLP SrvRqst "Global_NFS" --> Broadcast SLP SrvRply "master1:/, master2:/" <-- SLP server NFS COMPOUND {putrootfh rlookup nfs rlookup sun.com rlookup corp rlookup data open spreadsheet.pdf} --> master1 NFS { putrootfh OK rlookup OK rlookup OK rlookup OK rlookup EREFER corp:/stuff} <-- master1 NFS COMPOUND {putrootfh rlookup stuff rlookup data open spreadsheet.pdf } --> corp NFS { putrootfh OK rlookup OK rlookup OK rlookup EREFER cdata:/finance } <-- corp NFS COMPOUND {putrootfh rlookup finance rlookup data open spreadsheet.pdf } --> cdata NFS { putrootfh OK rlookup OK rlookup OK open OK } <-- cdata Expires: November 2003 [Page 8] Title A Namespace For NFS Version 4 May 2003 6. Server-Namespace Interaction Though this document intends to specify the client-server interactions of the namespace, it is interesting to speculate on how servers will construct the namespace abstraction for the client. There are two main ways to do this, which differ in where the "real" data lives. 6.1. Use The Directory In this scenario, the real home of the data is in a set of interrelated nodes in an LDAP directory. The server enumerates a list of junction points from the directory and marks those nodes as requiring special handling, and accesses to these nodes result in an LDAP lookup to find the latest data to return to the client. This group would standardize an LDAP schema and management would be via LDAP tools. This has the benefit that an LDAP schema would be a well-understood concept and that tools should be available to manage it. A disadvantage is that NFS server implementations are usually embedded in the operating system kernel, requiring LDAP lookups to involve a user-level daemon. Also, unavailability of the LDAP service will cause issues for the server. 6.2. Use Replication In this scenario, the new virtual junction becomes an actual filesystem object, and contains the data needed by the client. The junction object could be created on the master filesystem and propagated by filesystem replication as defined in [REPLMIG]. For managability, an SNMP MIB could be defined to enumerate all junction points in a particular filesystem and to manipulate their properties. A management tool would construct an image of the namespace by consulting the root of the global filesystem and walking down as needed. This has the advantage that servers would always have data to give to clients, and that changes in the linkage of filesystems would be identical to other changes to the linkage of directories in the filesystem as far as the client could see. Expires: November 2003 [Page 9] Title A Namespace For NFS Version 4 May 2003 7. Appendix A: XDR Protocol Definition File /* * Copyright (C) The Internet Society (2003). * All Rights Reserved. */ /* * node.x */ %#pragma ident "@(#)node.x 1.2 03/05/21" enum nodetype4 { NAME_NFS_URL = 1, NAME_NFS_IP = 2, NAME_SMB = 3 }; enum ipaddrtype { NAME_IPV4 = 1, NAME_IPV6 = 2 }; struct nameipnode4 { ipaddrtype type; opaque addr; opaque path<>; }; union namenode4 switch (nodetype4 type) { case NAME_NFS_URL: /* nfs://server.domain.com/export/dir1/dir2 */ opaque nfs_url; case NAME_NFS_IP: /* 10.0.0.2 + /export/dir1/dir2 */ nameipnode4 nfs_ip; case NAME_SMB: /* As defined by: http://www.ietf.org/internet-drafts/draft-crhertel-smb-url-04.txt */ /* smb://server.domain.com/export/dir1/dir2 */ opaque smb_url; }; enum opttype4 { NAME_ACCESS = 1, NAME_MASTER = 2, NAME_VOLID = 3, Expires: November 2003 [Page 10] Title A Namespace For NFS Version 4 May 2003 NAME_STRING = 4 }; enum access4 { NAME_RO = 0, NAME_RW = 1 }; struct keyvalue { opaque key<>; opaque value<>; }; union options4 switch (opttype4 type) { case NAME_ACCESS: /* ro/rw */ access4 acc; case NAME_MASTER: /* master true/false */ bool master; case NAME_VOLID: /* volume ID if multiple paths to master */ int64 volid; case NAME_STRING: /* generic string=value option */ keyvalue kv; }; struct location4 { namenode4 loc; options4 opts<>; }; struct RLOOKUP4args { /* CURRENT_FH: directory */ component4 objname; }; union referral4 switch (nfsstat4 status) { case NFS4ERR_REFERRAL: location4 locarray<>; default: void; }; struct RLOOKUP4res { /* CURRENT_FH: object */ referral4 refer; }; Expires: November 2003 [Page 11] Title A Namespace For NFS Version 4 May 2003 8. Normative References [RFC1831] R. Srinivasan, "RPC: Remote Procedure Call Protocol Specification Version 2", RFC1831, August 1995. [RFC1832] R. Srinivasan, "XDR: External Data Representation Standard", RFC1832, August 1995. [RFC2165] J. Veizades, E. Guttman, C. Perkins, S. Kaplan, "Service Location Protocol", RFC2165, June 1997 [RFC3530] S. Shepler, B. Callaghan, D. Robinson, R. Thurlow, C. Beame, M. Eisler, D. Noveck, "Network File System (NFS) Version 4 Protocol", RFC3530, April 2003. [RFC2608] E. Guttman, C. Perkins, J. Veizades, M. Day, "Service Location Protocol, Version 2", RFC2608, June 1999. 9. Informative References [RFC3010] S. Shepler, B. Callaghan, D. Robinson, R. Thurlow, C. Beame, M. Eisler, D. Noveck, "NFS version 4 Protocol", RFC3010, December 2000. Expires: November 2003 [Page 12] Title A Namespace For NFS Version 4 May 2003 10. Author's Address Address comments related to this memorandum to: nfsv4-wg@sunroof.eng.sun.com Robert Thurlow Sun Microsystems, Inc. 500 Eldorado Boulevard, UBRM05-171 Broomfield, CO 80021 Phone: 877-718-3419 E-mail: robert.thurlow@sun.com Expires: November 2003 [Page 13]