INTERNET DRAFT Aditya Pandit Expires: March 2005 Sujay Godbole Document: draft-pandit-nfs-2-3-4-local-minor-interop-00.txt Interoperability between all NFS versions and local filesystem IPR Statement: By submitting this Internet-Draft, We certify that any applicable patent or other IPR claims of which we are aware have been disclosed or will be disclosed, and any of which we become aware will be disclosed in accordance with RFC 3668. Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. This document may not be modified, and derivative works of it may not be created, except to publish it as an RFC and to translate it into languages other than English. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoletes by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Abstract NFS (Network File System) version 4 is a major rewrite of NFS. NFS version 4 has lot of new features added into the protocol and the behavior is changed a lot from the predecessors. This Draft considers the interoperability problems of all versions of NFS and local filesystem. Yet, access to any one file system through the NFS v2 or NFS v3 or NFS v4 protocol requires that a single server be accessed. This draft tries to suggest the behavior that could be expected when NFS version 4 is used in multi-protocol environment with NFS version 2 and version 3. Aditya Pandit Expires March 2005 [Page 1] Internet Draft Interoperability NFS v2,v3,v4 August 27, 2004 Table of contents 1. Introduction. . . . . . . . . . . . . . . . . . . 4 2. New Features supported by NFS version 4 . . . . . 5 2.1. Mount . . . . . . . . . . . . . . . . . . . . . 5 2.2. Statefull protocol . . . . . . . . . . . . . . 5 2.3. Open and Close . . . . . . . . . . . . . . . . 5 2.4. Migration . . . . . . . . . . . . . . . . . . . 5 2.5. Replication . . . . . . . . . . . . . . . . . . 5 2.6. Delegation and callbacks . . . . . . . . . . . 5 2.7. Callback RPCs . . . . . . . . . . . . . . . . . 6 2.8. Locking . . . . . . . . . . . . . . . . . . . . 6 2.9. Volatile File Handles . . . . . . . . . . . . . 6 2.10. Mount point Crossing . . . . . . . . . . . . . 6 2.11. Crash Recovery . . . . . . . . . . . . . . . . 6 2.12. ACL support . . . . . . . . . . . . . . . . . 6 2.13. Minor Versioning . . . . . . . . . . . . . . . 6 2.14. Internationalization . . . . . . . . . . . . . 7 3. Details of interoperability with previous version of NFS and local file system. . . . . . 7 3.1. Mount changes and its effect to earlier versions of NFS . . . . . . . . . . . . . . . . . 8 3.2. Statefull protocol and its effect to earlier versions. . . . . . . . . . . . . . . .. . . 11 3.2.1. COMMIT and stable writes . . . . . . . . . . 12 3.2.2. OPEN and CLOSE operation and state of the file . . . . . . . . . . . . . . . . . . 12 3.2.3. Client and Server Crash . . . . . . . . . . . 13 3.2.4. Lock Manager Protocol . . . . . . . . . . . . 13 3.2.5. Delegation . . . . . . . . . . . . . . . . . 14 3.2.6. SETATTR . . . . . . . . . . . . . . . . . . . 14 3.2.7. READ and WRITE. . . . . . . . . . . . . . . . 14 3.3. File handle creation, mapping and file identity. . . . . . . . . . . . . . . . . . . . 14 3.4. Duplicate Request Cache . . . . . . . . . . . . 15 3.5. OPEN and CLOSE . . . . . . . . . . . . . . . . 16 3.6. Filesystem Migration and expected behavior on earlier versions of NFS and local file system . . . . . . . . . . . . . . . . . . . . . 18 3.7. Replication and its expected behavior on earlier versions of NFS and local file system . . . . . . . . . . . . . . . . . . 19 3.8. Issues of Delegation with respect to the earlier versions of NFS . . . . . . . . . . . . 19 3.8.1. Delegation recall. . . . . . . . . . . . . . 19 3.8.2. Operations that can recall delegation . . . . 20 3.8.3. Delegation recovery . . . . . . . . . . . . . 20 3.8.3.1. Client Reboot. . . . . . . . . . . . . . . 20 3.8.3.2. Server Reboot . . . . . . . . . . . . . . . 21 3.8.3.3. Network Partition . . . . . . . . . . . . . 21 Aditya Pandit Expires March 2005 [Page 2] Internet Draft Interoperability NFS v2,v3,v4 August 27, 2004 3.8.4. Delegation and Mandatory file locking . . . . 21 3.8.5. Read Delegation . . . . . . . . . . . . . . . 21 3.8.6. Write Delegation . . . . . . . . . . . . . . 21 3.9. NFS version 4 Locking and interoperability with NLM. . . . . . . . . . . . . . . . . . . . 22 3.9.1. NLM, Locking and migration. . . . . . . . . . 24 3.9.2. NLM, Locking and share reservations. . . . . 24 3.9.3. NLM, Locking and mandatory locking and READ and WRITE . . . . . . . . . . . . . 24 3.9.4. State affecting locks and effect to NFS version 2 and 3 . . . . . . . . . . . . . 24 3.9.5. NLM and NFS version 4 Lease expiration. . . . 25 3.9.5.1. NFS v4 client lease expires . . . . . . . . 26 3.9.5.2. NFS v3 client stops responding . . . . . . 26 3.9.6. Server Failure - NLM and NFS version 4 lock recovery . . . . . . . . . . . . . . . 26 3.9.7. Revocation of locks and NLM . . . . . . . . . 26 3.9.8. CLOSE and locks release and updation with NLM . . . . . . . . . . . . . . . . . . . 26 3.10. Attribute caching . . . . . . . . . . . . . . 27 3.11. ACL Support . . . . . . . . . . . . . . . . . 3.12. Minor Versioning . . . . . . . . . . . . . . . 29 3.12.1 pNFS . . . . . . . . . . . . . . . . . . 29 3.12.2 SECINFO changes .. . . . . . . . . . . . 29 3.12.3 Directory delegation . . . . . . . . . . 30 3.13. Internationalization in NFS v4 and encoding on NFS v3 . . . . . . . . . . . . . . . 30 4. Conclusion. . . . . . . . . . . . . . . . . . . . 31 5. Informative References. . . . . . . . . . . . . . 31 6. Author’s Addresses. . . . . . . . . . . . . . . . 32 7. Copyright Notice. . . . . . . . . . . . . . . . . 33 Aditya Pandit Expires March 2005 [Page 3] Internet Draft Interoperability NFS v2,v3,v4 August 27, 2004 1. Introduction This document assumes understanding of the NFSv2, NFSv3 and NFSv4. A new version of NFS is being proposed. This is the 4th version of the NFS protocol. Previously we had NFS version 3. This was an extension of NFS version 2. NFS protocol provides transparent remote access to shared files across networks. The NFS protocols have been designed to be portable across machines, operating systems network architectures and transport protocols. NFS version 4 has been greatly changed from the previous versions of NFS. This imposes a great challenge for the interoperability between the protocols. NFS version 4 has lot of improved features as compared to NFS version 2 and 3. Purpose of this document to 1. Propose solutions for problem of interoperability of NFS version 2, version 3 and version 4. 2. Map the features of NFS version 2 and NFS version 3 on NFS version 4 and vice-versa. 3. Describe the behavior that one could expect in case of feature mapping. 4. Describe the mapping NLM (Network Lock Manager) to NFS version 4. 5. Describe the effect of NFS version 4 on MOUNT protocol version 2 and version 3. 6. Describe the effect of NFS version 4 on local filesystem. 7. This document does not cover the issues related to interoperability with other protocols like CIFS or others. This document is structured in such a way that section 2 gives in short the list of new features in NFS version 4, or the features that we found are different from NFS version 2 and version 3. Section 3 gives the details of the interoperability behavior. This is expansion of the section 2. The document does contain some of the definitions of procedures and structures from the NFS version 2, 3 or 4. These definition have been taken "as is" from the RFCs. These have been taken for clarity and for easy explanation of a topic to the reader. This document does not discuss the implementation specific issues. The implementation specific issues are left to the implementor. Aditya Pandit Expires March 2005 [Page 4] Internet Draft Interoperability NFS v2,v3,v4 August 27, 2004 2. New Features supported by NFS version 4. In this section, list of all the new features have been described in short. In some places the complexity of the problem has also been mentioned. 2.1 Mount Working of mount protocol on NFS version 2 and version 3 was as follows: The function of getting the root file handle is done by the MNTPROC_MNT and MNTPROC3_MNT procedure of the mount RPC. The mount protocol works on an ephemeral port. It relies on the rpcbind daemon for correct transport address. This causes problem for mounting, when NFS server is behind a firewall. To overcome such problems NFS version 4 has integrated the mount protocol in itself. The mount program and the functions in the mount program have been totally removed. The details of the effect of changes in mount protocol will be discussed in section 3.1. 2.2 Statefull protocol In the NFS version 4 a field stateid has been introduced in almost all the RPCs. Unlike NFS version 4, previous versions of NFS were stateless. Most of the operations in NFS version 2 and version 3 were idempotent. The detailed issues are given in section 3.2. 2.3 Open and close These two new operations are defined in the NFS version 4 protocol. These two operations mainly introduce the statefullness in NFS version 4. OPEN operation initiates the delegation. These procedures are absent in the earlier version of NFS. The details of this are covered in section 3.5. 2.4 Migration NFS version 4 supports a feature for migration of a filesystem for load balancing. This feature is not present in the previous versions of NFS protocol. The detailed issues and suggested behavior is given in section 3.6 2.5 Replication NFS version 4 supports replication of filesystem. The replication is read- only. This feature is absent in earlier version of NFS. The detailed description of the issues have been given in section 3.7 2.6 Delegation and callbacks The file, attribute, and directory caching for the NFS version 4 protocol is similar to previous versions. Aditya Pandit Expires March 2005 [Page 5] Internet Draft Interoperability NFS v2,v3,v4 August 27, 2004 The major addition to NFS version 4 from the earlier version is the ability of the server to delegate caching. The delegation of read and write is not present in NFS version 2 and version 3. In NFS version 2 and version 3, there is only weak cache consistency. This draft also tries to cover the issues, which may arise in conjunction with a physical file system. The details have been covered in section 3.8. 2.7 Callback RPCs. A new set of RPC have been introduced in NFS version 4. These are the callback RPCs. These RPCs are sent from server to the client. They are generally used for recalling of delegation. 2.8 Locking In NFS version 4, locking has been made a part of the protocol. In earlier versions this part was being handled by NLM. Hence the earlier versions of NFS were stateless. These issues have been addressed in detail in section 3.9. 2.9 Volatile File Handles. In Volatile File Handles, the server may determine that a volatile filehandle is no longer valid at many different points in time. In volatile file handles when the server determines that a filehandle is no longer valid it should return NFS4ERR_EXPIRED. This concept is not present on the earlier version of NFS. 2.10 Mount point Crossing. In NFS version 4 the NFS client will be able to determine if it crosses a server mount point, by a change in the value of the "fsid" attribute. In case of NFS version 2 and version 3 always the physical directory gets accessed. The interoperability issues will be addressed in 3.1 2.11 Crash Recovery In the previous versions of NFS it was very simple. There was no state information that had to be maintained on the server side. If the server crashes the client has to resend the request to the server to get the data. Being stateful in NFS version 4 it has become a bit complicated. 2.12 ACL support NFS Version 4 supports access control lists. These access control lists are superset of POSIX ACLs. There are around 17 different types of ACEs defined in NFS version 4. This feature is not present in the earlier versions of NFS. The effect of this features has been described in detail in section 3.11 2.13 Minor Versioning To address the requirement of an NFS protocol that can evolve as the need arises, the NFS version 4 protocol contains the rules and framework to allow for future minor changes or versioning. It is described in detail in section 3.12. Aditya Pandit Expires March 2005 [Page 6] Internet Draft Interoperability NFS v2,v3,v4 August 27, 2004 2.14 Internationalization NFS version 4 supports different character set encoding like UTF-8 and others, whereas in NFS version 2 and version 3, only character string of bytes is supported. The issues related to internationalization is covered in section 3.13 3. Details of interoperability with previous version of NFS and local file system. The interoperability can be achieved in two ways: 1. Stacking NFS version 2 and version 3 on top of NFS version 4. 2. By implementing a common layer to which all the NFS protocols integrate. Stacking NFS version 2 and 3 on top of NFS version 4: In this case, if the NFS version 2 or version 3 server wants to access some data the request will be forwarded to the NFS version 4 server. The NFS 4 server will process the request and give it to the respective server to return it back to the client. In this case we are making NFS version 2 and 3 server a client of the NFS version 4 server. This layer will talk NFS version 4 below and will talk NFS version 2 or version 3 to the clients. In this way we can keep the data and metadata consistent that is being accessed. Indirectly the NFS version 2 and version 3 will be acting like NFS version 4. This approach can be good in some cases where you will have minor versions. This approach has got following drawbacks: 1. In this case NFS version 2 and 3 will be dependant on the implementation of NFS version 4. 2. If the NFS version 4 server becomes unstable NFS version 2 and 3 will also become unstable. 3. If a file is directly accessing the lower level then it becomes difficult to propagate the information to the higher level. E.g. If you directly access NFS version 4 from NFS version 4 client. This cannot be conveyed to NFS version 2 and version 3. This will require two way communication between NFS version 4 with previous version of NFS. 4. Degradation in performance for NFS version 2 and version 3. 5. This will violate some of the semantics of NFS version 2 and version 3 like mountpoint crossing, locking, access control etc. 6. Infact this will increment the NFS version 2 and 3 protocol for migration and replication. Implementing a common layer: In this approach the implementation can define a layer for all the version of NFS. This layer should know all the features till the latest version of NFS. Here the latest version of NFS can be a NFS version 4 or a minor version of NFS version 4. Hence this layer should be aware of all the features of all the version of NFS installed on that server. Aditya Pandit Expires March 2005 [Page 7] Internet Draft Interoperability NFS v2,v3,v4 August 27, 2004 This approach is preferable because 1. The working of NFS version 2 and 3 is not affected because the protocol is not directly mapped to NFS version 4. 2. There will be no performance degradation. 3. Semantics can remain consistent. This approach has following drawbacks 1. This approach may not work with the minor versions of NFS version 4. 2. The layer can be very complicated to implement as the layer has to keep track of the source of the request, without violating the semantics of the corresponding protocol. 3.1 Mount changes and its effect to earlier versions of NFS This section does a comparative study of the different version of the NFS protocol. The mount program in NFS version 2 is defined as follows: /* Protocol description for the mount program */ program MOUNTPROG { /* Version 1 of the mount protocol used with * version 2 of the NFS protocol. */ version MOUNTVERS { void MOUNTPROC_NULL(void) = 0; fhstatus MOUNTPROC_MNT(dirpath) = 1; mountlist MOUNTPROC_DUMP(void) = 2; void MOUNTPROC_UMNT(dirpath) = 3; void MOUNTPROC_UMNTALL(void) = 4; exportlist MOUNTPROC_EXPORT(void) = 5; } = 1; } = 100005; The mount program in NFS version 3 is defined as follows: program MOUNT_PROGRAM { /* Version 3 of the mount protocol */ version MOUNT_V3 { void MOUNTPROC3_NULL(void) = 0; mountres3 MOUNTPROC3_MNT(dirpath) = 1; mountlist MOUNTPROC3_DUMP(void) = 2; void MOUNTPROC3_UMNT(dirpath) = 3; void MOUNTPROC3_UMNTALL(void) = 4; exports MOUNTPROC3_EXPORT(void) = 5; } = 3; } = 100005; The functionality of the RPCs is the same. Both the versions have same number of functions and perform same operations. Aditya Pandit Expires March 2005 [Page 8] Internet Draft Interoperability NFS v2,v3,v4 August 27, 2004 In NFS version 4 mount is not a separate program it is present in the NFS itself. The definition of the NFS version 4 is defined as: /* * Remote file service routines */ program NFS4_PROGRAM { version NFS_V4 { void NFSPROC4_NULL(void) = 0; COMPOUND4res NFSPROC4_COMPOUND (COMPOUND4args) = 1; } = 4; } = 100003; The NFS version 4 also consists of a program for callbacks NFS4_CALLBACK, but we will see the importance of this in the later sections where we will discuss delegation in detail. The NULL RPC is used to test the response from the server. This RPC is present on all the versions of NFS. In NFS version 2 and version 3 the MNT RPC is used at the time of mounting of the NFS file system. It is used to add a mount entry. On success the server returns filehandle of the dirpath. The mount program uses the binding protocol, due to this it can not work across firewalls. One more problem that can be tracked down is that, if the server exports one more directory, there is no way for the client to automatically start accessing that exported directory. The client will have to list the exports and then mount that directory on a mount point. In NFS version 4 the RPC (PUTROOTFH) used for mounting is now a part of the protocol. The ROOT filehandle is the "conceptual" root of the filesystem name space at the NFS server. The third procedure is the DUMP procedure. In NFS version 2 and version 3 this procedure is used to returns the list of remotely mounted file systems. This procedure is not present on NFS version 4. In the earlier versions of NFS the UMNT and UMNTALL RPCs are used to remove an entry previously added by MNT. These procedures are absent in NFS version 4 protocol. The NFS namespace has been totally changed in the NFS version4 protocol. The EXPORT procedure is used to list the exported directories in NFS version 2 and version 3. This procedure is absent in NFS version 4. This is now done using READDIR RPC on the root. Aditya Pandit Expires March 2005 [Page 9] Internet Draft Interoperability NFS v2,v3,v4 August 27, 2004 In the NFS version 4 the namespace is exposed to the client using an export feature. Once the client gets the root file handle, the client can traverse the exported directories and the directories in the exported directories. The traversing of the exported directories is done using a multi-component LOOKUP operation. This solves both the problems of the being able to cross the firewalls as well as dynamic updation of the exported directories by the NFS server. In most of the implementations the configuration file is sometimes kept common and in some implementations it is kept separate from the previous versions. First, let us take the case where the configuration file for all the 3 version of NFS is same and is as follows: Case 1: The configuration file has valid exported directories /a disk1 (exported) /b disk2 (exported) If the configuration for NFS version 4 is kept same with the configuration file of NFS version 2 or version 3, then for NFS version 2 and version 3 it should return the exported directory. In the above example it should return: /a (and its options) /b (and its options) In case of NFS version 4, when the user browses the directories the user should see disk1 and disk2 as two directories in the mounted directory. In case of NFS version 4, user should not see the directories "a" and "b". Case 2: The configuration file has ‘/’ as placeholder / (place holder/not exported) /a/b (filesystem 1) /a/b/c/d (filesystem 2) The above is a case of mount point crossing. Mount point crossing is supported in NFS version 4, but it is not supported in NFS version 2 and version 3. In NFS version 2 and version 3 the server would respond with the filehandle of directory "/a/b/c/d" within the filesystem "/a/b". The above case also has a placeholder. The NFS version 2 and version 3 server will need to be made aware of the fact that ‘/’ is not exported and is just a place holder, otherwise there will be a security flaw. In this case when the client asks for the exported directories the NFS version 2 and version 3 should return: /a/b (filesystem 1) /a/b/c/d (directory in /a/b) Aditya Pandit Expires March 2005 [Page 10] Internet Draft Interoperability NFS v2,v3,v4 August 27, 2004 It should not return ‘/’, if ‘/’ is intended to be just a placeholder. For this, the code for reading the configuration file will need to be modified. Case 3: The configuration has a placeholder directory which is used to fillup a bridging gap between two directories. / disk1 (exported) /a disk2 (not exported) /a/b disk3 (exported) In NFS version 2 and version 3 there was no concept of a place holder. The NFS version 2 and version 3 server will need to be made aware of the fact that /a is not exported and is just a place holder, otherwise there will be a security flaw. In this case when the client asks for the exported directories the NFS version 2 and version 3 should return: / (filesystem 1) /a/b (exported) It should not return /a if /a is intended to be just a placeholder. For this the code for reading the configuration file will need to be modified. Now let us consider the other case where the configuration file for NFS version 4 is different from the earlier versions of NFS. In this case the exported directories for NFS 2 and 3 can be made different from the directories exported in NFS version 4. The only drawback is that the exported directories for all the NFS versions can differ. The management of maintaining exports for NFS will also increase. But overall keeping the configuration file separate for NFS version 4 can reduce some of the problems mentioned above. It should also be noted that the client should get the exported directories for the NFS version 4 protocol by doing READDIR on root filehandle. The client should not use the NFS version 3 mount protocol to browse the exported directories and assume that the directories are the same for the NFS version 4 protocol. 3.2 Statefull protocol and its effect to earlier versions. In this section the main focus is on the state of the file which is being used. There are other things like locking which affect the state of the file, but this section will be focusing on the statefullness of the protocol rather than going into the semantics of the feature that affects the states. NFS version 4 is a statefull protocol. NFS version 2 and version 3 were stateless protocols. For NFS version 4 every procedure the stateid is passed from client to server and vice-versa. The stateid uniquely defines the open and locking state provided by the server. Aditya Pandit Expires March 2005 [Page 11] Internet Draft Interoperability NFS v2,v3,v4 August 27, 2004 The stateid is generated when client sends an OPEN RPC. The server is responsible for generating the sequence number and the stateid. The important part in this section is to see that due to statelessness in the NFS version 2 and version 3 or having states in the NLM protocol, does the NLM states affect the states of NFS version 4. The statefull operations are file or record locking and remote execution. In NFS version 4 these operations are present in the protocol itself. The state of file is affected for the following 3.2.1 COMMIT and stable writes This section covers the effect of NFS version 3 commit with the state information of the NFS version 4. COMMIT RPC is called when the write is asynchronous. This is directly going to affect a file opened using NFS version 3 protocol and the NFS version 4 protocol has delegated the file for writing. The commit will have to sync the buffers also for the NFS version 4. This should eventually result in recall of the delegation, eventually changing the state of the file. This situation is also valid for local filesystem. The same procedure will have to be done if a process on the server does sync or fsync. 3.2.2 OPEN and CLOSE operations and state of the file. This section concentrates on Open and close operations and their effect on the state of the file. The details of OPEN and CLOSE are covered in section 3.5. When the file is opened the state of the file is set. In previous version of NFS there was no need of open and close operation. This is because the protocol was stateless. Slightly longer, the statefullness was separated into a separate protocol(NLM). OPEN call is absent in the earlier versions of NFS protocol. The places where NFS 2 or 3 will affect the state for a opened file is that when the one client opens file using NFS version 4 and server grants delegation and from other client NFS version 3 requests for a read or write. This will result in recall of the delegation. If there is a previously opened file using NFS version 2 or version 3 and if the data is being written or read from it, and if another client opens the file using NFS version 4 this will not affect the state of NFS version 4 as the data is in a consistent state for both the protocol. The close operation in NFS version 4 should also affect a locked file using a NLM of NFS version 3. If after closing a file on NFS version 4, a NFS version 3 client wants to upgrade a lock, that should be allow. The locked state should be update unlocked state. In case of open or close system calls called by a process on the server accessing the same file, the operation for open will be done as per the physical filesystem. For close the locks for that file will have to be released and the NFS version 4 will need to be aware that the local filesystem has released the locks. Aditya Pandit Expires March 2005 [Page 12] Internet Draft Interoperability NFS v2,v3,v4 August 27, 2004 3.2.3 Client and Server Crash This section covers the effect of client and server crash to the state of the file. The server crash can be of two types viz. machine reboot or daemon crash. 3.2.3.1 Client crash In case of NFS 3 client crash state of a file will only be affected for the NLM. Consider a case where you are accessing a file from NFS version 3 and NFS version 4. The NLM will have to be modified to communicate with NFS version 4 to tell about the state of a file that was locked by a NFS version 3 client. A similar thing should be done to the NFS version 4 server. The NFS version 4 server should be aware of the NLM and should share single copy of the state of the file. In case if NFS 4 client crashes the NFS 4 server will take care of recovering the locks etc. but the NLM module of NFS version 3 will also be have to be told about it. 3.2.3.2 Server crash In case of NFS version 3, it will only affect NLM. NFS version 3 itself is stateless so it will not affect the working of the NFS server. The NFS version 4 will need to know the state of the files locked by the NLM. Suppose if the NFS version 4 crashes, the only component in NFS version 3 that is going to be affected is the NLM. The NLM should be communicated that NFS version 4 server has crashed. In this case the locks should be recovered to the NFS version 4 server. The locked state could also be recovered in this case. 3.2.4 Lock Manager Protocol Lock manager protocol version 3 and 4 is a stateful protocol. NFS version 4 is also a stateful protocol. In this section we will cover how the NLM and NFS version 4 will affect the state of a file. The semantics of how NLM could be made work with NFS version 4 will be covered in section 3.9. The NLM affects the state of an open file. When a file is opened on NFS version 4 and a lock is acquired on it (may be shared or exclusive), the NLM should me made aware of the state of the file. Similarly when a NFS version 2 or 3 client locks a file the NFS version 4 should be made aware of its state. These procedures in NLM are: NLMPROC4_TEST, NLMPROC4_LOCK, NLMPROC4_CANCEL, NLMPROC4_UNLOCK, NLMPROC4_GRANTED, NLMPROC4_TEST_MSG, NLMPROC4_LOCK_MSG, NLMPROC4_CANCEL_MSG, NLMPROC4_UNLOCK_MSG, NLMPROC4_GRANTED_MSG, NLMPROC4_TEST_RES, NLMPROC4_LOCK_RES, NLMPROC4_CANCEL_RES, NLMPROC4_UNLOCK_RES, NLMPROC4_GRANTED_RES, NLMPROC4_SHARE, NLMPROC4_UNSHARE, NLMPROC4_NM_LOCK, NLMPROC4_FREE_ALL. Aditya Pandit Expires March 2005 [Page 13] Internet Draft Interoperability NFS v2,v3,v4 August 27, 2004 All these procedures will have to update the state which is understood by the NFS version 4 procedures - LOCK, LOCKT, LOCKU, RELEASE_LOCKOWNER, OPEN, OPEN_DOWNGRADE and CLOSE and vice-versa. 3.2.5 Delegation This section covers the effect of delegation to the state of an open file, when accessed from different versions of NFS. When open is called from NFS version 4 client and no other client or process is using that file then, the file gets delegated for read or write. This effectively means that the client can cache the file locally and flush the data when delegation is recalled or can flush the data periodically. In delegation the state of the file is generally changed to being delegated for reading or writing. NFS version 3 can affect the delegation state. This will generally happen if the file is delegated for writing and a NFS version 3 client wants to read the file. In this case the state of the file needs to be modified. Similarly when the file is delegated for reading and NFS version 3 client tries to write to the file the state of the file will need to be modified. The NLM request from the NFS version 2 and version 3 client will affect the delegation state. 3.2.6 SETATTR This section covers the effect of SETATTR on the state of the open file. SETATTR of NFS version 3 will only affect a file, which is delegated for reading or writing. The effect of calling SETATTR will have to change the state of the file. The same is true for physical filesystem. A chmod or chown system call on UNIX platform will have to affect state of an open file on NFS verion 4. 3.2.7 READ and WRITE READ and WRITE in NFS version 4 requires that the file should be opened i.e. OPEN and OPEN_CONFIRM should be called before calling READ and WRITE. In NFS version 3 READ and WRITE are idempotent operations. NFS version 4 will not affect NFS version 2 and 3 but NFS version 2 and 3 will affect a file opened for read or write delegation using NFS version 4. The NFS version 3 procedures READ and WRITE will have to be modified to communicate with NFS version 4 server and change the state of the file. This might not be true for the operations done on the physical filesystem. This is because a physical filesystem can use the semantics similar to NFS version 4 and the state can be changed in the open system call of the operating system. 3.3 File handle creation, mapping and file identity. The file handle size in NFS version 4 is 128 bytes. It is defined as: const NFS4_FHSIZE = 128; typedef opaque nfs_fh4; Aditya Pandit Expires March 2005 [Page 14] Internet Draft Interoperability NFS v2,v3,v4 August 27, 2004 The file handle size in NFS version 3 is 64 bytes and is defined as: const NFS3_FHSIZE = 64; struct nfs_fh3 { opaque data; }; In NFS version 2 the file handle is defined as: const FHSIZE = 32; typedef opaque fhandle[FHSIZE]; The file handle creation should be at a single point and should generate a unique file handle. The file handle referred by NFS version can differ due to difference in sizes. But if they are pointing to the same file then: o Every time from the client of the same version you should get same file handle for the same file. o If an operation is done using a file handle from two different clients using two different protocols and if the file handles are different due to difference in the size but are pointing to the same file then the operation should be done on the same file. o A NFS server sharing a file using NFS version 2, 3 and 4 should generate Persistent Filehandles for NFS version 2 and 3 otherwise there will be inconsistencies in the file handle referring the same file. o All the other rules specified by the respective protocols hold true. 3.4 Duplicate Request Cache Most NFS version 3 protocol server implementations use a cache of recent requests, for the processing of duplicate non-idempotent requests. In NFS version 2 and version 3 the non-idempotent operations include: CREATE, MKDIR, MKNOD, SYMLINK, MKNOD, REMOVE, RMDIR, RENAME and LINK. In NFS version 2 and version 3 idempotent operations include: NULL, GETATTR, SETATTR, LOOKUP, ACCESS, READLINK, READ, WRITE, READDIR, READDIRPLUS, FSSTAT, FSINFO, PATHCONF and COMMIT. For the non-idempotent operation the server has to keep a cache of the last operation, which effectively means that the server has to maintain a state of the last non-idempotent operation. In NFS version 4 the non-idempotent operations are: CLOSE CREATE LINK LOCK OPEN OPEN_ATTR (open operations changes state of the server), REMOVE RENAME and RELEASE_LOCKOWNER. Aditya Pandit Expires March 2005 [Page 15] Internet Draft Interoperability NFS v2,v3,v4 August 27, 2004 And the idempotent operations are: Access, COMMIT, DELEGPURGE, DELEGRETURN, GETATTR, GETFH, LOCKT, LOCKU, LOOKUP, LOOKUP, NVERIFY, PUTFH, PUTPUBFH, PUTROOTFH, READ, READDIR, READLINK, RENEW, RESTOREFH, SAVEFH, SECINFO, SETATTR, SET_CLIENTID, SET_CLIENTID OPEN_CONFIRM, VERIFY, WRITE & ILLEGAL. The more important operations for us are the Non-idempotent operations. These operations are common for NFS version 4 and the earlier version of NFS. The cache should be updated when any of the non-idempotent operations are affecting the same file or directory. Being a duplicate cache, each NFS version implementation can maintain its own cache. 3.5 OPEN and CLOSE OPEN and CLOSE are two operations introduced in NFS version 4 protocol. These operations were absent in NFS version 2 and version 3. These operations provide single point where file lookup, creation, validation, sharing, delegation etc can be combined. Most of the procedures in NFS version 2 and version 3 are idempotent and maintained state for locking only. Locking in NFS version 2 and version 3 is done using a separate protocol (NLM). If an NFS version 2 and 3 client is continuously accessing a file and NFS version 4 client opens a file, the version 4 client will get delegation. When in the next request, NFS version 2 or version 3 client tries to do some operation on the file, NFS version 4 client will have to recall the delegation immediately without actually taking advantage of delegation. This could be an overhead for NFS version 4. This can be optimized in the following ways: 1. Keep a cache of recently executed operation and its corresponding protocol. The cache will be looked up before granting delegation to a NFS version 4 client. 2. A heuristic approach to dynamically judge the request pattern for NFS version 2 and version 3 clients. The NFS version 4 server should be made aware of the open and close system calls made from the local filesystem. share_access and share_deny: This is used to control share reservation on a file. Using share access the user can open the file where the opening of the file by other client can be controlled. OPEN4_SHARE_ACCESS_READ: the file is shared for reading with other client or process. OPEN4_SHARE_ACCESS_WRITE: the file is shared for writing with other client or process. Aditya Pandit Expires March 2005 [Page 16] Internet Draft Interoperability NFS v2,v3,v4 August 27, 2004 OPEN4_SHARE_ACCESS_BOTH: the file is shared for both reading and writing with other client or process. Using share deny, the user can open the file and the opening of the file by other client can be controlled. OPEN4_SHARE_DENY_NONE The other processes can open this file for reading and writing. OPEN4_SHARE_DENY_READ Other processes can open this file only for writing. OPEN4_SHARE_DENY_WRITE Other processes can open this file only for reading. OPEN4_SHARE_DENY_BOTH Other process can not open this file for reading and writing. In the above cases the other process can also be NFS version 2 or version 3 server. The share access and deny access affects the NFS version 2 and version 3 because there is no single point where this access can be checked. The share reservations control the access of the complete file. The operations from NFS version 2 and 3 that will be affected are READ and WRITE. READ and WRITE operations will have to be made aware of the share access of NFS version 4. In case of conflicts to have a consistent behavior, the READ and WRITE of NFS version 2 and version 3 should be return MNT3ERR_ACCES. The share reservations will also affect the locking of NFS version 2 and version 3. The details of this are covered in section 3.9. In case of local file system, open system call should check for the existing share and deny modes of the open file of NFS version 4. Create modes and open type in NFS version 4: Create modes and open types in NFS version 4 will only affect a file open by a process on the server itself. It will not affect NFS version 2 and 3 because OPEN operation is not present on NFS version 2 and 3. There exists an operation CREATE which is used for creation of a file. But the semantics of CREATE will take care if the file is being created by NFS version 4. Another important operation that is performed by OPEN is delegation. Delegation is given to a NFS version 4 client for read and writes to a file. When another client tries to access the same file delegation is recalled. Granting of delegation is done in OPEN operation. Detailed issues related to delegation will be handled in section 3.8. OPEN and CLOSE can be thought as a non-idempotent operations. This is because it changes the state of the server. Second time open or close will not always return success. Aditya Pandit Expires March 2005 [Page 17] Internet Draft Interoperability NFS v2,v3,v4 August 27, 2004 Last Close Problem: It may happen that a file opened by NFS version 4 can be deleted or renamed by NFS version 2 and version 3 clients as they are stateless. This causes the last close problem. In NFS version 3 this is being typically solved by renaming the file to .nfs. Even if we delete the file from NFS version 2 and version 3, the file will still remain open for NFS version 4 and will get deleted after NFS version 4 closes the file. Similarly we can allow the rename requests also. The delete on close of a file is generally supported by the underlying operating system. If the operating system does not support delete on close, then the delete request from NFS version 2 and version 3 can be made successful and NFS version 2 and version 3 server can communicate with the NFS version 4 server to delete the file on close. 3.6 Filesystem Migration and expected behavior on earlier versions of NFS and local file system. In NFS version 4 Filesystem Migration is achieved using the recommended attribute "fs_locations". If a filesystem is migrated and a NFS version 4 client tries to access the file NFS4ERR_MOVED is returned. Using the fs_locations attribute the client can access the file from the new location. In case of migration the complete state of the server for that filesystem is migrated to the new server. All the persistent file handles are valid on the new server also, though the server can choose to use volatile filehandles. In earlier versions of NFS there was no concept of migration. There are two ways by which this problem can be solved for NFS version 2 and version 3. First way is that NFS version 2 and version 3 assume that the filesystem was migrated and no longer exists on this server. Hence the NFS version 2 and version 3 server does not know anything about the files now. Therefore for all the operations affecting the migrated files NFS version 2 and version 3 server will return an error NFSERR_NOENT and NFS3ERR_NOENT respectively. While migration, the migration utility should take into account the traffic for NFS version 2 and version 3 protocol. The second way of solving this problem is that to make the NFS version 2 and version 3 server aware of the migration in NFS version 4. In this case we are taking NFS version 2 and version 3 near NFS version 4. The NFS version 2 and version 3 servers can communicate with the migrated server to fetch the data and give it back to its client. This will add lot of complexity to NFS version 2 and version 3 server. Aditya Pandit Expires March 2005 [Page 18] Internet Draft Interoperability NFS v2,v3,v4 August 27, 2004 3.7 Replication and its expected behavior on earlier versions of NFS and local file system. Typically, the filesystem will be replicated on two or more servers. The fs_locations attribute will provide the list of these locations to the client. Feature of replication is totally absent on NFS version 2 and version 3. The NFS 4 client can switch to the replicated server when primary server becomes unresponsive. Such a thing can not be done for NFS version 2 and version 3. Hence, when the server becomes unresponsive NFS version 2 and version 3 clients will wait until timeout or until the server is back again. 3.8 Issues of Delegation with respect to the earlier versions of NFS NFS version 4 does not guarantee strict and distributed cache coherency. However NFS version 4 provides aggressive caching using delegation. Delegation is done for READ and WRITE. 3.8.1 Delegation recall. Delegation gets initiated in the OPEN call. The delegation type is returned in the open_delegation4 field of the OPEN4resok. The delegation can be of type OPEN_DELEGATE_NONE, OPEN_DELEGATE_READ or OPEN_DELEGATE_WRITE. OPEN_DELEGATE_READ: the client is assured that no other client has write access to the file for the duration of the delegation. OPEN_DELEGATE_WRITE: the client is assured that no other client has read or write access to the file. If other client tries to violate the delegation, say the client tries to open file for writing, the delegation is recalled. This is done using the CB_RECALL operation. The problem starts here, suppose there is a NFS version 2 or version 3 client, who is also accessing the same file. In NFS version 2 and 3 there is no call for opening the file hence delegation in this case will have to be recalled from READ or WRITE operation of NFS version 2 and version 3. The reads and writes of NFS version 2 and version 3 should be treated as the functional equivalents of a corresponding type of OPEN. This refers to the READs and WRITEs that use the special stateids in NFS version 4 consisting of all zero bits or all one bits. Therefore, such READs or WRITEs will force the server to recall a write open delegation. A WRITE of NFS version 2 and version 3 done by another client will force a recall of read open delegations. Aditya Pandit Expires March 2005 [Page 19] Internet Draft Interoperability NFS v2,v3,v4 August 27, 2004 3.8.2 Operations that can recall the delegation. The list of operations that can recall read delegation are: 1. OPEN by process on the server in write mode. 2. WRITE by NFS version 2 or version 3 or process on server in case of write delegation 3. SETATTR 4. LOCK - NLM request or lock request by process on the server for writing. 5. REMOVE 6. COMMIT The list of operations that can recall write delegation are: 1. OPEN by process on the server for reading and writing 2. READ or WRITE by NFS version 2 or version 3 or process on server in case of write delegation 3. SETATTR 4. LOCK - NLM request or lock request by process on the server for reading and writing. 5. REMOVE 6. COMMIT 3.8.3 Delegation recovery. There are three cases in delegation recovery as mentioned in NFS version 4 rfc 3530: 1. Client reboot or restart 2. Server reboot or restart 3. Network Partition Delegation recovery affects the NFS version 2 and version 3 server in following ways: 3.8.3.1 Client Reboot In case of Client reboot, the delegation state is present on the server. After the reboot of the client, the client will have to set up delegation with the server. In the mean time if a NFS version 2 or version 3 client issues a request to read or write the data that conflicts with the delegation then NFS version 2 and version 3 server should be aware of the fact that file is already delegated and will have to request the NFS version 4 client to recall or revoke the delegation depending on how long the client is not responding. A similar case will have to be handled when a process on the server will try to open a file that was delegated to a client, which is rebooted or restarted. In this case the CLAIM_DELEGATE_PREV will need to be handled by the NFS version 4 server. Note that in this case, there is no NFS version 4 server accessing the file but a NFS version 2 or version 3 client or a process on the server itself, which is trying to access the same file. Aditya Pandit Expires March 2005 [Page 20] Internet Draft Interoperability NFS v2,v3,v4 August 27, 2004 3.8.3.2 Server Reboot In case of server reboot or restart the delegation information is lost. The client is responsible to RECLAIM the delegation. If already the delegation has been recalled then the question of again recalling does not come into picture. If there is a request pending from the NFS version 2 or version 3 client and at the same time the client tries to reclaim the delegation, the server should deny the delegation to the clients. 3.8.3.3 Network Partition If there is a Network Partition of NFS version 4 and a NFS version 2, version 3 or the process on the same server tries to access the same file, then this should be informed to the NFS version 4 server. The NFS version 4 servre should treat this request from another NFS version 4 client. The behavior of NFS version 4 server will not change in this case. The NFS version 4 server will need to be aware that NFS version 2 or NFS version 3 or a process on the server needs to access the file. 3.8.4 Data Caching and Mandatory File Locking. This is the cases where the file is marked for mandatory file locking and is locked and the file is also cached. In this case the NFS version 2 and version 3 server should be made aware of the NFS4ERR_LOCKED error code and this error should translate into NFS3ERR_ACCES i.e. access is denied. 3.8.5 Read Delegation In NFS version 4 when a file is opened for reading, delegation can be given to the client for reading. Read delegation guarantees that no other client can write to the file until the delegation is recalled or revoked. But other clients can read from the file. This should be also true for clients which are reading the file using NFS version 2 or version 3 or a process on the server reading the file. Otherwise the delegation should be recalled. 3.8.6 Write Delegation Write delegation guarantees that no other client can write to the file or read from the file until the delegation is recalled or revoked. This will be also true for clients, which are write to the file or read from the file using NFS version 2 or version 3 or a process on the server. Otherwise the delegation should be recalled. Aditya Pandit Expires March 2005 [Page 21] Internet Draft Interoperability NFS v2,v3,v4 August 27, 2004 3.9 NFS version 4 Locking and interoperability with NLM. In NFS version 4 the locking is implemented by the server and can be mandatory or advisory. Infact it is up to the server to determine whether to use mandatory or advisory locking. The NFS version 4 clients should handle the NFS4ERR_LOCKED error code. In NFS version 2 and version 3 the statefullness is separated into a separate protocol (NLM). NLM has got the following procedures as given in rfc 1813: version NLM4_VERS { void NLMPROC4_NULL(void) = 0; nlm4_testres NLMPROC4_TEST(nlm4_testargs) = 1; nlm4_res NLMPROC4_LOCK(nlm4_lockargs) = 2; nlm4_res NLMPROC4_CANCEL(nlm4_cancargs) = 3; nlm4_res NLMPROC4_UNLOCK(nlm4_unlockargs) = 4; nlm4_res NLMPROC4_GRANTED(nlm4_testargs) = 5; void NLMPROC4_TEST_MSG(nlm4_testargs) = 6; void NLMPROC4_LOCK_MSG(nlm4_lockargs) = 7; void NLMPROC4_CANCEL_MSG(nlm4_cancargs) = 8; void NLMPROC4_UNLOCK_MSG(nlm4_unlockargs) = 9; void NLMPROC4_GRANTED_MSG(nlm4_testargs) = 10; void NLMPROC4_TEST_RES(nlm4_testres) = 11; void NLMPROC4_LOCK_RES(nlm4_res) = 12; Aditya Pandit Expires March 2005 [Page 22] Internet Draft Interoperability NFS v2,v3,v4 August 27, 2004 void NLMPROC4_CANCEL_RES(nlm4_res) = 13; void NLMPROC4_UNLOCK_RES(nlm4_res) = 14; void NLMPROC4_GRANTED_RES(nlm4_res) = 15; nlm4_shareres NLMPROC4_SHARE(nlm4_shareargs) = 20; nlm4_shareres NLMPROC4_UNSHARE(nlm4_shareargs) = 21; nlm4_res NLMPROC4_NM_LOCK(nlm4_lockargs) = 22; void NLMPROC4_FREE_ALL(nlm4_notify) = 23; } = 4; We will consider the following NLM synchronous operations first: - NLMPROC4_TEST - NLMPROC4_LOCK - NLMPROC4_CANCEL - NLMPROC4_UNLOCK. The other operations are used for asynchronous operations. In NFS version 4 there is no callback RPC for blocking locks to tell the client that the request was granted. In NFS version 4 the client must poll to check whether the lock has been released. The corresponding operations in NFS version 4 are: - LOCKT - LOCK - there is no corresponding operation for NLMPROC4_CANCEL. - LOCKU, NLM is generally implemented in a separate process for NFS version 2 and version 3. E.g. FreeBSD uses rpc.lockd. The locks need to be handled such that all the version of NFS must have a consistent view of the lock held by a particular client. These locks can be present on the underlying filesystem. So that even if a process on the server tries to lock the file, proper status can be returned. The locking semantics of NLM and locking on NFS version 4 should work according to the respective protocols. The lock should be maintained at the lower layer in order to have a consistent view on all the protocols. Aditya Pandit Expires March 2005 [Page 23] Internet Draft Interoperability NFS v2,v3,v4 August 27, 2004 Let us consider the special cases with locking. 3.9.1 NLM, Locking and migration In case of migration, the migration utility should also take care of the locking state of a file from NLM. If the migration utility is started and is migrating the files, which are being locked by NLM, then the migration utility could also migrates the lock information to the new server but in this case the locks acquired by NFS version 2 and version 3 clients will never be released. This can be avoided by returning the NFSERR_STALE and NFS3ERR_STALE to NFS version 2 and version 3 clients respectively. Then clients will send the release lock requests. On unlock request, the migration utility will have to unlock the file on the migrated server. Second option is that the locks, locked by NFS version 2 and version 3 should not be migrated. This behavior won’t break the NFS version 2 and version 3 semantics. In the first option you will have to consider the cases like: Client crashes, NLM crashes, NFS version 2/3 client gets network partitioned. In these cases the migration utility should be smart enough to handle this and when the client recovers or comes back then the server should free the locks held by the client. But overall the second option looks to be a better choice. 3.9.2 NLM, Locking and share reservations When a NLM request is sent by the client for a file locked by a NFS version 4 client using share reservation, this request could behave like request sent by using special stateids of all bits 0. This situation will be similar to one NFS version 4 client, which has share reservation and other NFS version 4 client is trying to send LOCK request. 3.9.3 NLM, Locking and mandatory locking and READ and WRITE Mandatory locking is generally enforced by the underlying filesystem. If a file being locked by NLM then NFS version 4 should know that the file is locked and depending on whether mandatory locking is enabled READ and WRITE should return success or failure. 3.9.4 State affecting locks and effect to NFS version 2 and version 3 In NFS version 4 there are two types of locks the record locks and the share reservations. NFS version 4 has to keep the track of the state of the file. Aditya Pandit Expires March 2005 [Page 24] Internet Draft Interoperability NFS v2,v3,v4 August 27, 2004 If it is locked, the NFS version 4 server keeps the type of the lock and also the share reservation. NFS version 4 server can also delegate the file for reading and writing, and can also delegate the locks. If a NFS version 2 or version 3 client tries to access such a file then there will be three levels of checking that can be done: - Share reservations - Mandatory locking - Delegation checks. The operations that will be affected will be READ, WRITE and LOCK (NLM) In case of share reservations, if the file is opened with OPEN4_SHARE_ACCESS_READ by NFS version 4 then NFS version 2 and version 3 should also be allowed to read the file. If the file is opened with OPEN4_SHARE_DENY_READ then NFS version 2 and version 3 client should not be allowed to READ or LOCK the file. If LOCK is permitted there will be some conditions like lock a range of bytes from NLM but you have denied the bytes for reading. If the file is opened with OPEN4_SHARE_ACCESS_WRITE by NFS version 4 then NFS version 2 and 3 should also be allowed to write to the file. If the file is opened with OPEN4_SHARE_DENY_WRITE then NFS version 2 and version 3 client should not be allowed to WRITE or LOCK the file. If it is OPEN4_SHARE_ACCESS_BOTH then LOCK permission should also be possible. The next level of checking should be mandatory locking or locking by NFS version 4. This state should be shared with the NFS version 2 and version 3 server. The state of the file should be same for all versions of NFS. If there is a conflicting lock, then lock request should be denied. The next level of checking should be delegation. If the file is delegated then the delegation should be recalled. The issues with delegation are covered in section 3.8 3.9.5 NLM and NFS version 4 Lease expiration. The purpose of a lease is to allow a server to remove stale locks that are held by a client that has crashed or is network partitioned. Lease renewal can not be done for the special operations with special stateids. The lease generally expires on client failure and network partition. Taking this into account there can be two cases: Aditya Pandit Expires March 2005 [Page 25] Internet Draft Interoperability NFS v2,v3,v4 August 27, 2004 3.9.5.1 NFS v4 client lease expires Lock held by NFS version 4 client and if NFS version 2 or version 3 clients tries to get the lock and the lease period of NFS version 4 client has expired then the request should behave like a NFS version 4 request with stateid having all bits zero. The further operation should be taken normally as specified by NFS version 4. 3.9.5.2 NFS v3 client stops responding Lock held by NLM and if NFS version 4 client tries to get the lock then the request should be denied even though NFS version 2 or version 3 client stops responding forever. This will require manual intervention to revoke such locks. 3.9.6 Server Failure - NLM and NFS version 4 lock recovery There can be two cases in Server failure. If NLM and NFS version 4 server are different processes then any one of them can fail or if they are implemented in the same process then there will be a normal recovery of locks. Here we are interested in NLM failure and NFS version 4 server failure. If the NLM fails or the NFS version 4 fails in this case the lock information can be maintained in a single place probably we can consider in the common layer. When the failed server recovers the locks can also be recovered from the layer. In this case it is possible to seamless recovery of the locks and the client may not even come to know about the server failure. In case if both the server fail or the machine reboots. Normal recovery of locks should take place as specified in section 8.6.2 of NFS version 4 RFC 3530. 3.9.7 Revocation of locks and NLM This can only happen for three cases viz. Server restarts or reboots, inability to renew the locks and for administrative purpose. These scenarios are discussed in NFS version 4 RFC 3530. The important thing here is that, if the locks are to be revoked then it should also result in a CANCEL request to NLM clients. 3.9.8 CLOSE and locks release The server should keep track of LOCK and UNLOCK requests from NLM and NFS version 4 clients. The server should reference count the locks on file. There should be a single place for maintaining locks and should keep track of OPEN and CLOSE request from NFS version 4 client and open and close system calls from the local file system of the server. Aditya Pandit Expires March 2005 [Page 26] Internet Draft Interoperability NFS v2,v3,v4 August 27, 2004 3.10 Attribute caching Attribute caching in NFS version 2 and version 3 is time-bounded, whereas NFS version 4 has support for attribute caching using callbacks. In NFS version 4 there is a callback RPC - CB_GETATTR, which gets called if write delegation is active for a client. Hence NFS version 4 can be more accurate in case of attributes. The GETATTR of NFS version 2 and version 3 should result into sending of CB_GETATTR by NFS version 4 server, if delegation is in effect. 3.11 ACL support 3.11.1 NFS version 4 support for ACLs: NFS version 4 has a good support for Access Control Lists (ACLs), which expands upon the traditional idea of ACLs. ACLs are used to specify fine grained control of access to files system objects. The ACLs is a list of ACEs (Access Control Entries), where each ACE specifies some level of access for that entity. An ACE consists of: Access type: It identifies type of ACE. The ACE can of type ALLOW, DENY, AUDIT(LOG) and ALARM. Access mask: Access mask is a bitmask where each bit describes a level of access such as read, write, execute permission on the file system objects. Access Flag: Access flag specifies how an ACL on a directory may be prorogated to newly created files or directories inside of said directory. Who: It is User identifier, which is used identify requester at the time of ACL processing. It is not necessary that NFSv4 server should support all the ACL types. If server does not support the ACL type, it may return NFS4ERR_ATTRNOTSUPP error to the NFS version 4 clients. 3.11.2 NFS version 2 and NFS version 3 support for Permissions: NFS version 2 and version 3 supports traditional unix based access to the files system object. The user, group and others (UGO) of the requester are matched with the UGOs of the file system object on the local file system at the server end. Aditya Pandit Expires March 2005 [Page 27] Internet Draft Interoperability NFS v2,v3,v4 August 27, 2004 3.11.3 ACL Interoperability: 3.11.3.1 Mapping to local file system: The both NFS version 3 and version 4 access support is mapped to the local file system on the server. In this case the interoperability depends upon the accuracy of this mapping. This affects the attribute caching in all versions of NFS. A mechanism should be provided, which allows these servers to inform each other the changes in the access permissions of the file objects. NFSv3 client NFSv4 client | | | | NFSv3 server NFSv4 server | | | | ---------------------------------------------- Local File system ---------------------------------------------- The guiding principle in deciding mapping is that NFS version 4 server must not accept any ACLs that appear to make the file system object more secure than it really is. For example if local file system supports the UGOs then NFS version 4 server can not supports ACEs per user basis. If the local file system supports the POSIX ACLs, then mapping has been specified in detail in draft [posixacl] 3.11.3.2 Mapping NFSv3 to NFSv4: In this NFS version 2 and version 3 access control requests are mapped to corresponding NFS version 4 requests. All the access control is done by NFS version 4 server. Since all NFS versions authenticate the requester using RPC credentials, it is very easy of NFS version 4 server to process the access control requests from NFS version 2 and version 3 clients. NFSv3 client NFSv4 client | | | Access control request | NFSv3 server -------------> NFSv4 server Access control request <-------------- | | | ---------------------------------------------- Local File system ---------------------------------------------- Aditya Pandit Expires March 2005 [Page 28] Internet Draft Interoperability NFS v2,v3,v4 August 27, 2004 3.12 Minor Versioning In NFS version 4, minor version is recognized by a minor version field in the compound RPC. struct COMPOUND4args { utf8str_cs tag; uint32_t minorversion; nfs_argop4 argarray<>; }; This section will refer to the IETF drafts published for NFS version 4. This section will try to cover the currently known minor versions and will try to solve the problem of interoperability of the minor versions with NFS version 2 and version 3. On a NFS version 4 server, there can be atmost one minor version. Here we take the follow drafts: 3.12.1 pNFS draft: pNFS tries to optimize NFS version 4 for clusters. pNFS tries to achieve bandwidth scaling and reduce the bottleneck of the NFS i.e. the Network end- point. It also has a separate data and metadata accessing techniques. pNFS enhances NFS version 4 in a minor version. If the pNFS data is to be made accessible to NFS version 2 and version 3, the NFS version 2 and version 3 server should be able to communicate with the pNFS server. pNFS tries to achieve delegation for the layout maps. This would mean that when the data is present on pNFS, you might need to stack NFS version 2 and 3 on top of pNFS server. 3.12.2 NFSv4.1: SECINFO Changes: This minor version aims to remove the ambiguities of the NFS version 4, solves issues related to SECINFO and solving problems if the server returns the NFS4ERR_WRONGSEC. NFS version 2 and version 3 do not have SECINFO RPC. The authentication and security is mostly done at the RPC level. The operations that will be affecting NFS version 2 and version 3 are modified LOOKUP, PUTFH + LOOKUPP. If this minor version is present with the NFS version 2 and version 3 then common layer should be aware of the new version of NFS or the new version of NFS should be made aware of NFS version 2 and version 3. The new LOOKUP and LOOKUPP will cross the mountpoints as NFS version 4, but NFS version 2 and version 3 should not cross the mountpoints as discussed in section 2.10. There should not be a change in behavior of NFS version 2 and version 3. Aditya Pandit Expires March 2005 [Page 29] Internet Draft Interoperability NFS v2,v3,v4 August 27, 2004 3.12.3 NFSv4.1: Directory Delegations and Notifications This document proposes to adding directory delegations and notifications to NFS Version 4.1. This will affect the working of the NFS version 2 and 3. The Directory delegation draft adds the following operations: GET_DIR_DELEGATION, CB_NOTIFY, CB_RECALL_ANY. The NFS version 2 and version 3 operations may not result in a directory delegation recall. The operations like SETATTR, CREATE, MKDIR, RMDIR, SYMLINK, REMOVE, LINK should result in a NFS version 4.1 callback. The NFS version 2 and version 3 server should be able to communicate with the NFS version 4.1 to update its information. The mapping of notifications is: NFS version NFS version 4.1 2 and 3 Ops Operations SETATTR: FILE/DIR ATTRIBUTE CHANGE CREATE, MKNOD, MKDIR, SYMLINK, LINK: ADD ENTRY REMOVE, RMDIR: REMOVE ENTRY RENAME: RENAME ENTRY The operations like GETATTR, LOOKUP, ACCESS, LOOKUP, READDIR and READDIRPLUS of NFS version 2 and version 3 will result in NFS version 4.1 CB_NOTIFY FILE/DIR ATTRIBUTE CHANGE Notification callback to be sent to get the latest information. 3.13 Internationalization in NFS version 4 and encoding on NFS version 2 and version 3. NFS version 4 supports internationalization or I18N. There are three UTF-8 string types defined for NFS version 4: utf8str_cs, utf8str_cis and utf8str_mixed. In NFS version 2 and version 3 file names and other strings are defined as opaque or variable length string. There is no support for UTF-8 strings in NFS version 2 and version 3. Hence it possible that, there will be problems accessing the files. In NFS version 4 the following is defined as UTF-8: Symbolic link contents, path name components, server name, ace name, ace "who", compound tag, fattr4_mimetype, owner owner_group, filename, directory name, special file name etc. All the strings are now being coded as UTF-8. Aditya Pandit Expires March 2005 [Page 30] Internet Draft Interoperability NFS v2,v3,v4 August 27, 2004 Hence this might cause problems to NFS version 2 and version 3 while accessing. The files and the data will be accessible but the file names might get interpreted differently. This means that a file name shown on NFS version 4 will be shown differently from NFS version 2 and version 3 server. On the server, a converter will be required which converts the NFS version 4 UTF-8 strings to opaque or string types required by NFS version 2 and version 3. 4. Conclusion This document has tried to cover the interoperability of known versions of NFS and the local file system. We hope that the document will be useful to the implementor in the designing of NFS version 4. 5. Informative References [RFC2025] Adams, C., "The Simple Public-Key GSS-API Mechanism (SPKM)", RFC 2025, October 1996. [RFC1964] Linn, J., "The Kerberos Version 5 GSS-API Mechanism", RFC 1964, June 1996. [ISO10646] "ISO/IEC 10646-1:1993. International Standard -- Information technology -- Universal Multiple-Octet Coded Character Set (UCS) -- Part 1: Architecture and Basic [RFC1094] Sun Microsystems, Inc., "NFS: Network File System Protocol Specification", RFC 1094, March 1989. Multilingual Plane." [RFC1813] Callaghan, B., Pawlowski, B. and P. Staubach, "NFS Version 3 Protocol Specification", RFC 1813, June 1995. [RFC3010] Shepler, S., Callaghan, B., Robinson, D., Thurlow, R., Beame, C., Eisler, M. and D. Noveck, "NFS version 4 Protocol", RFC 3010, December 2000. [RFC3454] Hoffman, P. and P. Blanchet, "Preparation of Internationalized Strings ("stringprep")", RFC 3454, December 2002. [XNFS] The Open Group, Protocols for Interworking: XNFS, Version 3W, The Open Group, 1010 El Camino Real Suite 380, Menlo Park, CA 94025, ISBN 1-85912-184-5, February 1998. HTML version available: http://www.opengroup.org Aditya Pandit Expires March 2005 [Page 31] Internet Draft Interoperability NFS v2,v3,v4 August 27, 2004 [pNFS] Garth Gibson, Peter Corbett, "pNFS Problem Statement" Available: http://www.ietf.org/internet-drafts/ draft-gibson-pnfs-problem-statement-01.txt [NFSv4.1SECINFO] M. Eisler, NFSv4.1: SECINFO Changes, Available: http://www.ietf.org/internet-drafts/ draft-ietf-nfsv4-secinfo-02.txt [posixacl] Marius Aamodt Eriksen, Mapping Between NFSv4 and Posix Draft ACLs Available: http://www.ietf.org/internet-drafts/ draft-ietf-nfsv4-acl-mapping-01.txt [NFSv4.1Dir] S. Khan, "NFSv4.1: Directory Delegations and Notifications" Available: http://www.ietf.org/internet-drafts/ draft-ietf-nfsv4-directory-delegation-00.txt 5. Author's Address: Name: Aditya Pandit Address: "Shree" 20, Ambika Housing Society, Senapati Bapat Road, Pune 411016, Maharashtra, India. E-mail: adityaspandit@gmail.com Office email: adityap@calsoftinc.com Name: Sujay Godbole Address: 47/6 ‘A’ Mangalam Chambers co-op Housing Society, Flat No 2, Poud Road, Pune 411038, Maharashtra, India. Email: sujay.godbole@gmail.com Office E-mail: sujay@calsoftinc.com Notice: The reader’s comments are most welcome at our gmail addresses given above. It will be good to CC us on the Office E-mail ids. Aditya Pandit Expires March 2005 [Page 32] Internet Draft Interoperability NFS v2,v3,v4 August 27, 2004 6. Copyright Notice Copyright (C) The Internet Society (2004). All Rights Reserved. This document is subject to rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARANTEE THAT THE USE OF THE INFORMATION HERIN WILL NOT INFRINGE ANY RIGHTS OR ANY PMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Aditya Pandit Expires March 2005 [Page 33]