Internet DRAFT - draft-haynes-nfsv4-flex-filesv2

draft-haynes-nfsv4-flex-filesv2







NFSv4                                                          T. Haynes
Internet-Draft                                              Primary Data
Intended status: Standards Track                         August 07, 2017
Expires: February 8, 2018


              Parallel NFS (pNFS) Flexible File Layout v2
                 draft-haynes-nfsv4-flex-filesv2-00.txt

Abstract

   The Parallel Network File System (pNFS) allows a separation between
   the metadata (onto a metadata server) and data (onto a storage
   device) for a file.  The flexible file layout type is an extension to
   pNFS which allows the use of storage devices in a fashion such that
   they require only a quite limited degree of interaction with the
   metadata server, using already existing protocols.  This document
   describes two extensions to the flexible file layout type to allow
   for multiple stateids for tightly coupled NFSv4 models and an
   additional security mechanism for loosely coupled models.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at http://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on February 8, 2018.

Copyright Notice

   Copyright (c) 2017 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect



Haynes                  Expires February 8, 2018                [Page 1]

Internet-Draft             Flex File Layout v2               August 2017


   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
     1.1.  Definitions . . . . . . . . . . . . . . . . . . . . . . .   3
     1.2.  Requirements Language . . . . . . . . . . . . . . . . . .   4
   2.  XDR Description of the Flexible File Layout Type  . . . . . .   4
     2.1.  Code Components Licensing Notice  . . . . . . . . . . . .   5
   3.  Flexible File Layout Type v2  . . . . . . . . . . . . . . . .   6
     3.1.  ffv2_layout4  . . . . . . . . . . . . . . . . . . . . . .   7
   4.  Security Considerations . . . . . . . . . . . . . . . . . . .   9
     4.1.  RPCSEC_GSS and Security Services  . . . . . . . . . . . .   9
       4.1.1.  Loosely Coupled . . . . . . . . . . . . . . . . . . .   9
   5.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  10
   6.  References  . . . . . . . . . . . . . . . . . . . . . . . . .  10
     6.1.  Normative References  . . . . . . . . . . . . . . . . . .  10
     6.2.  Informative References  . . . . . . . . . . . . . . . . .  11
   Appendix A.  Acknowledgments  . . . . . . . . . . . . . . . . . .  11
   Appendix B.  RFC Editor Notes . . . . . . . . . . . . . . . . . .  11
   Author's Address  . . . . . . . . . . . . . . . . . . . . . . . .  11

1.  Introduction

   In the parallel Network File System (pNFS), the metadata server
   returns layout type structures that describe where file data is
   located.  There are different layout types for different storage
   systems and methods of arranging data on storage devices.
   [flexfiles] defines the flexible file layout type used with file-
   based data servers that are accessed using the Network File System
   (NFS) protocols: NFSv3 [RFC1813], NFSv4.0 [RFC7530], NFSv4.1
   [RFC5661], and NFSv4.2 [RFC7862].

   The first version of the flexible file layout type had two issues
   which could not be addressed in [flexfiles] because of existing
   implementations.  The first issue was that under the tightly coupled
   model for a NFSv4 implementation, either a global stateid or an
   anonymous stateid needed to be used.  The second issue was that under
   the loosely coupled model, for a secure Remote Procedural Call (RPC)
   ([RFC5531]) implementation, each of the client, metadata server, and
   storage devices needed to implement an RPC-application-defined
   structured privilege assertion with RPCSEC_GSS version 3
   (RPCSEC_GSSv3) [RFC7861].  The second version of the flexible file
   layout type addresses both of these issues.




Haynes                  Expires February 8, 2018                [Page 2]

Internet-Draft             Flex File Layout v2               August 2017


1.1.  Definitions

   control communication requirements:  defines for a layout type the
      details regarding information on layouts, stateids, file metadata,
      and file data which must be communicated between the metadata
      server and the storage devices.

   control protocol:  defines a particular mechanism that an
      implementation of a layout type would use to meet the control
      communication requirement for that layout type.  This need not be
      a protocol as normally understood.  In some cases the same
      protocol may be used as a control protocol and data access
      protocol.

   data file:  is that part of the file system object which contains the
      content.

   fencing:  is when the metadata server prevents the storage devices
      from processing I/O from a specific client to a specific file.

   file layout type:  is a layout type in which the storage devices are
      accessed via the NFS protocol (see Section 13 of [RFC5661]).

   layout:  informs a client of which storage devices it needs to
      communicate with (and over which protocol) to perform I/O on a
      file.  The layout might also provide some hints about how the
      storage is physically organized.

   layout iomode:  describes whether the layout granted to the client is
      for read or read/write I/O.

   layout stateid:  is a 128-bit quantity returned by a server that
      uniquely defines the layout state provided by the server for a
      specific layout that describes a layout type and file (see
      Section 12.5.2 of [RFC5661]).  Further, Section 12.5.3 of
      [RFC5661] describes the difference between a layout stateid and a
      normal stateid.

   layout type:  describes both the storage protocol used to access the
      data and the aggregation scheme used to lay out the file data on
      the underlying storage devices.

   loose coupling:  is when the metadata server and the storage devices
      do not have a control protocol present.

   metadata file:  is that part of the file system object which
      describes the object and not the content.  E.g., it could be the
      time since last modification, access, etc.



Haynes                  Expires February 8, 2018                [Page 3]

Internet-Draft             Flex File Layout v2               August 2017


   metadata server (MDS):  is the pNFS server which provides metadata
      information for a file system object.  It also is responsible for
      generating layouts for file system objects.  Note that the MDS is
      responsible for directory-based operations.

   recalling a layout:  is when the metadata server uses a back channel
      to inform the client that the layout is to be returned in a
      graceful manner.  Note that the client has the opportunity to
      flush any writes, etc., before replying to the metadata server.

   revoking a layout:  is when the metadata server invalidates the
      layout such that neither the metadata server nor any storage
      device will accept any access from the client with that layout.

   stateid:  is a 128-bit quantity returned by a server that uniquely
      defines the open and locking states provided by the server for a
      specific open-owner or lock-owner/open-owner pair for a specific
      file and type of lock.

   storage device:  designates the target to which clients may direct I/
      O requests when they hold an appropriate layout.  See Section 2.1
      of [pNFSLayouts] for further discussion of the difference between
      a data store and a storage device.

   tight coupling:  is when the metadata server and the storage devices
      do have a control protocol present.

1.2.  Requirements Language

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in [RFC2119].

2.  XDR Description of the Flexible File Layout Type

   This document contains the external data representation (XDR)
   [RFC4506] description of the flexible file layout type version 2.
   The XDR description is embedded in this document in a way that makes
   it simple for the reader to extract into a ready-to-compile form.
   The reader can feed this document into the following shell script to
   produce the machine readable XDR description of the flexible file
   layout type version 2:

   <CODE BEGINS>

   #!/bin/sh
   grep '^ *///' $* | sed 's?^ */// ??' | sed 's?^ *///$??'




Haynes                  Expires February 8, 2018                [Page 4]

Internet-Draft             Flex File Layout v2               August 2017


   <CODE ENDS>

   That is, if the above script is stored in a file called "extract.sh",
   and this document is in a file called "spec.txt", then the reader can
   do:

   sh extract.sh < spec.txt > flex_filesv2_prot.x

   The effect of the script is to remove leading white space from each
   line, plus a sentinel sequence of "///".

   The embedded XDR file header follows.  Subsequent XDR descriptions,
   with the sentinel sequence are embedded throughout the document.

   Note that the XDR code contained in this document depends on types
   from both the flex files version 1 flex_filesv2_prot.x file
   ([flexfiles]) and the NFSv4.1 nfs4_prot.x file ([RFC5662]).  This
   includes both nfs types that end with a 4, such as offset4, length4,
   etc., as well as more generic types such as uint32_t and uint64_t.

2.1.  Code Components Licensing Notice

   Both the XDR description and the scripts used for extracting the XDR
   description are Code Components as described in Section 4 of "Legal
   Provisions Relating to IETF Documents" [LEGAL].  These Code
   Components are licensed according to the terms of that document.

   <CODE BEGINS>

   /// /*
   ///  * Copyright (c) 2012 IETF Trust and the persons identified
   ///  * as authors of the code. All rights reserved.
   ///  *
   ///  * Redistribution and use in source and binary forms, with
   ///  * or without modification, are permitted provided that the
   ///  * following conditions are met:
   ///  *
   ///  * o Redistributions of source code must retain the above
   ///  *   copyright notice, this list of conditions and the
   ///  *   following disclaimer.
   ///  *
   ///  * o Redistributions in binary form must reproduce the above
   ///  *   copyright notice, this list of conditions and the
   ///  *   following disclaimer in the documentation and/or other
   ///  *   materials provided with the distribution.
   ///  *
   ///  * o Neither the name of Internet Society, IETF or IETF
   ///  *   Trust, nor the names of specific contributors, may be



Haynes                  Expires February 8, 2018                [Page 5]

Internet-Draft             Flex File Layout v2               August 2017


   ///  *   used to endorse or promote products derived from this
   ///  *   software without specific prior written permission.
   ///  *
   ///  *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS
   ///  *   AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED
   ///  *   WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
   ///  *   IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
   ///  *   FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO
   ///  *   EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
   ///  *   LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
   ///  *   EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
   ///  *   NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
   ///  *   SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
   ///  *   INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
   ///  *   LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
   ///  *   OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING
   ///  *   IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
   ///  *   ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
   ///  *
   ///  * This code was derived from RFCTBD10.
   ///  * Please reproduce this note if possible.
   ///  */
   ///
   /// /*
   ///  * flex_files_prot.x
   ///  */
   ///
   /// /*
   ///  * The following include statements are for example only.
   ///  * The actual XDR definition files are generated separately
   ///  * and independently and are likely to have a different name.
   ///  * %#include <nfsv42.x>
   ///  * %#include <rpc_prot.x>
   ///  */
   ///

   <CODE ENDS>

3.  Flexible File Layout Type v2

   This document defines structures associated with the layouttype4
   value LAYOUT4_FLEX_FILES_V2 and it presents the minimal XDR changes
   neccessary from LAYOUT4_FLEX_FILES, which is described in
   [flexfiles].  [RFC5661] specifies the loc_body structure as an XDR
   type "opaque".  The opaque layout is uninterpreted by the generic
   pNFS client layers, but is interpreted by the flexible file layout
   type implementation.  This section defines the structure of this
   otherwise opaque value, ffv2_layout4.



Haynes                  Expires February 8, 2018                [Page 6]

Internet-Draft             Flex File Layout v2               August 2017


3.1.  ffv2_layout4

   <CODE BEGINS>

   /// struct ffv2_data_server4 {
   ///     deviceid4               ffds_deviceid;
   ///     uint32_t                ffds_efficiency;
   ///     stateid4                ffds_stateid<>;
   ///     nfs_fh4                 ffds_fh_vers<>;
   ///     fattr4_owner            ffds_user;
   ///     fattr4_owner_group      ffds_group;
   ///     opaque_auth             ffds_auth;
   /// };
   ///

   /// struct ffv2_mirror4 {
   ///     ffv2_data_server4         ffm_data_servers<>;
   /// };
   ///

   /// struct ffv2_layout4 {
   ///     length4                 ffl_stripe_unit;
   ///     ffv2_mirror4            ffl_mirrors<>;
   ///     ff_flags4               ffl_flags;
   ///     uint32_t                ffl_stats_collect_hint;
   /// };
   ///

   <CODE ENDS>

   The ffv2_layout4 structure specifies a layout over a set of mirrored
   copies of that portion of the data file described in the current
   layout segment.

   It is possible that the file is concatenated from more than one
   layout segment.  Each layout segment MAY represent different striping
   parameters, applying respectively only to the layout segment byte
   range.

   The ffl_stripe_unit field is the stripe unit size in use for the
   current layout segment.  The number of stripes is given inside each
   mirror by the number of elements in ffm_data_servers.  If the number
   of stripes is one, then the value for ffl_stripe_unit MUST default to
   zero.  The only supported mapping scheme is sparse and is detailed in
   Section 6 of [flexfiles].  Note that there is an assumption here that
   both the stripe unit size and the number of stripes is the same
   across all mirrors.




Haynes                  Expires February 8, 2018                [Page 7]

Internet-Draft             Flex File Layout v2               August 2017


   The ffl_mirrors field is the array of mirrored storage devices which
   provide the storage for the current stripe, see Figure 1.

                      +-----------+
                      |           |
                      |           |
                      |   File    |
                      |           |
                      |           |
                      +-----+-----+
                            |
               +------------+------------+
               |                         |
          +----+-----+             +-----+----+
          | Mirror 1 |             | Mirror 2 |
          +----+-----+             +-----+----+
               |                         |
          +-----------+            +-----------+
          |+-----------+           |+-----------+
          ||+-----------+          ||+-----------+
          +||  Storage  |          +||  Storage  |
           +|  Devices  |           +|  Devices  |
            +-----------+            +-----------+

                                 Figure 1

   The ffs_mirrors field represents an array of state information for
   each mirrored copy of the current layout segment.  Each element is
   described by a ffv2_mirror4 type.

   ffds_deviceid provides the deviceid of the storage device holding the
   data file.

   ffds_fh_vers is an array of filehandles of the data file matching to
   the available NFS versions on the given storage device.  There MUST
   be exactly as many elements in ffds_fh_vers as there are in both
   ffda_versions (see 4.1 of [flexfiles]) and ffds_stateid.  Each
   element of the array corresponds to a particular combination of
   ffdv_version, ffdv_minorversion, and ffdv_tightly_coupled provided
   for the device.  The array allows for server implementations which
   have different filehandles for different combinations of version,
   minor version, and coupling strength.  See Section 5.3 of [flexfiles]
   for how to handle versioning issues between the client and storage
   devices.

   For tight coupling, ffds_stateid provides the stateids to be used by
   the client to access the file.  For loose coupling and a NFSv4
   storage device, the client may use anonymous stateids to perform I/O



Haynes                  Expires February 8, 2018                [Page 8]

Internet-Draft             Flex File Layout v2               August 2017


   on the storage device as there is no use for the metadata server
   stateid (no control protocol).  In such a scenario, the server MUST
   set the ffds_stateids to be anonymous stateids.

   For loose coupling, ffds_auth provides the RPC credentials needed for
   secure access to the storage devices.  If secure access is not
   needed, i.e., the synthetic ids are sufficient, or in a tight
   coupling, the server should use the AUTH_NONE flavor and a zero
   length opaque body to minimize the returned structure length.  [[AI1:
   after the lesson learned from ffds_stateid, we either need to put an
   array here or define all of the file handles to share the same
   credentials.  And as Olga points out in her email, this gets big
   fast.  Especially if we throw in many mirrored copies!  --TH]]

4.  Security Considerations

   All of the security considerations to [flexfiles] apply here.  In
   addition, this document addresses how security mechanisms, such as
   Kerberos V5 GSS-API [RFC4121], can be applied to the loosely coupled
   model.

4.1.  RPCSEC_GSS and Security Services

4.1.1.  Loosely Coupled

   Under this coupling model, the principal used to authenticate the
   metadata file is different than that used to authenticate the data
   file.  For the metadata server, the RPC credentials would be
   generated by the same source as the client.  For RPC credentials to
   the data on the storage device, the metadata server would be
   responsible for their generation.  Such "credentials" SHOULD be
   limited to just the data file be accessed.  Using Kerberos V5 GSS-API
   [RFC4121], some possible approaches would be:

   o  a dedicated/throwaway client principal name akin to the synthetic
      uid/gid schemes.

   o  authorization data in the ticket.

   o  an out-of-band scheme between the client and metadata server.

   Depending on the implementation details, fencing would then be
   controlled either by expiring the credential or by modifying the
   synthetic uid or gid on the data file.  I.e., if the credentials are
   at a finer granularity than the synthetic ids, it might be possible
   to also fence just one client from the file.





Haynes                  Expires February 8, 2018                [Page 9]

Internet-Draft             Flex File Layout v2               August 2017


5.  IANA Considerations

   [RFC5661] introduced a registry for "pNFS Layout Types Registry" and
   as such, new layout type numbers need to be assigned by IANA.  This
   document defines the protocol associated with the existing layout
   type number, LAYOUT4_FLEX_FILES_V2 (see Table 1).

    +-----------------------+-------+----------+-----+----------------+
    | Layout Type Name      | Value | RFC      | How | Minor Versions |
    +-----------------------+-------+----------+-----+----------------+
    | LAYOUT4_FLEX_FILES_V2 | 0x6   | RFCTBD10 | L   | 1              |
    +-----------------------+-------+----------+-----+----------------+

                     Table 1: Layout Type Assignments

6.  References

6.1.  Normative References

   [LEGAL]    IETF Trust, "Legal Provisions Relating to IETF Documents",
              November 2008, <http://trustee.ietf.org/docs/
              IETF-Trust-License-Policy.pdf>.

   [RFC1813]  IETF, "NFS Version 3 Protocol Specification", RFC 1813,
              June 1995.

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119, March 1997.

   [RFC4121]  Zhu, L., Jaganathan, K., and S. Hartman, "The Kerberos
              Version 5 Generic Security Service Application Program
              Interface (GSS-API) Mechanism Version 2", RFC 4121, July
              2005.

   [RFC4506]  Eisler, M., "XDR: External Data Representation Standard",
              STD 67, RFC 4506, May 2006.

   [RFC5531]  Thurlow, R., "RPC: Remote Procedure Call Protocol
              Specification Version 2", RFC 5531, May 2009.

   [RFC5661]  Shepler, S., Ed., Eisler, M., Ed., and D. Noveck, Ed.,
              "Network File System (NFS) Version 4 Minor Version 1
              Protocol", RFC 5661, January 2010.

   [RFC5662]  Shepler, S., Ed., Eisler, M., Ed., and D. Noveck, Ed.,
              "Network File System (NFS) Version 4 Minor Version 1
              External Data Representation Standard (XDR) Description",
              RFC 5662, January 2010.



Haynes                  Expires February 8, 2018               [Page 10]

Internet-Draft             Flex File Layout v2               August 2017


   [RFC7530]  Haynes, T. and D. Noveck, "Network File System (NFS)
              version 4 Protocol", RFC 7530, March 2015.

   [RFC7862]  Haynes, T., "NFS Version 4 Minor Version 2", RFC 7862,
              November 2016.

   [flexfiles]
              Halevy, B. and T. Haynes, "Parallel NFS (pNFS) Flexible
              File Layout", draft-ietf-nfsv4-flex-files-13 (Work In
              Progress), July 2017.

   [pNFSLayouts]
              Haynes, T., "Requirements for pNFS Layout Types", draft-
              ietf-nfsv4-layout-types-05 (Work In Progress), July 2017.

6.2.  Informative References

   [RFC7861]  Adamson, W. and N. Williams, "Remote Procedure Call (RPC)
              Security Version 3", November 2016.

Appendix A.  Acknowledgments

   Dave Noveck inspired the need for mutiple stateids for the tightly
   coupled model in [flexfiles].

   Olga Kornievskaia inspired the need for another security mechanism
   for the loosely coupled model in [flexfiles].

Appendix B.  RFC Editor Notes

   [RFC Editor: please remove this section prior to publishing this
   document as an RFC]

   [RFC Editor: prior to publishing this document as an RFC, please
   replace all occurrences of RFCTBD10 with RFCxxxx where xxxx is the
   RFC number of this document]

Author's Address

   Thomas Haynes
   Primary Data, Inc.
   4300 El Camino Real Ste 100
   Los Altos, CA  94022
   USA

   Phone: +1 408 215 1519
   Email: thomas.haynes@primarydata.com




Haynes                  Expires February 8, 2018               [Page 11]