Draft Specification NDMP Version 5 Requirements June 2001 Network Working Group Harald Skardal, INTERNET DRAFT Network Appliance Inc., Category: Applications Document: draft-skardal-ndmpv5-requirements-00.txt Requirements for Network Data Management Protocol Version 5 Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or become obsolete by other documents at any time. It is inappropriate to use Internet- Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Abstract This document describes the proposed requirements for NDMP version 5. This document assumes NDMP version 4 as a starting point. The requirements focus on improved clarity and interoperability, enabling a richer set of data management applications, and broadening the scope of NDMP to become a true Internet protocol. This increased scope increases the need for improved security mechanisms. The document provides the input to a BOF session where the NDMP community will discuss and prioritize these requirements. The key goals of NDMP include interoperability, contemporary functionality, and extensibility. Copyright Copyright (C) The Internet Society (2001). All Rights Reserved. Expires December 2001 [Page 1] Draft Specification NDMP Version 5 Requirements June 2001 Table of Contents 1. Overview........................................................3 1.1. Motivation....................................................3 1.2. Scope.........................................................3 1.3. Audience......................................................3 1.4. Terminology...................................................3 1.5. Key Words.....................................................4 2. Requirements....................................................5 2.1. Restartability and Checkpoints................................5 2.2. Multi Source or Multi Destination Sessions....................6 2.3. Security......................................................6 2.3.1 Improved Authentication......................................6 2.3.2 Firewall compatibility.......................................6 2.4. Standardization of Environment Variables......................7 2.5. Internationalization..........................................8 2.6. Generalizing NDMP for non-UNIX environments...................8 2.7. Data Management of non-file system data sets..................8 2.8. Snapshot management...........................................8 2.9. XML Metadata..................................................9 2.10. Time out handling............................................9 2.11. Additional Data Path Processing..............................9 2.12. Archive or Secondary Storage Abstractions...................10 2.12.1. Higher Level Archive Interfaces...........................10 2.12.2. New Archive Media.........................................10 2.12.3. Generalizing a Unix Based Tape Environment................10 2.13. Optimal Network Path Selection..............................10 2.14. Server or Application Initiated Operations..................11 2.15. IPS: iSCSI, FCIP and iFCP...................................11 2.16. Restoring From a Partially Damaged Backup...................11 2.17. Partial File Incremental Backup.............................12 2.18. Distributing the Config Interface...........................12 2.19. A Simple "Object Oriented" Architecture.....................12 2.20. Tape Verification...........................................12 2.21. Data Stream Abstraction.....................................13 2.22. Supporting Contemporary File System Properties..............13 2.23. Symbolic Links..............................................13 2.24. Spooling....................................................13 2.25. File System Browsing........................................14 3. References.....................................................15 4. Authors and Contributors.......................................16 4.1. Document Author..............................................16 4.2. Contributors.................................................16 Expires December 2001 [Page 2] Draft Specification NDMP Version 5 Requirements June 2001 1. Overview 1.1. Motivation The goal of NDMP v5 is the definition of a protocol allowing data management applications to control the administrative movement of data between NDMP compliant primary, secondary and other storage systems and applications without the need for data management application software resident on storage system servers. The control and data transfer components of the data management session are separated. The separation allows complete interoperability at a network level. The storage system vendors need only be concerned with maintaining compatibility with one, well defined protocol. The data management vendors can place their primary focus on the sophisticated central administration software for data management. NDMP is targeted towards the process of administering the protection of data for an organization. Included are tasks such as backup and recovery, mirroring and replication, on and offsite archiving, and more. 1.2. Scope This document is the requirements specification for Network Data Management Protocol version 5. The primary scope of the NDMP v5 effort is to further improve simplicity, clarity and interoperability of NDMP, to enable a richer set of NDMP based data management functionality, and to broaden the scope of NDMP to become a true Internet protocol. This increased scope increases the need for improved security mechanisms. 1.3. Audience This document is intended for use by software developers and architects who will participate in the development of version 5 of the NDMP protocol. The reader is assumed to be familiar with TCP/IP networking, and with the operation of data management software in general, and NDMP v4 in particular. The user is not expected to have knowledge of internal backup software behavior. 1.4. Terminology This document uses the terminology from the specification of NDMP v4. The following sections define new terms. NDMP extensions provide additional functions beyond those provided by the core NDMP. The core NDMP includes features needed to support these NDMP extensions. See [1]. Expires December 2001 [Page 3] Draft Specification NDMP Version 5 Requirements June 2001 1.5. Key Words The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119. Expires December 2001 [Page 4] Draft Specification NDMP Version 5 Requirements June 2001 2. NDMP v5 Requirements The following sections give a functional description of new applications or capabilities that are considered for NDMP v5. It also includes a description of the underlying capabilities in core NDMP that are needed to enable such functionality. Notice that we include in the v5 core the functionality that is required to support certain proposed extensions. These extensions form the infrastructure needed to enable more complex data management capabilities. An example is restartability and checkpoint support which will allow recovery from partially failed data management operations. The NDMP core will as a minimum include the basic mechanisms needed to support checkpoint creation and tracking. Whereas, various approaches to restarting from checkpoints may be implemented as one or more NDMP extensions. One explicit goal of NDMP v5 is to exploit existing Internet standards where it is appropriate. For instance, in order to support optimal network path selection we will attempt to rely on internet based directory services for creating, storing, managing and accessing the information used for selecting the optimal path. Following is a list of approximately 30 features under consideration for NDMP v5. The features are listed in prioritized order, the most requested features are listed first. The ranking is based on a brief review by some of the participants in the NDMP development efforts. Work will be done to further prioritize and select the subset that gives good value to the NDMP community, and that can be accomplished within a suitable development period for v5. Currently our goal for v5 completion would be fall 2002 or winter 2003. 2.1. Restartability and Checkpoints Backup, recovery and other data management operations involve large amounts of data, and often take many hours to complete. Currently, when a failure occurs, the operation needs to be restarted from the beginning. Often many hours of time and data protection is lost. NDMP needs to provide the underlying mechanisms that enable data management applications to create coordinated session wide synchronization points in the data stream(s), and to be able to restart the operation from a set of coordinated synchronization points. The synchronization point infrastructure needs to apply to all types of data management sessions: backup, recovery, mirroring, etc., and it needs to enable several different implementations of such capabilities. Expires December 2001 [Page 5] Draft Specification NDMP Version 5 Requirements June 2001 2.2. Multi Source or Multi Destination Sessions Devices or network paths participating in an NDMP session typically have different bandwidth and latency properties. In order to fully utilize the storage systems and networks, NDMP should be extended to allow for more than one source or more than one destination within one NDMP session. Examples are: - Moving data between two storage systems using more than one network connection, or - Moving data from N high performance storage system to M medium performance storage systems. This enables so called speed matching. Multiple connections can be used to shorten the backup or restore times, reduce the time lag between two asynchronous storage systems, and more. One application is so called "tape RAID", where a backup data set is partitioned and distributed over several tape drives in parallel. 2.3. Security 2.3.1 Improved Authentication NDMP currently supports clear text and MD5 based authentication between DMAs and servers. Use of more secure contemporary authentication mechanisms, including public key encryption and the use of encrypted passwords must be investigated. 2.3.2 Firewall compatibility The most significant issue for NDMP and firewall compatibility is the lack of port range control for NDMP connections (control and data) and the incompatibility of NDMP with firewalls implementing Network Address Translation (NAT). Port Ranges: For control connections, the DMA initiates a connection from any non-reserved TCP source port (1025 to 65535) to the NDMP "well known" destination port on the data/tape server (10,000). The data/tape server then allocates a new non-reserved TCP port for the NDMP connection. For control connections, the TCP source and destination ports can be allocated anywhere in the non-reserved range. Firewalls normally disallow or restrict inbound connections based on IP address/port combinations. Supporting a wide range of ports requires opening large "holes" in the firewall. This is something security folks dislike. Providing administrative control of the DMA and data/tape server port allocations, allows the ranges to constrained to manageable (and less vulnerable) sizes. It more desirable to open a range of 50 ports than 64,000 ports. Expires December 2001 [Page 6] Draft Specification NDMP Version 5 Requirements June 2001 For control connections the problem is most evident when the DMA attempts to connect to a data/tape server residing inside a firewall. For 3-way data connections the problem is most difficult when the data and tape servers reside behind separate firewalls. Note for 3-way data connections the location of the DMA is not relevant. The control connection is assumed to already be established and the data connections is really a server peer to peer connection being passed to each peer by the DMA. NAT Incompatibility: NATs translate between registered IP addressed used for external connectivity and unregistered internal IP addresses to 1) allow use of unregistered IP addresses and 2) to hide the topology of the network behind the firewall. NDMP is incompatible with Network Address Translation (NAT) firewalls because IP address and TCP port information is conveyed as payload data between NDMP peers (connect_addr in NDMP_MOVER_LISTEN/NDMP_DATA_LISTEN replies and NDMP_MOVER_CONNECT/NDMP_DATA_CONNECT requests). This address information is not processed by the translate logic of the firewall. The problem is evident when a data connection is attempted between two servers separated by a NAT firewall where the mover or data server listener is on the interior of the firewall. The mover or data connect requests will contain internal network IP addresses which can not be used externally to establish the data connection. The solution for port range control may not require a protocol change (rather just administrative control of each NDMP entity). The solution for NAT compatibility seems to necessitate a protocol change. 2.4. Standardization of Environment Variables NDMP backup methods are typically based on the design of existing tar, dump or cpio backup utilities. On unix or unix style servers these utilities, as well as the NDMP data and tape servers, are controlled via environment variables. Currently there is little structure in the definitions of these environment variables, there is redundancy, and there is no definition of environment variables that are for NDMP use only. The new requirement is therefore to introduce a system of NDMP based environment variables, with naming conventions reflecting their use: data, tape or filter services, for core NDMP or for use in standard or proprietary extensions, etc. Expires December 2001 [Page 7] Draft Specification NDMP Version 5 Requirements June 2001 Another environment related issue is related to managing time stamps. Currently incremental backups (level 0 - 9) use one global location to store the time stamp for a backup of a certain level. If multiple independent backups of the same file system are taken by multiple backup clients, generating different tape sets, a race condition occurs since all the clients will update the same time stamp. A successive incremental backup will end up with less updates than it should have had. The solution is to transfer the time stamp to the DMA at the conclusion of the operation, thus the time stamp is a per session item managed by the DMA as part of the meta data set for the operation. 2.5. Internationalization NDMP currently use the ASCII character set. As the scope of NDMP widens, NDMP needs to provide the infrastructure so that DMA's can be developed for non-english speaking users, using non-english languages. 2.6. Generalizing NDMP for non-UNIX environments NDMP grew out of a Unix based environments. The applications and storage landscape now include other operating systems such as those developed by Microsoft, Apple Computers, as well as custom operating systems used in network, server and storage appliances. NDMP should be generalized such that it can be more easily adapted to to these environments which have other sets of file and file system attributes, and other properties different from the classic Unix' environments. 2.7. Data Management of non-file system data sets NDMP is currently aimed at moving file system data. iSCSI and other IPS protocols bring block storage systems, data transfers and access to the internet network. Some database vendors implement relational database tables and files directly on the block device, circumventing the file system. There are several other data sets formats in use today. NDMP must enable data management of data sets that do not exist in file systems. These include volume or block based data sets or storage systems and database data sets. 2.8. Snapshot management The snapshot management interface defines a mechanism and protocol for controlling primary storage file system images commonly referred to as snapshots. Expires December 2001 [Page 8] Draft Specification NDMP Version 5 Requirements June 2001 Specifically this interface supports the management of automated and manual snapshot creation, snapshot deletion, and snapshot directory browsing as well as full snapshot recovery and selective file recovery. This interface provides functionality allowing snapshots to be used to implement near-line data protection solutions that offer faster backup and recovery times compared to traditional tape based secondary storage. 2.9. XML Metadata The use of the emerging XML standard in NDMP metadata would enable easier integration between NDMP and other business and management applications. This requirement may be a component of the Internationalization requirement. 2.10. Time out handling When one or more component of an NDMP session fail, for instance when a network connection is broken, or when a participating host fails, there needs to be a coordinated NDMP session level strategy for determining when failure has happened, how to report it or take other actions, and how to recover from the failure. This strategy should be coordinated with the foundation for restart- ability and checkpoints. 2.11. Additional Data Path Processing Data management is becoming a wide area application. In addition, new needs require additional processing of data as it is moved between storage systems. When data is moved to an off site archive over the network, data compression may shorten the transfer time, and thus the window of time a service is unprotected, by 50% or more. For data sizes of 10's or 100's of Gbytes this time may amount to hours, and is therefore significant. Another need is virus scanning of data that is being archived or replicated between two primary storage systems. These are examples where the data being moved in an NDMP session require one or more additional step of processing between source and destination. NDMP needs to be enhanced in order to allow for the configuration of additional services that process the data stream. In the following we will call these intermediate data translators or filters for "(data) filter services". Expires December 2001 [Page 9] Draft Specification NDMP Version 5 Requirements June 2001 2.12. Archive or Secondary Storage Abstractions 2.12.1. Higher Level Archive Interfaces The service interface to the archive devices such as the tape drive or tape library is now at a very low level. The interaction between DMA and tape is done via SCSI level commands. This implies that the DMA needs to understand vendor specific aspects of the tape drives or libraries. Since the SCSI bus on the tape host is exposed it also leaves the tape host vulnerable if there are errors in the tape interface. A low level tape interface slows the adoption of NDMP, and slows the development of NDMP based data management applications. The benefit of a tape/archive model at a higher level is two fold: First it frees the tape and library vendors from remaining mostly compatible with old tape architectures, instead they can make the low level changes necessary to yield a better balanced product. Secondly the DMA vendors can omit the tape specific issues and focus on the administration of the data. NDMP needs to provide an alternative to the existing tape interface, one that hides the device specific properties, and presents a generic archive interface that give tape or library vendors the freedom to provide innovation and greater added value. 2.12.2. New Archive Media Archive media other than tape: CD's, DVD's, even disk based archive, is gaining some popularity today. NDMP should be generalized to allow for easy adoption of these and other emerging archive technologies. 2.12.3. Generalizing a Unix Based Tape Environment The tape model in NDMP grew out of UNIX. Therefore the tape server architecture is heavily related to UNIX, and thus harder to implement in non-UNIX based operating systems. NDMP needs a tape model which enables easier deployment and implementations across all the operating systems commonly hosting tape or archival devices or systems. 2.13. Optimal Network Path Selection Due to the decrease in network cost, primary and secondary storage systems typically have multiple network interfaces, and are often connected to multiple networks or network segments. A file server may have multiple Ethernet interfaces, plus interfaces to Fibre Channel and other network fabric types. Each interface may have different properties, based on the network type, network topology and network loading. This may impact the peak vs. guaranteed bandwidth, the reliability of the network, etc. Expires December 2001 [Page 10] Draft Specification NDMP Version 5 Requirements June 2001 Depending upon the requirements of the data management application, it must be possible for the application to determine which of the available network paths and interfaces provides the best fit for an NDMP session. NDMP based applications therefore needs to provide the ability to discover the properties of the network topologies so as to enable the data management applications to configure an NDMP session according to specified service levels. Primarily this should be achieved by interfacing to and using existing or emerging protocols for QoS management. NDMPv5 should interface to and interoperate with these new protocols. 2.14. Server or Application Initiated Operations Enterprise applications often use caching techniques to improve performance. One implication of this is that the application need to be "quiscenced" before backup; the applications' cache needs to be flushed in order to generate a data set in the storage system which is consistent, and can be replicated or backed up. Most database implementations require this. In automated IT environments this implies that it is the state or operations on the applications which determine when data management applications should start. One way of enabling such external control of NDMP sessions is through the primary storage system; applications can create snapshots (point in time copies of the data set), when the data service or the DMA per convention or per signal from the application sees the new snapshot, it can initiate the session. NDMP needs to provide mechanisms that allow external programs, including NDMP services, to initiate pre-configured NDMP sessions in the DMA. 2.15. IPS: iSCSI, FCIP and iFCP A new set of technologies is being developed in the IETF for transporting block data over TCP/IP. In particular, iSCSI enables servers to mount and read/write block devices that are connected using Internet networks. With iSCSI primary storage devices can mount tape libraries directly over the IP network. These new protocols needs to be investigated to better understand if and how NDMP v5 can integrate, expose and exploit their capabilities. 2.16. Restoring From a Partially Damaged Backup When a tape set becomes only slightly damaged, for instance if two tapes in a five tape backup has deteriorated, the full backup is lost. It would be of great value to enable the recovery of those elements of the tape or other backup media that was still intact. Expires December 2001 [Page 11] Draft Specification NDMP Version 5 Requirements June 2001 Notice that this is a post backup scenario, thus it is not a matter of restarting the backup. Also, this problem is not solved by restarting a restore, as the missing data is completely lost. The issue here is to prepare the tapes or tape data with information that enable the DMA to parse fragments of a backup and make use of the fragments that exist. 2.17. Partial File Incremental Backup Typical NDMP based backup methods support full and incremental backup options. Full backups transfer all files residing in the specified file system hierarchy to the secondary storage system. Incremental backups transfer only those files that have been modified since a specified date (last backup timestamp, etc.). However both full and incremental backups transfer the complete file contents regardless of the amount of change to the file. Many applications, such as databases use very large files. This implies that despite minor changes in the data sets even incremental backups can generate large amounts of data. NDMP must be extended to support new backup methods that perform incremental backups including only the modified parts of the file or volume. 2.18. Distributing the Config Interface Most of the messages in this interface is used to configure other interfaces: data, tape, etc. A suggestion is therefore to restructure the messages such that configuration messages for an interface is part of this interface. 2.19. A Simple "Object Oriented" Architecture Early discussions of "translate services", see 2.6.: "Additional Data Path Processing", uncovered that one could generalize the interface architecture of NDMP services such that all services were "core services" and implemented a core set of the interfaces/messages, and then specialized services such as tape, data, encryption, etc. were "sub-classes" of the general service class with their own specialized interfaces and messages. Additionally it has been pointed out that most of the messages in the CONFIG interface really belong with the interface (tape, data, mover, SCSI) that the message configures. It is believed that re-architecture of NDMP according to these objectives would make NDMP simpler to understand, describe, implement and support. 2.20. Tape Verification Backup data on tapes are typically lying unused and unchecked for weeks or months. Archive data may sit off site for years. The tape medium is not 100% reliable, thus when a backup tape set is needed some or all the data may have been lost. Expires December 2001 [Page 12] Draft Specification NDMP Version 5 Requirements June 2001 NDMP should support the use of real or proxy data services to read and verify that the tapes themselves are intact, and that the data on the tapes is consistent in terms of data format and packaging format. It is not clear whether this requires new support in NDMP, an extension, or can be supported via the current NDMP_DATA_START_RECOVER_FILEHIST message. 2.21. Data Stream Abstraction Current NDMP backup methods exploit the design of existing file based backup utilities such as tar, dump or cpio. New data management methods for backup, replication, archiving and other services will necessitate new data formats be conveyed between NDMP controlled source and destination systems. Even though NDMP has never attempted to define the data stream structure between source and destination systems, existing NDMP v4 data channel services must be examined to ensure sufficient flexibility exists to accommodate new data management methods and associated data structures. 2.22. Supporting Contemporary File System Properties NTFS and NFS version 4 supports "composite" files which include multiple underlying objects, also called "streams". As an example, some implementations support such files as hidden directories with multiple hidden files. During a recovery operation a problem arises when a DMA requests a file which is composed of multiple distinct objects. The DMA requests a single object, yet the NDMP session is responsible for recovering possibly a group of objects, located at different places on a tsape set. NDMP must provide the underlying protocol mechanisms required to be able to back up, mirror or recover such composite file objects. 2.23. Symbolic Links Currently (v2-v4) the file history information does not include information about the destination of a symbolic link. This seriously limits the utility of recovering directories that include symbolic links. The file history interface in NDMP v5 should provide the necessary mechanisms that handle symbolic links. 2.24. Spooling One implementation of an integrated check pointing and restartability, and multi source/destination sessions is to partition the data stream into "segments", and providing an intermediate "spooling service" which will manage the placement of segment on tapes, and provide both the multi source/destination multiplexing as well as the restartability. Expires December 2001 [Page 13] Draft Specification NDMP Version 5 Requirements June 2001 2.25. File System Browsing Currently (v2-v4) the file history information does not include information about the destination of a symbolic link. This seriously limits the utility of recovering directories that include symbolic links. The file history interface in NDMP v5 should provide the necessary mechanisms that handle symbolic links. When looking for data to back up, NDMP relies on other file system access methods such as NFS or CIFS to browse the data set. It would be helpful if NDMP provided a native method for browsing of the primary storage system. Expires December 2001 [Page 14] Draft Specification NDMP Version 5 Requirements June 2001 3. References [1] - NDMP version 4 draft specification. Work in progress. See www.ndmp.org. Expires December 2001 [Page 15] Draft Specification NDMP Version 5 Requirements June 2001 4. Authors and Contributors 4.1. Document Author Harald Skardal, Network Appliance Inc. harald.skardal@netapp.com 4.2. Contributors Clive Hendrie BlueArc Corporation Email: chendrie@bluearc.com Greg Linn Network Appliance Inc. Email: Greg.Linn@netapp.com Dave Manley Network Appliance Email: David.Manley@netapp.com Jim Ward Workstation Solutions Inc. Email: jimw@worksta.com Expires December 2001 [Page 16]