INTERNET-DRAFT J. Wendt draft-wendt-direct-access-reqs-00.txt M. Krause Expires: May 2002 HP B. Aboba Microsoft S. Bailey Sandburst D. Garcia Compaq A. Romanow Cisco November 2001 Direct Access Requirements < draft-wendt-direct-access-reqs-00.txt > Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Copyright Notice Copyright (C) The Internet Society (2001). All Rights Reserved. Abstract Current networking systems generally rely on data copy operations to move incoming data from network protocol stack buffers into application buffers. These data copy operations consume significant portions of a host system's processor-memory bandwidth and instruction cycle count, resources that could be applied to running applications. Wendt, et al Expires May 2002 [Page 1] Internet-Draft Direct Access Requirements 13 Nov 2001 Existing copy-avoidance schemes are limited by specialization and constrained applicability. Upper layer protocol mechanisms for placing data directly into application buffers require hardware support on a per-protocol basis. This document defines a basic set of requirements for an architecture and protocol that facilitates efficient high speed buffer data transfer and low latency communication. Conventions used in this document The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119]. Table of Contents 1 Introduction..................................................2 2 Terminology...................................................2 3 Solution Requirements.........................................5 3.1 Application Compatibility...................................5 3.2 Programming Model Support...................................5 3.3 Implementation Methods......................................5 3.4 Infrastructure Compatibility................................5 3.5 Data Transfer...............................................5 3.6 Data Integrity..............................................6 3.7 Error Reporting.............................................6 4 General Requirements..........................................6 4.1 Performance.................................................6 4.2 Recovery....................................................6 4.3 Robustness..................................................6 4.4 Scalability.................................................6 4.5 Efficiency..................................................7 4.6 Congestion Avoidance........................................7 5 Security Considerations.......................................7 5.1 Security Requirements.......................................7 5.2 Security Discussion.........................................7 6 References....................................................8 7 Full Copyright Statement......................................8 8 Acknowledgements..............................................9 9 Author's Addresses............................................9 1 Terminology This document uses the following terms and abbreviations: LLP - Lower Layer Protocol Wendt, et al Expires May 2002 [Page 2] Internet-Draft Direct Access Requirements 13 Nov 2001 RDMA - Remote Direct Memory Access - a method of accessing memory on a remote system without interrupting the processing of the CPU(s) on that system. ULP - Upper Layer Protocol 2 Introduction This document describes the requirements that need to be met in order to address the limitations of the Internet Protocol suite in very large scale high speed data transfer. The ubiquity of the IP suite derives largely from its flexibility, as well as the benefits of an open standard. One domain in which IP has not been widely used is in application to application very high speed data transfer. Much of today's usage of the Internet and IP networks is for buffer- to-buffer data transfers, often in the form of bulk data and inter- process communications, using a variety of Internet protocols, including HTTP, FTP, NFS, CIFS, and soon, iSCSI. Gigabit and faster network transfers incur heavy system resource costs, including both CPU use and system bus bandwidth, particularly on the receiving side. The costs incurred on hosts for protocol processing and management has inhibited the use of IP in the high speed bulk data domain. Until fairly recently this domain was a specialized niche, limited primarily to university research and scientific supercomputing. However, since the Internet has become the backbone of much of world-wide commerce, another venue, where high speed data transfer is critical, has come to play a central and pivotal role in the evolution of the Internet -- the data center. Bulk data transfer overhead is dominated by the costs of copying and validating incoming data in order to place it at its ultimate destination. Small data message delivery overhead is dominated by the costs of protocol processing and operating system interaction. Of equal importance in the data center is low latency communication. Minimizing the latency incurred in delivery of small data messages is critical to client/server application performance. Several approaches have been used in research and industry to address the issue of CPU overhead and memory bandwidth costs associated with large scale buffer transfer and low latency communication. One approach has consisted of a variety of copy- avoidance schemes, such as Zero Copy TCP in particular OS implementations. These have limited success as they are complex, highly specialized, and tend to be fragile. Wendt, et al Expires May 2002 [Page 3] Internet-Draft Direct Access Requirements 13 Nov 2001 Another approach has been for Upper Layer Protocols (ULPs) to place data directly into application buffers using hardware support, such as SCSI/FibreChannel. This solution has the significant disadvantage of requiring specialized hardware for each different ULP. This makes solutions expensive to implement and deploy, and inflexible when evolution is necessary. A third approach has been direct data placement through RDMA technology. There have been both proprietary and standards-based solutions, such as Dec's Memory Channel, Virtual Interface Architecture (VIA), and Infiniband. An RDMA protocol makes it possible for a network interface to place data from an incoming packet directly into the memory where the application wants the data. To reduce latency, the application communicates directly with the network interface using a kernel-bypass mechanism that eliminates operating system overhead. While this direct data placement approach provides a viable solution, it is not directly compatible with the Internet Protocol framework. Data centers must support multiple networking protocol technologies. It would be technically more efficient and commercially less expensive and therefore more easily deployable, to have a standard IP solution for high speed transfer which could be used for all ULPs. The proposed WG and this document investigate the requirements that are necessary for a generic IP suite solution to enable efficient support of very large scale high speed buffer transfer and low latency communication in multi-gigabit networks. Wendt, et al Expires May 2002 [Page 4] Internet-Draft Direct Access Requirements 13 Nov 2001 3 Solution Requirements 3.1 Application Compatibility 3.1.1 The protocol MUST efficiently support a wide variety of application classes, such as storage (block & file transfer), clustering, inter-process communication and high performance computing. 3.1.2 The protocol MUST efficiently support existing high-bandwidth upper layer protocols, but not necessarily without any modification to the upper layer protocols or their implementations. 3.2 Programming Model Support 3.2.1 The protocol MUST support efficient implementation of widely used buffer to buffer programming models, such as the Message Passing Interface (MPI) and the Virtual Interface Architecture (VIA). 3.2.2 The protocol MUST support efficient implementation of a stream programming model, such as the Sockets Direct Protocol (SDP). 3.3 Implementation Methods 3.3.1 The protocol MUST permit a wide range of implementation methods. 3.4 Infrastructure Compatibility 3.4.1 The protocol MUST function correctly over all manner of IP infrastructure (e.g., point-to-point, dedicated LANs, wide area networks, the public Internet) to the extent allowed by the application and other protocol layers above and below it. 3.4.2 The protocol MUST function correctly when passing through middleboxes (e.g., NATs, NAPTs, firewalls), to the extent allowed by the application and other protocol layers above and below it. 3.4.3 The protocol MUST be compatible with both IPv4 and IPv6. 3.5 Data Transfer 3.5.1 The protocol MUST enable a suitable implementation to deliver data directly from application buffers to the network under most circumstances. 3.5.2 The protocol MUST enable a suitable implementation to deliver data directly from the network into application buffers under most circumstances. Wendt, et al Expires May 2002 [Page 5] Internet-Draft Direct Access Requirements 13 Nov 2001 3.6 Data Integrity 3.6.1 The protocol MUST provide strong data corruption detection for both protocol headers and application data. 3.7 Error Reporting 3.7.1 The protocol SHOULD report protocol-level errors. Reporting of protocol failure reasons facilitates diagnostic activities. 3.7.2 The protocol MAY report host system errors resulting from protocol-specified operations. 4 General Requirements 4.1 Performance 4.1.1 The protocol MUST have minimal impact on the intrinsic data transfer latency characteristics of the application and other protocol layers above and below it. 4.1.2 The protocol MUST have minimal impact on the intrinsic data transfer throughput characteristics of the application and other protocol layers above and below it. 4.1.3 The protocol MUST function efficiently over a wide variety of high-bandwidth IP network topologies (e.g. point to point, multipoint, low round-trip time and high round-trip time). 4.2 Recovery 4.2.1 The protocol SHALL NOT be required to ensure the state of system resources after protocol-level or system-level failures. 4.2.2 The protocol MUST allow suitable application implementations to recover system state after a protocol-level or system-level failure. 4.3 Robustness 4.3.1 The protocol MUST allow solutions to be implemented and deployed in such a way that there is no single point of failure in the system. 4.4 Scalability 4.4.1 The protocol MUST permit target implementations with a wide range of processing capabilities (e.g., personal computer, server, multiprocessor, supercomputer). Wendt, et al Expires May 2002 [Page 6] Internet-Draft Direct Access Requirements 13 Nov 2001 4.5 Efficiency 4.5.1 The protocol MUST complete operations with a minimal number of protocol exchanges. 4.6 Congestion Avoidance 4.6.1 The combination of the protocol, and the application and other protocol layers above and below it, should not introduce behavior that exacerbates network congestion. 5 Security Considerations Security is provided by the combination of the protocol and the other layers above and below it. 5.1 Security Requirements 5.1.1 The protocol MUST not provide protocol-specific security services, such as authentication, authorization, confidentiality and replay protection. Note that such security services MUST be provided in applications of the protocol, by the application or other protocol layers above and below it. 5.1.2 Implementations of the protocol MUST protect application buffer contents from being retrieved or modified, other than as specified by the protocol. 5.2 Security Discussion The protocol is carried over a lower layer protocol (LLP). Therefore, both the control and data packets of the protocol are vulnerable to attack. Examples of attacks include: 1. An adversary may try to discover user identities by snooping data packets. 2. An adversary may try to modify packets (both control and data). 3. An adversary may try to hijack a protocol channel. 4. An adversary can launch denial of service attacks by terminating the protocol connections, such as by sending a TCP reset. 5. An authenticated or unauthenticated adversary can attempt to read or write from unauthorized memory regions. 6. On a multi-user system, an adversary can inject packets into an established protocol connection opened by another user. In general these concerns are most typically addressed via lower layer protocol security mechanisms (e.g., TLS Transport Layer Wendt, et al Expires May 2002 [Page 7] Internet-Draft Direct Access Requirements 13 Nov 2001 Security [TLS] and IPSec [RFC2401]). Such mechanisms must provide the following functions: 1. Allow for mutual authentication at the start of a protocol peer- to-peer association 2. Allow for preservation of the control messages once the association has been established. 3. Allow for optional confidentiality protection of control messages. (If provided the mechanism should allow a choice in the algorithm to be used.) 4. Operate across un-trusted domains between the protocol peers in a secure fashion. 5. Support non-repudiation for a customer-located device communicating with a network operator's device. 6. Define mechanisms to mitigate denial of service attacks. 7. Define mechanisms to mitigate replay attacks on the control messages. Note: any associated protocol document will need to include an extended discussion of security requirements, offering more precision on each threat and giving a complete picture of the defense including non-protocol measures such as configuration. 6 References [RDMA] Sapuntzakis. C., et. Al., "The Case for RDMA", Internet Draft draft-csapuntz-caserdma-00.txt, September 2000 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC2401] Atkinson, R., Kent, S., "Security Architecture for the Internet Protocol", RFC 2401, November 1998. [TLS] Dierks, T. and C. Allen, "The TLS Protocol Version 1.0", RFC 2246, November 1998. 7 Full Copyright Statement The following copyright notice is copied from RFC 2026 [Bradner, 1996], Section 10.4, and describes the applicable copyright for this document. Copyright (C) The Internet Society (2001). All Rights Reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it Wendt, et al Expires May 2002 [Page 8] Internet-Draft Direct Access Requirements 13 Nov 2001 or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English. The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assigns. This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE." 8 Acknowledgements The authors would like to acknowledge the contributions of Paul Culley, Edward Gardner, David Minturn, Jim Pinkerton, Renato Recio, and other members of the RDMA mailing list. 9 Author's Addresses Jim Wendt Hewlett Packard Corporation 8000 Foothills Boulevard Phone: 916 785 5198 Roseville, CA 95747-5668 Email: jim_wendt@hp.com Bernard Aboba Microsoft Corporation One Microsoft Way Phone: 425 936 6605 Redmond, WA 98052 EMail: bernarda@microsoft.com Stephen Bailey Sandburst Corporation 600 Federal Street Phone: 978 689 1614 Andover, MA 01810 Email: steph@sandburst.com Dave Garcia Compaq Computer Corp. 19333 Vallco Parkway Phone: 408 285 6116 Cupertino, Ca. USA 95014 Email: dave.garcia@compaq.com Mike Krause Hewlett Packard Corporation 19420 Homestead Road 43LN Phone: 408 447 3191 Cupertino, CA 95014 Email: krause@cup.hp.com Wendt, et al Expires May 2002 [Page 9] Internet-Draft Direct Access Requirements 13 Nov 2001 Allyn Romanow Cisco Systems, Inc. 170 W. Tasman Drive Phone: 408 525 8836 San Jose, CA 95134 Email: allyn@cisco.com Wendt, et al Expires May 2002 [Page 10]