Internet Draft Julian Satran Document: draft-satran-transport- IBM adaptation-framework-00.txt Transport Adaptation Framework (TAF) Julian Satran Expires June 2001 1 TAF December 14, 2000 Status of this Memo This document is an Internet-Draft and is NOT offered in accordance with Section 10 of RFC2026, and the author does not provide the IETF with any rights other than to publish as an Internet-Draft. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or made obsolete by other documents at any time. It is inappropriate to use Internet- Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Abstract This draft is an attempt to outline a "technique" (I hesitate to call it a protocol) to adapt a transport protocol to application needs, outside the protocol payload, without interfering with the transport protocol operation, nor directly affecting its implementation. It is based on an "interposer" header between the IP header and the transport header (TCP or UDP) and, when needed, a trailer that that is appended to the transport PDU. The header and trailer are built for specific transport flows by routines registered as application specific extensions and "activated" by the applications at the two ends of the communication channel trough association tables. The two transfer directions are considered as separate entities and this framework does not prescribe any relation between them - although some applications may do so. Introduction Motto: The most dangerous thing in the world is to think you understand something (from an old calendar) J. Satran Expires - June 2001 2 TAF December 14, 2000 This draft is very preliminary. In fact, it is more a "teaser" than a full-fledged specification. It is meant to check the soundness of several concepts and to start a process. Transport protocols for IP networks are designed for a very wide class of applications. Their performance, robustness and interoperability require extreme restraint on behalf of designers. Changes have to be avoided and when changes become unavoidable, their introduction has to be preceded by a long period of testing. However, IP networks expand rapidly in new application domains and this expansion is accompanied by a plethora of new requirements. The transport protocol family has been expanded to include a new point-to-point transport with widely expanded functionality (SCTP) and work is being done on reliable multicast transport protocols. However, the "classical" TCP and UDP, due to their maturity and wide deployment are the preferred choice for a wide variety of vendor provided applications. Many of the new applications could benefit from using functions available from a new transport (like SCTP) but no vendor is going to "bet the business" on an all new and radically different transport protocol - unless there is a transition plan that allows him to start by using a mature transport protocol and then migrate it to a new protocol. Moreover the lack of a technology that should provide a "functional continuum" between functions built-in in different transports and make the choice of base protocol a simple choice between application specific or "pre-packaged" is hampering applications development and is stressing protocols far beyond the original intent of the designers. Several attempts have been made and more are underway to solve a similar set of issues with "in band" mechanisms - i.e. completely contained within the transport data. This draft is an attempt to outline a "technique" (I hesitate to call it a protocol) to adapt a transport protocol to application needs, outside the protocol payload, without interfering with the transport protocol operation, nor directly affecting its implementation. It is based on an "interposer" header between the IP header and the transport header (TCP or UDP) and, when needed, a trailer that that is appended to the transport PDU. The header and trailer are built for specific transport flows by routines registered as application specific extensions and J. Satran Expires - June 2001 3 TAF December 14, 2000 "activated" by the applications at the two ends of the communication channel trough association tables. The two transfer directions are considered as separate entities and this framework does not prescribe any relation between them - although some applications may do so. Unlike other "transport extension" mechanisms this mechanism does not consume option space, can operate on every transport (UDP, TCP and even SCTP) and can be easily activated for a flow or any group of flows. Several application extensions can be applied in sequence in fashion controlled by a specific application. Acknowledgements The design of this scheme is close to the IPV6 header extension structure and the IPsec header and activation. As such, I am grateful to the known and anonymous contributors to the IPV6 and IPsec design. As this draft was also driven by the large strides made by SCTP to provide a transport protocol with a far richer functionality than TCP we are grateful to the SCTP known and anonymous contributor for setting a new bar in transport protocol design. Conventions used in this document The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC-2119 [1]. J. Satran Expires - June 2001 4 TAF December 14, 2000 1. Overview A transport data-gram structure in with current transports looks as follows: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 0| IP-Header | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ n-m| IPsec-Headers(opt) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ m-h| Transport-Header (UDP, TCP, SCTP) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ h+t| Application payload | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ t+e| IPsec trailer (opt) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ For TAF extension a TAF header or sequence of headers is inserted between the IPsec-Headers and the Transport-Header and a TF trailer of sequence of trailers is inserted just before the IPsec trailer. The datagram format is now: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 0| IP-Header | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ n-m| IPsec-Headers(opt) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ m-x| TAF-Headers(opt) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ m-h| Transport-Header (UDP, TCP, SCTP) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ h-t| Application payload | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ t-y| TAF-Trailers(opt) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ y-e| IPsec trailer (opt) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1.1 TAF Header format The general format of a TAF-Header is: J. Satran Expires - June 2001 5 TAF December 14, 2000 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 0| NHT | TAF-type |Length |Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ n-m| Application Specific Fields (optional) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1.1.1 NHT - next header type. Specifies TAF for all except the last. The last header will contain in the NHT field the transport protocol type (TCP, UDP, SCTP, etc.). 1.1.2 TAF-type The application transport adaptation type. Types 0x'00' to 0x'7f' are registered with IANA. Types 0x'80'-0x'ff' are vendor specific application adaptation extensions. 1.1.3 Length Header length - 1 in 32 bit words. 1.2 TAF operation Every outgoing IP packet is examined and, if it belongs to a transport flow or group of flows (e.g., all communications to a well known port) as defined by TAF-Selector-Table, the TAF operations specified by the TAF-Selector-Table are performed and the corresponding TAF headers are inserted in the packet. In addition, for some application extensions and connection protocols, action routines may save/use specific state to/from connection specific data structures. Every incoming IP packet is examined against a TAF-Selector-Table for the presence of the required TAF headers and then the specific operations are performed; packets missing required headers and/or trailers are dropped. Some TAF headers don't have to appear on every packet. Some TAF headers imply a TAF trailer. Some TAF trailers do not have a header counterpart (is this a good idea worth exploring?). Input packets containing TAF headers not specified in the TAF- Selector-Table have the headers (and implied trailers) stripped before being delivered to the transport for further processing. J. Satran Expires - June 2001 6 TAF December 14, 2000 1.3 TAF negotiation For TAF to operate the TAF-selector-table have to be synchronized. The table synchronization has to handled by the communicating applications through a adaptation synchronization protocol. This is a two-phase protocol modeled after ISAKMP (TBD) J. Satran Expires - June 2001 7 TAF December 14, 2000 2. Lightweight end-to-end data integrity - TAF types 0x'00' to 0x'0f' Through this application extension transport packets can carry a stronger data integrity checks that the one provided through the usual TCP and UDP checksums. The integrity check value (ICV) will be carried by the trailer. This will enable in flight computation of the ICV. Separate ICVs can be computed for the transport header and the application payload to enable simple transformations for transport proxy units and end-to-end integrity checks on the payload. NB - different integrity checks have to be provided for application level proxies and an attempt to a generic solution for those is discussed later. J. Satran Expires - June 2001 8 TAF December 14, 2000 3. Message Framing - TAF type 0x'10' This application extension enables streaming type transport to mark end-to-end boundaries for messages in the stream. It is assumed that application will use a "message-type" API (send- message, receive-message) that will convey the message boundaries to the protocol stack. For reliable transports the message boundary "history" will be kept for the complete send window-having the ACK mechanism remove the "obsolete" message boundaries. A message boundary TAF header will look like: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 0| NHT | 0x'10' |Length |Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4| First-Message-End-pointer | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Next-Message-End-pointer(s) (optional) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Message-end-pointers point to the last byte of a user message relative to the start of this transport PDU payload (in a generic way). If the current transport PDU does not contain a message-end boundary, the whole message boundary header may be omitted. J. Satran Expires - June 2001 9 TAF December 14, 2000 4. Remote DMA - TAF type 0x'11' This type of application extension enables the sender to steer portions of the payload into predefined locations in the receiver memory. It is assumed that application will use a "RDMA-message-type" API on the sender side (RDMA-write) and a message type API on the receive side (receive-message) that will convey the message boundaries and the steering information to the protocol stack. For reliable transports the message steering "history" will be kept for the complete send window-having the ACK mechanism remove the "obsolete" message information. A message boundary TAF header will look like: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 0| NHT | 0x'11' |Length |Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4| First-Steering-Tag | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 12|First-Data-Offset |First-Data-Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |E|Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 20| Next-Steering-Tag(s) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 28|Next-Data-Offset(s) |Next-Data-Length(s) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |E|Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ The Steering-Tag is either a complete address (or equivalent) in receiver memory or a tag and offset understood by the receiver through a well-defined convention with the sender. The tag issue - Is it possible to define the steering tags so that tag re-computation can be done by the sender in a generic way (for retransmissions) or application specific RDMA codes are needed? J. Satran Expires - June 2001 10 TAF December 14, 2000 The E flag indicates that the specific piece of data is a "message" end and if all the preceding data in the transport stream where placed a receiver message receive call can be terminated. Please note that the E flag is "invariant" at retransmission (it appears always at the same point in the stream). J. Satran Expires - June 2001 11 TAF December 14, 2000 5. Integrity Encapsulated RDMA - TAF type 0x'18' In addition to the RDMA every data piece is accompanied by an ICV if the E flag is set. The ICVs are calculated for the entire message at the first transmission and are kept until made obsolete by an ACK. The receiver is supposed to use a receive-message API with 2 options - Receive-ICV-without-check and Receive-ICV-with-check. The first option is to be used by application proxies to "pass-on" the ICV while the second can be used by end-points. The RDMA-write API will also have two options - compute-ICV or pass- ICV. J. Satran Expires - June 2001 12 TAF December 14, 2000 6. IANA Considerations NHT and the types will be registered with IANA J. Satran Expires - June 2001 13 TAF December 14, 2000 7. References and Bibliography [RFC793] Transmission Control Protocol, RFC 793 [RFC1122] Requirements for Internet Hosts-Communication Layer, RFC1122, R. Braden (editor) [RFC2026] Bradner, S., "The Internet Standards Process -- Revision 3", RFC 2026, October 1996. [RFC-2119] Bradner, S. "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC-2434] T. Narten, and H. Avestrand, "Guidelines for Writing an IANA Considerations Section in RFCs.", RFC2434, October 1998. J. Satran Expires - June 2001 14 TAF December 14, 2000 8. Author's Addresses Julian Satran IBM, Haifa Research Lab MATAM - Advanced Technology Center Haifa 31905, Israel Phone +972 4 829 6211 Email: Julian_Satran@il.ibm.com Comments may be sent to Julian Satran J. Satran Expires - June 2001 15 TAF December 14, 2000 Apendix A. Rationale Many of application issues we attempt to solve through the mechanisms described in this document can be solved through alternative mechanisms. Data integrity can be provided through digests included in the payload data, data framing can be attempted using byte stuffing, markers, or higher-level PDUs aligning at fixed boundary. RDMA can be done with application specific mechanisms. Implementing them for high-speed networks will probably benefit (if not mandate) hardware assists and "in kernel" support in the form of "shims". We assumed that if support at such a basic level is mandated then we should at least think in terms of an generic mechanism and try to maintain/use the basic transport mechanisms (such as recovery). Using shims and building purely within the transport stream leads to rebuilding within a higher layer some of the functions already present in the transport (like recovery from transport errors in case of a failed data integrity check). User API changes being required for both techniques we felt that transport application assists built this way will be simpler to implement and deploy and their operation will be more robust. One additional rationale for specialized headers was that the transition to IPV6 will enable the "smooth" incorporation of this mechanism into the IPV6 destination options. The insertion technique for those options once available can be used for both IPV4 and IPV6 J. Satran Expires - June 2001 16 TAF December 14, 2000 Full Copyright Statement "Copyright (C) The Internet Society (date). All Rights Reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English. The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assigns. This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE." J. Satran Expires - June 2001 17