J. Williams Internet-draft Emulex Expires: September 2001 J. Pinkerton Microsoft C. Sapuntzakis Cisco J. Wendt HP ULP Framing for TCP draft-williams-tcpulpframe-00 Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other docuˇ ments at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Copyright Notice Copyright (C) The Internet Society (2001). All Rights Reserved. Abstract The proposed framing protocol accepts PDUs from a ULP (upper level protocol) and transports them over a TCP connection. This is done in such a manner that the ULP PDUs can be recovered at the receiver Williams Expires September 2001 [Page 1] Internet-Draft ULP Framing for TCP 23 February 2001 and can be recovered even if preceding TCP segments have not yet been received. This is a particularly powerful technique when the ULP provides PDUs which are self describing and which fit entierly within one TCP segment. This allows incoming packets to be proˇ cessed in the order received, and their data to be placed directly in the ultimate destination memory. Introduction The proposed framing protocol accepts PDUs from a ULP (upper level protocol) and transports them over a TCP connection. This is done in such a manner that the ULP PDUs can be recovered at the receiver and can be recovered even if preceding TCP segments have not yet been received. This is a particularly powerful technique when the ULP provides PDUs which are self describing and which fit entierly within one TCP segment. This allows incoming packets to be proˇ cessed in the order received, and their data to be placed directly in the ultimate destination memory. To fully exploit the framing protocol, a special framing aware TCP implementation must be used. However these special TCP implementaˇ tions will be fully compliant with all governing RFCs and will be fully interoperable with all existing compliant TCP implementaˇ tions. In the absence of such a special TCP implementation, the protocol will be fully functional, but will not allow for the recovery of framing on out of order packets. 1. ULP Support for Framing A ULP will submit PDUs to the framing protocol. In the standard mode, the ULP PDUs are limited to the smaller of 2^16-1 (65535) and the size that will fit entirely within a TCP segment. The framing protocol MUST fail any attempt to submit a ULP PDU that is larger than will fit in a TCP segment. The TCP maximum segment size (MSS) can shrink to 8 bytes [see PathMTU] which leaves no room for ULP PDUs. If the MSS goes below 512 bytes, the ULP MAY instruct the framing protocol to enter an "emergency mode." In this mode, the framing protocol MUST accept ULP PDUs up to 512 bytes and MAY fragment the ULPs across TCP segˇ ments. Williams Expires September 2001 [Page 2] Internet-Draft ULP Framing for TCP 23 February 2001 Framing-aware TCP implementations will indicate to the framing proˇ tocol the path maximum segment size (MSS). This size may change during the course of the connection due to changes in the path MTU. The framing protocol MUST notify the ULP sender of changes in the MSS. The framing protocol MUST provide on the current value of the path MSS to the ULP on request. 2. Framing Protocol Each ULP PDU will be encapsulated in a framing PDU. The format of the framing PDU is as follows. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Length | Key | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Key | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | | ~ ~ ~ ULP PDU ~ | | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ULP PDU | PAD (up to 3 bytes) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ The "Length" field is 16 bits and contains the length in bytes of the ULP PDU. The PAD field trails the ULP PDU and contains between zero and three bytes of data. The pad data must be set to zero by the sender and ignored by the receiver. The length of the pad is set so as to make the framing PDU a multiple of four. The KEY field is 48 bits and its usage depends on whether the sendˇ ing TCP implementation is framing aware. If the sending TCP implementation is NOT framing aware (i.e. is a conventional TCP implementation) then the framing protocol must set the key to zero. Williams Expires September 2001 [Page 3] Internet-Draft ULP Framing for TCP 23 February 2001 If the sending TCP implementation is framing aware, then the KEY value is a non-zero random value selected by the sender at connecˇ tion setup time. All framing PDUs sent on a given connection in one direction must use the same (original) KEY value. Each direcˇ tion will in general have a different KEY value. The length of the framing PDU in bytes will be 8 + round(L), where L is the length of the ULP PDU and round(L) is the value of L rounded up to the next multiple of four. The length of the ULP PDU may or may not be a multiple of four. 3. Encapsulation of Framing PDUs within a TCP Stream If a TCP connection supports the ULP Framing Protocol, than all data sent on that connection must be Framing PDUs. There is no provision to mix both framed and unframed data on the same connecˇ tion. Two types of TCP implementations are supported, framing aware and non framing aware. The requirements for each are as follows. 3.1. Non Framing Aware TCP implementations Conventional TCP implementations without special support for framˇ ing are considered "non framing aware". In this case the KEY field of the framing header must be set to zero. There are no other requirements other than standard TCP requirements. 3.2. Framing Aware TCP implementation Framing aware TCP implementations must notify the framing protocol of changes in the path maximum segment size (PMSS). The framing protocol must be able to retrieve the PMSS from the framing-aware TCP. Because of changes in the PMSS, there may be cases when a fully framing aware ULP will fail to create PDUs which fit in a TCP segˇ ment. This can occur, for example, when retransmitting framing segˇ ments after a path MSS change. The use of oversize TCP segments sent by means of IP fragmentation is discouraged due to the limited ID number size of IP and the Williams Expires September 2001 [Page 4] Internet-Draft ULP Framing for TCP 23 February 2001 potential for undetected error due to ID number wrap. Framing aware TCP implementations should resegment at the TCP layer when necessary to meet requirements of the path MSS. If a framing PDU must be split across multiple TCP segments, then the sending TCP implementation must insure that each TCP segment containing a piece of the split framing PDU MUST have a length which is NOT a multiple of four. See Appendix A for such an algoˇ rithm at the sender for ensuring this proprerty. 4. TCP Receiver Framing Recovery. Because each framing PDU contains sufficient information to deterˇ mine its length, the beginning of the next framing PDU can be determined. Therefore each successive ULP can be recovered. Conventional TCP implementations will pass received data to the ULP in order, so framing is easily recovered by the ULP. Special framing aware TCP receive implementations may allow the ULP to do immediate data placement on TCP segments received out of order. The receiving end can safely assume that a framing header is aligned with the beginning of the TCP segment's payload if the following conditions are met. 1. Standard TCP processing indicates that this is a valid, in-window, non-duplicate segment that does not overlap with a previously received segment. 2. The remote sending TCP implementation is framing aware as evidenced by a non-zero KEY on all previous framing PDUs. 3. The received TCP segment length is a multiple of four. 4. No evidence of a resegmenting middle-box has been observed on this connection. Evidence of a resegmenting middle-box would be a previously received TCP segment whose length is a multiple of four and which contained a piece of a split framing PDU. 5. The data contained in the TCP segment parses correctly when interpreted as one or more framing PDUs. In particular, all the KEYs are correct, and the lengths add up to the length of the containing TCP segment. Williams Expires September 2001 [Page 5] Internet-Draft ULP Framing for TCP 23 February 2001 5. Validity of the Alignment Algorithm The objective of the transmit and receive algorithms is to ensure that the receiver, when processing an out of order TCP segment, never assumes alignment of the framing header with the TCP segment when in fact alignment is not the case. In the absence of a midˇ dle-box which resegments the TCP stream, this should never occur. In the presense of such a middle-box, every effort is made to avoid making an invalid alignment assumption, however in the extremely rare case that the middle-box maintained perfect alignment until the critical moment when an out of order TCP segment is received at the destination, then avoiding of erroneous processing of the data depends on the sufficiently low probability that the data stream will not contain a valid framing header(s) with the length (sum of lengths) matching the TCP segment length AND a valid KEY(s) at the non-aligned point in the data stream. 6. Security considerations When TLS is in use, the framing protocol is best layered under TLS, since TLS is a packet-based protocol. However, since the framing protocol works over unmodified TCPs, it will work over connections secured with TLS. 7. References [ALF] D. D. Clark and D. L. Tennenhouse, "Architectural consideraˇ tions for a new generation of protocols," in SIGCOMM Symposium on Communications Architectures and Protocols , (Philadelphia, Pennˇ sylvania), pp. 200--208, IEEE, Sept. 1990. Computer Communications Review, Vol. 20(4), Sept. 1990. [SOCKS] Leech, M., and others, "SOCKS Protocol Version 5," RFC 1928, April 1996 [RFC1112] Braden, R., ed., "Requirements for Internet Hosts -- Comˇ munications Layers", RFC 1122, October 1989 [PathMTU] Mogul, J., and Deering, S., "Path MTU Discovery", RFC 1191 Williams Expires September 2001 [Page 6] Internet-Draft ULP Framing for TCP 23 February 2001 [RFC2581] Allman, M. and others, "TCP Congestion Control," RFC 2581, April 1999 [Stevens] Stevens, W. Richard, "Unix Network Programming Volume 1," Prentice Hall, 1998, ISBN 0-13-490012-X [TCP] Postel, J., "Transmission Control Protocol - DARPA Internet Program Protocol Specification", RFC 793, September 1981 [TLS] Dierks, T. and others, "The TLS Protocol, Version 1.0", RFC 2246 Williams Expires September 2001 [Page 7] Internet-Draft ULP Framing for TCP 23 February 2001 Appendix A: Segmentation algorithm for Framing-aware TCP SeqNextByte = next byte to send SeqStartFrame = start of the current Framing protocol frame // (<= SeqNextByte) SeqStartNextFrame = start of the next Framing protocol frame // (> SeqNextByte) PathMss = maximum segment size with options subtracted // How many bytes do we have to send in the current frame FrameBytesLeft = SeqStartNextFrame - SeqNextByte if (FrameBytesLeft <= PathMss) { if (SeqNextByte == SeqStartFrame || FrameBytesLeft % 4) { // Pack as many complete framing protocol frames in a frame // as possible SegmentBytesLeft = PathMss do { copy frame into segment SegmentBytesLeft -= FrameBytesLeft update SeqStartFrame and SeqStartNextFrame to point to next frame FrameBytesLeft = SeqStartNextFrame - SeqStartFrame } while (FrameBytesLeft < SegmentBytesLeft); } else { // This case happens when the remote TCP acknowledges // up to an even byte boundary send frame with FrameBytesLeft - 1 bytes } } else // FrameBytesLeft >= PathMss if ((PathMss % 4) && (FrameBytesLeft - PathMss) % 4) { send frame with the next PathMss bytes from current frame } else if (((PathMss - 1) % 4) && (FrameBytesLeft - (PathMss - 1)) % 4) { send frame with PathMss - 1 bytes } else if (((PathMss - 2) % 4) && (FrameBytesLeft - (PathMss - 2)) % 4) { send frame with PathMss - 2 bytes } Williams Expires September 2001 [Page 8] Internet-Draft ULP Framing for TCP 23 February 2001 Appendix B: Sockets support for framing-aware TCP senders B.1.1. Creating a TCP socket with segmentation support s = socket (PF_INET, SOCK_STREAM, getprotobyname("tcp")); flag = 1 setsockopt (s, SOL_TCP, TCP_FRAMING_AWARE, &flag, sizeof(flag)); A TCP that does not support segmentation MUST fail the setsockopt call. The setsockopt call MAY not be made on an open TCP connecˇ tion. B.1.2. Sending data atomically on that socket The send or sendmsg calls should be used to write data to the TCP stream. The EMSGSIZE error should be returned if the buffer passed to send or sendmsg is too large to fit in a single TCP segment. When the path MSS increases, the TCP MAY return EMSGSIZE once to inform the client of the change. B.1.3. Retrieving the max segment size getsockopt (s, SOL_TCP, TCP_SEND_MSS, &mbs, &mbslen); This call returns the maximum segment size that can be sent without fragmentation. The number should not count any bytes that go towards TCP options. Williams Expires September 2001 [Page 9] Internet-Draft ULP Framing for TCP 23 February 2001 Authors' Addresses Jim Williams Giganet, Inc. Concord Office Center 580 Main Street Bolton, MA 01740 US Phone: +1 978 779 7224 EMail: jimw@giganet.com Jim Pinkerton Microsoft, Inc. 1 Microsoft Way Redmond, WA 98052 USA EMail: jpink@microsoft.com Constantine Sapuntzakis Cisco Systems 170 W Tasman Drive San Jose, CA 95134 USA Phone: +1 408 525 5497 EMail: csapuntz@cisco.com Jim Wendt Hewlett Packard Corporation 8000 Foothills Boulevard MS 5668 Roseville, CA 95747-5668 USA Phone: +1 916 785 5198 EMail: jim_wendt@hp.com Williams Expires September 2001 [Page 10] Internet-Draft ULP Framing for TCP 23 February 2001 Full Copyright Statement Copyright (C) The Internet Society (2001). All Rights Reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, pubˇ lished and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paraˇ graph are included on all such copies and derivative works. Howˇ ever, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the proceˇ dures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English. The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assigns. This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGIˇ NEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARˇ RANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Williams Expires September 2001 [Page 11]