Network Working Group M. Allman Internet-Draft ICSI Expires: November 2006 May 2006 TCPx2: Don't Fence Me In draft-allman-tcpx2-hack-00.txt Status of this Memo By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Copyright Notice Copyright (C) The Internet Society (2006). Abstract In this document we aim to solve several problems caused by TCP's lack of header space for certain values by increasing the size of header without changing the semantics of the protocol. 1. Introduction "Oh, give me land, lots of land under starry skies above" -- Bing Crosby TCP [RFC793] has proven itself to be quite robust and useful across a wide range of applications. However, as the Internet has evolved the range of values that can be held in various TCP header fields have become increasingly anemic. A number of suggestions and specifications have been put forth to mitigate particular issues. In this document, we do not address any one particular issue, but rather attempt to provide more head room in all of TCP's header fields. Expires: November 2006 [Page 1] draft-allman-tcpx2-hack-00.txt May 2006 The most prominent example of the size of a TCP header value being a limitation is the case of the advertised window. [RFC793] provides 16 bits for the advertised window, which provides for a window of up to 64KB. As detailed in [RFC1323] this limit has proved too small and causes severely limited performance as the capacity of networks has increased. [RFC1323] provides the "Window Scale" (WS) option to address the standard limitation. The WS option is a shift value that is negotiated during TCP's three-way handshake. On incoming segments the advertised window value in the packet is left shifted by the negotiated number of bits, while on outgoing packets the advertised window is right shifted by the number of negotiated bits such that it fits in the alloted 16 bits. A recent study shows that roughly 27% of observed web connections use window scaling [MAF05]. Anecdotes indicate that all major operating system TCP implementations support window scaling (but, clearly many do not use the WS option by default). Increasing the size of the advertised window, in turn, means that TCP can chew through the sequence space more rapidly. Therefore, the timestamp (TS) option and the PAWS algorithm were also specified in [RFC1323] to protect the TCP sender against wrapping the sequence space too quickly and not being able to discern this case from the case of old segments arriving. Another case where the standard TCP header is running out of room is in the "reserved bits" area. Originally, TCP had 6 reserved bits adjacent to the "flags" field. Explicit Congestion Notification (ECN) [RFC3168] and the ECN Nonce [RFC3540] consume 3 of these bits, leaving 3 unallocated bits. Whether this is a problem is debatable. On the one hand, only half the original reserved bits have been consumed in 25 years of TCP use and so one possible viewpoint is that having only 3 bits remaining is not a problem. On the other hand, it may be beneficial to be a little less conservative with bit allocations (e.g., on a "retransmit" bit as outlined in [LK00]) and therefore more bits would be useful. As has been discussed on the IETF discussion list [IETF05], some larger servers may be running out of port numbers due to the way port numbers are used by applications. [She04] offers one possible mechanism for starting a connection on a well-known port and then migrating it to an ephemeral port, which could be a useful technique in more fully utilizing the available port space. Others have (privately - as far as we know) suggested putting larger port numbers or connection identifiers in TCP's option space. Finally, [Tou06b] specifies a "portnames" option for TCP whereby a string in the option space of a TCP SYN is used for demultiplexing, rather than depending on a well-known port number. One of the benefits of this mechanism is to allow for the port space to be more fully utilized. Finally, two proposals have noted that TCP's option space may be dwindling---especially with regards to further evolution. The header length encoded in TCP's standard header is a 4-bit field. The value carried in this field is the number of 32-bit words in Expires: November 2006 [Page 2] draft-allman-tcpx2-hack-00.txt May 2006 TCP's header (standard + options). Therefore, the maximum size of a TCP header is 60 bytes (20 bytes of standard header and a maximum of 40 bytes of options). [Edd05] suggests negotiating the use of the first option as a larger replacement for the standard header length (with the standard field then being abandoned). [Koh05] proposes using the unused code-points in the current header length field (because the minimum size of the TCP header is 20 bytes) to indicate larger headers (e.g., a length of "2" indicates a header size of 148 bytes, 20 bytes of standard header and 128 bytes of option space). The proposal in this document is to allocate a new IP protocol number for a new version of TCP that is essentially the same as the current version [RFC793] except that the size of each standard field in the TCP header is doubled. While the size of the header is changed, TCP's semantics are not changed. The expectation is that current TCP implementations that process current 20-byte TCP headers will be able to process new 40-byte headers with only minimal changes to deal with the new size of each value. The protocol logic, however, will remain identical. This document is not about specifying a "TCPng". The functionality (or lack thereof) of TCP is not changed. The intention is to simply give TCP some possibly breathing room. The proposal is for pragmatic evolution, rather than principled engineering. Note: This document is clearly nowhere near as fully fleshed out as it would need to be to be published as an RFC. Rather, this document is meant to capture the big picture idea to seed discussion. Expires: November 2006 [Page 3] draft-allman-tcpx2-hack-00.txt May 2006 2. Specification A new IP protocol number, TCP-NEW, will indicate that TCP with a larger header will be used. TCP stacks can then encode and decode the headers per this document or [RFC793] accordingly. In the remainder of this document we will use "TCP" to refer to the [RFC793] standardized version of TCP and "TCPx2" to refer to the proposed new protocol. The TCPx2 header is: 0 1 2 3 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Source Port Number (32 bits) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Destination Port Number (32 bits) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Sequence Number (64 bits) | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Acknowledgment Number (64 bits) | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | HLen (8 bits) | Reserved1 (15 bits) |N C E|U A P R S F| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Window Size (32 bits) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Reserved2 (16 bits) | Checksum (16 bits) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Urgent Pointer (32 bits) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ The fields all have the same meaning as defined in [RFC793], [RFC3168] (for the "C" and "E" ECN flags) and [RFC3540] (for the "N" ECN nonce bit). The "Reserved1" is the larger version of TCP's traditional reserved bits field. In addition, we have added one additional field to the header, denoted "Reserved2" above. The standard 16-bit Internet checksum is still used with TCPx2 and therefore the extra space derived from doubling the TCP header is reserved for future use. Reserved2 could possibly be for a second 16-bit checksum or a new 32-bit checksum that encompasses both the Checksum and the Reserved2 fields. To Do: TCP's SACK option [RFC2018] will need extended such that it holds larger sequence numbers. To Think About: Whether there are other options that need altered to take into account the larger header fields. 3. Advantages The main advantage of the proposal outlined above is that it is a straightforward change that addresses a myriad of issues that have Expires: November 2006 [Page 4] draft-allman-tcpx2-hack-00.txt May 2006 or may confront TCP. This change does not represent a high-minded engineering design, but rather a pragmatic evolution of a widely used and entrenched protocol. The proposal in this document mitigates all the issues discussed in Section 1. In addition, a simple doubling of the size of the TCP header will mitigate TCP's susceptibility to blind attacks [Tou06a] by making it more difficult for an attacker to guess valid port and sequence numbers to insert valid control or data packets into the stream. Note that this is a mitigation at best in that doubling the field sizes simply reduces the probabilities that any given blind attack packet will be considered valid and increases the amount of brute force needed to ensure a blind attack packet will succeed. The proposal in this document neither aids nor hurts the security of TCP if an attacker can observe a connection. All fields in the header can revert to being self-described. That is, previous knowledge is not required to interpret a given field's value. This is in contrast to extensions like the Window Scale option whereby the negotiated scale factor is required to interpret every packet in the connection. Finally, this proposal keeps all TCP's standard information in a standard place in the header field, in contrast to extending fields using options. This allows a TCP implementation (or middlebox, intrusion detection system (IDS), firewall, etc.) to quickly parse packets for needed information without digging through options to find key information. For instance, a firewall attempting to match a port number or an IDS attempting to reassemble all packets from a given stream on-the-fly. 4. Drawbacks Naturally, this proposal has drawbacks. First, TCP implementations will have to be changed to understand TCPx2. While we expect that this is straightforward, it will clearly take effort. Second, for many common cases, TCP's current header is sufficient and using TCPx2 would simply use more overhead for the same work. That said, the overhead of a 1500 byte IPv4/TCP packet with no TCP options is 2.7%, while the overhead of an IPv4/TCPx2 packet is 4%---hardly a dramatic increase. Finally, middleboxes such as firewalls would have to be updated to understand TCPx2 and to implement the desired local policy before the protocol would be viable for use. 5. Transition Since hosts are simply augmenting current TCP implementations with TCPx2 the transition procedure is fairly straightforward. A TCP that wishes to use TCPx2 simply sends the SYN in an IP packet with a TCPx2 header. If that segment is not acknowledged with a SYN+ACK within a specified time period then the SYN is retransmitted using a standard TCP header (adjusting aspects of the SYN as necessary --- Expires: November 2006 [Page 5] draft-allman-tcpx2-hack-00.txt May 2006 e.g., reducing the initial sequence number from 64-bits to 32-bits). The timeout does not have to be based on the default RTO (3 seconds, from [RFC2988]). As long as the host transmits a TCPx2 SYN and retransmits a TCP SYN, the retransmit is not taken as an indication of congestion and therefore the interval between the original and this single retransmission is not of large concern from a congestion control standpoint since this is a single retransmission. Further, the retransmission is (at least in the near-term) more likely to be caused by a firewall, middlebox or host that does not understand TCPx2 than actual network congestion. Note that subsequent retransmissions would be done using TCP with the default RTO and backoff as specified in [RFC2988]. Finally, we note that while waiting for 3 seconds to retransmit a TCPx2 SYN is likely unreasonable, implementers and operators should not make the interval too short such that the TCPx2's SYN and SYN+ACK are not given a chance to propagate across the network before retransmission. ICMP messages indicating TCPx2 is not supported can also be used to trigger an immediate retransmit of the SYN with TCP and without any congestion control action. 6. Discussion In discussions about this idea, some have suggested that rather than blindly doubling each header field, we carefully construct a new header format based on the needs of each header field. Of particular note is that the doubling of the Window Size field (to 32 bits) is not an actual doubling of the usable window given the wide-scale deployment of [RFC1323] which provides for 30-bit window sizes. The notion presented in this document is admittedly a sleazy hack, as opposed to proper engineering. However, it is *one* sleazy hack that copes with a number of disparate issues that will otherwise require multiple techniques to mitigate. Therefore, it may be the pragmatic path to evolving a heavily used protocol. 7. Security Considerations TCPx2 does not introduce any new security concerns to TCP since the protocol semantics remain unchanged. As discussed in Section 3, TCPx2 can provide for a mild increase in the robustness to blind attacks. TCPx2 may allow traffic to circumvent firewalls that pass unknown protocols, but allow only specific uses of TCP. Arguably, undefined protocols should not be passed and therefore this is not viewed as a large concern. Expires: November 2006 [Page 6] draft-allman-tcpx2-hack-00.txt May 2006 8. IANA Considerations The proposal in this document calls for IANA to allocate a new IP protocol number for TCPx2. Acknowledgments This document was fueled by fumes from Sherwin-Williams. The author benefited from useful discussions with Sally Floyd, Janardhan Iyengar, Mike O'Dell, Shawn Ostermann, Vern Paxson and Scott Shenker. Normative References [RFC793] J. Postel, Transmission Control Protocol, RFC 793, September 1981. [RFC2018] M. Mathis, J. Mahdavi, S. Floyd, A. Romanow. TCP Selective Acknowledgment Options, RFC 2018, October 1996. [RFC3168] K.K. Ramakrishnan, S. Floyd, and D. Black. The Addition of Explicit Congestion Notification (ECN) to IP, RFC 3168, September 2001. [RFC3540] N. Spring, D. Wetherall, and D. Ely. Robust Explicit Congestion Notification (ECN) Signaling with Nonces, RFC 3540, June 2003. Informative References [Edd05] W. Eddy. Extending the Space Available for TCP Options, May 2005. Internet-Draft draft-eddy-tcp-loo-03.txt (expired, cited for acknowledgment purposes). [IETF05] Email thread "Port numbers and IPv6" on IETF discussion list, July 2005. [Koh04] E. Kohler. Extended Option Space for TCP, September 2004. Internet-Draft draft-kohler-tcpm-extopt-00.txt (expired, cited for acknowledgment purposes). [LK00] R. Ludwig, R. H. Katz. The Eifel Algorithm: Making TCP Robust Against Spurious Retransmissions. ACM Computer Communication Review, 30(1), January 2000. [MAF05] A. Medina, M. Allman, S. Floyd. Measuring the Evolution of Transport Protocols in the Internet. ACM Computer Communication Review, 35(2), April 2005. [RFC1323] V. Jacobson, R. Braden, D. Borman. TCP Extensions for High Performance, RFC 1323, May 1992. [RFC2988] V. Paxson, M. Allman. Computing TCP's Retransmission Timer, RFC 2988, November 2000. Expires: November 2006 [Page 7] draft-allman-tcpx2-hack-00.txt May 2006 [She04] T. Shepard. Reassign Port Number Option for TCP, July 2004. Internet-Draft draft-shepard-tcp-reassign-port-number-00.txt (expired, cited for acknowledgment purposes). [Tou06a] J. Touch. Defending TCP Against Spoofing Attacks, February 2006. Internet-Draft draft-ietf-tcpm-tcp-antispoof-03.txt (work in progress). [Tou06b] J. Touch. A TCP Option for Port Names, April 2006. Internet-Draft draft-touch-tcp-portnames-00.txt (work in progress). Authors' Addresses Mark Allman ICIR / ICSI 1947 Center Street Suite 600 Berkeley, CA 94704-1198 Phone: +1 440 235 1792 EMail: mallman@icir.org http://www.icir.org/mallman/ Intellectual Property Statement The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org. Disclaimer of Validity This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR Expires: November 2006 [Page 8] draft-allman-tcpx2-hack-00.txt May 2006 IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Copyright Statement Copyright (C) The Internet Society (2006). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. Acknowledgment Funding for the RFC Editor function is currently provided by the Internet Society. Expires: November 2006 [Page 9]