Network Working Group L. Eggert Internet-Draft S. Schuetz Expires: January 10, 2005 S. Schmid NEC July 12, 2004 TCP Extensions for Immediate Retransmissions draft-eggert-tcpm-tcp-retransmit-now-00 Status of this Memo By submitting this Internet-Draft, I certify that any applicable patent or other IPR claims of which I am aware have been disclosed, and any of which I become aware will be disclosed, in accordance with RFC 3668. This document may not be modified, and derivative works of it may not be created, except to publish it as an RFC and to translate it into languages other than English. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on January 10, 2005. Copyright Notice Copyright (C) The Internet Society (2004). All Rights Reserved. Abstract This document describes a modification to TCP's standard retransmission scheme that improves performance across intermittently connected paths. In addition to the regular retransmission attempts scheduled at exponentially increasing intervals, this extension causes additional, speculative retransmission attempts upon receiving external triggers. One example of such a trigger is "first hop Eggert, et al. Expires January 10, 2005 [Page 1] Internet-Draft TCP Extensions for Immediate Retransmissions July 2004 router reachable." This document does not define the specifics of such triggers, although it describes some examples. Instead, it defines how a conforming TCP implementation operates when it receives a trigger. 1. Introduction Depending on the specific path between two nodes in the Internet, disruptions in connectivity may be frequent. Host mobility and other factors can further increase the likelihood of connectivity disruptions. When hosts communicate with the Transmission Control Protocol (TCP) [1], their connections may abort during periods of disconnection. The main reason for connection aborts during periods of disconnection is TCP's "user timeout." It defines the maximum amount of time that transmitted segments may remain unacknowledged. If a disconnection lasts longer than the user timeout, the TCP connection will abort. Many TCP implementations default to user timeout values of a few minutes [6]. The proposed TCP Abort Timeout Option [7] allows conforming TCP implementations to use longer user timeout values and consequently tolerate long disconnections without disruption. Although the TCP Abort Timeout Option enables TCP connections to survive extended periods of disconnections, experiments have shown that TCP connections perform significantly worse when operating along paths with frequent disconnections [8][9]. This decrease in performance is caused by TCP's retransmission behavior after connectivity is restored. This document describes a modification of TCP's retransmission scheme to improve performance over a path with frequent disconnections. The basic idea is to trigger a speculative retransmission attempt when a TCP implementation receives an indication that connectivity to a previously disconnected peer node may have been restored. Section 3 discusses TCP performance over intermittently connected paths in more detail, comparing it to similar proposals [10][11][12], and Section 4 describes the proposed "immediate retransmission" extension to TCP. Section 7 investigates security aspects of the proposed modification and Section 8 summarizes and concludes this document. 2. Conventions The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [2]. Eggert, et al. Expires January 10, 2005 [Page 2] Internet-Draft TCP Extensions for Immediate Retransmissions July 2004 3. Background When a disconnection occurs along the path between a host and its peer while the host is transmitting data, it stops receiving acknowledgments. After the retransmission timeout (RTO) expires, the host attempts to retransmit the first unacknowledged segment. TCP implementations that follow the recommended RTO management proposed in RFC 2988 [3] double the RTO after each retransmission attempt until it exceeds 60 seconds. This scheme causes a host to attempt to retransmit across established connections roughly once a minute. (More frequently during the first minute or two of the disconnection, while the RTO is still being backed off.) When the disconnection ends, standard TCP implementations still wait until the RTO expires before attempting retransmission. Figure 1 illustrates this behavior. Depending on when connectivity becomes available again, this can waste up to a minute of connection time for TCPs that implement the recommended RTO management described in RFC 2988 [3]. For TCP implementations that do not implement RFC 2988, even more connection time may be lost. For example, Linux uses 120 seconds as the maximum RTO. sequence number X = successfully transmitted segment ^ O = lost segment | : : : X | : : :X | XO O O O O : X | X: : : | X : :<------------>: | X : : wasted : | X : : connection : |X : : time : +-----:---------------------:--------------:--------> : : : time connectivity connectivity TCP gone back retransmit Figure 1: Standard TCP behavior in the presence of a disconnection This retransmission behavior is not efficient, especially in scenarios where connected periods are short and disconnections frequent [13]. Experiments show that TCP performance across a path with frequent disruptions is significantly worse compared to a similar path without disruptions [8][9]. In the ideal case, TCP would attempt a retransmission as soon as connectivity to its peer was re-established. Figure 2 illustrates Eggert, et al. Expires January 10, 2005 [Page 3] Internet-Draft TCP Extensions for Immediate Retransmissions July 2004 the ideal behavior. sequence number X = successfully transmitted segment ^ O = lost segment | : : X : | : :X : | XO O O O O X : | X: : : | X : :<------------>: | X : : efficiency : | X : : improvement : |X : : : +-----:---------------------:--------------:--------> : : : time connectivity connectivity next gone back = immediate scheduled TCP retransmit retransmit Figure 2: Ideal TCP behavior in the presence of a disconnection The ideal behavior is difficult to achieve for arbitrary connectivity disruptions. One obviously problematic approach would use higher-frequency retransmission attempts to enable earlier detection of whether connectivity was restored. This can generate significant amounts of extra traffic. Other proposals attempt to trigger faster retransmissions by retransmitting buffered or newly-crafted segments from inside the network [10][11][12]. Section 6 compares these approaches to the "immediate retransmission" extension. 4. Examples of Reconnection Triggers This section describes examples for reconnection triggers, which the retransmission mechanism described in the next section acts upon. This document does not define the specifics of such triggers but merely discusses them to illustrate the operation of the "immediate retransmission" extension. Reconnection triggers signal TCP when connectivity to a previously disconnected peer may have been restored. They depend on the specifics of a node and its environment, for example network-layer mechanisms such as DHCP [14], MobileIP [15] or HIP [16]. The IETF's Detection of Network Attachment (DNA) working group currently investigates the specifics of providing such triggers [17]. One example of a reconnection trigger is "next hop reachable." This indicator could occur if a combination of the following conditions is true, depending on host specifics: Eggert, et al. Expires January 10, 2005 [Page 4] Internet-Draft TCP Extensions for Immediate Retransmissions July 2004 o Network-layer connectivity along the path to the destination is restored, e.g., the outbound interface has an IP address and a next-hop router is known, maybe due to DHCP [14] or IPv6 router advertisements [18]. o Link-layer connectivity of the link to the next-hop router along the path to the destination is restored (e.g., link-layer "link up"). o Other local conditions that affect reachability of the destination are satisfied (e.g., IKE exchanges [19], MobileIP binding updates [15] or HIP readdressing [20] have completed). The "next hop reachable" trigger only depends on locally determinable information (e.g., state of directly-connected links, etc.) and does not require network cooperation. It can signal TCP to restart active connections across intermittently connected links where disruptions occur on the first or last hop. This simple trigger has the potential to improve TCP performance in many cases, because connection disruptions at the first or last hop are arguably the most common cause of disconnections in today's Internet. A second, more general example of a reconnection trigger would be "end-to-end connectivity restored." If hosts have the ability to detect or be notified of connectivity changes inside the network (i.e., not only at the first or last hop), a more general trigger could act on those pieces of information. This can improve TCP performance across intermittently connected paths where disruptions occur at arbitrary links along the path, even inside the network. However, providing this more general trigger is problematic due to its dependence on remote information and its related issues, such as trust. Reconnection triggers are generally asymmetric, i.e., they may occur on one peer host but not the other. As discussed above, a local event at one host may trigger the "immediate retransmission" mechanism, while the other host is unable to detect this event across the network. Symmetric reconnection triggers are a special case and always occur concurrently at both communicating hosts. Examples for such symmetric triggers are handshake events such as IKE exchanges or HIP readdressing. Symmetric triggers are an important special case, because the retransmission procedure required in response to a symmetric trigger is simpler than that for an asymmetric one. The next section will describe this in detail. 5. TCP Immediate Retransmission Extension This section describes the main contribution of this document, i.e., Eggert, et al. Expires January 10, 2005 [Page 5] Internet-Draft TCP Extensions for Immediate Retransmissions July 2004 a TCP extension for immediate retransmission in response to reconnection triggers. The basic idea behind the "immediate retransmission" extension is to allow TCP to restart stalled connections as soon as it receives an indication - a reconnection trigger - that connectivity to previously disconnected peers may have been restored. This document does not specify how TCP determines which connections are affected by a specific reconnection trigger, i.e., for which connections it should initiate retransmission attempts. This is a property of individual reconnection triggers. For example, the "next hop reachable" trigger described in the previous section affects connections to all destinations routed through that hop. It is important to note that this retransmission extension does not modify TCP's basic congestion control, fairness properties or slow-start algorithms. The only difference in TCP behavior is the timing of retransmission events and, in some cases, a minor, fixed increase in the number of initially retransmitted segments. The "immediate retransmission" extensions increases performance through better utilization of connected periods, not through sending traffic at a faster rate or modifying TCP's congestion control mechanisms. Hosts that implement the "immediate retransmission" TCP extension MUST implement the following retransmission mechanism whenever a reconnection trigger is received: When receiving a symmetric or asymmetric reconnection trigger, conforming TCP implementations MUST immediately initiate the standard retransmission procedure for connections affected by the reconnection trigger - just as if the RTO for those connections had expired. If the reconnection trigger is symmetric, i.e., all peers receive it concurrently; this simple change is sufficient to kick-start the relevant TCP connections. If the reconnection trigger is asymmetric, this simple extension is not always sufficient, because only one peer received the reconnection trigger. In case the host receiving the trigger has no (or too little) unacknowledged data awaiting retransmission, it will not emit enough segments to cause its peer nodes, which may have unacknowledged data, to attempt retransmission themselves. Transmission would thus only resume in one direction, which is ineffective for two-way communication. To avoid this issue, conforming TCP implementation MUST perform a different retransmission procedure in response to an asymmetric reconnection trigger. TCP MUST send at least four segments that all Eggert, et al. Expires January 10, 2005 [Page 6] Internet-Draft TCP Extensions for Immediate Retransmissions July 2004 acknowledge the last segment received from a peer for all connections affected by the reconnection trigger. These triple-duplicate ACKs will activate the peers' fast retransmit algorithm and cause them to immediately restart communication in the reverse direction, i.e., before their next scheduled retransmission. If a TCP connection affected by a reconnection trigger has four or more unacknowledged data segments in the retransmission queue, it SHOULD piggyback the triple-duplicate ACK to the regular retransmissions of those data segments. In this case, the "immediate retransmission" TCP extension does not require additional messages, compared to standard TCP. For connections where the retransmission queue contains only three or less unacknowledged data segments, TCP implementations supporting the "immediate retransmission" TCP extension MUST send additional pure ACKs until a complete triple-duplicate ACK has been sent. In the worst case, when the retransmission queue is empty, this scheme requires four additional ACKs, compared to standard TCP. After the peer's fast retransmit algorithm sends the assumed missing segment, TCP performs either fast recovery or a slow-start [4], depending on the length of the disconnection. If the retransmission trigger occurs before the RTO, i.e., for very short disconnections, TCP has not yet lost its ACK clock and can thus perform fast recovery. After longer disconnections, TCP falls back to slow-start to restart the ACK clock, just as it does at the beginning of a connection. The result of this modification is twofold. First, TCP connections receiving the reconnection trigger attempt retransmission of their unacknowledged segments before the next scheduled RTO. This increases utilization of connected periods. Second, TCP connections receiving the reconnection trigger use an existing TCP mechanism (triple-duplicate ACK) to signal their peer. Although the peer may not have received a reconnection trigger itself (e.g., the trigger was asymmetric), this causes it to attempt faster retransmission as well. As mentioned above, the "immediate retransmission" scheme can generate up to four additional segments, compared to standard TCP. All additional segments are pure ACKs and hence small, resulting in a minor total overhead. Furthermore, measurements have shown that increasing TCP's initial window is not problematic [21]; this may indicate that a minor increase in traffic at retransmission time may be tolerable as well. (NB: The authors have seen the idea of triggering retransmits based Eggert, et al. Expires January 10, 2005 [Page 7] Internet-Draft TCP Extensions for Immediate Retransmissions July 2004 on connectivity events of directly-connected links attributed to Phil Karn, but were unable to locate a specific reference. Pointers are highly appreciated.) 6. Related Work Several other approaches try to improve TCP performance in the presence of connectivity disruptions [10][11][12]. They attempt to improve TCP startup after a disconnection by retransmitting buffered or newly-crafted segments from inside the network. These proposals can be problematic, because TCP is built on the assumption that segments older than the maximum segment lifetime (MSL) of 2 minutes [1] will never be received. When a disconnection lasts longer than the MSL, these proposals will either become ineffective or risk leaking buffered old segments onto new connections, violating TCP's semantics. The "immediate retransmission" modification also improves performance over a path with frequent disconnections. The basic idea is to schedule an additional, speculative retransmission attempt when a TCP implementation receives an indication that connectivity to a peer node has been restored. Unlike the other proposals, the "immediate retransmission" scheme uses regular retransmissions, i.e., retransmits data that is buffered at the end systems. Because that data has not entered the network yet, it is not subject to the problematic MSL rule. Consequently, the "immediate retransmission" scheme remains effective even for disconnections longer than the MSL, without the risk of compromising connection integrity. Other transport-layer approaches such as the Explicit Link Failure Notification [22] or TCP-F [23] use specific messages generated by intermediate routers to inform TCP senders about disrupted paths. The former extends the TCP state machine with a new "stand by" state during which the standard retransmission timers are disabled. In this state, TCP periodically probes the network to detect connectivity reestablishment. Depending on the frequency of the probes and the network environment, this can cause significant amounts of extra traffic. TCP-F completely suspends ongoing connections until receiving "route reestablishment notifications" that indicate peer reachability. Both proposals are primarily designed for ad hoc networks and rely on changes to intermediate routers, whereas the "immediate retransmission" extension only requires end system support. ATCP [24] uses a similar approach as the Explicit Link Failure Notification, but discovers link failures through ICMP Destination Unreachable messages. Caceres and Iftode [25] propose and evaluate a Eggert, et al. Expires January 10, 2005 [Page 8] Internet-Draft TCP Extensions for Immediate Retransmissions July 2004 solution similar to the TCP Retransmission Trigger that improves performance during MobileIP handoffs. Unlike the solution proposed in this paper, the handoff mechanism is targeted at disconnections of a few seconds. 7. Security Considerations To protect the TCP retransmission trigger from abuse, e.g., the launch denial-of-service attacks by flooding TCP with triggers, a control mechanism that "rate-limits" connectivity indications may be effective. This document does not currently discuss the security aspects of reconnection triggers and the "immediate retransmission" extension to TCP further. 8. Conclusion This document described the "immediate retransmission" extension to TCP's standard retransmission scheme. The new extension improves performance across intermittently connected paths through additional, speculative retransmission attempts upon receiving external triggers. One example of such a trigger is "first hop router reachable." This document did not define the specifics of such triggers, although it described some examples to illustrate the operation of the "immediate retransmission" extension, which is its main contribution. 9. Acknowledgments The following people have helped to improve this document through thoughtful suggestions and feedback: Marcus Brunner, Juergen Quittek and Joe Touch. This work is a byproduct of the Ambient Networks project supported in part by the European Commission under its Sixth Framework Programme. It is provided "as is" and without any express or implied warranties, including, without limitation, the implied warranties of fitness for a particular purpose. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of the Ambient Networks project or the European Commission. 10. References 10.1 Normative References [1] Postel, J., "Transmission Control Protocol", STD 7, RFC 793, September 1981. Eggert, et al. Expires January 10, 2005 [Page 9] Internet-Draft TCP Extensions for Immediate Retransmissions July 2004 [2] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [3] Paxson, V. and M. Allman, "Computing TCP's Retransmission Timer", RFC 2988, November 2000. [4] Allman, M., Paxson, V. and W. Stevens, "TCP Congestion Control", RFC 2581, April 1999. [5] Postel, J., "Internet Protocol", STD 5, RFC 791, September 1981. 10.2 Informative References [6] Stevens, W., "TCP/IP Illustrated, Volume 1: The Protocols", Addison-Wesley , 1994. [7] Eggert, L., "TCP Abort Timeout Option", draft-eggert-tcpm-tcp-abort-timeout-option-00 (work in progress), April 2004. [8] Schuetz, S., "Network Support for Intermittently Connected Mobile Nodes", M.S. Thesis, University of Mannheim, Germany, June 2004. [9] Schuetz, S., Eggert, L., Schmid, S. and M. Brunner, "Protocol Enhancements for Intermittently Connected Hosts", under submission (work in progress), July 2004. [10] Scott, J. and G. Mapp, "Link layer-based TCP optimisation for disconnecting networks", ACM Computer Communication Review, Vol. 33, No. 5, October 2003. [11] Dawkins, S., "End-to-end, Implicit 'Link-Up' Notification", draft-dawkins-trigtran-linkup-01 (work in progress), October 2003. [12] Karn, P., "Advice for Internet Subnetwork Designers", draft-ietf-pilc-link-design-15 (work in progress), December 2003. [13] Ott, J. and D. Kutscher, "Drive-Thru Internet: IEEE 802.11b for Automobile Users", Proc. INFOCOM 2004, March 2004. [14] Droms, R., "Dynamic Host Configuration Protocol", RFC 2131, March 1997. [15] Johnson, D., Perkins, C. and J. Arkko, "Mobility Support in IPv6", RFC 3775, June 2004. Eggert, et al. Expires January 10, 2005 [Page 10] Internet-Draft TCP Extensions for Immediate Retransmissions July 2004 [16] Moskowitz, R., "Host Identity Protocol Architecture", draft-moskowitz-hip-arch-05 (work in progress), October 2003. [17] Choi, J., "Detecting Network Attachment in IPv6 Goals", draft-ietf-dna-goals-00 (work in progress), June 2004. [18] Deering, S. and R. Hinden, "Internet Protocol, Version 6 (IPv6) Specification", RFC 2460, December 1998. [19] Harkins, D. and D. Carrel, "The Internet Key Exchange (IKE)", RFC 2409, November 1998. [20] Nikander, P., "End-Host Mobility and Multi-Homing with Host Identity Protocol", draft-nikander-hip-mm-01 (work in progress), January 2004. [21] Allman, M., Hayes, C. and S. Ostermann, "An Evaluation of TCP with Larger Initial Windows.", ACM Computer Communication Review, Vol. 28, No. 3, July 1998. [22] Holland, G. and N. Vaidya, "Analysis of TCP Performance over Mobile Ad Hoc Networks", Proc. 5th Annual ACM/IEEE International Conference on Mobile Computing and Networking, 1999. [23] Chandran, K., Raghunathan, S., Venkatesan, S. and R. Prakash, "A Feedback Based Scheme For Improving TCP Performance In Ad-Hoc Wireless Networks", IEEE Personal Communication Systems (PCS) Magazine: Special Issue on Ad Hoc Networks, Vol. 8, No. 1, February 2001. [24] Liu, J. and S. Singh, "ATCP: TCP for Mobile Ad Hoc Networks", IEEE Journal on Selected Areas in Communication, Vol. 19, No. 7, July 2001. [25] Caceres, R. and L. Iftode, "Improving the Performance of Reliable Transport Protocols in Mobile Computing Environments", IEEE Journal on Selected Areas in Communication, Vol. 13, No. 5, 1995. Eggert, et al. Expires January 10, 2005 [Page 11] Internet-Draft TCP Extensions for Immediate Retransmissions July 2004 Authors' Addresses Lars Eggert NEC Network Laboratories Kurfuerstenanlage 36 Heidelberg 69115 DE Phone: +49 6221 90511 43 Fax: +49 6221 90511 55 EMail: lars.eggert@netlab.nec.de URI: http://www.netlab.nec.de/ Simon Schuetz NEC Network Laboratories Kurfuerstenanlage 36 Heidelberg 69115 DE Phone: +49 6221 90511 10 Fax: +49 6221 90511 55 EMail: simon.schuetz@netlab.nec.de URI: http://www.netlab.nec.de/ Stefan Schmid NEC Network Laboratories Kurfuerstenanlage 36 Heidelberg 69115 DE Phone: +49 6221 90511 54 Fax: +49 6221 90511 55 EMail: stefan.schmid@netlab.nec.de URI: http://www.netlab.nec.de/ Appendix A. Document Revision History +-----------+-------------------------------------------------------+ | Revision | Comments | +-----------+-------------------------------------------------------+ | 00 | Initial version. | +-----------+-------------------------------------------------------+ Eggert, et al. Expires January 10, 2005 [Page 12] Internet-Draft TCP Extensions for Immediate Retransmissions July 2004 Intellectual Property Statement The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org. Disclaimer of Validity This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Copyright Statement Copyright (C) The Internet Society (2004). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. Acknowledgment Funding for the RFC Editor function is currently provided by the Internet Society. Eggert, et al. Expires January 10, 2005 [Page 13]