Internet Engineering Task Force M. Kojo INTERNET-DRAFT University of Helsinki draft-kojo-tcpm-frto-eval-00.txt K. Yamamoto Expires: December 2007 M. Hata NTT Docomo P. Sarolahti Nokia Research Center 5 June 2007 Evaluation of RFC 4138 Status of this Memo By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on December 2007. Abstract Forward-RTO recovery (F-RTO) specified in RFC 4138 is an algorithm for detecting a spurious retransmission timeout with TCP and SCTP. This document describes the advantages of F-RTO and summarizes the Kojo/Yamamoto/Hata/Sarolahti [Page 1] INTERNET-DRAFT Expires: December 2007 June 2007 experience in its implementations and the experiments conducted with it. By analyzing the implications of the spurious retransmission timeouts on the regular RTO recovery and Forward-RTO recovery algorithm, including a detailed corner case analysis, it shows that F-RTO does not have negative impact on the network when used with an appropriate response algorithm even in the rare cases where F-RTO falsely declares a retransmission timeout spurious. It concludes with a recommendation that F-RTO is to be advanced to the standards track. Kojo/Yamamoto/Hata/Sarolahti [Page 2] INTERNET-DRAFT Expires: December 2007 June 2007 Table of Contents 1. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1. Conventions and Terminology. . . . . . . . . . . . . . . 4 2. Problems with the Regular RTO Recovery. . . . . . . . . . . . 4 2.1. Unnecessary Retransmissions. . . . . . . . . . . . . . . 4 2.2. Dishonoring the Packet Conservation Princi- ple . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.3. Unnecessary Reduction of the Congestion Win- dow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.4. Other Problems . . . . . . . . . . . . . . . . . . . . . 6 3. Advantages and Motivation . . . . . . . . . . . . . . . . . . 6 3.1. Avoiding Unnecessary Retransmissions . . . . . . . . . . 6 3.2. Adhering to the Packet Conservation Princi- ple . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 3.3. Selecting an Appropriate Congestion Control Response. . . . . . . . . . . . . . . . . . . . . . . . . . . 7 3.4. Other Advantages . . . . . . . . . . . . . . . . . . . . 8 3.5. Non-spurious RTOs and Undetected Spurious RTOs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 4. Experimental Results. . . . . . . . . . . . . . . . . . . . . 8 4.1. Initial trials in an emulated network. . . . . . . . . . 9 4.2. F-RTO Performance over Commercial W-CDMA Networks. . . . . . . . . . . . . . . . . . . . . . . . . . . 9 5. Hidden Packet Losses. . . . . . . . . . . . . . . . . . . . . 11 5.1. Loss of Retransmitted Segments . . . . . . . . . . . . . 11 5.2. Reordering . . . . . . . . . . . . . . . . . . . . . . . 11 5.3. Malicious Receiver . . . . . . . . . . . . . . . . . . . 12 6. Conclusions and Recommendations . . . . . . . . . . . . . . . 13 References . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 AUTHORS' ADDRESSES . . . . . . . . . . . . . . . . . . . . . . . 16 Full Copyright Statement . . . . . . . . . . . . . . . . . . . . 18 Intellectual Property. . . . . . . . . . . . . . . . . . . . . . 18 1. Introduction A temporary delay spike or a more permanent but sudden delay increase in the TCP data or ACK path may result in a spurious retransmission timeout (RTO) that triggers a premature retransmission of the first unacknowledged data segment followed by an unnecessary loss recovery in slow-start. This creates severe problems with the regular RTO recovery algorithm as the late acknowledgments of the original segments trigger unnecessary retransmissions at a high rate. This introduces useless load into the network in the form of a (large) packet burst. In addition, the TCP sender will reduce its transmission rate quite unnecessarily because the congestion control algorithms are falsely triggered, resulting in decreased TCP performance. Kojo/Yamamoto/Hata/Sarolahti Section 1. [Page 3] INTERNET-DRAFT Expires: December 2007 June 2007 When a spurious RTO occurs, a TCP sender employing the Forward RTO- Recovery (F-RTO) algorithm [RFC4138] is able to avoid the problems encountered with the regular RTO recovery by detecting that the TCP retransmission timer expired spuriously and by avoiding additional unnecessary retransmissions. In addition, the F-RTO sender may elude the unnecessary performance degradation by restoring the congestion control state and/or reduce the risk of falsely triggering TCP's loss recovery and congestion control again in the later phases of the connection by adapting the RTT estimators. This document discusses the problems with the regular TCP RTO recovery when spurious RTOs are encountered and evaluates the F-RTO algorithm as a standards track alternative for the regular RTO recovery. 1.1. Conventions and Terminology The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119]. 2. Problems with the Regular RTO Recovery When a spurious retransmission timeout occurs, the regular RTO recovery is incapable of avoiding unnecessary retransmissions and also fails in adhering to the packet conservation principle [Jac88] by injecting the unnecessary retransmissions into the network at a rate that is higher than the rate at which packets are leaving the network. In addition, after bursting the unnecessary retransmissions into the network the transmission is continued at an unnecessarily low rate. 2.1. Unnecessary Retransmissions After the first unacknowledged segment triggered by a spurious retransmission timeout has been transmitted, the late acknowledgments of the original segments arrive at the sender and trigger further unnecessary retransmissions in slow start. The first late ACK triggers unnecessary retransmission of the next segment(s) which, in turn, will get acknowledged by the next late ACK quite immediately, injecting the next segments from the retransmission queue into the network. Assuming that none of the original segments and none of the corresponding late ACKs were lost, this chain reaction results in unnecessary retransmission of all outstanding segments. Kojo/Yamamoto/Hata/Sarolahti Section 2.1. [Page 4] INTERNET-DRAFT Expires: December 2007 June 2007 Depending on the flight size at the time when the spurious RTO occurred, a large number of unnecessary retransmissions may get injected into the network as useless load. This often creates a significant problem in network environments where sudden delay spikes tend to appear because such networks often offer a low or moderate transmission capacity and any excess load significantly reduces the available network capacity for delivering useful traffic. In addition, unnecessary transmissions waste the battery power of wireless devices and introduce extra costs when the access link usage is billed per transmission volume. 2.2. Dishonoring the Packet Conservation Principle The purpose of the TCP slow-start algorithm is to (re)start the ACK clock and probe the network for available capacity. However, when the retransmission timeout expires spuriously the TCP sender fails to restart the ACK clock at a correct rate and is not able to probe the available network capacity correctly. This is because the acknowledgements acknowledging the retransmissions do not arrive one round-trip time (RTT) after the retransmission as supposed. Instead, the retransmissions are acknowledged by the late acknowledgements that arrive at the line rate of the bottleneck link on the end-to- end path or, in some cases, at much higher rate. Therefore, the late acknowledgements clock out the unnecessary retransmissions within one RTT using slow start and potentially 50 percent more data segments are transmitted to the network in one RTT than what the TCP sender in steady state would have transmitted if the spurious RTO had not occurred. This violates the packet conservation principle [Jac88]. Assuming no packets are lost and the delayed ACKs are in use, each late acknowledgement (except the first one) arriving after a spurious RTO triggers three unnecessary retransmissions into the network until all segments in the retransmission queue have been retransmitted. After this, the TCP sender continues by transmitting three new segments on each late acknowledgement. The number of the new segments triggered by the late acknowledgements equals to the half of the flight size at the time when spurious RTO occurred. This injects 50 percent more segments into the network within one RTT compared to the number of segments injected in one RTT by a TCP sender in steady state, i.e., three segments per ACK instead of two segments per ACK. Assuming no packets are lost and the delayed ACKs are not in use, each late acknowledgement corresponding to the first half of the original flight triggers two unnecessary retransmissions into the network. The late acknowledgements belonging the second half of the Kojo/Yamamoto/Hata/Sarolahti Section 2.2. [Page 5] INTERNET-DRAFT Expires: December 2007 June 2007 original flight trigger a transmission of one new segment each. This means that during the first half of the RTT, the sending rate is doubled compared to the rate at which a TCP sender in steady state would have transmitted if the spurious RTO had not occurred. The new data segments transmitted after the unnecessary retransmissions during the same RTT are likely to experience congestion as the preceding unnecessary retransmission of the whole window of segments is likely to occupy the bottleneck link queue. This may result in serious a performance penalty as the TCP sender is often forced to wait for a backed-off retransmission timer to expire in order to recover the lost segments. A figure depicting an example of such a behavior is available at [http://www.iki.fi/pasi.sarolahti/frto/]. 2.3. Unnecessary Reduction of the Congestion Window When a spurious RTO occurs, the TCP sender enters loss recovery and reduces the congestion window and slow-start threshold. If the RTO was spurious, the reduction is likely to be unnecessary and results in sacrificed TCP performance. The impact of this unnecessary congestion control action is particularly notable in high latency environments where restoring the previous congestion window takes a long time. 2.4. Other Problems Updating the RTO estimate on retransmitted segments is not possible due to the retransmission ambiguity problem [Zh86, KP87]. Therefore, the RTO estimate is not updated for segments that experience the unusually long delay and cause the spurious RTOs. This means that the delayed segments are ignored in updating the RTO estimate and, in the worst case, the temporary delay spikes are never reflected to the RTO estimate, allowing a later delay spike to trigger a new spurious RTO as easily as the previous spurious RTO was triggered. 3. Advantages and Motivation 3.1. Avoiding Unnecessary Retransmissions If the TCP sender employs F-RTO, it is able to detect spurious RTOs. When F-RTO detects a spurious RTO, it retransmits only one segment Kojo/Yamamoto/Hata/Sarolahti Section 3.1. [Page 6] INTERNET-DRAFT Expires: December 2007 June 2007 unnecessarily (the first unacknowledged segment) and continues by transmitting new segments. 3.2. Adhering to the Packet Conservation Principle If the TCP sender employs F-RTO, it is able to detect spurious RTOs and avoid the unnecessary retransmission of the whole window of data. The amount of data that the TCP sender employing F-RTO transmits during the next RTT after detecting the spurious RTO depends on the congestion control response that the TCP sender follows. Whichever response algorithm is selected, the segments clocked out by the late acknowledgements must not be transmitted in slow start, unless the TCP sender was in slow start right before the spurious RTO occurred and the RTO recovery was entered. Otherwise, the late acknowledgements would clock out the segments at higher than accepted rate as discussed in Section 2.2 and the TCP sender would not adhere to the packet conservation principle. 3.3. Selecting an Appropriate Congestion Control Response If the F-RTO algorithm detects that the RTO was spurious, the TCP sender may revert the congestion control state back to the same state as it was right before the RTO occurred. One possible option is to restore the congestion window and slow-start threshold [LG04]. This would result in transmitting at the same rate as before the RTO, avoiding the performance penalty of unnecessarily reducing the congestion window and slow-start threshold. However, reverting the congestion control parameters might not be a safe response in all occasions. For example, a spurious RTO may occur due to a make- before-break vertical handover [MK04] from a low latency path to a high latency path [HC05, DK06]. If the handover results in a spurious RTO and the bottleneck link bandwidth-delay product with the new path after the handover is smaller than with the old path, restoring the congestion window is likely to result in congestion on new the bottleneck link. The TCP sender may select to take a conservative congestion control response after detecting a spurious RTO. The original F-RTO algorithm employed a conservative response algorithm after a spurious RTO was detected [SKR03]. That is, the TCP sender sets the congestion window and slow-start threshold to a value that equals to the half of the flight size right before the spurious RTO occurred and continues transmitting new data in congestion avoidance. This approach is always a safe response as the TCP sender halves its transmission rate, thereby taking the spurious RTO as a congestion signal. Kojo/Yamamoto/Hata/Sarolahti Section 3.3. [Page 7] INTERNET-DRAFT Expires: December 2007 June 2007 3.4. Other Advantages If the F-RTO algorithm declares that an RTO was spuriously triggered, it may take the RTT for the delayed segments into account when calculating the RTO estimate, except for the segment that was retransmitted upon the retransmission timer expiration. This alone may help in avoiding further spurious RTOs. However, with the capability of detecting spurious RTOs the TCP sender may adjust the RTO estimate explicitly in order avoid entering loss recovery unnecessarily in the later phases of the connection [BBA06]. 3.5. Non-spurious RTOs and Undetected Spurious RTOs If the retransmission timeout is not spurious or the F-RTO algorithm is not able to detect the spurious timeout, it reverts back to the conventional RTO recovery and continues retransmitting segments in slow-start. Two different cases with slightly different behavior can be observed: (i) if the first ACK arriving after the retransmission timer expired is a duplicate acknowledgement, the F-RTO sender declares the RTO genuine and reverts back to the conventional RTO recovery. (ii) if the first ACK arriving after the retransmission timer expired acknowledges new data, the F-RTO sender sends two previously unsent segments. Now, if the next ACK is a duplicate ACK, the F-RTO sender declares the RTO genuine and reverts to the conventional RTO recovery. In the first case, the behavior is identical to the behavior of the conventional RTO recovery. In the second case, the behavior is similar to the conventional RTO recovery with the only difference that the 2nd and 3rd segment sent after the RTO are new segments. When compared to a regular TCP implementation, the use of the F-RTO algorithm does not change the transmission rate of segments in the cases where the RTO is not declared spurious. Therefore, from the congestion control point of view the F-RTO algorithm can be seen to be safe also in these cases. 4. Experimental Results Additional material, such as the cited papers with the F-RTO experimentation results, can be found at [http://www.iki.fi/pasi.sarolahti/frto/]. Kojo/Yamamoto/Hata/Sarolahti Section 4. [Page 8] INTERNET-DRAFT Expires: December 2007 June 2007 4.1. Initial trials in an emulated network The basic F-RTO algorithm was first introduced in [SKR03]. F-RTO performance was experimented in a simple emulated network environment with slow bottleneck link that was typical to the wireless environments at the time of conducting the analysis. The original conservative F-RTO congestion control response (see Section 3.3) was used. The paper analyzed different scenarios that trigger a retransmission timeout either spuriously or genuinely. The following scenarios were investigated: i) delay spike that triggers a spurious timeout, ii) lost retransmission, iii) loss burst of entire window of data. The paper also discussed packet reordering scenario, although experiments with packet reordering were not conducted. In the delay spike scenario F-RTO significantly reduced the number of unnecessary retransmissions and also improved the data throughput. In the other cases use of F-RTO did not affect TCP performance negatively, and did not cause any additional traffic to be sent into the network. SACK-enhanced F-RTO with different congestion control response algorithms was evaluated in [Sar03]. Because the cases where duplicate ACKs interact with spurious timeout detection were relatively rare in practice, in many cases the basic F-RTO performed equally well with the SACK-enhanced F-RTO. 4.2. F-RTO Performance over Commercial W-CDMA Networks Experimentation on F-RTO performance over commercial W-CDMA networks and in a test environment which emulates HSDPA (High Speed Downlink Packet Access) networks has been reported in [Yam05, Hok05]. In the experimentation over the commercial networks, we downloaded data objects from a test server in the Internet through the W-CDMA mobile communications networks. The F-RTO detection algorithm with the Eifel response algorithm was implemented on the test server, HP- UX 11i prototype. The commercial W-CDMA networks provide a maximum bearer rate of 384 kbps in a downlink and 64 kbps in an uplink. A mobile client downloaded data objects of varying size from the server in five different situations; fixed point (good and bad wireless conditions), low speed (pedestrian), medium speed (driving by car in an urban area), and high speed (a bullet train). The object sizes were set to 6 Kbytes, 18 Kbytes, 300 Kbytes, 2 Mbytes, and 518 Mbytes (the object size of 518 Mbytes was used only for a bullet train). The experimentation took two weeks collecting performance data for 643 connections with F-RTO and 991 connections Kojo/Yamamoto/Hata/Sarolahti Section 4.2. [Page 9] INTERNET-DRAFT Expires: December 2007 June 2007 without F-RTO. In this experimentation, F-RTO reduced the amount of unnecessarily retransmitted data by 82 percent compared to the connections without F-RTO. Because the spurious RTOs did not occur very often, a relatively large amount of data was sent in total compared to the amount of unnecessarily retransmitted data. Therefore, just avoiding the unnecessary retransmissions did not improve TCP performance significantly. Throughput was improved primarily because F-RTO was used with the Eifel response algorithm that fully reverts the congestion control state to the state valid prior to the spurious retransmission timeout. F-RTO with the Eifel response improved throughput by 6 percent for connections that transferred at least 2 Mbytes and experienced spurious timeout. The network used for the experimentation has a small bandwidth-delay product around ten segments. Larger improvement in throughput is expected in networks with high bandwidth-delay product such as HSDPA networks. There are a few situations in which F-RTO cannot detect a spurious timeout such as severe reordering or duplication occurring on the segment that triggered the spurious timeout, if the sender has no new data to send, or the advertised window does not allow to send new data that is needed by F-RTO to detect the spurious timeout. In the experimentation, F-RTO was able detect 71 percent of spurious timeouts successfully. 28 percent of the spurious timeouts could not be detected by F-RTO because the sender had already sent the FIN segment and had no new data to send when the spurious timeout occurred. 0.7 percent of the spurious timeouts could not be detected because the advertised window prohibited transmitting new data and 0.3 percent because the sender received duplicate acknowledgements after the spurious timeout. Throughput of F-RTO with Eifel response was also evaluated in the test environment that emulated HSDPA networks. The test network has 14 Mbps bearer rate and 300 ms round-trip time, which yields the bandwidth-delay product of about 350 segments with the segment size of 1460 bytes. To trigger a spurious timeout, acknowledgements from the server to the client were delayed intentionally in the early and middle phase of the initial slow start and after the congestion window reached the maximum window size (i.e., in the steady state). F-RTO with Eifel response improved the throughput by 262 percent, 92 percent, and 37 percent when a spurious timeout occurred in the early slow-start phase, in the middle of the slow start phase, and after the steady state, respectively. The results show that F-RTO with Eifel response improves the throughput significantly in the networks with large bandwidth-delay product. Kojo/Yamamoto/Hata/Sarolahti Section 4.2. [Page 10] INTERNET-DRAFT Expires: December 2007 June 2007 5. Hidden Packet Losses There are a few known scenarios where a packet loss could escape F- RTO's notice and cause a false positive detection. These scenarios could be split into two cases: scenarios with a legitime receiver where TCP communication is unaffected, and scenarios with misbehaving receiver. In the first case the hidden packet loss is harmless, if the congestion control response to spurious timeout is conservative enough. The second case requires receiver misbehavior by acknowledging segments that have not been received or by delaying the acknowledgements, and it is not beneficial to the receiver because, as a result, the TCP connection may become unreliable or useless, or the malicious receiver may compromise the performance of the TCP loss recovery in order to mislead the F-RTO sender. We also note that optimistically acknowledging segments that have not yet been received is possible with any regular TCP implementation, and if the receiver's motivation was to damage the TCP connection (for example, as a part of some kind of denial-of-service effort), the standard TCP offers easier ways of doing that. Next we will discuss these cases in more detail. 5.1. Loss of Retransmitted Segments RFC 4138 notes that when the timeout is declared spurious, the TCP sender cannot detect whether the unnecessary RTO retransmission was lost. In principle, the loss of the RTO retransmission should be taken as a congestion signal. Thus, there is a small possibility that the F-RTO sender will violate the congestion control rules, if it chooses to fully revert congestion control parameters after detecting a spurious timeout. The Eifel detection algorithm [RFC3522] has a similar property, while the DSACK option can be used to detect whether the retransmitted segment was successfully delivered to the receiver [RFC3708]. This behavior belongs to the first class of the above-mentioned cases; the loss of the RTO retransmission does not harm the TCP connection in any way, because the original segment has reached the receiver. With a conservative enough congestion control response this behavior is not harmful to the network, either. 5.2. Reordering F-RTO can declare a timeout spurious unintentionally when there is reordering between the retransmitted segment and the original segments transmitted before the timeout so that the RTO retransmission is acknowledged before the full window of original Kojo/Yamamoto/Hata/Sarolahti Section 5.2. [Page 11] INTERNET-DRAFT Expires: December 2007 June 2007 transmissions. This could happen, for example, in a case when an original segment is lost on a high-latency connection path, and the RTO retransmission of that segment traverses through a different path that has substantially lower round-trip delay. This might sound a pathological scenario, but could occur on a multi-radio device that is performing a vertical handover between a high-latency WWAN link and a low-latency WLAN link. Also this scenario belongs to the first class of the above-mentioned cases as it is not harmful to the network, if the congestion control response to spurious retransmission timeout is conservative enough. We believe that in practice this kind of combination of loss, delay and reordering is very rare. In addition, this kind of a reordering is less likely to occur with large TCP windows with which the effect of a non-conservative response could be detectable. 5.3. Malicious Receiver RFC 4138 notes in its security considerations that with F-RTO the receiver could mislead the sender into falsely declaring the RTO spurious. There are two possible ways a malicious receiver could trigger a wrong output from the F-RTO algorithm. First, the receiver can acknowledge data that it has not received so that the acknowledgement arrives at the sender after the retransmission timeout. Second, it can delay acknowledgments for segments it has received earlier and acknowledge the outstanding segments after the retransmission timer has expired and the retransmitted segment has arrived, deluding the sender to declare the timeout spurious. If the TCP receiver acknowledges a segment it has not really received, the sender can be led to declare the timeout spurious in the F-RTO algorithm, step 3. However, by doing so the receiver risks the correct behavior of the connection. If both the original transmission and the retransmission of the segment are dropped, the sender incorrectly thinks that the lost segment has been delivered to the receiver being not able to retransmit the segment again. As a result, the TCP connection is unable to proceed unless the receiver delivers the data out-of-order to the application, making the data delivery of the connection unreliable. In addition, this requires that the receiver transmits the false ACK timely such that the ACK does not arrive at the sender until the retransmission timer has expired and that the receiver suppresses any duplicate ACKs in order to prevent the sender from entering the fast retransmit and fast recovery. Therefore, we believe that this kind of attack is very hard to implement succesfully and a malicious receiver is unlikely to get any benefit from this attack, and with an appropriate response this attack is not harmful to the network, either. Kojo/Yamamoto/Hata/Sarolahti Section 5.3. [Page 12] INTERNET-DRAFT Expires: December 2007 June 2007 If the TCP receiver delays the acknowledgements of the out-of-order segments after detecting a hole in the sequence space and waits for the retransmission timer to expire and the retransmitted segment to arrive before it acknowledges the segments with cumulative acknowledgements, it may make the F-RTO sender to walk through the algorithm steps so that the timeout seems spurious when it should have been genuine. We believe this kind of attack is difficult to implement in practice, and it is likely to be of no benefit to the receiver as it needs to force the sender to wait for an RTO to recover each of the lost segments while loss recovery with fast retransmit and fast recovery is likely to be much more efficient. In addition, this approach does not work if consecutive segments are lost unless the receiver acknowledges data that it has not received. Also, with a conservative response to the spurious timeout, this attack is of no benefit to the receiver and it is not harmful to the network, either. 6. Conclusions and Recommendations This document analyzed the possible benefits and disadvantages of using F-RTO enhanced TCP, if it was deployed globally. When a spurious retransmission timeout occurs, the regular RTO recovery wastes the network resources by retransmitting the whole window of data unnecessarily. By doing it, a regular TCP also violates the packet conservation principle and is thus harmful for congestion control, despite following the letter of the specifications. As a result of spurious timeout, the regular RTO recovery transmits 1.5 times more data than what is allowed by the congestion window at the time the spurious timeout occurred and imposes excessive load on the network. F-RTO is able to avoid these unnecessary retransmissions by detecting a spurious timeout and not retransmitting segments unnecessarily. When the spurious retransmission timeout has been detected by F-RTO, the F-RTO sender with an appropriate response algorithm adheres to the packet conservation principle, because it does not transmit more segments than what have left the network. Therefore, a successful detection of a spurious retransmission timeout with F-RTO can result both in reduced load on the network and improved TCP throughput. These factors are especially important in wireless communication. There is one well-known scenario where a spurious timeout hides a packet loss with F-RTO: if the RTO retransmission is lost, a F-RTO sender cannot detect the segment loss. This is common to all currently known mechanisms for detecting spurious retransmission Kojo/Yamamoto/Hata/Sarolahti Section 6. [Page 13] INTERNET-DRAFT Expires: December 2007 June 2007 timeout immediately after it has occurred. Missing one packet loss event is not a problem, if the response algorithm is conservative enough. DSACK, that is able to detect spurious timeout after a full window of data has been unnecessarily retransmitted, does not have this problem, but on the other hand, DSACK is not able to avoid the unnecessary retransmissions and the consequent violation of the packet conservation principle. There are two known ways a misbehaving TCP receiver could cheat the F-RTO algorithm: (i) after detecting a packet loss, the receiver could delay acknowledging the following segments until a retransmission timer expires and the retransmitted segment arrives and then acknowledge the outstanding data, making the timeout seem spurious to F-RTO. (ii) after F-RTO algorithm has been triggered, the receiver could optimistically acknowledge segments that have been lost and make the RTO seem spurious to F-RTO. In the latter case the penalty to the connection is significant, because the control data at TCP sender may go into an invalid state, causing the TCP connection to be unusable. Furthermore, also with the regular TCP algorithms the receiver can acknowledge unreceived segments before they arrive in hope of gaining more performance, with the risk of invalidating the TCP state at the sender and making the connection unusable. If the response to spurious retransmission timeout is conservative enough, a misbehaving receiver cannot cause extensive congestion to the network in either of the cases. Given the above presented benefits and disadvantages, we believe F- RTO [RFC4138] is safe algorithm to be moved on to Proposed Standard, and to be deployed globally in the Internet with the following notes: * Because F-RTO performance with SCTP has not been studied to a significant extent, we propose that the revised version of RFC 4138 would not include discussion on SCTP. However, the authors would like to encourage future experimentations with F-RTO and SCTP, applying RFC 4138. * The research so far indicates that SACK-enhanced F-RTO provides only a limited benefit over the basic F-RTO in a small subset of spurious timeouts. Also many of the deployed implementations of the F-RTO algorithm implement only the basic F-RTO. Therefore, we propose that the revision of RFC 4138 would only contain the basic F-RTO algorithm. * While it is useful to keep the spurious timeout detection and response specifications separate, the authors would like to enable an usage of the F-RTO algorithm that allows detecting a spurious timeout without applying any specific response algorithm, i.e., Kojo/Yamamoto/Hata/Sarolahti Section 6. [Page 14] INTERNET-DRAFT Expires: December 2007 June 2007 allowing the TCP sender to continue transmitting new data with a conservative congestion control response. In other words, after detecting a spurious retransmission timeout, the TCP sender would take the spurious timeout as a congestion signal and reduce the congestion window and slow-start threshold. References [BBA06] J. Blanton, E. Blanton, and M. Allman. Using Spurious Retransmissions to Adapt the Retransmission Timeout. Internet-Draft "draft-allman-rto-backoff-04.txt", December 2006. Work in progress. [DK06] L. Daniel and M. Kojo. "Adapting TCP for Vertical Handoffs in Wireless Networks". In Proc. 31st IEEE Conference on Local Computer Networks (LCN), Tampa, FL, USA, November 15-16, 2006. [Jac88] Jacobson, V., "Congestion Avoidance and Control", In Proceedings of ACM SIGCOMM 88. [HC05] H. Huang and J. Cai. Improving TCP Performance during Soft Vertical Handoff. In Proc. 19th International Conference on Advanced Information Networking and Applications (AINA'05), volume 2, pages 329-332, Mar. 2005. [Hok05] A. Hokamura, et al. "Performance Evaluation of F-RTO and Eifel Response Algorithms over W-CDMA packet network". Wireless Personal Multimedia Communications (WPMC'05), Sept. 2005. [KP87] Karn, P. and C. Partridge, "Improving Round-Trip Time Estimates in Reliable Transport Protocols", In Proceedings of ACM SIGCOMM 87. [LG04] Ludwig, R. and A. Gurtov, "The Eifel Response Algorithm for TCP", RFC 4015, February 2005. [MK04] J. Manner and M Kojo, Mobility Related Terminology. RFC 3753, June 2004. [RFC793] J. Postel. Transmission Control Protocol. RFC 793, September 1981. [RFC2119] S. Bradner. Key words for use in RFCs to Indicate Requirement Levels. BCP 14, RFC 2119, March 1997. [RFC3522] R. Ludwig and M. Meyer, The Eifel Detection Algorithm for Kojo/Yamamoto/Hata/Sarolahti [Page 15] INTERNET-DRAFT Expires: December 2007 June 2007 TCP. RFC 3522, April 2003. [RFC3708] E. Blanton and M. Allman, Using TCP Duplicate Selective Acknowledgement (DSACKs) and Stream Control Transmission Protocol (SCTP) Duplicate Transmission Sequence Numbers (TSNs) to Detect Spurious Retransmissions, RFC 3708, February 2004. [RFC4138] P. Sarolahti and M. Kojo. Forward RTO-Recovery (F-RTO): An Algorithm for Detecting Spurious Retransmission Timeouts with TCP and the Stream Control Transmission Protocol (SCTP), RFC 4138, August 2005. [SKR03] P. Sarolahti, M. Kojo, and K. Raatikainen. F-RTO: An Enhanced Recovery Algorithm for TCP Retransmission Timeouts. In ACM SIGCOMM Computer Communication Review, 33(2), April 2003 [Sar03] P. Sarolahti. Congestion Control on Spurious TCP Retransmission Timeouts. In Proceedings of IEEE Globecom 2003, San Francisco, CA, USA, December 2003. [Yam05] K. Yamamoto, et al. "Effects of F-RTO and Eifel Response Algorithms for W-CDMA and HSDPA networks". Wireless Personal Multimedia Communications (WPMC'05), Sept. 2005. [Zh86] Zhang, L., "Why TCP Timers Don't Work Well", In Proceedings of ACM SIGCOMM 86. AUTHORS' ADDRESSES Markku Kojo University of Helsinki P.O. Box 68 FI-00014 UNIVERSITY OF HELSINKI Finland Email: kojo@cs.helsinki.fi Kazunori Yamamoto NTT Docomo, Inc. 3-5 Hikarinooka, Yokosuka, Kanagawa, 239-8536, Japan Phone: +81-46-840-3812 Email: yamamotokaz@nttdocomo.co.jp Max Hata NTT Docomo, Inc. 3-5 Hikarinooka, Yokosuka, Kanagawa, 239-8536, Japan Phone: +81-46-840-3812 Kojo/Yamamoto/Hata/Sarolahti [Page 16] INTERNET-DRAFT Expires: December 2007 June 2007 Email: hatama@s1.nttdocomo.co.jp Pasi Sarolahti Nokia Research Center P.O. Box 407 FI-00045 NOKIA GROUP Finland Phone: +358 50 4876607 Email: pasi.sarolahti@nokia.com Kojo/Yamamoto/Hata/Sarolahti [Page 17] INTERNET-DRAFT Expires: December 2007 June 2007 Full Copyright Statement Copyright (C) The IETF Trust (2007). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Intellectual Property The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf- ipr@ietf.org. Kojo/Yamamoto/Hata/Sarolahti [Page 18]