Internet Engineering Task Force J. Ma and R. Zhang Internet Draft Nokia (China) R&D Center Document: draft-ma-pilc-ftcp-01.txt November 1999 Expires: May 2000 Fast-TCP: An enhancement to the current TCP Status of this Memo This document is an Internet Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet Drafts are working documents of the Internet Engineering Task Force (IETF), its Areas, and its Working Groups. Note that other groups may also distribute working documents as Internet Drafts. Internet Drafts are draft documents valid for a maximum of six months. Internet Drafts may be updated, replaced, or obsoleted by other documents at any time. It is not appropriate to use Internet Drafts as reference material or to cite them other than as a "working draft" or "work in progress". The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Abstract Being the widely used transport protocol in Internet, TCP might be influenced by the congestion occurred in those nodes (i.e., routers, IP/ATM access nodes, and ATM switches) along the travelling path. Since TCP relies on the end-to-end window-based mechanism to explain the congestion, there is limited room for intermediate network elements to explicitly contribute to the TCP control. However, in huge pipe networks, TCP might be ineffective because it starts decreasing its rate just after it senses packet's loss, implying that the network has already been congested somewhere. A general mechanism of delaying ACKs in congested nodes is presented in this draft to connect the traffic control of TCP with that of network elements. This mechanism can be considered as an enhancement to the currently prevalent TCP and referred to Fast-TCP. The basic idea of Fast-TCP arises from the fact that controlling ACK flows can indirectly affect the dynamics of TCP's behavior. ACK delay mechanism can be easily implemented in various network nodes where information of link utilization or buffer occupancy may be employed to notify the congestion. By using ACK delay policy at IP routers in a complex network, we present the simulation results to show that this policy achieves good TCP throughput and reduces the buffer requirement. J. Ma and R. Zhang [1] 1. INTRODUCTION In Internet, a crucial issue is the high end-to-end propagation delay of connections in long haul networks. TCP is a widely used transmission protocol in Internet and will be the dominate protocol for broadband wide area networking by the year 2002 rolls around, which is one of the few transport protocols that has its congestion control mechanisms [1- 2]. With an acknowledgment and time-out based congestion control mechanism, the performance is inherently related to the delay-bandwidth product of the connection. TCP operates in the following way. It starts sending packet at a very slow rate (slow start) and watches the acknowledgments (ACK) from the destination to see if any packets are lost. If none is lost (source receives a consequent ACK), it speeds up. It keeps speeding up until a packet is lost, (if a packet does not arrive successfully, a time-out based approach is used to recover the lost packet,) at which time it decreases its rate. It keeps decreasing its rate until no packet is lost. It then increases its rate again, continually oscillating like this. It always losses packets in a round trip time and under-utilizing the network in another round trip time. The long recovery time for TCP causes a degradation of throughput. Therefore, if TCP control time can not be speed up, TCP will cause major overloads and outages on long haul networks. To adapt the existing upper layer protocols to IP or IP/ATM networks many congestion control mechanisms [3-4] have been proposed and analyzed in the past, however, to some certain degrees they appear to suffer throughput reduction when congestion occur in long haul network. In such environment it may be desirable to make major technological innovations in the flow-control field. The largest innovation required in the high end-to-end propagation delay of connection is to reduce the control time. Recently, various schemes [5-9] have emerged which modify TCP's ACK steams by ABR feedback information to control window growth behaviors. These schemes suggest that throughput improvements are possible by simply delaying the ACK packets or spacing ACK intervals in the backward direction during congestion. Most notably, the queue oscillations can be smaller and packet drops can be prevented. This in turn reduces timeouts and window reductions, improving overall throughput. If TCP is using outgoing ABR connections, the actual feedback received by ABR connections (EFCI bits, ER values) can also be incorporated into the delay computations. This implicit coupling of the ABR and TCP control loops can reduce the size of buffering required at the edge devices. However, all the above schemes are employed in ABR category and are located at the access node to relieve network congestion. In this draft, we extend the concept of delaying ACKs in wider areas, e.g., IP networks and IP over ATM networks. This mechanism can be considered as an enhancement to the currently prevalent TCP and referred to Fast-TCP or F-TCP. In general, IP networks differ from ATM networks in a number of aspects such as packet vs. cell, connection vs. connectionless, which might influence the implementation of this mechanism. For clarification, we divide the mechanism into three functions including congestion J. Ma and R. Zhang [2] detection, identifying ACKs and delaying ACKs. We discuss the issues on implementing the mechanism in an IP router in section 2. Similar mechanisms also work at the IP/ATM access node and ATM switch respectively. As an illustration, we implement the proposed congestion control scheme at IP routers in a complex network and exhibit our investigation on its performance by giving a specific algorithm and simulation results in section 3. It is shown that this scheme can significantly reduce buffer requirement without deteriorating TCP throughput. Finally, the main contributions are summarized. 2. DELAYING ACKS AT CONGESTED NODE For a TCP connection, congestion might occur in the nodes along the travelling path, which mostly results in packet loss and then cause degradation of TCP throughput. Along with the dramatic increase of the LAN bandwidth and of Internet service request, gateway more likely becomes the bottleneck of a network than any ever. Although each network element is equipped with numerous mechanisms of traffic control and congestion control in various points and levels, they act these control independently and hardly contribute to TCP control loop. Especially in long haul, multi-services networks, TCP performance might be deviated seriously. However, since TCP is of closed-loop control based on feedback of ACKs, it is possible for congested nodes to avoid or alleviate congestion as well as packet loss quickly by delaying ACKs. The basic scheme is shown in Figure 1. Apparently, the control loop is shortened by using this mechanism in those intermediate nodes where congestion occurs. This is the mechanism of F-TCP. Intermediate Node +------------------------+ | Data | +---------+ | ++++++++++++++++++++++ *Congested +------------+ | TCP |- - - - ->| |||||||||||||||||||***- - - - ->| TCP | | Source | | ++++++++++++++++++++++ * |Destination | | | | | | | | Work- | | ++++++++++++++++++++++ | | Work- | | station |<- - - - -| |||||||||||||||||| |<- - - - - | station | +---------+ Dealyed| ++++++++++++++++++++++ | +------------+ ACK| ACK | +------------------------+ +-----------------------------------------------+ | TCP control loop | +-----------------------------------------------+ +----------+ | F-TCP | | control | | loop | +----------+ Figure 1: Control loop of F-TCP vs. TCP control loop To implement mechanism of F-TCP, we divide it into three parts consisting of congestion detection, ACK's identification and delaying J. Ma and R. Zhang [3] ACKs. Congestion detection is used to notify congestion before packet loss is to happen. ACK's identification is required to separate ACK flow from normal data traffic. Then, the rate of ACKs should be adjusted according to the policy of delaying ACKs. In the following part of this section, we discuss the implementation of this mechanism in three different network elements. * Delaying ACKs at a Router Routers are the key components in constructing the Internet. It is usually designed on the basis of messages' store and forward, which is in need of huge buffer capacity. Accordingly, congestion might be detected by means of watching the buffer occupancy and/or its increasing rate. Congestion is notified if the buffer occupancy (increasing rate) exceeds a prefixed threshold. To identify an ACK, the router should check the ACK bit in TCP/IP packet so that an ACK packet is found if the ACK bit is set. However, ACK information might be contained in data packet which are prohibited to delay. Thus, we need to separate the ACK information from the data packet. We could clear the ACK bit in the data packet and generate a new ACK packet by copying the ACK information from the data packet. If the node is congested and the mechanism is triggered, IP packets are monitored and ACKs are filtered. Moreover, there exists another fact that possibly invalidates the effect of F-TCP, that is, IP networks is connectionless that ACK packet might travel a different path from the forward data path. It seems difficult for routers to determine whether data packet and its returning ACKs share the same path. However, in Internet there are actually numerous situations to have a unique path between two routers. Many local networks or Intranets are interconnected to Internet usually via a single router, where mechanism of delaying ACK can be used. Particularly, this mechanism might be implemented in the router as an enhancing policy and can be optionally enabled depending on the location the router deployed. The flow of ACKs is spaced according to the policy of delaying ACKs. Two basic methods can be adopted here, i.e., rate-based and token-based. The rate-based method explicitly calculates the leaking rate of ACKs and the token-based method calculates the number of tokens for sending ACKs. These two methods are all based on the knowledge of resource conditions such as spare buffer space or the variation rate of buffer occupancy. Moreover, algorithms to explicitly avoid packet loss are still openly discussed. * Delaying ACKs at IP/ATM access nodes At IP/ATM access nodes, congestion can also be detected by inspecting buffer occupation. ACKs are identified and separated in the similar way as at routers. Especially, the problem of different paths of ACKs and data might become less important because an IP sub-network is connected to ATM network usually through a unique access node. Since ATM is connection-oriented, the information of link bandwidth can be obtained and are suitable for determining the policy of delaying J. Ma and R. Zhang [4] ACKs.Particularly for ATM ABR network, the feedback of available cell rate (ACR) can greatly benefit for the explicit control of delaying ACKs [7][8]. For example, assume at time t ACR is equal to H, then the leaking rate of ACR is set to H in the case of normal state or H/L in the case of congestion respectively, where L is a constant. Nevertheless, even in ATM UBR network, the mechanism of delaying ACKs can take effect. In addition, in UBR network, enhancing schemes are indeed needed because of lack of traffic control of UBR service. * Delaying ACKs in ATM switch In ATM networks, the function of identifying an ACK gets into trouble due to the fact that IP packets are segmented into ATM cells. ATM switch exchanges cells without watching the payload of the cells. In order to identify an ACK packet, we ought to implement an additional mechanism to inspect the content of the payload and filter the cells of ACKs. But this method is usually unacceptable because of its increasing complexity. However, considering the connection-based feature of ATM, we might simply adjust the flow rate of the backward connection without filtering ACK's flow. Although this method potentially decline the data packet of the backward direction by delaying ACKs, this degradation might be omitted when the switch is middle congested. Furthermore, F-TCP brings us the distinguished advantage of solving congestion quickly. In ATM networks, the policy of delaying ACKs will determine the delaying values based on the link bandwidth, i.e., ACR information in ABR networks or leftover bandwidth in UBR networks. 3. F-TCP AT ROUTERS In this section, we implement F-TCP at routers in a complex network and investigate TCP performance by using simulation study. 3.1 Model and simulation parameters +-----+ +-----+ +-----+ | WS1 | | WS8 | | WS6 | +-----+ +-----+ +-----+ ^ ^ ^ | | | V V V +-------+ +-------+ +-------+ +-------+ +-----+ |Router1|<--->|Router2|<--->|Router3|<--->|Router4|<-->| WS7 | +-------+ +-------+ +-------+ +-------+ +-----+ ^ ^ ^ ^ | | | | V V V V +-----+ +-----+ +-----+ +-----+ | WS2 | | WS3 | | WS4 | | WS5 | +-----+ +-----+ +-----+ +-----+ Figure 2. Simulation configuration. Figure 2 illustrates our simulation configuration consisting of five TCP sources, four routers, and three TCP destinations. Here are some important parameters of the simulation model: J. Ma and R. Zhang [5] * Sources and Destinations Sources: WS1, WS2, WS3, WS4 and WS5 Destinations: WS6, WS7 and WS8 TCP Initial Congestion Window: 1M bytes TCP Maximum Segment Size: 512 bytes TCP Receive Buffer: 2M bits TX_x and RX_x are peers of a TCP connection, there are 10 TCP connections in the network. RX_x downloads a huge file from TX_x using FTP service. All RXs send FTP requests at the same time, 10 sec. In detail, WS1 contains TX_1 and TX_2, WS2 contains TX_3 and TX_4, WS3 contains TX_5 and TX_6, WS4 contains TX_7 and TX_8, WS5 contains TX_9 and TX_10, WS6 contains RX_1, RX_2, RX_3, RX_4, RX_9 and RX_10, WS7 contains RX_7 and RX_8, WS7 contains RX_5 and RX_6. * Routers: Routers consist of Router1, Router2, Router3 and Router4 IP Forwarding Rate: 1,200 packets/sec IP buffer for forward : 5M bits IP buffer for backward : 5M bits * Links Links rate : 1.544 Mbits/sec. Delay: From WS1 to Router1 is 1ms, from WS2 to Router1 is 1ms, from WS3 to Router2 is 9.5ms, from WS4 to Router3 is 9.5ms, from Router3 to WS8 is 9.5ms, from WS5 to Router4 is 25ms, from Router4 to WS6 is 1ms, from Router4 to WS7 is 9.5ms, from Router1 to Router2 is 1ms, from Router2 to Router3 is 1ms, from Router3 to Router4 is 1ms. The simulation tool is OPNET. 3.2 Implementation of F-TCP The F-TCP scheme is located at all routers. It monitors traffic in the forward channel and control ACK output rate in the backward channel. Instead of discarding packets or cells in the forward path, the congested node delays ACKs in the backward path and thus causes the TCP source to reduce its output rate. Since a congested link naturally causes the increasing of the buffer occupancy, which may result in the overflow of the buffer, the congestion could be detected by monitoring the occupancy of the buffer. One of the simplest methods is to set a threshold above which congestion is notified. Here, identifying ACKs is easy because no data packets run in the backward paths. We also give a simple policy of delaying ACKs, that is, when no congestion occurs, ACKs are leaked at a normal rate, otherwise, by a fraction of the normal rate. Here, we set a small threshold of 80Kbytes and we set the normal rate same as the rate of data packet in the forward path. The fraction is set to half so that the rate is halved when congestion is detected. 3.3 Simulation results and analysis J. Ma and R. Zhang [6] We observe the forward buffer occupancy of Router3 and Router4. It clearly shows, from the simulation, that F-TCP reduces the forward buffer occupancy. In the case of no F-TCP used, the forward queue length reaches a high level, hence likely results in buffer overflow. As a result, in order to avoid packet loss the forward buffer capacity should be large enough to avoid buffer overflow. In contrast, in the case of F- TCP presented the forward buffer occupancy keeps very stable in a low level. Thus, it is possible for F-TCP to avoid packet loss even with a small buffer. Specially, the stable queue length also implies that the source traffic is practically smoothed. On the other hand, although F-TCP reduces the forward queue length, it might increase the backward queue length. Therefore, we also examine the figures depicting the total buffer occupancy, that is, forward buffer occupancy plus backward buffer occupancy. The total buffer occupancy with F-TCP is much smaller than that without F-TCP. In fact, since ACK packets are generally shorter than data packets, they will produce a relatively small queue length even if they accumulate in backward buffer. Therefore, we conclude that F-TCP can reduce the buffer requirement as well as smooth the source traffic flows. Finally, we investigate the issue whether F-TCP slows down the throughput when it delays ACKs. Simulation records the received TCP sequence numbers of Rx_1 and Rx_2 which are destinations of two connections traversing from WS1 to WS6 while pass all routers. Records show that, F-TCP has no negative effects on throughput. Although it might slow down the ACK flow rate sometimes, it does not slow down the data traffic in average sense. Moreover, since F-TCP could effectively avoid packet loss, it likely has larger throughput in comparison with the case without F-TCP. 4. CONCLUSIONS In this draft we have presented the scheme of delaying ACKs in controlling TCP flow in Internetworks, called Fast-TCP. We generally examine the issues on implementing this scheme at routers, access node and ATM switches. We meet challenges to identify ACKs from data traffic and delay ACKs. We present the implementation of this scheme at IP Routers in a complex network and examine the variation of queue length variation and TCP throughput. As proved by the results, this scheme is able to obtain maximal throughput while keeps a small buffer requirement. The most distinguished advantage of F-TCP is that it can relieve the congestion quickly. In addition, F-TCP scheme offers an inexpensive solution for giving the TCP source an early warning of impending overload or congestion in the network. Especially, the transport protocol TCP itself does not have to be amended in any way. Note that, delaying ACKs does not have any negative effects on RTT (Round Trip Time) because of the following observation. Fast-TCP mechanism acts only when a congestion is detected, and it could lighten the congestion as discussed above. In most time instants, the system is in the normal situation and no actions would be taken. Hence, the normal RTT will not be negatively affected. Because the unexpected problems can be overcomed and no extra RTT is needed, hence it is the conclusion J. Ma and R. Zhang [7] that, with the Fast-TCP mechanism, RTT will not be lengthened. Although in this draft we only present the simulation that is based on a specified configuration, we believe that F-TCP scheme has its potential ability in wider areas and will contribute the improvement of TCP performance in Internetworks. Some farther related work, such as parameter setting, performance evaluation, congestion control over whole link, in Internetworks with wireless links, multiple links or with asymmetric links and etc. are on going. Some preliminary results can be found in [10]-[14]. REFERENCES [1] W. R. Stevens, "TCP/IP Illustrated, Volume 1: The Protocols," Addison-Wesley, Reading, Massachusetts, 1994. [2] V. Jacobson, "Congestion avoidance and control," Proc. ACM SIGCOMM'88, pp. 314-329. [3] A. Romanow and S. Floyd, "Dynamics of TCP Traffic over ATM Networks," Proc. ACM SIGCOMM'94 , pp.79-88, August 1994. [4] R. Goyal, R. Jain, S. Kalyanaraman and S. Fahmy "UBR+: Improving Performance of TCP over ATM-UBR Service," Proc. ICC'97, June 1997. [5] S. Tuner, "Maintaining high throughput during overload in ATM switches," INFOCOM'96, March 1996, pp. 287-295. [6] J. Ma, "Interworking Between TCP and ATM Flow Controls," ATM Forum/97-0960, December 1997. [7] A. Koike, "TCP flow control with ACR information," ATM Forum/97- 0758R1, December 1997. [8] P. Narvaez, and K.-Y. Siu, "An Acknowledgement Bucket Scheme for Regulating TCP Flow over ATM," in Proc. of Globecom'97, November 1997. [9] R. Satyavolu, et al., "Explicit rate control of TCP applications," ATM Forum/98-0152, Feb, 1998 [10] J. Ma, "A simple fast flow control for TCP/IP over satellite ATM network," in proc. of wmatm'98, 1998. [11] P. Zhang and J. Ma, "Performance of TCP Traffic in the ATM Networks with asymmetric links," in Proc. of LAN/WAN'98, May, 1998. [12] P. Zhang, L. Guo and J. Ma, "A simulation study of TCP performance over ADSL", in proc. of ECBIA'98, May, 1998. [13] P. Zhang, J. Wu, J. Ma and S. Cheng, "ACK delay control to improve TCP throughput over satellite links, " Proc. of IEEE/ISCC'99, July, 1999. [14] Q. Wang, J. Wu, S. Cheng and J. Ma, "Fast TCP Flow Control with J. Ma and R. Zhang [8] Different Services," in proc. of pacc'99, May, 1999. AUTHORS' ADDRESS Jian Ma Advanced Internet Technology Nokia (China) Research Center No. 11, He Ping Li Dong Jie Beijing, 100013, PR China Phone: +86 10 84229922 E-mail: jian.ma@nokia.com Runtong Zhang Advanced Internet Technology Nokia (China) Research Center No. 11, He Ping Li Dong Jie Beijing, 100013, PR China Phone: +86 10 84229922 E-mail: runtong.zhang@nokia.com J. Ma and R. Zhang [9]