PILC Working Group Farid Khafizov INTERNET DRAFT Mehmet Yavuz Nortel Networks CATEGORY: Informational 12 November 2001 TCP over CDMA2000 networks draft-khafizov-pilc-cdma2000-00.txt Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress". The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html Abstract The purpose of this document is to inform the Internet community of the TCP optimization techniques that were found useful for transmitting data over the CDMA2000 1x network. Our recommendations are based on lab measurements and simulations of FTP data transmissions over the CDMA2000 1x systems. Same recommendations could be applicable to networks with similar characteristics (e.g., W-CDMA, GPRS, etc.) and therefore could be incorporated into the existing document addressing performance implications of 2.5G/3G networks [3]. This document does not specify an Internet standard and it does not propose any changes to an Internet standard. 0. Introduction The inadequacies of TCP/IP for various link layers (including satellite, token ring, X.25, ATM, terrestrial wireless, etc.) have been pointed out repeatedly over the last 20 years). The solutions F.Khafizov, M.Yavuz [page 1] TCP over CDMA2000 November, 2001 proposed have taken on two flavors: modifications to the peer TCP stacks, and proxies. Both solution approaches have serious problems associated with them, and have not been generally adopted by the IETF. Only recently has PILC WG been chartered to look into certain aspects of the link related issues. At the time of writing of this document IETF has been working on a number of recommendations for links with various characteristics that are present in the CDMA2000 systems. For example, Links with Error [1], Networks with Asymmetry [2], and 2.5G/3G wireless networks [3] have features similar to the CDMA2000 networks. This document describes techniques that proved useful in improving TCP performance over CDMA2000 1X networks. Same techniques may be applicable to 2.5G/3G wireless networks (e.g., W-CDMA, GPRS) in general. Our view on 2.5G/3G wireless links is that three factors make them different from other links: - they are links with errors ([1] addresses this issue) - they experience (some) bandwidth asymmetry ([2] covers that part) - they experience "bandwidth oscillation" (see Section 3) Careful review of the mentioned documents [1,2,3] reveals that some recommendations are too generic in nature and may not be beneficial for CDMA2000 networks. In addition to that one important factor - bandwidth oscillation that is present in CDMA2000 systems, was not covered in the above mentioned documents. Therefore we thought it would be appropriate to put our findings in a separate document. The rest of the document is organized as follows. Section 1 discusses effects of high bit error rates and related techniques improving TCP performance. Section 2 discusses bandwidth oscillation related issues. Possible effect of bandwidth asymmetry in CDMA2000 networks are discussed in Section 3. Section 4 contains description of each technique that was shown to improve performance. Section 5 is a summary of recommendations. 1. High Bit Error Rate. Effects of High Bit Error rate on TCP performance have been studied extensively in the literature as well as by IETF community. Recent RFC3155 [1] addresses performance issues of TCP running on top of links with errors. Because TCP algorithms were designed with wireline world assumptions in mind, one obvious approach is to reduce TCP segment error by implementing error recovery mechanism at the link layer over the wireless access link. In case of CDMA2000 systems, Radio Link Protocol 3 (RLP3) [4] is deployed for this purpose. RLP3 error recovery mechanism works quite well when F.Khafizov, M.Yavuz Expires May 2002 [page 2] TCP over CDMA2000 November, 2001 physical link errors are independently and identically distributed [5]. As a first line of defense against high BER, we recommend aggressive RLP3 retransmission settings. RLP3 provides ways of configuring retransmission settings. We do not discuss these techniques as they are out of scope of this document. Instead, we refer the reader to [4, 5] for more details. In the mobility environment one can observe prolonged fading conditions. When that happens, RLP3 may not recover physical link errors completely, and TCP segment errors could remain high (>10E- 4). In such cases some of the algorithms which assume low bit error rate and that are not absolutely needed for TCP operation should be disabled. For example, we found that when TCP segment error is relatively high the benefits of Van Jacobson Header compression (VJC) do not justify the level of retransmission and reduction in throughput caused by header sequence mismatch [5]. Similar findings were reported in [6]. Moreover, careful analysis of the algorithm reveals that VJC algorithm effectively disables Fast Retransmit algorithm (see section 4.2). Authors of [1] recommend "minimizing the amount of time that is spent unnecessarily in congestion avoidance" and Fast Retransmit/Fast Recovery (FRR) mechanism as a way of achieving this goal because it allows "quick repair of loss without giving up the safety of congestion avoidance". Analysis of our simulation and field data logs supports this claim. To make error recovery by FRR more efficient authors of [1] recommend window size large enough to force the receiver to send three duplicate acknowledgments before the retransmission timeout interval expires. We agree with this recommendation and found that in our simulations and lab measurements the window size of 2*DBP (delay-bandwidth product) works quite well. However, there is a danger of increasing window size too much because of spurious retransmissions delayed segments may cause, that could occur quite frequently due to bandwidth oscillation effect (see Section 2). Effectiveness of FRR can be further improved by configuring the sending TCP to launch FRR mechanism upon receiving the second duplicate ACK. Some operating systems, such as Solaris 2.8 and Windows2000, provide mechanisms for enabling such configuration [7, 8]. Changing MTU size has impact on various aspects of data transmission and, therefore, could yield unexpected changes in TCP performance. Detailed discussion of effects of MTU size change is in section 4.6. F.Khafizov, M.Yavuz Expires May 2002 [page 3] TCP over CDMA2000 November, 2001 Some TCP options, such as SACK [16, 17], proved very useful during simulations in helping TCP to recover from the effects of high BER [9], especially when there are multiple segment errors in a single window. 2. Bandwidth Oscillation For some (default) network configurations, bandwidth oscillation proved to be the single most significant factor in reducing throughput. CDMA2000 1x standard, IS-2000.2 [10], provides means of transmitting data over two type of traffic channels: Fundamental (FCH) and Supplemental (SCH). Fundamental channel has a fixed low bandwidth (e.g., 9.6 or 14.4 kbps). Bandwidth of SCH is a multiple of that and could be as high as 32 times of FCH bandwidth. To simplify notation we denote (SCH+FCH)/FCH bandwidth ratio by O. FCH is always assigned before data transmission begins. SCH is assigned on per needed basis. When SCH is being used we say that the call is in burst. There are two type of SCH assignments: finite and infinite [11], which will be referred to as finite burst and infinite burst, respectively. Infinite burst means that SCH can be used for transmitting data until a release command is issued. Finite burst mode of operation limits the SCH usage to one of fourteen finite time intervals [11] before it must be released. We denote the duration of SCH allocation by B. After SCH is released, it can be acquired again after certain delay (D). One of the ways of detecting congestion in TCP is RTO expiration. RTO computation algorithm [12] was designed to follow closely round trip time (RTT), but is known to work poorly when delay variance is high [13]. During high bandwidth (FCH+SCH) RTT is low and, if B is relatively long (e.g., 5.12 seconds), RTO converges to RTT. When SCH is released, suddenly RTT increases (proportionally to O) and low RTO expires forcing TCP into the Slow Start state, while actually none of the TCP segments were lost. B |<--------------->| |-----------------| |------------- | | | | | | | D | | | SCH ---| |<---->| |------| + FCH FCH ------------------------------------------------------------------- Figure 1. Bandwidth oscillation. Full cycle time is B+D. SCH and FCH are used for transmitting data for time B, then SCH is released and only FCH carries data for time D. Analysis of RTO algorithm along with an alternative (Eifel) algorithm are presented in [14]. Eifel algorithm requires timestamp F.Khafizov, M.Yavuz Expires May 2002 [page 4] TCP over CDMA2000 November, 2001 option and at least one RTO expiration before TCP "learns" that retransmission was not necessary. Limited Transmit algorithm proposed in [15] calls for sending a new data segment in response to each of the first two duplicate ACKs. This approach "extends" cwnd thereby effectively increasing chances of arrival of the third duplicate ACK which would trigger FRR. Although it does not eliminate spurious retransmissions it reduces the chances of TCP falling into full slow start phase. Simulation results as well as lab measurements suggest that when TCP parameters are fixed the level of throughput degradation (and achievable throughput) is a function of . For some combinations degradation of throughput could reach 55%. When B and/or D are low, the throughput degradation is less severe. However, deploying CDMA2000 1x systems with low B and/or D values could be impractical. Higher throughput is achieved when B is high, while signaling delays impose limits on reducing D. Avoiding finite burst mode of operation is also not practical because limited RF resources require time-sharing of SCH resources (e.g., scheduling users). Therefore, we had to look for techniques to reduce spurious retransmissions due to bandwidth oscillation. It is important to note that some of such techniques (e.g., minimizing ACK delays) may not be recommended by the IETF. Note, however, that for wireless data transmission the benefits of those techniques could be far greater than their drawbacks. During our simulations and field experiments we found a number of techniques that helped us either to completely eliminate spurious retransmissions or significantly reduce it. One obvious method was to adjust computed RTO value (or configure appropriately the minimum RTO value) at sending TCP. This technique, however, can not be recommended as a practical solution. Other, more practical methods are: - increase initial RTT value at sending TCP to at least 3 seconds - increase window size - enable timestamp option - reduce delay ACK timer at Receiving TCP to < 100 ms (NOTE: this adjustment may have side effect related to Bandwidth Asymmetry). 3. Bandwidth Asymmetry CDMA2000 standard specifies physical layer operations between the Base Station (BS) and Mobile Terminal (MT). In CDMA2000 systems forward (BS->MT) and reverse (MT->BS) link transmissions are done on separate RF carriers. Although both links are designed similarly, effects of bandwidth asymmetry [2] can be observed when the latency of SCH allocation on the link where ACK are sent is F.Khafizov, M.Yavuz Expires May 2002 [page 5] TCP over CDMA2000 November, 2001 high. Consider, for example, Forward link transmission with relatively small MTU size. SCH on the Forward link will be opened when the amount of TCP data reaches certain level. As SCH is allocated, more TCP segments will be received at the other end and hence more ACKs will be generated in unit of time. Higher volume of ACKs may require SCH on the reverse link for timely delivery. If the SCH allocation is delayed, effects of bandwidth asymmetry could adversely impact TCP throughput. Frame errors in the reverse link increase ACK delay and could further reduce TCP throughput. Currently we do not have sufficient information to quantify effects of bandwidth asymmetry in CDMA2000 1x networks. Although we do not make any recommendations, we believe that keeping the MTU size high or allowing quick SCH allocation on the reverse link could reduce adverse effects bandwidth asymmetry may have. 4. Improving TCP Performance over CDMA2000 1x networks The following techniques were evaluated and proved helpful in improving FTP data transfer over CDMA2000 1x networks. This was especially true for finite burst mode of operation. 4.1 Applying aggressive RLP3 retransmission settings Discussion of RLP settings is not in the scope of this document. We do, however, suggest that wireless infrastructure operators utilize RLP configuration tools as the first line of defense against high bit error rates that will ultimately improve TCP performance. RLP3 [4] configurations allow the operator to choose between aggressive error recovery (and hence lower TCP segment error) or lower delay over RLP layer. Based on the analysis presented in [5], we recommend that the operators choose aggressive retransmission settings. 4.2 Disabling TCP/IP header compression Van Jacobson TCP/IP header compression (VJC) algorithm [18] is negotiated between peer PPP layers (Figure 2). This method increases application layer throughput due to reduced packetization overhead. In the absence of TCP segment errors, for MTU=1000 Bytes, enabling VJC increases throughput by about 4%. However, in the wireless environment TCP segment error is usually nonzero. Since VJC algorithm transmits not the TCP/IP headers but only the changes in the headers of consecutive segments, a segment error causes the transmitting and receiving TCP sequence numbers to go out of synch. When this happens the receiver stops transmitting ACKs for the remaining segments received after the segment in error, which effectively disables fast retransmit mechanism. As a result, the RTO timer expires and all segments in the TCP window need to be F.Khafizov, M.Yavuz Expires May 2002 [page 6] TCP over CDMA2000 November, 2001 retransmitted. This phenomenon was observed from analysis of various measurements [9]. | TCP | | TCP | | IP | | IP | | IP | | PPP | | PPP | | | | RLP3 | | RLP3 | | | | | |IS-2000| |IS-2000| | | | | ------- ------- ----- ----- MT BTS/BSC PDSN Application server Figure 2: Protocol stack for data transmission over CDMA2000 networks Note that if VJC is enabled and TCP segment is lost, none of the following segment will go through until RTO expires. Hence there is no benefit of having Fast Retransmit Fast Recovery mechanism at the sending TCP. 4.3 Enable SACK When multiple segments are lost in a single TCP window the cumulative acknowledgement scheme used in TCP usually results in timer expiration and unnecessary retransmissions. Selective Acknowledgement (SACK) method [17] corrects this behavior by enabling the receiver to selectively acknowledge received segments. Correlated channel errors frequently occur in wireless channels due to channel fading and this can result in multiple segment errors during one TCP window. Our simulations using ns [20] and lab measurements showed that SACK option can improve the throughput performance of TCP over wireless networks [9]. 4.4 Enable Time-stamp option Enabling time-stamp option allows more frequent RTT sampling, hence RTO can follow RTT changes faster. Our simulations show that this can prevent some of the spurious TCP retransmissions 4.5 Increasing Congestion Window Increasing congestion window to 2*DBP helps to reduce negative effects of high BER as well as of bandwidth oscillation. The value 2*DBP will keep the "pipe full" even when the system detects congestion and shrinks cwnd by half. Increasing cwnd to higher values is not desirable because of bandwidth oscillation may F.Khafizov, M.Yavuz Expires May 2002 [page 7] TCP over CDMA2000 November, 2001 trigger premature RTO expiration and spurious retransmission of the whole window. Smaller window size is not desirable for two reasons. When TCP segment is in error, larger cwnd increases chances of third duplicate ACK received by the sender. That would trigger Fast Retransmission of a missing segment. Another reason for avoiding smaller cwnd is that for larger window, RTO timer is more conservative which reduces chances of spurious retransmissions. 4.6 Changing MTU size We recommend higher value of MTU size that is less than the maximum MTU size in wireless infrastructure network. Increasing MTU size (for example, from 576 to 1500 Bytes) has the following effects on TCP performance: - increase of throughput due to reduction in packaging overhead (especially when header compression is not used) - decrease in throughput due to higher probability of TCP segment error (because TCP segments are carried by smaller size physical layer frames). - may improve throughput by reducing effects of bandwidth asymmetry (see Section 3) - may improve throughput due to increased ACKs Our simulations and experiments suggest that when TCP segment error is below 1%, the dominant factor is the packet overhead and higher (e.g., 1500 Bytes) MTU size is preferred. 4.7 Fast Retransmit after 2d D-ACK When a TCP segment is lost due to bad RF conditions, Fast Retransmit should be invoked to help recovery of the missing segment. Usually this done after the 3d duplicate ACK. Configuring the sending TCP to start Fast Retransmit after the 2d duplicate ACK will increase the chances of avoiding RTO expiration. 4.8 Minimize ACK Delay Minimizing ACK delay enables the sender TCP to receive the ACKs ôas soon as possibleö thereby reducing probability of RTO timer expiration. Also minimum ACK delay speeds up the slow start which can help to recover from RTO timer expirations. Our simulations show some improvement in throughput with minimizing ACK delay. 5. Summary of recommendations F.Khafizov, M.Yavuz Expires May 2002 [page 8] TCP over CDMA2000 November, 2001 This document outlines performance degradations of TCP over CDMA2000 wireless networks. Several techniques have been evaluated for reducing adverse affects of high bit error rate and bandwidth oscillation, two main factors characterizing data transmission over CDMA2000 networks. FTP data transmission simulation as well as lab measurements were used for measuring the effects of proposed techniques. Table 1 summarizes our recommendations. ------------------------------------------------------------------- | Technique | Section | Where ------------------------------------------------------------------- | 1. Apply aggressive link level | 4.1 | RLP3 layer | error recovery mechanism | | | 2. Disable Van Jacobson | 4.2 | PPP over IS-2000 | Header Compression | | | 3. Enable SACK Option | 4.3 | TCP S and TCP R | 4. Enable Time-Stamp Option | 4.4 | TCP S and TCP R | 5. Increase cwnd = 2*DBP | 4.5 | TCP R | 6. Adjusting MTU size | 4.6 | TCP S | 7. Start Fast Retransmit after | 4.7 | TCP S | the 2d D-ACK | | | 8. Minimize ACK delay | 4.8 | TCP R ------------------------------------------------------------------- Table 1. Current recommendations for improving TCP performance over CDMA2000 networks. TCP S and TCP R denote sending and receiving TCP, respectively. 7. Security Considerations This document raises no security issues. IP security RFC2411[21] or TLS RFC2246[22] can be deployed by user devices for end-to-end security. 6. Acknowledgements This work would have been possible without support we received from Brian Troup, Michael Anderson, Eric Jerumanis, Doug Klymyshyn, Allan Ding, Wei Lou and many other individuals. 7. References [1] S.Dawkins et. al., "End-to-end Performance Implications of Links with Errors", RFC 3155, August 2001 [2] H.Balakrishnan et.al., "TCP Performance Implications of Network Path Asymmetry", draft-ietf-pilc-asym-06.txt, September 2001 [3] H.Inamura et.al. , "TCP over 2.5G and 3G Wireless Networks", draft-ietf-pilc-2.5g3g-04, October 17, 2001 F.Khafizov, M.Yavuz Expires May 2002 [page 9] TCP over CDMA2000 November, 2001 [4] IS-707-A-1.10, PN-4541.10, Data Service Option for Spread Spectrum Systems: Radio Link Protocol Type 3, Ballot Resolution Version, December, 1999 [5] F.Khafizov and M.Yavuz, "Analytical Model of RLP in IS-2000 CDMA Networks", submitted for publication [6] A.C.Auge, J.P.Aspas, "TCP/IP Over Wireless Links: Performance Evaluation", IEEE VTC, 1755 -1759 vol.3, 1998 [7] "Solaris Tunable Parameters Reference Manual", Sun Microsystems [8] Dave MacDonald and Warren Barkley, "Microsoft Windows 2000 TCP/IP Implementation Details", White Paper, Microsoft Corp., 2000 [9] F.Khafizov, M.Yavuz, "TCP over IS-2000", submitted for publication [10] TIA/EIA/IS-2000.2-A, March, 2000, "Physical Layer Standard for cdma2000 Spread Spectrum Systems", [11] TIA/EIA/IS-2000.5-A, "Upper Layer (Layer 3) Signaling Standard for cdma2000 Spread Spectrum Systems", March, 2000, (see "Extended Supplemental Channel Assignment Message" Section) [12] V.Paxson, M.Allman, "Computing TCP's Retransmission Timer", RFC2988, November 2000 [13] P.Karn, "Advice for Internet Subnetwork Designers", draft- ietf-pilc-link-design-06.txt, July, 2001 [14] R. Ludwig, K. Sklower, "The Eifel Retransmission Timer", ACM Comp. Com. Rev., Vol.30, #3, July 2000 [15] M.Allman et.al, "Enhancing TCP's Loss Recovery Using Limited Transmit", RFC 3042, January 2001 [16] M.Mathis, et. al., "TCP Selective Acknowledgement Options", RFC 2018 October 1996 [17] S.Floyd, et.al., "An Extension to the Selective Acknowledgement (SACK) Option for TCP", RFC 2883, July 2000 [18] V. Jacobson, "Compressing TCP/IP Headers for Low-Speed Serial Links", RFC 1144, February 1990 [19] J.Mogul and S.Deering, "Path MTU Discovery", RFC 1191, November 1990 [20] K. Fall and K. Varadhan (Editor), "ns manuals", 2001 http://www.isi.edu/nsnam/ns/ [21] Thayer, R., Doraswamy, N. and R. Glenn, "IP Security Document Roadmap", RFC 2411, November 1998. [22] Dierks, T. and C. Allen, "The TLS Protocol Version 1.0", RFC 2246, January 1999. 7. Authors' Addresses Farid Khafizov Tel: +1 972 685 4331 Nortel Networks EMail: faridk@nortelnetworks.com 2201 Lakeside Boulevard Richardson, Texas 75083 F.Khafizov, M.Yavuz Expires May 2002 [page 10] TCP over CDMA2000 November, 2001 United States of America Mehmet Yavuz Tel: +1 972 684 5062 Nortel Networks EMail: myavuz@nortelnetworks.com 2201 Lakeside Boulevard Richardson, Texas 75083 United States of America F.Khafizov, M.Yavuz Expires May 2002 [page 11]