Network Working Group H. Inamura (editor) Internet-Draft NTT DoCoMo, Inc. Expires: August 30, 2002 G. Montenegro (editor) Sun Microsystems Laboratories, Europe R. Ludwig Ericsson Research A. Gurtov Sonera F. Khafizov Nortel Networks March 1, 2002 TCP over Second (2.5G) and Third (3G) Generation Wireless Networks draft-ietf-pilc-2.5g3g-07 Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http:// www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on August 30, 2002. Copyright Notice Copyright (C) The Internet Society (2002). All Rights Reserved. Abstract This document describes a profile for optimizing TCP over second (2.5G) and third (3G) generation wireless networks. We describe the relevant characteristics, and discuss example deployments of the Inamura (editor), et al. Expires August 30, 2002 [Page 1] Internet-Draft TCP over 2.5G/3G March 2002 proposed profile for 2.5G and 3G wireless services. We then recommend TCP optimization mechanisms and discuss open issues. All the configuration options in this document are found in modern TCP stacks and are widely available and may be used safely on the general Internet. Even hosts which serve predominantly non-wireless users may safely enable these options. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 2. 2.5G and 3G Link Characteristics . . . . . . . . . . . . . . 5 2.1 Data Rates . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.2 Asymmetry . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.3 Latency . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.4 Delay Spikes . . . . . . . . . . . . . . . . . . . . . . . . 6 2.5 Packet Loss Due to Corruption . . . . . . . . . . . . . . . 7 2.6 Intersystem Handovers . . . . . . . . . . . . . . . . . . . 7 2.7 Bandwidth Oscillation . . . . . . . . . . . . . . . . . . . 7 3. 2.5G and 3G Deployments . . . . . . . . . . . . . . . . . . 9 3.1 W-CDMA . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 3.2 CDMA2000 . . . . . . . . . . . . . . . . . . . . . . . . . . 10 3.3 GPRS, HSCSD, and EDGE . . . . . . . . . . . . . . . . . . . 11 4. TCP over 2.5G and 3G . . . . . . . . . . . . . . . . . . . . 12 4.1 Large Window Size (Sender & Receiver) . . . . . . . . . . . 12 4.2 Large Initial Window (Sender) . . . . . . . . . . . . . . . 12 4.3 Limited Transmit (Sender) . . . . . . . . . . . . . . . . . 13 4.4 Large MTU . . . . . . . . . . . . . . . . . . . . . . . . . 13 4.5 Path MTU Discovery (Sender) . . . . . . . . . . . . . . . . 13 4.6 Selective Acknowledgments (Sender & Receiver) . . . . . . . 14 4.7 Explicit Congestion Notification (Sender & Receiver) . . . . 14 4.8 TCP Timestamp Option (Sender & Receiver) . . . . . . . . . . 14 4.9 Disabling Van Jacobson TCP/IP Header Compression (Wireless Host) . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 4.10 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 16 5. Open Issues . . . . . . . . . . . . . . . . . . . . . . . . 18 6. Security Considerations . . . . . . . . . . . . . . . . . . 20 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 21 References . . . . . . . . . . . . . . . . . . . . . . . . . 22 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . 26 Full Copyright Statement . . . . . . . . . . . . . . . . . . 28 Inamura (editor), et al. Expires August 30, 2002 [Page 2] Internet-Draft TCP over 2.5G/3G March 2002 1. Introduction The second generation cellular systems are commonly referred to as 2G. The 2G phase began in the 1990s when digital voice encoding had replaced analog systems (1G). 2G systems are based on various radio technologies including frequency-, code- and time- division multiple access. Examples of 2G systems include GSM (Europe), PDC (Japan), and IS-95 (USA). Data links provided by 2G systems are mostly circuit-switched and have transmission speed of 10-20 kbps uplink and downlink. Demand for higher data rates, instant availability and data volume-based charging, as well as lack of radio spectrum allocated for 2G led to the introduction of 2.5G (GPRS, EDGE, PDC-P) and 3G (Wideband CDMA, cdma2000) systems. Radio technology of both Wideband CDMA (W-CDMA) (Europe, Japan) and cdma2000 (US) is based on code division multiple access allowing for higher data rates and spectrum utilization than 2G systems. 3G systems provide both packet-switched and circuit-switched connectivity in order to address the quality of service requirements of conversational, interactive, streaming, and bulk transfer applications. The transition to 3G is expected to be a gradual process. Initially, 3G will be deployed to introduce high capacity and high speed access in densely populated area. Mobile users with multimode terminals will be able to utilize existing coverage of 2.5G systems on the rest of territory. Much development and deployment activity has centered around 2.5G and 3G technologies. Along with objectives like increased capacity for voice channels, a primary motivation for these is data communication, and, in particular, Internet access. Accordingly, key issues are TCP performance and the several techniques which can be applied to optimize it over different wireless environments[1]. This document proposes a profile of such techniques, (particularly effective for use with 2.5G and 3G wireless networks), derived from previous work at the IETF [30]. All the configuration options in this document are found in modern TCP stacks and are widely available and may be used safely on the general Internet. Even hosts which serve predominantly non-wireless users may safely enable these options. Two example applications of the recommendations in this document are: o The WAP Forum [13] is an industry association that has developed standards for wireless information and telephony services on digital mobile phones. In order to address WAP functionality for high speed networks such as 2.5G and 3G networks, and to aim at convergence with Internet standards, the WAP Forum thoroughly Inamura (editor), et al. Expires August 30, 2002 [Page 3] Internet-Draft TCP over 2.5G/3G March 2002 revised its specifications. The resultant version 2.0[18] adopts TCP as its transport protocol, and recommends TCP optimization mechanisms closely aligned with those described in this document. o I-mode[24] is a wireless Internet service deployed on handsets in Japan. The newer version of i-mode runs on FOMA [25], an implementation of W-CDMA. I-mode over FOMA deploys the profile of TCP described in this document. Inamura (editor), et al. Expires August 30, 2002 [Page 4] Internet-Draft TCP over 2.5G/3G March 2002 2. 2.5G and 3G Link Characteristics Link layer characteristics of 2.5G/3G networks have significant effects on TCP performance. In this section we present various aspects of link characteristics unique to the 2.5G/3G networks. 2.1 Data Rates The main incentives for transition from 2G to 2.5G to 3G are the increase in voice capacity and in data rates for the users. 2.5G systems have data rates of 10-20 kbps in uplink and 10-40 kbps in downlink. 3G systems are expected to have bit rates around 64 kbps in uplink and 384 kbps in downlink. Considering the resulting bandwidth-delay product of around 1-5KB for 2.5G and 8-50 KB for 3G, 2.5G links can be considered LTNs (Long Thin Networks [1]), and 3G links approach LFNs (Long Fat Networks [46]). For good TCP performance both LFNs and LTNs require maintaining a large enough window. For LFNs, it is necessary to utilize the available network bandwidth. LTNs need a window larger than required for 'filling the pipe,' in order to avoid retransmission timeouts in the presence of packet losses. This document recommends only standard mechanisms suitable both for LTNs and LFNs, and to any network in general. However, suggested experimental mechanisms can be targeted either for LTN [1] or LFN [46]. Data rates are dynamic due to effects from other users and from mobility. Arriving and departing users can reduce or increase the available bandwidth in a cell. Increasing the distance from the base station decreases the link bandwidth due to worsening link quality. Finally, by simply moving into another cell the user can experience a sudden change in available bandwidth. For example, if upon changing cells a connection experiences a sudden increase in available bandwidth, it can underutilize it, because during congestion avoidance TCP increases the sending rate slowly. Changing from a fast to a slow cell normally is handled well by TCP due to the self- clocking property. However, a sudden increase in RTT in this case can cause a spurious TCP timeout as described in Section 2.7. In addition, a large TCP window used in the fast cell can create overbuffering in the slow cell. 2.2 Asymmetry 2.5G/3G systems have built-in asymmetry in uplink and downlink data rates. The uplink data rate is limited by battery power consumption and complexity limitations of mobile terminals. However, the asymmetry does not exceed 3-6 times, and can be tolerated by TCP Inamura (editor), et al. Expires August 30, 2002 [Page 5] Internet-Draft TCP over 2.5G/3G March 2002 without the need for congestion control or ACK filtering [47]. 2.3 Latency The latency of 2.5G/3G links is very high due to FEC combined with interleaving, processing delay and delays from the transport in the radio network [54]. A typical RTT varies between a few hundred milliseconds to one second. The high latency of 2.5/3G links is due to the requirement of providing capacity in a wide coverage area. The associated radio channels suffer from difficult propagation environments and powerful physical layer techniques need to be applied to enable a resource efficient communication. Besides interleaving techniques causing delays, these physical layer techniques require substantial processing power and introduce therefore additional delays. However, rapid improvements in all areas of wireless networks ranging from radio layer techniques over signal processing to system architecture will ultimately also lead to reduced delays in 3G wireless systems. 2.4 Delay Spikes A delay spike is a sudden increase in the latency of the communication path. 2.5G/3G links are likely to experience delay spikes exceeding the typical RTT by several times due to the following reasons. 1. A long delay spike can occur during link layer recovery from a link outage due to temporal loss of radio coverage, for example, while driving into a tunnel or riding on an elevator. 2. During a handover the mobile terminal may have to perform some time-consuming actions before data can be transmitted in a new cell. Many wireless wide area networks in such a case try to provide seamless mobility, that is, internally re-route packets from the old to the new base station incurring additional delay. 3. Blocking by high-priority traffic may occur when an arriving circuit-switched call or higher priority data temporarily preempts the radio channel. This happens because most terminals are not able to handle a voice call and a data connection simultaneously and suspend the data connection in this case. Additionally, a scheduler in the radio network can suspend a low- priority data transfer to give the radio channel to higher priority users. Delay spikes can cause spurious TCP timeouts and unnecessary retransmissions. Inamura (editor), et al. Expires August 30, 2002 [Page 6] Internet-Draft TCP over 2.5G/3G March 2002 2.5 Packet Loss Due to Corruption Even in the face of data corruption, 2.5G/3G systems have a low rate of error losses thanks to link-level retransmissions. Justification for link layer ARQ is discussed in [10], [11]. In general, link layer ARQ and FEC can provide a packet service with a negligibly small probability of undetected errors (failures of the link CRC), and a low level of loss (non-delivery) for the upper layer traffic, i.e. IP. The loss rate of IP packets is low due to the ARQ, but the recovery in layer two appears as jitter to the higher layers. 2.6 Intersystem Handovers It is likely that 3G systems will be used as a 'hot spot' technology in high population areas while 2.5G systems will provide lower speed data service elsewhere. This creates an environment where a mobile user can roam between 2.5G and 3G networks while keeping ongoing TCP connections. The inter-system handover is likely to trigger a high delay spike (Section 2.4), and can result in data loss. Additional problems arise because of context transfer, which is out of scope of this document, but is being addressed elsewhere in the IETF in activities addressing seamless mobility [48]. Intersystem handovers can adversely affect ongoing TCP connections since many features (e.g. window scaling) are negotiated at connection establishment and cannot be changed later. This is especially a valid concern if some mechanism specific for LTN or LFN is implemented by TCP. 2.7 Bandwidth Oscillation Given the limited RF spectrum, satisfying the high data rate needs of 2.5G/3G wireless systems requires dynamic resource sharing among concurrent data users. Various scheduling mechanisms can be deployed in order to maximize resource utilization. Time division sharing of these resources may result in TCP throughput degradation. Ideally, resources are allocated on an as-needed basis (bandwidth on demand) and released when there is no data to send. However, if multiple users transfer large amount of data at the same time, the scheduler may have to repeatedly allocate and de-allocate resources for each user. In this section we refer to periodic allocation and de- allocation of high-speed channels as Bandwidth Oscillation. Bandwidth Oscillation effects such as spurious retransmissions were identified elsewhere (e.g., [17] ) as factors that degrade throughput. There are research studies [49], which show that in some cases Bandwidth Oscillation can be the single most important factor in reducing throughput. When resources are released, suddenly the RTT increases and a low RTO may expire. This forces TCP to reduce Inamura (editor), et al. Expires August 30, 2002 [Page 7] Internet-Draft TCP over 2.5G/3G March 2002 the congestion window and enter Slow Start, while actually no TCP segments were lost. The RTO computation algorithm [32] is intended to detect network congestion. It was designed to follow closely the round trip time (RTT), but is known to work poorly when delay variance is high [11]. Nevertheless, when there is very little variance in RTT and packet losses are rare, RTO converges to RTT. For fixed TCP parameters the achievable throughput depends on the pattern of resource allocation. When the frequency of resource allocation and de-allocation is sufficiently high, there is no throughput degradation. However, increasing frequency of resource allocation/de-allocation may come at the expense of increased signaling, and, therefore, may not be desirable in systems which have limited capacity. Standards for 3G wireless technologies provide other mechanisms that can be used to combat the adverse effects of Bandwidth Oscillation. Furthermore, it is the consensus of the PILC WG that the best approach for avoiding adverse effects of Bandwidth Oscillation is proper wireless sub-network design [11]. In systems that do experience bandwidth oscillation, one can control throughput degradation by optimizing TCP parameters [49]. One obvious method is to adjust the computed RTO value (or configure appropriately the minimum RTO value) at the sending TCP. This technique, however, can not be recommended as a practical solution. Experiments have shown that RTO algorithm implementations compliant with RFC2988 [32] (e.g., minimum RTO=1 sec and initial RTO=3 sec) reduce the number of spurious re-transmissions. Although RTO timer management as specified in RFC2988 is not mandatory, restarting the retransmission timer when an ACK is received (section 5.3 of RFC2988) will further reduce (or even eliminate) spurious retransmissions. However, it may not be possible to completely eliminate if secondary effects, such as TCP segment loss, combine with Bandwidth Oscillation. Analysis of the RTO algorithm along with an alternative (Eifel) algorithm are presented in [17]. The Eifel algorithm requires the timestamp option and at least one RTO expiration before TCP "learns" that retransmissions were not necessary. The D-SACK option [26] also allows a TCP sender to detect spurious RTO expirations. Enabling the timestamp option enables increased RTT sampling which can reduce spurious re-transmissions due to Bandwidth Oscillation. Other options that could reduce spurious RTO expirations due to Bandwidth Oscillation are to increase CWND and reduce the delay ACK timer at the Receiving TCP to < 100 ms (however, this technique may have side effects in case bandwidth is limited in the opposite direction). Inamura (editor), et al. Expires August 30, 2002 [Page 8] Internet-Draft TCP over 2.5G/3G March 2002 3. 2.5G and 3G Deployments This section provides further details on a specific 2.5G/3G technology, namely, Wideband CDMA (W-CDMA), CDMA2000 and GPRS. Other documents discuss fundamental technologies in more detail. For example, ARQ and FEC are discussed in [1], while justification for link layer ARQ is discussed in [9], [11]. 3.1 W-CDMA The International Telecommunication Union (ITU) has selected Wideband Code Division Multiple Access (W-CDMA) as one of the global telecom systems for the IMT-2000 3G mobile communications standard. W-CDMA specifications are created in the 3rd Generation Partnership Project (3GPP). The link layer characteristics of the 3G network which have the largest effect on TCP performance over the link are error controlling schemes such as layer two ARQ (L2 ARQ) and FEC (forward error correction). W-CDMA (Wideband CDMA) uses RLC (Radio Link Control) [2], a Selective Repeat and sliding window ARQ. RLC uses protocol data units (PDUs) with a 16 bit RLC header. The size of the PDUs may vary, but there is at least one implementation with a 336 bit PDU [25]. This is the unit for link layer retransmission. The IP packet is fragmented into PDUs for transmission by RLC. (For more fragmentation discussion, see Section 4.4.) In W-CDMA, one to twelve PDUs (RLC frames) constitute one FEC frame, the actual size of which depends on link conditions and bandwidth allocation. The FEC frame is the unit of interleaving. It is the accumulation of PDUs for FEC that adds the latency mentioned in Section 2.3. For reliable transfer, RLC has an acknowledge mode for PDU retransmission. RLC uses checkpoint ARQ [9]. Using "status report" type acknowledgments: the poll bit in the header explicitly solicits the peer for a status report containing the sequence number that the peer acknowledged. The use of the poll bit is controlled by timers and by the size of available buffer space in RLC. Also, when the peer detects a gap between sequence numbers in received frames, it can issue a status report to invoke retransmission. RLC preserves the order of packet delivery. The maximum number of retransmissions is a configurable RLC parameter that is specified by RRC [33] (Radio Resource Controller) through RLC connection initialization. The RRC can set the maximum number of Inamura (editor), et al. Expires August 30, 2002 [Page 9] Internet-Draft TCP over 2.5G/3G March 2002 retransmissions (up to a maximum of 40). Therefore, RLC can be described as an ARQ that can be configured for either HIGH- PERSISTENCE or LOW-PERSISTENCE, not PERFECT-PERSISTENCE, according to the terminology in [9]. Since the RRC manages RLC connection state, Bandwidth Oscillation (Section 2.7) can be eliminated by the RRC's keeping RF resource on an RLC connection with data in its queue. This avoids resource de- allocation in the middle of transferring data. In summary, the link layer ARQ and FEC can provide a packet service with a negligibly small probability of undetected error (failure of the link CRC), and a low level of loss (non-delivery) for the upper layer traffic, i.e. IP. Retransmission of PDUs by ARQ introduces latency and jitter to the IP flow. This is why the transport layer sees the underlying W-CDMA network as a network with a relatively large BDP (Bandwidth-Delay Product), the typical values may range around 50KB for 384 Kbps RF allocation. 3.2 CDMA2000 One of the Terrestrial Radio Interface standards for 3G wireless systems, proposed under the International Mobile Telecommunications- 2000 umbrella, is cdma2000 [51]. It employs Multi-Carrier Code Division Multiple Access (CDMA) technology with a single-carrier RF bandwidth of 1.25 MHz. cdma2000 evolved from IS-95 [52], a 2G standard based on CDMA technology. The first phase of cdma2000 utilizes a single carrier and is designed to double the voice capacity of existing CDMA (IS-95) networks and to support always-on data transmission speeds of up to 316.8 kbps. At the physical layer, the standard allows transmission in 5,10,20,40 or 80 ms time frames. Various orthogonal (Walsh) codes are used for channel identification and to achieve higher data rates. Radio Link Protocol Type 3 (RLP) [53] is used with a cdma2000 Traffic Channel to support CDMA data services. RLP provides an octet stream transport service and is unaware of higher layer framing. There are several RLP frame formats. RLP frame formats with higher payload were designed for higher data rates. Depending on the channel speed, one or more RLP frames can be transmitted in a single physical layer frame. RLP can substantially decrease the error rate exhibited by CDMA traffic channels. When transferring data, RLP is a pure NAK-based finite selective repeat protocol. The receiver does not acknowledge successfully received data frames. If one or more RLP data frames are missing, the receiving RLP makes several attempts (called NAK rounds) to recover them by sending one or more NAK control frames to Inamura (editor), et al. Expires August 30, 2002 [Page 10] Internet-Draft TCP over 2.5G/3G March 2002 the transmitter. Each NAK frame must be sent in a separate physical layer frame. When RLP supplies the last NAK control frame of a particular NAK round, a retransmission timer is set. If the missing frame is not received when the timer expires, RLP may try another NAK round. RLP may not recover all missing frames. If after all RLP rounds, a frame is still missing, RLP supplies data with a missing frame to the higher layer protocols. 3.3 GPRS, HSCSD, and EDGE High Speed Circuit-Switched Data (HSCSD) and General Packet Radio Service (GPRS) are extensions of GSM providing high data rates for a user. Both extensions were developed first by ETSI and later by 3GPP. In GSM, a user is assigned one timeslot downlink and one uplink. HSCSD allocates multiple timeslots to a user creating a fast circuit-switched link. GPRS is based on packet-switched technology that allows efficient sharing of radio resources among users and always-on capability. A GPRS terminal can utilize up to eight timeslots per direction. Additionally, several terminals can share timeslots. GPRS network uses an updated base station subsystem of GSM as the access network; the GPRS core network includes Serving GPRS Support Node (SGSN) and Gateway GPRS Support Node (GGSN). The RLC protocol operating between a base station controller and a terminal provides ARQ capability over the radio link. The Logical Link Control (LLC) protocol between the SGSN and the terminal also has an ARQ capability utilized during handovers. Enhanced Data for Global Evolution (EDGE) is a new modulation technique and new channel coding that increases throughput and capacity of the radio link. EDGE applied to GPRS (EGPRS) or HSCSD (ECSD) can increase the data rate threefold for a single user. Inamura (editor), et al. Expires August 30, 2002 [Page 11] Internet-Draft TCP over 2.5G/3G March 2002 4. TCP over 2.5G and 3G What follows is a set of recommendations for configuration parameters for protocol stacks which will be used to support TCP connections over 2.5G and 3G wireless networks. Some of these recommendations imply special configuration at the data receiver, frequently a stack at or near the wireless device, some at the data sender, frequently a host in the Internet or possibly a gateway or proxy at the edge of a wireless network, and some at both. All the configuration options in this section are found in modern TCP stacks and are widely available and may be used safely on the general Internet. Even hosts which serve predominantly non-wireless users may safely enable these options. System administrators are cautioned, however, that setting MTU size (Section 4.1.4) and disabling Van Jacobson header compression (Section 4.1.9) could affect host efficiency and changing such parameters should be done with care. 4.1 Large Window Size (Sender & Receiver) TCP over 2.5G/3G should support appropriate window sizes based on the Bandwidth Delay Product (BDP) of the end-to-end path. The traditional TCP specification [38] limits the receiver window size to 64 KB. If the end-to-end BDP is expected to be larger than 64 KB, the window scale option [5] can overcome that limitation. If the estimated path BDP is larger than 64 KB, the window scale option may be used. Many operating systems by default use small TCP receive and send buffers around 16KB. Therefore, even for bandwidth-delay product below 64 KB, the default buffer size setting should be increased at the sender and at the receiver to allow a large enough window 4.2 Large Initial Window (Sender) TCP controls its transmit rate using the congestion window mechanism. Traditionally, the initial value of the window is one segment. Because the delayed Ack mechanism is the standard, a TCP sender should have an increased initial congestion window of two segments[3]. This effectively cancels the delayed Ack by sending two segments at once in the first RTT of slow start, which helps avoid overhead in the initial phase of the connection. Furthermore, the increased initial window mechanism [4] is also effective, especially for the transmission of small amounts of data, which is the behavior commonly seen in such applications as Internet- enabled mobile wireless devices. For large data transfers, on the other hand, the effect of this mechanism is negligible. [6] Inamura (editor), et al. Expires August 30, 2002 [Page 12] Internet-Draft TCP over 2.5G/3G March 2002 describes evaluations of this mechanism by measurements. An initial congestion window size of two segments is recommended in RFC2581 [3]. RFC2414 [4] also considers the use of an initial window size larger than two segments. At the time of this writing, RFC2414 has been proposed to become a standards track RFC. Due to the fact that the delayed Ack mechanism is the standard RFC2581 [3], and that the increased initial window option is especially effective for the small data transfers that are common for mobile wireless devices, TCP over 2.5G/3G should use initial CWND (congestion window) = 2 segments. Senders may use a CWND > 2 segments as per the recommendation in RFC2414 [4]. 4.3 Limited Transmit (Sender) RFC3042 [27], Limited Transmit, extends Fast Retransmit/Fast Recovery for TCP connections with small congestion windows that are not likely to generate the three duplicate acknowledgements required to trigger Fast Retransmit. The mechanism calls for sending a new data segment in response to each of the first two duplicate acknowledgments that arrive at the sender. This mechanism is effective when the congestion window size is small or if a large number of segments in a window are lost. This may reduce the amount of retransmission due to TCP round trip timeout. Similar to the discussion in Section 4.2, this mechanism is useful for small amounts of data to be transmitted. TCP over 2.5G/3G implementations should implement Limited Transmit. 4.4 Large MTU One of the link layer parameters is MTU (Maximum Transfer Unit). In TCP, the slow start mechanism tries to find an adequate rate for the network path. A larger MTU allows TCP to increase the congestion window faster [10], because the window is counted in units of segments. In links with high error rates, a smaller link PDU size increases the chance of successful transmission. With layer two ARQ and transparent link layer fragmentation, the network layer can enjoy a larger MTU even in a relatively high BER (Bit Error Rate) condition. Without these features in the link, a smaller MTU is suggested. TCP over 2.5G/3G should allow freedom for designers to choose MTU values ranging from small values (such as 576 bytes) to a large value that is supported by the type of link in use (such as 1500 bytes for IP packets on Ethernet), designers are generally encouraged to choose large values. 4.5 Path MTU Discovery (Sender) Path MTU discovery allows a sender to determine the maximum end-to- Inamura (editor), et al. Expires August 30, 2002 [Page 13] Internet-Draft TCP over 2.5G/3G March 2002 end transmission unit (without IP fragmentation) for a given routing path. RFC1191 [19] and RFC1981 [21] describe the MTU discovery procedure for IPv4 and IPv6 respectively. This allows TCP senders to employ larger segment sizes (without causing IP layer fragmentation) instead of assuming the small default MTU. TCP over 2.5G/3G implementations should implement Path MTU Discovery. Path MTU Discovery requires intermediate routers to support the generation of the necessary ICMP messages. RFC1435 [20] provides recommendations that may be relevant for some router implementations. 4.6 Selective Acknowledgments (Sender & Receiver) The selective acknowledgment option (SACK), RFC2018 [7], is effective when multiple TCP segments are lost in a single TCP window[12]. In particular, if the end-to-end path has a large BDP and a certain packet loss rate, the probability of multiple segment losses in a single window of data increases. In such cases, SACK provides reliability beyond traditional and Reno TCP[8]. TCP over 2.5G/3G should support SACK. In the absence of SACK feature, the TCP may use NewReno RFC2582 [39] semantics. 4.7 Explicit Congestion Notification (Sender & Receiver) Explicit Congestion Notification, RFC3168 [23], allows a TCP receiver to inform the sender of congestion in the network by setting the ECN- Echo flag upon receiving an IP packet marked with the CE bit(s). The TCP sender will then reduce its congestion window. Thus, the use of ECN is believed to provide performance benefits[22]. TCP over 2.5G/ 3G may support ECN. RFC3168 [23] also places requirements on intermediate routers (e.g. active queue management and setting of the CE bit(s) in the IP header to indicate congestion). Therefore, the potential improvement in performance can only be achieved when ECN capable routers are deployed along the path. 4.8 TCP Timestamp Option (Sender & Receiver) Traditionally, TCPs collect one RTT sample per window of data [38]. This can lead to an underestimation of the RTT, and spurious timeouts on bandwidth-dominated paths. This holds despite a conservative retransmit timer such as the one proposed in RFC2988 [32]. In general, TCP connections with a window larger than eight segments require more frequent RTT measurements [44]. Timing every segment can be implemented with or without the TCP Timestamps option [5]. Using the TCP Timestamps option has the advantage that retransmitted segments can be used for RTT measurement, which is otherwise forbidden by Karn's algorithm [40]. Furthermore, the TCP Timestamps Inamura (editor), et al. Expires August 30, 2002 [Page 14] Internet-Draft TCP over 2.5G/3G March 2002 option is the basis for detecting spurious retransmits using the Eifel algorithm [17]. On paths where the packet transmission delay across the bottleneck link dominates the path's RTT, the queuing delay and hence the RTT can increase rapidly during the slow start phase. When RTT samples are collected once per window, the RTO value applied just before taking a new RTT sample can underestimate the current RTT. Experiments using NS2 show that in this case a spurious timeout during fast recovery can occur even without any delay spike [42]. Enabling timestamps significantly increases the maximum delay spike tolerated by TCP without experiencing a spurious timeout. In summary, timing every segment avoids the effect of an RTO lagging behind a rapidly increasing RTT. This decreases the likelihood of a spurious timeout. Additionally, timestamps reduce the likelihood of spurious timeouts due to bandwidth oscillation [49]. The only problematic issue about using timestamps seems to be the 12 bytes overhead introduced by carrying the TCP Timestamps option and padding in the TCP header. For a small MTU size, it can present a considerable overhead. For example, for an MTU of 296 bytes the added overhead is 4%. For an MTU of 1500 bytes, the added overhead is only 0.8%. Current TCP header compression schemes RFC1144 [43], RFC2507 [36] do not support compressing headers with options. Thus, using the TCP Timestamps option effectively disables header compression. The IETF is currently specifying a robust TCP/IP header compression scheme that supports TCP options [16][34]. The original definition of the timestamp option [5] specifies that duplicate segments below cumulative ACK do not update the cached timestamp value at the receiver. This may lead to overestimating of RTT for retransmitted segments. A possible solution [45] allows the receiver to use a more recent timestamp from a duplicate segment. However, this suggestion allows for spoofing attacks at the TCP receiver. Therefore, careful consideration is needed in implementing this solution. Recommendation: TCP SHOULD use the TCP Timestamps option. It allows for better RTT estimation, reduces the risk of spurious timeouts, and enables the detection of spurious retransmits using the Eifel algorithm. 4.9 Disabling Van Jacobson TCP/IP Header Compression (Wireless Host) The Van Jacobson TCP/IP header compression (VJC) algorithm RFC1144 [35] is negotiated between peer PPP layers. The algorithm was designed to increase application layer throughput by reducing packetization overhead [11]. For TCP segment size of 1000 Bytes, Inamura (editor), et al. Expires August 30, 2002 [Page 15] Internet-Draft TCP over 2.5G/3G March 2002 enabling VJC increases throughput by about 4%, if there is no packet loss. It is well known (and has been shown with experimental data) that VJ TCP header compression does not perform well in the presence of packet losses [41], [49]. If a wireless link error is not recovered, it will cause TCP segment loss between peer PPP layers, and then VJ header compression does not allow TCP to take advantage of Fast Retransmit Fast Recovery mechanism. The VJ header compression algorithm transmits not the TCP/IP headers but only the changes in the headers of consecutive segments. Therefore, loss of a single TCP segment on the link causes the transmitting and receiving TCP sequence numbers to fall out of synch. When a TCP segment is lost, none of the following segment will be forwarded by the link until RTO expires [11]. As previously recommended in RFC3150 [10], VJ header compression should be disabled unless packet loss between peer PPP layers is very low. Other header compression schemes like RFC2507 [36] and Robust Header Compression [34] are meant to address deficiencies in VJ header compression. At the time of this writing, the IETF was working on multiple extensions to Robust Header Compression (negotiating Robust Header Compression over PPP, compressing TCP options, etc) [50]. 4.10 Summary Inamura (editor), et al. Expires August 30, 2002 [Page 16] Internet-Draft TCP over 2.5G/3G March 2002 Items Comments ---------------------------------------------------------------- Large window size (sender & receiver) based on end-to-end BDP Window scale option (sender & receiver) [RFC1323] Window size>64KB Large initial window (sender) [RFC2581] (CWND = 2 segments) Large initial window (sender) Optional for senders [RFC2414] (CWND > 2 segments) Limited Transmit (sender) [RFC3042] MTU larger than default IP MTU Path MTU discovery [RFC1191,RFC1981] Selective Acknowledgment option (SACK) [RFC2018] (sender & receiver) Explicit Congestion Notification(ECN) [RFC3168] (sender & receiver) Timestamp option (sender & receiver) [RFC1323, R.T.Braden's ID] Disabling VJ TCP/IP Header Compression [RFC1144] (wireless host) Inamura (editor), et al. Expires August 30, 2002 [Page 17] Internet-Draft TCP over 2.5G/3G March 2002 5. Open Issues This section outlines additional mechanisms and parameter settings that may increase end-to-end performance when running TCP across 2.5G/3G networks. Note, that apart from the discussion of the RTO's initial value, those mechanisms and parameter settings are not part of any standards track RFC at the time of this writing. Therefore, they cannot be recommended for the Internet in general. Link layer mechanisms for increasing TCP performance include enhanced TCP/IP header compression schemes [16], and active queue management RFC2309 [15], link layer retransmission schemes [11], and holding on to packets during transient link outages [11]. Shortcomings of existing TCP/IP header compression schemes (RFC1144 [35], RFC2507 [36]) are that headers of handshaking packets (SYNs and FINs), and TCP option fields (e.g., SACK or timestamps) are not compressed. In fact, the presence of timestamps effectively disables header compression. Although RFC3095 [34] does not yet address this issue, the IETF is developing schemes to compress TCP headers, including options such as timestamps and selective acknowledgements. Especially, if many short-lived TCP connections run across the link, the compression of the handshaking packets may greatly improve the overall header compression ratio. Implementing active queue management is attractive for a number of reasons as outlined in RFC2309 [15]. One important benefit for 2.5G/3G networks, is that it minimizes the amount of potentially stale data that may be queued in the network ("clicking from page to page" before the download of the previous page is complete). Avoiding the transmission of stale data across the 2.5G/3G radio link saves transmission (battery) power, and increases goodput (the ratio of useful data over total data transmitted). Another important benefit of active queue management for 2.5G/3G networks, is that it reduces the risk of a spurious timeout for the first data segment as outlined below. Finding ways to avoid the path round-trip times required for TCP's connection setup and disconnect is particularly attractive for 2.5G/ 3G networks since these networks are commonly characterized by high delays. This would be particularly beneficial for short-lived, transactional (request/response-style) TCP sessions that typically result from browsing the Web from a smart phone. However, existing solutions such as T/TCP RFC1644 [14], have not been adopted due to known security concerns [31]. Spurious timeouts RFC3150 [10], packet re-ordering, and packet duplication may reduce TCP's performance. Thus, making TCP more robust against those events is desirable. Solutions to this problem have been proposed [17], [26], [37], and standardization work within the IETF is ongoing at the time of writing. Those solutions include Inamura (editor), et al. Expires August 30, 2002 [Page 18] Internet-Draft TCP over 2.5G/3G March 2002 reversing congestion control state after such an event has been detected, and adapting the retransmission timer and duplicate acknowledgement threshold. The deployment of such solutions may be particularly beneficial when running TCP across wireless networks because wireless access links may often be subject to handovers and resource preemption, or the mobile transmitter may traverse through a radio coverage hole. Such disrupting events may easily trigger a spurious timeout despite a conservative retransmission timer. Also, the mobility mechanisms of some wireless networks may cause packet duplication. The algorithm for computing TCP's retransmission timer is specified in RFC2988 [32]. The standard specifies that the initial setting of the retransmission timeout value (RTO) must not be less than 3 seconds. This value might be too low when running TCP across 2.5G/3G networks. In addition to its high latencies, those networks may be run at bit rates of as low as about 10 kb/s which results in large transmission delays. In this case, the RTT for the first data segment may easily exceed the initial TCP retransmission timer setting of 3 seconds. This would then cause a spurious timeout for that segment. Hence, in such situations it may be advisable to set TCP's initial RTO to a value larger than 3 seconds. Furthermore, due to the potentially large transmissions delays, a TCP sender might choose to refrain from initializing its RTO from the RTT measured for the SYN, but instead take the RTT measured for the first data segment. Inamura (editor), et al. Expires August 30, 2002 [Page 19] Internet-Draft TCP over 2.5G/3G March 2002 6. Security Considerations In 2.5G/3G wireless networks, data is transmitted as ciphertext over the air and as cleartext between the Radio Access Network (RAN) and the core network. IP security RFC2401 [29] or TLS RFC2246 [28] can be deployed by user devices for end-to-end security. The use of a transport gateway introduces conflicts with IPsec; however TLS can be used in such architectures. Inamura (editor), et al. Expires August 30, 2002 [Page 20] Internet-Draft TCP over 2.5G/3G March 2002 7. Acknowledgements The authors would like to acknowledge the contribution to the text from the following individuals: Max Hata, NTT DoCoMo, Inc. (hata@mml.yrp.nttdocomo.co.jp) Masahiro Hara, Fujitsu, Inc. (mhara@FLAB.FUJITSU.CO.JP) Joby James, Motorola, Inc. (joby@MIEL.MOT.COM) William Gilliam, Hewlett-Packard Company (wag@cup.hp.com) Alan Hameed, Fujitsu FNC, Inc. (Alan.Hameed@fnc.fujitsu.com) Rodrigo Garces (rgarces2000@yahoo.com) Peter Ford, Microsoft (peterf@Exchange.Microsoft.com) Fergus Wills, Openwave (fergus.wills@openwave.com) Michael Meyer (Michael.Meyer@eed.ericsson.se) The authors gratefully acknowledge the valuable advice from the following individuals: Gorry Fairhurst (gorry@erg.abdn.ac.uk) Mark Allman (mallman@grc.nasa.gov) Aaron Falk (falk@ISI.EDU) Inamura (editor), et al. Expires August 30, 2002 [Page 21] Internet-Draft TCP over 2.5G/3G March 2002 References [1] Montenegro, G., Dawkins, S., Kojo, M., Magret, V. and N. Vaidya, "Long Thin Networks", RFC 2757, January 2000. [2] Third Generation Partnership Project, "RLC Protocol Specification (3G TS 25.322:)", 1999. [3] Allman, M., Paxson, V. and W. Stevens, "TCP Congestion Control", RFC 2581, April 1999. [4] Allman, M., Floyd, S. and C. Partridge, "Increased TCP's Initial Window", RFC 2414, September 1998. [5] Jacobson, V., Bdaden, R. and D. Borman, "TCP Extensions for High Performance", RFC 1323, May 1992. [6] Allman, M., "An Evaluation of TCP with Larger Initial Windows 40th IETF Meeting -- TCP Implementations WG. December", December 1997. [7] Mathis, M., Mahdavi, J., Floyd, S. and R. Romanow, "TCP Selective Acknowledgment Options", RFC 2018, October 1996. [8] Fall, K. and S. Floyd, "Simulation-based Comparisons of Tahoe, Reno, and SACK TCP", Computer Communication Review, 26(3) , July 1996. [9] Fairhurst, G. and L. Wood, "Link ARQ issues for IP traffic", Internet draft , November 2000, . [10] Dawkins, S. and G. Montenegro, "End-to-end Performance Implications of Slow Links", RFC 3150/BCP 48, July 2001. [11] Karn, P., Falk, A., Touch, J., Montpetit, M., Mahdavi, J., Montenegro, G., Grossman, D. and G. Fairhurst, "Advice for Internet Subnetwork Designers", Internet draft , November 2000, . [12] Dawkins, S., Montenegro, G., Magret, V., Vaidya, N. and M. Kojo, "End-to-end Performance Implications of Links with Errors", RFC 3135/BCP 50, August 2001. [13] Wireless Application Protocol, "WAP Specifications", 2001, . Inamura (editor), et al. Expires August 30, 2002 [Page 22] Internet-Draft TCP over 2.5G/3G March 2002 [14] Braden, R., "T/TCP -- TCP Extensions for Transactions", RFC 1644, July 1994. [15] Braden, R., Clark, D., Crowcroft, J., Davie, B., Deering, S., Estrin, D., Floyd, S., Jacobson, V., Minshall, G., Partridge, C., Peterson, L., Ramakrishnan, K., Shenker, S., Wroclawski, J. and L. Zhang, "Recommendations on Queue Management and Congestion Avoidance in the Internet", RFC 2309, April 1998. [16] IETF, "Robust Header Compression", 2001, . [17] Ludwig, R. and R. H. Katz, "The Eifel Algorithm: Making TCP Robust Against Spurious Retransmissions", ACM Computer Communication Review 30(1), January 2000, . [18] Wireless Application Protocol, "WAP Wireless Profiled TCP", WAP-225-TCP-20010331-a, April 2001, . [19] Mogul, J. and S. Deering, "Path MTU Discovery", RFC 1191, November 1990. [20] Knowles, S., "IESG Advice from Experience with Path MTU Discovery", RFC 1435, March 1993. [21] McCann, J., Deering, S. and J. Mogul, "Path MTU Discovery for IP version 6", RFC 1981, August 1996. [22] Hadi Salim, J. and U. Ahmed, "Performance Evaluation of Explicit Congestion Notification (ECN) in IP Networks", RFC 2884, july 2000. [23] Ramakrishnan, K., Floyd, S. and D. Black, "The Addition of Explicit Congestion Notification (ECN) to IP", RFC 3168, September 2001. [24] NTT DoCoMo Technical Journal, "Special Issue on i-mode Service", October 1999. [25] NTT DoCoMo Technical Journal, "Special Article on IMT-2000 Services", September 2001. [26] Floyd, S., Mahdavi, J., Mathis, M. and M. Podolsky, "An Extension to the Selective Acknowledgement (SACK) Option for TCP", RFC 2883, July 2000. Inamura (editor), et al. Expires August 30, 2002 [Page 23] Internet-Draft TCP over 2.5G/3G March 2002 [27] Allman, M., Balakrishnan, H. and S. Floyd, "Enhancing TCP's Loss Recovery Using Limited Transmit", RFC 3042, January 2001. [28] Dierks, T. and C. Allen, "The TLS Protocol Version 1.0", RFC 2246, January 1999. [29] Kent, S. and R. Atkinson, "Security Architecture for the Internet Protocol", RFC 2401, November 1998. [30] Mitzel, D., "Overview of 2000 IAB Wireless Internetworking Workshop", RFC 3002, December 2000. [31] de Vivo, M., O. de Vivo, G., Koeneke, R. and G. Isern, "Internet Vulnerabilities Related to TCP/IP and T/TCP", ACM Computer Communication Review 29(1), January 1999, . [32] Paxson, V. and M. Allman, "Computing TCP's Retransmission Timer", RFC 2988, November 2000. [33] Third Generation Partnership Project, "RRC Protocol Specification (3GPP TS 25.331:)", September 2001. [34] Bormann, C., Burmeister, C., Degermark, M., Fukushima, H., Hannu, H., Jonsson, L-E., Hakenberg, R., Koren, T., Le, K., Liu, Z., Martensson, A., Miyazaki, A., Svanbro, K., Wiebke, T., Yoshimura, T. and H. Zheng, "RObust Header Compression (ROHC): Framework and four profiles: RTP, UDP, ESP, and uncompressed", RFC 3095, July 2001. [35] Jacobson, V., "Compressing TCP/IP Headers for Low-Speed Serial Links", RFC 1144, February 1990. [36] Degermark, M., Nordgren, B. and S. Pink, "IP Header Compression", RFC 2507, February 1999. [37] Blanton, E. and M. Allman, "Using TCP DSACKs and SCTP Duplicate TSNs to Detect Spurious Retransmissions", Internet draft , August 2001, . [38] Postel, J., "Transmission Control Protocol - DARPA Internet Program Protocol Specification", RFC 793, September 1981. [39] Floyd, S. and T. Henderson, "The NewReno Modification to TCP's Fast Recovery Algorithm", RFC 2582, April 1999. Inamura (editor), et al. Expires August 30, 2002 [Page 24] Internet-Draft TCP over 2.5G/3G March 2002 [40] Karn, P. and C. Partridge, "Improving Round-Trip Time Estimates in Reliable Transport Protocols", ACM SIGCOMM 87, 1987. [41] Ludwig, R., Rathonyi, B., Konrad, A. and A. Joseph, "Multi- layer tracing of TCP over a reliable wireless link", ACM SIGMETRICS 99, May 1999. [42] Gurtov, A., "Making TCP Robust Against Delay Spikes", University of Helsinki, Department of Computer Science, Series of Publications C, C-2001-53, Nov 2001, . [43] Jacobson, V., "Compressing TCP/IP Headers for Low-Speed Serial Links,", RFC 1144, Feb 1990. [44] Stevens, W., "TCP/IP Illustrated, Volume 1; The Protocols,", Addison Wesley , 1995. [45] Braden, R., "TCP Extensions for High Performance: An Update", Internet draft , Jun 1993, . [46] Allman, M., Dawkins, S., Glover, D., Griner, J., Tran, D., Henderson, T., Heidemann, J., Touch, J., Kruse, H., Ostermann, S., Ostermann, S., Scott, K. and J. Semke, "Ongoing TCP Researh Related to Satellites", RFC 2760, Feb 2000. [47] Balakrishnan, H., Padmanabhan, V., Fairhurst, G. and M. Sooriyabandara, "TCP Performance Implications of Network Asymmetry", Internet draft , Sep 2001, . [48] Kempf, J., "Problem Description: Reasons For Performing Context Transfers Between Nodes in an IP Access Network", Internet draft , Oct 2001, . [49] Khafizov, F. and M. Yavuz, "Running TCP over IS-2000", Proc. of IEEE ICC 2002. [50] Bormann, C., "ROHC over PPP", Internet draft , Nov 2001, . [51] TIA/EIA/cdma2000, "Mobile Station - Base Station Compatibility Standard for Dual-Mode Wideband Spread Spectrum Cellular Systems", Washington: Telecommunication Industry Association , 1999. Inamura (editor), et al. Expires August 30, 2002 [Page 25] Internet-Draft TCP over 2.5G/3G March 2002 [52] TIA/EIA/IS-95 Rev A, "Mobile Station - Base Station Compatibility Standard for Dual-Mode Wideband Spread Spectrum Cellular Systems", Washington: Telecommunication Industry Association , 1995. [53] TIA/EIA/IS-707-A-2.10, "Data Service Options for Spread Spectrum Systems: Radio Link Protocol Type 3", January 2000. [54] Dahlman, E., Beming, P., Knutsson, J., Ovesjo, F., Persson, M. and C. Roobol, "WCDMA - The Radio Interface for Future Mobile Multimedia Communications", IEEE Trans. on Vehicular Technology, vol. 47, no. 4, pp. 1105-1118 , November 1998. Authors' Addresses Hiroshi Inamura NTT DoCoMo, Inc. 3-5 Hikarinooka Yokosuka Shi, Kanagawa Ken 239-8536 Japan EMail: inamura@mml.yrp.nttdocomo.co.jp URI: http://www.nttdocomo.co.jp/ Gabriel Montenegro Sun Microsystems Laboratories, Europe 29, chemin du Vieux ChŠne 38240 Meylan France EMail: gab@sun.com Reiner Ludwig Ericsson Research Ericsson Allee 1 52134 Herzogenrath Germany EMail: Reiner.Ludwig@Ericsson.com Inamura (editor), et al. Expires August 30, 2002 [Page 26] Internet-Draft TCP over 2.5G/3G March 2002 Andrei Gurtov Sonera P.O. Box 970, FIN-00051 Helsinki, Finland EMail: andrei.gurtov@sonera.com URI: http://www.cs.helsinki.fi/u/gurtov/ Farid Khafizov Nortel Networks 2201 Lakeside Blvd Richardson, TX 75082, USA EMail: faridk@nortelnetworks.com Inamura (editor), et al. Expires August 30, 2002 [Page 27] Internet-Draft TCP over 2.5G/3G March 2002 Full Copyright Statement Copyright (C) The Internet Society (2002). All Rights Reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English. The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assigns. This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Acknowledgement Funding for the RFC Editor function is currently provided by the Internet Society. Inamura (editor), et al. Expires August 30, 2002 [Page 28]