Internet Engineering Task Force Janardhan Iyengar INTERNET DRAFT University of Delaware draft-iyengar-burst-mitigation-00.txt Expires: May 31, 2006 Ethan Blanton Purdue University Mark Allman ICIR/ICSI November 30, 2005 TCP Burst Mitigation Through Congestion Window Limiting draft-iyengar-burst-mitigation-00.txt Status of this Memo By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Copyright Notice Copyright (C) The Internet Society (2005). Abstract This document describes Congestion Window Limiting, a method for mitigating micro-bursts in TCP by limiting the congestion window during loss of TCP's ack clock. Terminology The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119]. The reader is expected to be familiar with terminology from [RFC2581]. Iyengar, Blanton, Allman [Page 1] draft-iyengar-burst-mitigation-00.txt November 2005 1. Introduction TCP injects two kinds of bursts in the network, which we refer to as micro-bursts and macro-bursts. Micro-bursts: Each TCP segment carrying a cumulative acknowledgment (ACK) that slides the sender's transmission window allows previously unsent data segments to be transmitted (when application data is available). These segments are (ideally) transmitted at the line rate of the sender's network (assuming the host's CPU can produce packets fast enough). We refer to such bursts of segments sent in response to receipt of a single ACK as "micro-bursts". Macro-bursts: While in slow start [RFC2581], a TCP sender increases the congestion window (cwnd) by a multiple of 1.5-2 in subsequent round-trip times (RTTs). The exact factor of the increase depends on whether the TCP receiver generates delayed acknowledgments [RFC1122,RFC2581], whether the sender employs byte counting [RFC3465] and network dynamics. Bursts are also caused by ACK compression [ZSC91], where ACKs are received closer together than when they were sent due to queueing dynamics in the network. We refer to bursts due to slow start and ACK compression as "macro-bursts" since they occur on longer timescales than micro-bursts. In this document, we consider only micro-bursts. Several situations can cause micro-bursting: * Although TCP's cumulative ACK mechanism is robust to loss, ACK loss causes a TCP sender's transmission window to slide by a greater amount with lesser frequency, potentially triggering large micro-bursts in the process. * An application can send data in a bursty fashion, causing TCP to transmit micro-bursts. * Reordered ACKs cause an ACK stream that appears similar to an ACK stream with loss, causing similar micro-bursting. These and other causes of bursting are described in more detail in [AB05, JD03]. In this document, we present one possible method for mitigating TCP micro-bursts called Congestion Window Limiting, which is based on work in [HTH01] and originally outlined in [AB05]. Alternate schemes have been proposed to mitigate the impact of micro-bursts, as discussed below. We note that the question of whether or not micro-bursts need mitigation remains open. [JD03] suggests that TCP's bursting may need mitigation from the perspective of the network, while [BA05] suggests that micro-bursts often do not cause loss within the bursting connection. This document, therefore, intends to draw community attention to the issue of micro-bursts, and attempts to generate discussion and further exploration in the area. Iyengar, Blanton, Allman [Page 2] draft-iyengar-burst-mitigation-00.txt November 2005 2. Congestion Window Limiting (CWL) The Congestion Window Limiting (CWL) algorithm first appeared in [AB05] and is based on work in [HTH01]. CWL introduces a new parameter called "MaxBurst", which represents the largest acceptable micro-burst a TCP should transmit. Each time an ACK is received, the cwnd modification (increase or decrease) procedures outlined in [RFC2581] MUST be applied. When using CWL and before any data is sent in response to the received ACK the following steps must be executed. (1) If cwnd > (FlightSize + MaxBurst) TCP will likely send a micro-burst and steps (2) and (3) are used. (If this condition holds, the only case where a micro-burst will not occur is when not enough application data is available to transmit). If the condition does not hold, data should be transmitted as usual (skipping steps (2) and (3)). (2) If ssthresh < cwnd then ssthresh MUST be set to cwnd. (3) Set cwnd = (FlightSize + MaxBurst). After these steps, available application data should be transmitted as allowed by the cwnd and the receiver's advertised window. CWL controls bursts by clamping down cwnd when the ACK clock is lost or interrupted. History information maintained in ssthresh allows exponential cwnd increase (via slow start) as the sender re-establishes the ACK clock. MaxBurst SHOULD be chosen such that bursts are no larger than those allowed by [RFC3390]. From [RFC3390], we therefore choose: MaxBurst = min (4*MSS, max (2*MSS, 4380 bytes)) (1) If useful, MaxBurst MAY be smaller than allowed by equation (1). 3. Related Work Congestion Window Validation [RFC2861] recommends (i) not increasing the cwnd when it is not fully used by an application-limited sender and (ii) decaying the cwnd of an idle sender after a sufficiently long period to avoid use of an invalid cwnd ([RFC2861] suggests reducing the cwnd of an application-limited sender by half for each idle RTO interval.) While it attempts to protect the network from a sender's incorrect or stale view of available bandwidth, [RFC2861] steers clear of micro-burst avoidance; [RFC2861] concerns itself with maintaining a "valid" cwnd and considers micro-burst avoidance an orthogonal problem. CWL prevents micro-bursts by reducing cwnd when appropriate, and in doing so, protects the network from an application-limited sender with stale cwnd information. CWL also prevents a cwnd from increasing during application-limited periods by limiting it to Iyengar, Blanton, Allman [Page 3] draft-iyengar-burst-mitigation-00.txt November 2005 (FlightSize + MaxBurst). Note that CWL is more aggressive in reducing cwnd than [RFC2861]. Several techniques have been proposed in the past for controlling micro-bursts. [FF96] introduces a simple mechanism that limits the number of segments sent in response to an ack. [HTH01] introduces an algorithm called "Use it or Lose it" (UI/LI) which modifies the cwnd to reflect the actual outstanding number of bytes, thereby controlling bursts in response to an ack. This algorithm is used in SCTP [RFC2960][RA+05] and provides the basis for CWL. CWL extends UI/LI by modifying ssthresh and enabling a sender to slow start up to the last known safe cwnd (step (2) in the algo above). Pacing mechanisms such as Rate Based Pacing [VH97] impose a limitation on the rate of sending, and prevent bursts by pacing the data into the network until the ACK clock is established. This solution however, requires a new timer, and the parameter values used to pace out the data require more research. 4. Discussion We emphasize that the question of whether or not micro-bursts need mitigation remains open. While this document provides one mitigation scheme based on current knowledge, continued research on bursts and alternative mitigation mechanisms is strongly encouraged. Finally, we note that some TCP stacks may already implement some form of micro-burst mitigation, although the mechanisms used may not be well understood and have not been through IETF community review. This document presents an initial step towards encouraging better understood and community reviewed micro-burst mitigation mechanisms. 5. Security Considerations This document calls for reducing the congestion window during loss of TCP's ack clock. An attacker can therefore reduce throughput of a TCP connection by causing ack loss or reordering of data or acks. 6. IANA Considerations None. Acknowledgments Normative References [RFC2119] S. Bradner. Key words for use in RFCs to Indicate Requirement Levels, March 1997. BCP 14, RFC 2119. Iyengar, Blanton, Allman [Page 4] draft-iyengar-burst-mitigation-00.txt November 2005 [RFC2581] M. Allman, V. Paxson, W. Stevens. TCP Congestion Control, April 1999. RFC 2581. Informative References [RFC2861] M. Handley, J. Padhye, S. Floyd. TCP Congestion Window Validation, June 2000. RFC 2861. [RFC3465] M. Allman. TCP Congestion Control with Appropriate Byte Counting (ABC), February 2003. RFC 3465. [AB05] M. Allman, E. Blanton. Notes on Burst Mitigation for Transport Protocols. ACM Computer Communication Review, 35(2), April 2005. [BA05] E. Blanton, M. Allman. On the Impact of Bursting on TCP Performance. Proceedings of the Workshop for Passive and Active Measurement, March 2005. [FF96] K. Fall, S. Floyd. Simulation-based Comparisons of Tahoe, Reno, and SACK TCP. Computer Communication Review, 26(3), July 1996. [HTH01] A. Hughes, J. Touch, J. Heidemann. Issues in TCP Slow-Start Restart After Idle. Internet draft , December 2001 (expired). URL: http://www.isi.edu/touch/pubs/draft-hughes-restart-00.txt. [JD03] H. Jiang, C. Dovrolis. Source-Level IP Packet Bursts: Causes and Effects. In ACM SIGCOMM/Usenix Internet Measurement Conference, October 2003. [SA+05] R. Stewart, I. Arias-Rodriguez, K. Poon, A. Caro, M. Tuexen. SCTP Specification Errata and Issues. Internet draft , October 2005 (work in progress). [VH97] V. Visweswaraiah and J. Heidemann. Improving Restart of Idle TCP Connections. Technical Report 97-661, University of Southern California, November 1997. [ZSC91] L. Zhang, S. Shenker, and D. Clark. Observations on the Dynamics of a Congestion Control Algorithm: The Effects of Two- Way Traffic. ACM SIGCOMM, September 1991. Author's Addresses Janardhan Iyengar Protocol Engineering Lab, CIS Department University of Delaware 103 Smith Hall Newark, DE 19716 Email: iyengar@cis.udel.edu URL: http//www.cis.udel.edu/~iyengar/ Iyengar, Blanton, Allman [Page 5] draft-iyengar-burst-mitigation-00.txt November 2005 Ethan Blanton Purdue University Computer Sciences 250 North University Street West Lafayette, IN 47907 Email: eblanton@cs.purdue.edu URL: http://www.cs.purdue.edu/homes/eblanton/ Mark Allman ICSI Center for Internet Research 1947 Center Street, Suite 600 Berkeley, CA 94704-1198 Phone: (440) 235-1792 Email: mallman@icir.org URL: http://www.icir.org/mallman/ Intellectual Property Statement The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org. Disclaimer of Validity This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Copyright Statement Copyright (C) The Internet Society (2005). This document is subject to the rights, licenses and restrictions contained in BCP 78, and Iyengar, Blanton, Allman [Page 6] draft-iyengar-burst-mitigation-00.txt November 2005 except as set forth therein, the authors retain all their rights. Acknowledgment Funding for the RFC Editor function is currently provided by the Internet Society. Iyengar, Blanton, Allman [Page 7]