Internet Engineering Task Force                        Janardhan Iyengar 
INTERNET DRAFT                                    University of Delaware  
draft-iyengar-burst-mitigation-01.txt                        Mark Allman
Expires: July, 2006                                            ICIR/ICSI
                                                           Ethan Blanton
                                                       Purdue University
                                                           January, 2006

           TCP Burst Mitigation Through Congestion Window Limiting
                    draft-iyengar-burst-mitigation-01.txt

Status of this Memo

    By submitting this Internet-Draft, each author represents that any
    applicable patent or other IPR claims of which he or she is aware
    have been or will be disclosed, and any of which he or she becomes
    aware will be disclosed, in accordance with Section 6 of BCP 79.

    Internet-Drafts are working documents of the Internet Engineering
    Task Force (IETF), its areas, and its working groups.  Note that
    other groups may also distribute working documents as
    Internet-Drafts.

    Internet-Drafts are draft documents valid for a maximum of six
    months and may be updated, replaced, or obsoleted by other documents
    at any time.  It is inappropriate to use Internet-Drafts as
    reference material or to cite them other than as "work in progress."

    The list of current Internet-Drafts can be accessed at
    http://www.ietf.org/ietf/1id-abstracts.txt.

    The list of Internet-Draft Shadow Directories can be accessed at
    http://www.ietf.org/shadow.html.

Copyright Notice

    Copyright (C) The Internet Society (2006).

Abstract

    This document describes Congestion Window Limiting (CWL), a method
    for mitigating micro-bursts in TCP by limiting the congestion window
    during interruptions in TCP's acknowledgment clock.

Terminology

    The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
    "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
    document are to be interpreted as described in [RFC2119].

    The reader is expected to be familiar with terminology from
    [RFC2581].


Iyengar, Allman, Blanton                                        [Page 1]

draft-iyengar-burst-mitigation-01.txt                       January 2006

1.  Introduction

    TCP dynamics and application sending patterns can cause a TCP sender
    to inject bursts into the network with potentially harmful effects
    for both the network and the sender.  Bursting can stress network
    queues causing loss in the bursting connection as well as in other
    flows sharing the stressed queues.  Bursting can also cause scaling
    on short timescales [JD03] and increase queueing delays in routers.
    This document draws from previously proposed burst mitigation
    techniques and presents one possible technique to reduce some of
    TCP's burstiness.

    In this document, we are concerned with one type of bursting which
    we call "micro-bursts".  Micro-bursts are generated by a TCP in
    response to changes in the cumulative acknowledgment point.  Each
    TCP segment carrying a cumulative acknowledgment (ACK) that slides
    the sender's transmission window allows previously unsent data
    segments to be transmitted (when application data is available).
    These segments are ideally transmitted at the line rate of the
    sender's network (assuming the host's CPU can produce packets fast
    enough).  We refer to such bursts of segments sent in response to
    receipt of a single ACK as "micro-bursts".

    TCP exhibits other bursting behaviors as well, which we collectively
    term as "macro-bursts" since they tend to occur over longer
    timescales than micro-bursts. Macro-bursts can be caused by several
    TCP and/or network phenomena, such as slow start [RFC2581] and ACK
    compression [ZSC91].  Although macro-bursts and their mitigation
    have also been the topic of much research ([AB05] briefly discusses
    this research), we limit ourselves to only micro-burst mitigation in
    this document.

    Several situations can cause micro-bursting:

      * Although TCP's cumulative ACK mechanism is robust to loss, ACK
        loss causes a TCP sender's transmission window to slide by a
        greater amount with lesser frequency, potentially triggering large
        micro-bursts in the process.

      * An application can send data in a bursty fashion, causing TCP to
        transmit micro-bursts.

      * Reordered ACKs cause an ACK stream that appears similar to an ACK
        stream with loss, causing similar micro-bursting.

      * In some cases, when a TCP sender exits fast recovery, a large
        number of segments are transmitted at line rate [FF96].  This
        dynamic occurs when the sender cannot transmit enough new
        segments during the recovery phase (e.g., due to ACK loss) and
        therefore stores "permission to send" until a cumulative ACK
        arrives.  This phenomenon is discussed in [FF96], where the
        "MaxBurst" mechanism is introduced to contain the consequent
        burst (see discussion in section 3).


Iyengar, Allman, Blanton                                        [Page 2]

draft-iyengar-burst-mitigation-01.txt                       January 2006

    These and other causes of bursting are described in more detail in
    [JD03,AB05].

    In this document, we present one possible method for mitigating TCP
    micro-bursts called Congestion Window Limiting (CWL), which is based
    on work in [HTH01] and originally outlined in [AB05].  Alternate
    schemes have been proposed to mitigate the impact of micro-bursts,
    as discussed in section 3.  We note that the question of whether or
    not micro-bursts need mitigation remains open.  [JD03] suggests that
    TCP's bursting may need mitigation from the perspective of the
    network, while [BA05] suggests that micro-bursts often do not cause
    loss within the bursting connection.  By specifying a particular
    mitigation technique this document intends to draw community
    attention to the issue of micro-bursts, and attempts to generate
    discussion and further exploration and experimentation in the area.

2.  Congestion Window Limiting (CWL)

    CWL introduces a new parameter called "BLimit", which represents the
    largest acceptable micro-burst a TCP should transmit.

    Each time an ACK is received that slides the transmission window,
    the congestion window (cwnd) modification (increase or decrease)
    procedures outlined in [RFC2581] MUST be applied. When using CWL,
    the following steps MUST be executed before any data is sent in
    response to the received ACK:

    (1) If cwnd > (FlightSize + BLimit) TCP will likely send a
        micro-burst and steps (2) and (3) MUST be used; otherwise,
        skip (2) and (3) and transmit data as usual.  If this
        condition holds, the only case where a micro-burst will not
        occur is when not enough application data is available to
        transmit.

    (2) If ssthresh < cwnd then ssthresh MUST be set to cwnd.

    (3) Set cwnd = (FlightSize + BLimit).

    After these steps, available application data should be transmitted
    as allowed by the cwnd and the receiver's advertised window.

    CWL controls bursts by reducing cwnd when the ACK clock is lost or
    interrupted to the point where the cumulative ACK will trigger a
    burst of segments in excess of BLimit.  History information
    maintained in ssthresh allows the connection to exponentially
    increase the cwnd (via slow start) back to the size before the
    reduction.

    BLimit SHOULD be chosen such that bursts are no larger than those
    allowed by [RFC3390].  From [RFC3390], we therefore choose:

        BLimit = min (4*MSS, max (2*MSS, 4380 bytes))            (1)

    If useful, BLimit MAY be smaller than allowed by equation (1).

Iyengar, Allman, Blanton                                        [Page 3]

draft-iyengar-burst-mitigation-01.txt                       January 2006


3. Related Work

    CWL makes TCP congestion control more conservative and is therefore
    implicitly allowed by [RFC2581].

    Congestion Window Validation (CWV) [RFC2861] attempts to protect the
    network from a sender's incorrect or stale view of the available
    capacity along the path.  [RFC2861] recommends (i) not increasing
    the cwnd when it is not fully used by an application-limited sender,
    and (ii) decaying the cwnd after a sufficiently long idle period to
    avoid use of an unvalidated cwnd.  [RFC2861] suggests reducing the
    cwnd of an application-limited sender by half for each idle RTO
    interval.  While CWV can prevent micro-bursts in some situations,
    this is accidental and not part of the problem CWV is trying to
    solve.  CWL, on the other hand, aims at preventing micro-bursts by
    reducing the cwnd when appropriate, and in doing so, protects the
    network from an application-limited sender with stale cwnd
    information. CWL also prevents a cwnd from increasing during
    application-limited periods by limiting it to (FlightSize +
    BLimit). Note that CWL is more aggressive in reducing cwnd than
    [RFC2861].
    
    Several techniques have been proposed in the past for controlling
    micro-bursts, as follows:

      * As noted above, [FF96] introduces the "MaxBurst" mechanism.
        MaxBurst is an additional constraint that limits the number of
        data segments that can be transmitted in response to any given
        ACK.  

        CWL provides a single control for the amount of data a TCP
        connection can transmit into the network at any given point.
        This is arguably a clean approach to controlling the load
        imposed on the network.  On the other hand, by introducing a
        second control, MaxBurst provides for separation of concerns.
        In other words, limiting the sizes of micro-bursts is, in some
        sense, a different task than limiting the overall transmission
        rate to control network congestion; therefore, using two
        different controls may make sense. An additional drawback of
        MaxBurst is that the two transmission controllers may interact
        poorly, causing undesirable side effects.  When BLimit ==
        MaxBurst, CWL and MaxBurst perform similarly [AB05].

      * [HTH01] introduces an algorithm called "Use it or Lose it"
        (UI/LI) which modifies the cwnd to reflect the actual
        outstanding number of bytes, thereby controlling bursts in
        response to an ack.  UI/LI is used in SCTP [RFC2960,RA+05] and
        provides the basis for CWL.  CWL extends UI/LI by modifying
        ssthresh and enabling a sender to slow start up to the last
        known safe cwnd (step (2) in the algo above).  In the absence of
        explicitly setting ssthresh as part of the burst mitigation
        process the UI/LI algorithm is non-deterministic in its use of
        slow start after reducing cwnd.  [AB05] illustrates cases where

Iyengar, Allman, Blanton                                        [Page 4]

draft-iyengar-burst-mitigation-01.txt                       January 2006

        slow start is used and cases where it is not used, simply
        depending on the state of the connection before UI/LI reduces
        the cwnd.

      * Rate-Based Pacing [VH97] imposes a limitation on the rate of
        sending, and prevent bursts by pacing data into the network
        until the ACK clock is established.  Although this solution can
        be very effective in burst mitigation in some cases, it requires
        a new timer and parameters for pacing out the data segments.
        Further, as shown in [AB05], there are cases where there is no
        natural "lull" in the connection into which segments can be
        nicely paced.  Therefore, the exact application of pacing
        requires more research.

4.  Discussion

    We emphasize that the question of whether or not micro-bursts need
    mitigation remains open.  While this document provides the
    specification for one mitigation technique based on current
    knowledge, continued research on bursts and alternative mitigation
    mechanisms is strongly encouraged.

    Finally, we note that some TCP stacks may already implement some
    form of micro-burst mitigation, although the mechanisms used may not
    be well understood and have not been through IETF community
    review. This document presents an initial step towards encouraging
    better understood and community reviewed micro-burst mitigation
    mechanisms.

5.  Security Considerations 

    This document calls for reducing the congestion window during loss
    of TCP's ACK clock.  An attacker can therefore reduce throughput of
    a TCP connection by causing ACK loss or reordering of data or acks.

6.  IANA Considerations

    None.

Acknowledgments

    Discussions with Sally Floyd have shaped some of the thinking that
    is contained in this document.

Normative References
    
    [RFC2119] S. Bradner.  Key words for use in RFCs to Indicate
        Requirement Levels, March 1997. BCP 14, RFC 2119.

    [RFC2581] M. Allman, V. Paxson, W. Stevens. TCP Congestion Control,
        April 1999. RFC 2581.


Iyengar, Allman, Blanton                                        [Page 5]

draft-iyengar-burst-mitigation-01.txt                       January 2006

Informative References

    [RFC2861] M. Handley, J. Padhye, S. Floyd. TCP Congestion Window
        Validation, June 2000. RFC 2861.

    [AB05] M. Allman, E. Blanton. Notes on Burst Mitigation for
        Transport Protocols. ACM Computer Communication Review, 35(2), 
        April 2005.

    [BA05] E. Blanton, M. Allman. On the Impact of Bursting on TCP
        Performance. Proceedings of the Workshop for Passive and Active
        Measurement, March 2005.
     
    [FF96] K. Fall, S. Floyd. Simulation-based Comparisons of Tahoe,
        Reno, and SACK TCP. Computer Communication Review, 26(3), July 
        1996.
     
    [HTH01] A. Hughes, J. Touch, J. Heidemann. Issues in TCP Slow-Start
        Restart After Idle. Internet draft 
        <draft-hughes-restart-00.txt>, December 2001 (expired).
        URL: http://www.isi.edu/touch/pubs/draft-hughes-restart-00.txt.

    [JD03] H. Jiang, C. Dovrolis. Source-Level IP Packet Bursts: Causes
        and Effects. In ACM SIGCOMM/Usenix Internet Measurement
        Conference, October 2003.

    [SA+05] R. Stewart, I. Arias-Rodriguez, K. Poon, A. Caro,
        M. Tuexen. SCTP Specification Errata and Issues. Internet draft
        <draft-ietf-tsvwg-sctpimpguide-16.txt>, October 2005 (work in
        progress).

    [VH97] V. Visweswaraiah and J. Heidemann.  Improving Restart of
       Idle TCP Connections.  Technical Report 97-661, University of
       Southern California, November 1997.

    [ZSC91] L. Zhang, S. Shenker, and D. Clark. Observations on the
       Dynamics of a Congestion Control Algorithm: The Effects of 
       Two-Way Traffic. ACM SIGCOMM, September 1991.

Author's Addresses

    Janardhan Iyengar
    Protocol Engineering Lab, CIS Department
    University of Delaware
    103 Smith Hall
    Newark, DE 19716
    Email: iyengar@cis.udel.edu
    URL: http//www.cis.udel.edu/~iyengar/

    Mark Allman
    ICSI Center for Internet Research
    1947 Center Street, Suite 600
    Berkeley, CA 94704-1198
    Phone: (440) 235-1792

Iyengar, Allman, Blanton                                        [Page 6]

draft-iyengar-burst-mitigation-01.txt                       January 2006

    Email: mallman@icir.org
    URL: http://www.icir.org/mallman/

    Ethan Blanton
    Purdue University Computer Sciences
    250 North University Street
    West Lafayette, IN  47907
    Email: eblanton@cs.purdue.edu
    URL: http://www.cs.purdue.edu/homes/eblanton/

Intellectual Property Statement

    The IETF takes no position regarding the validity or scope of any
    Intellectual Property Rights or other rights that might be claimed
    to pertain to the implementation or use of the technology described
    in this document or the extent to which any license under such
    rights might or might not be available; nor does it represent that
    it has made any independent effort to identify any such rights.
    Information on the procedures with respect to rights in RFC
    documents can be found in BCP 78 and BCP 79.

    Copies of IPR disclosures made to the IETF Secretariat and any
    assurances of licenses to be made available, or the result of an
    attempt made to obtain a general license or permission for the use
    of such proprietary rights by implementers or users of this
    specification can be obtained from the IETF on-line IPR repository
    at http://www.ietf.org/ipr.

    The IETF invites any interested party to bring to its attention any
    copyrights, patents or patent applications, or other proprietary
    rights that may cover technology that may be required to implement
    this standard.  Please address the information to the IETF at
    ietf-ipr@ietf.org.

Disclaimer of Validity

    This document and the information contained herein are provided on
    an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE
    REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE
    INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR
    IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
    THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
    WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

Copyright Statement

    Copyright (C) The Internet Society (2006).  This document is subject
    to the rights, licenses and restrictions contained in BCP 78, and
    except as set forth therein, the authors retain all their rights.

Acknowledgment

    Funding for the RFC Editor function is currently provided by the
    Internet Society.

Iyengar, Allman, Blanton                                        [Page 7]