Internet DRAFT - draft-han-tsvwg-cc

draft-han-tsvwg-cc







TSVWG Working Group                                               L. Han
Internet-Draft                                                     Y. Qu
Intended status: Experimental                                     Huawei
Expires: September 4, 2018                                     T. Nadeau
                                                            Lucid Vision
                                                           March 3, 2018


        A New Congestion Control in Bandwidth Guaranteed Network
                         draft-han-tsvwg-cc-00

Abstract

   In bandwidth guaranteed networks, network resources are reserved
   before a TCP session starts transmitting data.  This draft proposes a
   new TCP congestion control algorithm used in bandwidth guaranteed
   networks.  It is an extension to the current TCP standards.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at https://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on September 4, 2018.

Copyright Notice

   Copyright (c) 2018 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (https://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of




Han, et al.             Expires September 4, 2018               [Page 1]

Internet-Draft           New Congestion Control               March 2018


   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
   2.  Terminology and Notation  . . . . . . . . . . . . . . . . . .   3
   3.  Bandwidth Guaranteed Network  . . . . . . . . . . . . . . . .   4
   4.  New Congestion Control  . . . . . . . . . . . . . . . . . . .   5
     4.1.  Receiver Advertised Window Size . . . . . . . . . . . . .   5
     4.2.  MinBandwidthWND and MaxBandwidthWND . . . . . . . . . . .   5
     4.3.  Congestion Avoidance  . . . . . . . . . . . . . . . . . .   6
     4.4.  Fast Retransmit and Fast Recovery . . . . . . . . . . . .   7
     4.5.  Timeout . . . . . . . . . . . . . . . . . . . . . . . . .   8
     4.6.  Idle Recovery . . . . . . . . . . . . . . . . . . . . . .   8
   5.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .   8
   6.  Security Considerations . . . . . . . . . . . . . . . . . . .   8
   7.  References  . . . . . . . . . . . . . . . . . . . . . . . . .   8
     7.1.  Normative References  . . . . . . . . . . . . . . . . . .   9
     7.2.  Informative References  . . . . . . . . . . . . . . . . .   9
   Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . .  11
   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  11

1.  Introduction

   The original IP protocol suite was designed to support best-effort
   data transmission.  With the development of the Internet, congestion
   became a real problem.  To avoid congestion in the Internet, TCP uses
   congestion-avoidance algorithms to keep hosts from pumping too much
   traffic into the network.  Over the past 40 years there have been
   various algorithms and optimizations proposed to solve this problem,
   including TCP-RENO [RFC5681], TCP-NewReno [RFC6582] [RFC6675], TCP-
   Cubic [RFC8312] and BBR [I-D.cardwell-iccrg-bbr-congestion-control]
   etc.

   In bandwidth guaranteed networks, network resources are reserved
   before transmitting data.  This draft proposes a new congestion
   control algorithm that should be used in bandwidth guaranteed
   networks to improve TCP throughput.  The following is a list of key
   differences between this new algorithm and classic TCP congestion
   control [RFC5681]:

      It doesn't have a slow start, after a TCP session is successfully
      initiated its congestion window (cwnd) jumps to CIR and the host
      is allowed to transmit data.  This is based on the assumption that
      network resources have been reserved in bandwidth guaranteed
      networks.




Han, et al.             Expires September 4, 2018               [Page 2]

Internet-Draft           New Congestion Control               March 2018


      During congestion avoidance, cwnd stays between CIR (Committed
      Information Rate) and PIR (Peak Information Rate).  If there is no
      packet loss due to congestion, cwnd has a flat top rate as PIR.

      OAM is used together with duplicate ACKs to detect whether a
      packet loss is due to congestion or random failure.

   This draft is organized as follows.  Section 2 defines terminologies
   used in this draft.  Section 3 provides background information for
   Bandwidth Guaranteed Networks.  Section 4 explains the details of the
   new congestion control algorithm.

2.  Terminology and Notation

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in [RFC2119].

   Some of the following terms are defined the same as [RFC5681], and
   they are copied here for readability.

      FULL-SIZED SEGMENT: A segment that contains the maximum number of
      data bytes permitted (i.e., a segment containing SMSS bytes of
      data).

      RECEIVER WINDOW (rwnd): The most recently advertised receiver
      window.

      CONGESTION WINDOW (cwnd): A TCP state variable that limits the
      amount of data a TCP can send.  At any given time, a TCP MUST NOT
      send data with a sequence number higher than the sum of the
      highest acknowledged sequence number and the minimum of cwnd and
      rwnd.

      Sender Maximum Segment Size (SMSS): The SMSS is the size of the
      largest segment that the sender can transmit.  This value can be
      based on the maximum transmission unit of the network, the path
      MTU discovery [RFC1191, RFC4821] algorithm, RMSS (see next item),
      or other factors.  The size does not include the TCP/IP headers
      and options.

      RECEIVER MAXIMUM SEGMENT SIZE (RMSS): The RMSS is the size of the
      largest segment the receiver is willing to accept.  This is the
      value specified in the MSS option sent by the receiver during
      connection startup.  Or, if the MSS option is not used, it is 536
      bytes [RFC1122].  The size does not include the TCP/IP headers and
      options.




Han, et al.             Expires September 4, 2018               [Page 3]

Internet-Draft           New Congestion Control               March 2018


      INITIAL WINDOW (IW): The initial window is the size of the
      sender's congestion window after the three-way handshake is
      completed.

      RESTART WINDOW (RW): The restart window is the size of the
      congestion window after a TCP restarts transmission after an idle
      period.

      ssthresh: Slow Start Threshold.

      OAM: Operations, Administrations, and Maintenance.

      RTT: Round-Trip Time.

      CIR: Committed Information Rate.

      PIR: Peak Information Rate.

3.  Bandwidth Guaranteed Network

   With the development of new applications, such as AR/VR, the network
   is required to provide bandwidth guaranteed services.  There have
   been various solutions, including out-of-band signaling protocols
   such as RSVP [RFC2205] and NSIS [RFC4080], and in-band-signaling as
   proposed in [I-D.han-6man-in-band-signaling-for-transport-qos].  The
   common objective of all these solutions is to have network resources/
   bandwidth reserved before data is transmitted.  The details of how
   the resource is reserved are out of the scope of this draft, however
   it is assumed that in bandwidth guaranteed networks there have been
   network resources (bandwidths, queues etc.) dedicated to the TCP
   flows, and data is guaranteed at CIR rate.  When data rate is between
   CIR and PIR shared resources are used, and traffic above CIR rate is
   not guaranteed.  No traffic above PIR rate will be allowed to enter
   the network.

   The proposed congestion control also requires that OAM (Operations,
   administration and management) is used to constantly report on the
   network condition parameters.  Before a TCP session is started,
   important network parameters need to be detected by OAM, such as
   number of hops, Round Trip Time (RTT).  This might be done through
   setting up a measuring TCP connection.  The measuring TCP connection
   does not have user data, and it is only used to measure the key
   network parameters.  As the network status is constantly changing,
   after a TCP session is established, these parameters need to be
   updated.  This requires a sender to periodically or consistently
   embed TCP data packet with OAM
   [I-D.han-6man-in-band-signaling-for-transport-qos]
   [I-D.ietf-ippm-ioam-data] to detect current buffer depth, RTT etc.



Han, et al.             Expires September 4, 2018               [Page 4]

Internet-Draft           New Congestion Control               March 2018


   It is important that OAM needs to be able to detect if any device's
   buffer depth has exceeded the pre-configured threshold, as this is an
   indication of potential congestion and packet drop.  When this
   happens, OAM should send a possible congestion alarm to the TCP
   sender.  In case the retransmit timer expires on this TCP sender, if
   a possible congestion alarm has been received it means a packet is
   dropped due to congestion.  Otherwise it is possible that this packet
   drop might due to some physical failure.  The OAM details are out of
   the scope of this draft.  Please refer to other related drafts.

   In summary, in bandwidth guaranteed networks resources are reserved
   before transmitting data, and OAM is used to get network statistics.
   The new congestion control proposed in this draft is to be used in
   this kind of bandwidth guaranteed networks.

4.  New Congestion Control

   [RFC5681] defines a set of TCP congestion algorithms: slow start,
   congestion avoidance, fast retransmit and fast recovery.  The
   proposed congestion control in this draft is an extension to RFC
   5681, and it only differs in the congestion control algorithm on the
   sender side.

4.1.  Receiver Advertised Window Size

   Receiver's advertised window (rwnd) is a receiver-side limit on the
   amount of outstanding data, so a sender should not send data more
   than this window size.  It is calculated as the following:

     rwnd = AdvertisedWND = MaxRcvBuffer - (LastByteRcvd - LastByteRead)

4.2.  MinBandwidthWND and MaxBandwidthWND

   Same as [RFC5681], on the sender side, the congestion window (cwnd)
   is the sender-side limit on the amount of data that the sender can
   transmit before receiving an acknowledgement (ACK).  Considering both
   the sender and the receiver side, the effective sending window is
   always the minimum of cwnd and rwnd:

      EffectiveWND = min(cwnd, rwnd)

   A TCP sender MUST NOT send data more than the minimum of cwnd and
   rwnd.

   Slow-start is commonly used in TCP at the beginning of a transfer or
   after a loss repair as the network conditions are unknown, hence this
   slow probing is necessary to determine the available network capacity
   in order to avoid inappropriately sending large burst of data into



Han, et al.             Expires September 4, 2018               [Page 5]

Internet-Draft           New Congestion Control               March 2018


   the network and cause congestion.  A detailed discussion about
   initial window setting is provided in [RFC3390].

   RTT is the time taken to send a packet to the destination plus
   receiving a response packet(ACK).  Since the network status is
   constantly changing, RTT also varies.  [RFC6298] specifies how RTT
   should be sampled and updated.  In this new algorithm RTT is updated
   using the following formula:

      RTT = a* old RTT + (1-a) * new RTT   (0 < a < 1)   (1)

   The initial RTT can be achieved using a measure TCP connection, or
   configured based on historical data.

   In bandwidth guaranteed network since resources are already allocated
   and the network status is known through OAM
   [I-D.han-6man-in-band-signaling-for-transport-qos], it is safe to
   remove slow-start and allow a host to start sending traffic at the
   rate of CIR after the TCP session is established.

   There are two important window sizes, the MinBandwidthWND and the
   MaxBandwidthWND are calculated as below:

      MinBandwidthWND = CIR * RTT/MSS    (2)
      MaxBandwidthWND = PIR * RTT/MSS    (3)

   In bandwidth guaranteed networks, after a TCP session is established,
   the sender can start transmitting data at an initial window size,
   which is equal to MinBandwidthWND:

      cwnd = MinBandwidthWND
      IW = min (cwnd, rwnd)

   If the receiver window (rwnd) is not a limiting factor, the sender
   will start sending data at CIR rate.  This is a key difference from
   the classic TCP slow-start, which usually starts from sending one or
   two packets [RFC5681].

4.3.  Congestion Avoidance

   In TCP-Reno, a TCP enters congestion avoidance mode after slow-start.
   In bandwidth guaranteed networks, there is no slow-start, so a TCP
   enters congestion avoidance mode right after the initial start.

   During congestion avoidance, for approximately per round-trip time
   when a valid ACK packet is received, cwnd is increased by one until
   it reaches MaxBandwidthWND.




Han, et al.             Expires September 4, 2018               [Page 6]

Internet-Draft           New Congestion Control               March 2018


     If (cwnd < MaxBandwidthWND) {
       cwnd +=1;
     } else {
       cwnd = MaxBandwidthWND;
     }

   Once the cwnd reaches MaxBandwidthWND , it stays constant at
   MaxBandwidthWND until packet loss is detected.  This is another major
   difference from [RFC5681].  In [RFC5681] congestion avoidance period,
   the cwnd keeps increasing until a TCP sender detects segment loss.
   However, in this new congestion control algorithm, the cwnd stays
   constant at MaxBandwidthWND until there is packet loss detected.

   This means a TCP sender is never allowed to send data at a rate
   larger than PIR, and it's different from TCP Reno.

4.4.  Fast Retransmit and Fast Recovery

   Same as defined [RFC5681], a TCP receiver SHOULD send an immediate
   duplicate ACK when an out-of-order segment arrives.  The TCP sender
   detects and repair loss based on incoming duplicate ACKs.  If 3
   duplicate ACKs are received, the sender uses it as an indication that
   a segment has been lost, and will perform a retransmission of the
   lost segment.

   In TCP-Reno [RFC5681], after the fast retransmit of what appears to
   be the lost segment, fast recovery is used to continue to transmit
   new segments at a reduced rate ssthresh.

   In the new congestion control algorithm, upon receiving duplicate
   ACKs the fast retransmit and fast recovery follow the below rules:

   o  When a sender receives the first and second duplicate ACKs, same
      as [RFC5681], the cwnd is not changed, and the sender continues to
      send traffic.

   o  When a sender receives the third duplicated ACK, if the
      retransmission timer has not expired and a previous OAM congestion
      alarm has been received it is likely a segment is lost due to
      congestion.  The sender will perform a retransmission of the lost
      segment, and the cwnd is set to be MinBandwidthWND.

   o  When a sender receives the third duplicated ACK, but no previous
      OAM congestion alarm has been received, then it is considered that
      a segment is lost due to random failure not congestion.  In this
      case the cwnd is not changed.





Han, et al.             Expires September 4, 2018               [Page 7]

Internet-Draft           New Congestion Control               March 2018


   Compared to [RFC5681], where in case of network congestion the new
   cwnd is set to be ssthresh, which is usually half of the old cwnd.
   In this new congestion control, in case there is a segment loss
   detected as described above, the new cwnd is set to be MinBandwithWND
   as in equation (2).

4.5.  Timeout

   If a retransmission timer [RFC6298] in a TCP sender expires, in
   bandwidth guaranteed networks no matter duplicate ACK received or
   not, this most likely indicates a physical failure.

   In this case, the cwnd is set to be one, and the TCP sender will
   retransmit the lost segment.  This packet also services the function
   of probing network status.  If there is really a network failure, no
   ACK will be received and the retransmission timer will expire again.
   Upon receiving an expected ACK after the retransmission, it means the
   network has recovered, and the cwnd will be set to be MinBandwidthWND
   as in equation (2).

4.6.  Idle Recovery

   It is defined in [RFC5681] that a TCP session should use slow start
   to restart transmission after a long idle period more than one
   retransmission timeout, and the RW (Restart Window) is the minimum of
   IW and cwnd.

   In this proposal, the same rule is still followed.  However due to
   the fact that there is no slow start needed in bandwidth guaranteed
   networks, and the IW in this new congestion control is set to be
   MinBandwidthWND, a TCP sender can start transmitting data at CIR rate
   after a long idle.

5.  IANA Considerations

   NA.

6.  Security Considerations

   This proposal makes no change to the underlying security of TCP.
   More information about TCP security concerns can be found in
   [RFC5681].

7.  References







Han, et al.             Expires September 4, 2018               [Page 8]

Internet-Draft           New Congestion Control               March 2018


7.1.  Normative References

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119,
              DOI 10.17487/RFC2119, March 1997,
              <https://www.rfc-editor.org/info/rfc2119>.

7.2.  Informative References

   [RFC2205]  Braden, R., Ed., Zhang, L., Berson, S., Herzog, S., and S.
              Jamin, "Resource ReSerVation Protocol (RSVP) -- Version 1
              Functional Specification", RFC 2205, DOI 10.17487/RFC2205,
              September 1997, <https://www.rfc-editor.org/info/rfc2205>.

   [RFC3390]  Allman, M., Floyd, S., and C. Partridge, "Increasing TCP's
              Initial Window", RFC 3390, DOI 10.17487/RFC3390, October
              2002, <https://www.rfc-editor.org/info/rfc3390>.

   [RFC4080]  Hancock, R., Karagiannis, G., Loughney, J., and S. Van den
              Bosch, "Next Steps in Signaling (NSIS): Framework",
              RFC 4080, DOI 10.17487/RFC4080, June 2005,
              <https://www.rfc-editor.org/info/rfc4080>.

   [RFC4960]  Stewart, R., Ed., "Stream Control Transmission Protocol",
              RFC 4960, DOI 10.17487/RFC4960, September 2007,
              <https://www.rfc-editor.org/info/rfc4960>.

   [RFC5681]  Allman, M., Paxson, V., and E. Blanton, "TCP Congestion
              Control", RFC 5681, DOI 10.17487/RFC5681, September 2009,
              <https://www.rfc-editor.org/info/rfc5681>.

   [RFC6298]  Paxson, V., Allman, M., Chu, J., and M. Sargent,
              "Computing TCP's Retransmission Timer", RFC 6298,
              DOI 10.17487/RFC6298, June 2011,
              <https://www.rfc-editor.org/info/rfc6298>.

   [RFC6582]  Henderson, T., Floyd, S., Gurtov, A., and Y. Nishida, "The
              NewReno Modification to TCP's Fast Recovery Algorithm",
              RFC 6582, DOI 10.17487/RFC6582, April 2012,
              <https://www.rfc-editor.org/info/rfc6582>.

   [RFC6675]  Blanton, E., Allman, M., Wang, L., Jarvinen, I., Kojo, M.,
              and Y. Nishida, "A Conservative Loss Recovery Algorithm
              Based on Selective Acknowledgment (SACK) for TCP",
              RFC 6675, DOI 10.17487/RFC6675, August 2012,
              <https://www.rfc-editor.org/info/rfc6675>.





Han, et al.             Expires September 4, 2018               [Page 9]

Internet-Draft           New Congestion Control               March 2018


   [RFC8312]  Rhee, I., Xu, L., Ha, S., Zimmermann, A., Eggert, L., and
              R. Scheffenegger, "CUBIC for Fast Long-Distance Networks",
              RFC 8312, DOI 10.17487/RFC8312, February 2018,
              <https://www.rfc-editor.org/info/rfc8312>.

   [I-D.cardwell-iccrg-bbr-congestion-control]
              Cardwell, N., Cheng, Y., Yeganeh, S., and V. Jacobson,
              "BBR Congestion Control", draft-cardwell-iccrg-bbr-
              congestion-control-00 (work in progress), July 2017.

   [I-D.han-6man-in-band-signaling-for-transport-qos]
              Han, L., Li, G., Tu, B., Xuefei, T., Li, F., Li, R.,
              Tantsura, J., and K. Smith, "IPv6 in-band signaling for
              the support of transport with QoS", draft-han-6man-in-
              band-signaling-for-transport-qos-00 (work in progress),
              October 2017.

   [I-D.ietf-ippm-ioam-data]
              Brockners, F., Bhandari, S., Pignataro, C., Gredler, H.,
              Leddy, J., Youell, S., Mizrahi, T., Mozes, D., Lapukhov,
              P., Chang, R., and d. daniel.bernier@bell.ca, "Data Fields
              for In-situ OAM", draft-ietf-ippm-ioam-data-01 (work in
              progress), October 2017.




























Han, et al.             Expires September 4, 2018              [Page 10]

Internet-Draft           New Congestion Control               March 2018


Acknowledgments

   The authors wish to thank xxxx for their helpful comments and
   suggestions.

Authors' Addresses

   Lin Han
   Huawei
   2330 Central Expressway
   Santa Clara  CA 95050
   USA

   EMail: lin.han@huawei.com


   Yingzhen Qu
   Huawei
   2330 Central Expressway
   Santa Clara  CA 95050
   USA

   EMail: yingzhen.qu@huawei.com


   Thomas Nadeau
   Lucid Vision
   Hampton  NH 03842
   USA

   EMail: tnadeau@lucidvision.com




















Han, et al.             Expires September 4, 2018              [Page 11]