ConEx G. Karagiannis Internet-Draft University of Twente Intended status: Informational April 5, 2011 Expires: October 5, 2011 Calculating Exposed Congestion by using TCP Throughput Changes draft-karagiannis-conex-congestion-calculation-00 Abstract This document describes a possible implementation complexity problem associated with the current Conex concept defined by the ConEx WG and it proposes a solution to this. This problem occurs due to the fact that it is complex to measure the congestion rate at the sender side. Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on September 29, 2011. Copyright Notice Copyright (c) 2011 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Karagiannis, Expires October 04, 2011 [Page 1] Internet-Draft Calculating Exposed Congestion April 2011 Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 3 2. Implementation Complexity Problem on Congestion Rate Calculation at the Sender . . . . . . . . . . . . . . . . . . . . . . . . . 3 3. Solution on Calculating Congestion Rate by using Throughput Changes Calculated at the Sender . . . . . . . . . . 6 4. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 7 5. Security Considerations . . . . . . . . . . . . . . . . . . . 7 6. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . 7 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 8 8. Comments Solicited . . . . . . . . . . . . . . . . . . . . . . 8 9. References . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 9.1. Normative References . . . . . . . . . . . . . . . . . . . . 8 9.2. Informative References . . . . . . . . . . . . . . . . . . . 8 Karagiannis, Expires October 04, 2011 [Page 2] Internet-Draft Calculating Exposed Congestion April 2011 1. Introduction This document describes a possible implementation complexity problem associated with the current Conex concept defined by the ConEx WG and it proposes a solution to this. This problem occurs due to the fact that it is complex to measure the congestion rate, at the sender side. Moreover, this document provides a solution to this problem, by calculating the congestion rate at the sender side by using the TCP throughput changes calculated at the sender side. 1.1 Terminology The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119]. In addition to the terminology specified in [draft-ietf-conex- abstract-mech-01] and [draft-ietf-conex-concepts-uses-01] the following terms are used and defined in this document. O transport path throughput calculated at the sender: The per flow transport sending rate as a function of the congestion rate, round- trip time, and segment size. The transport path throughput calculated at the sender using the TCP throughput equation specified in [RFC5348]. Note that in [RFC5348] the term congestion rate is denoted as loss event rate. According to [RFC5348] a loss event is defined as one or more lost or marked packets from a window of data, where a marked packet refers to a congestion indication (CE) from Explicit Congestion Notification (ECN) [RFC3168]; 2. Implementation Complexity Problem on Congestion Rate Calculation at the Sender The mentioned problem is related to the fact that in the context of Conex it is complex to measure the congestion rate, at the sender side, see Figure 1. A possible solution is given in [draft-briscoe- tsvwg-re-ecn-tcp-09]: "6.1.1. RECN mode: Full Re-ECN capable transport In full RECN mode, for each half connection, both the sender and the receiver each maintain an unsigned integer counter we will call ECC (echo congestion counter). The receiver maintains a count of how many times a CE marked packet has arrived during the half-connection. Once a RECN connection is established, the three TCP option flags (ECE, CWR & NS) used for ECN-related functions in other versions of ECN are used as a 3-bit field for the receiver to repeatedly tell the sender the current value of ECC, modulo 8, whenever it sends a TCP ACK. We will call this the echo congestion increment (ECI) field. Karagiannis, Expires October 04, 2011 [Page 3] Internet-Draft Calculating Exposed Congestion April 2011 This overloaded use of these 3 option flags as one 3-bit ECI field is shown in Figure 7. The actual definition of the TCP header, including the addition of support for the ECN nonce, is shown for comparison in Figure 6. This specification does not redefine the names of these three TCP option flags, it merely overloads them with another definition once a flow is established. 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ | | | N | C | E | U | A | P | R | S | F | | Header Length | Reserved | S | W | C | R | C | S | S | Y | I | | | | | R | E | G | K | H | T | N | N | +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ Figure 6: The (post-ECN Nonce) definition of bytes 13 and 14 of the TCP Header 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ | | | | U | A | P | R | S | F | | Header Length | Reserved | ECI | R | C | S | S | Y | I | | | | | G | K | H | T | N | N | +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ Figure 7: Definition of the ECI field within bytes 13 and 14 of the TCP Header, overloading the current definitions above for established RECN flows. Receiver Action in RECN Mode Every time a CE marked packet arrives at a receiver in RECN mode, the receiver transport increments its local value of ECC and MUST echo its value, modulo 8, to the sender in the ECI field of the next ACK. It MUST repeat the same value of ECI in every subsequent ACK until the next CE event, when it increments ECI again. The increment of the local ECC values is modulo 8 so the field value simply wraps round back to zero when it overflows. The least significant bit is to the right (labelled bit 9). A receiver in RECN mode MAY delay the echo of a CE to the next delayed-ACK, which would be necessary if ACK-withholding were implemented. Karagiannis, Expires October 04, 2011 [Page 4] Internet-Draft Calculating Exposed Congestion April 2011 Sender Action in RECN Mode On the arrival of every ACK, the sender compares the ECI field with its own ECC value, then replaces its local value with that from the ACK. The difference D (D = (ECI + 8 - ECC mod 8) mod 8) is assumed to be the number of CE marked packets that arrived at the receiver since it sent the previously received ACK (but see below for the sender's safety strategy). Whenever the ECI field increments by D (and/or d drops are detected), the sender MUST clear the RE flag to "0" in the IP header of the next D' data packets it sends (where D' = D + d), effectively re-echoing each single increment of ECI. Otherwise the data sender MUST send all data packets with RE set to "1". As a general rule, once a flow is established, as well as setting or clearing the RE flag as above, a data sender in RECN mode MUST always set the ECN field to ECT(1). However, the settings of the extended ECN field during flow start are defined in Section 6.1.4. As we have already emphasised, the re-ECN protocol makes no changes and has no effect on the TCP congestion control algorithm. So, the first increment of ECI (or detection of a drop) in a RTT triggers the standard TCP congestion response, no more than one congestion response per round trip, as usual. However, the sender re-echoes every increment of ECI irrespective of RTTs. A TCP sender also acts as the receiver for the other half- connection. The host will maintain two ECC values S.ECC and R.ECC as sender and receiver respectively. Every TCP header sent by a host in RECN mode will also repeat the prevailing value of R.ECC in its ECI field. If a sender in RECN mode has to retransmit a packet due to a suspected loss, the re-transmitted packet MUST carry the latest prevailing value of R.ECC when it is re- transmitted, which will not necessarily be the one it carried originally.", copied from [draft-briscoe-tsvwg-re-ecn-tcp-09]. This solution seems to be quite complex to be implemented, since protocol changes need to be accomplished in order to modify the semantics of the TCP header, i.e., the semantics of the flags: NS, CWR and ECE. Karagiannis, Expires October 04, 2011 [Page 5] Internet-Draft Calculating Exposed Congestion April 2011 +---------+ +---------+ |Transport| +-----------+ |Transport| | Sender |>=Data=Path=>|(Congested)|>=====Data=Path=====>| Receiver| | | | Network |>-Congestion-Signal->|---. | | | | Device | | | | | | +-----------+ | | | | | | | | | |<==Feedback=Path==============================<| | | | ,---|<--Transport Layer returned Congestion Signal-<|<--' | | V | | | ||-------|| | | ||Congest|| | | ||rate || +-----------+ |Transport| ||calcul.||>=Data=Path=>|(Congested)|>=====Data=Path=====>| Receiver| || |->(new)Conex->| Network |-(new)Conex signal)->| | |+-------+| | Device | (carried in data | | | | +-----------+ packet headers) | | +---------+ +---------+ Figure 1: Overview ConEx architecture, based on [draft-ietf-conex-abstract-mech-01] 3. Solution on Calculating Congestion Rate by using Throughput Changes Calculated at the Sender This document provides a solution to this problem as follows, see Figure 2. +---------+ +---------+ |Transport| +-----------+ |Transport| | Sender |>=Data=Path=>|(Congested)|>=====Data=Path=====>| Receiver| | | | Network |>-Congestion-Signal->|---. | | | | Device | | | | | | +-----------+ | | | | | | | | | |<==Feedback=Path==============================<| | | | ---- <--Transport Layer returned Congestion Signal- <|<--' | | | | | | ||-------|| | | ||Throug.|| | | ||calc. || | | ||-------|| | | | | | | | | V | | | ||-------|| | | ||Conges.|| | | || rate || +-----------+ |Transport| ||calcul.||>=Data=Path=>|(Congested)|>=====Data=Path=====>| Receiver| || |->(new)Conex->| Network |-(new(Conex) signal->| | |+-------+| | Device | carried in data | | | | +-----------+ packet headers) | | +---------+ +---------+ Figure 2: Overview ConEx architecture, when calculating exposed congestion rate by using TCP throughput changes Karagiannis, Expires October 04, 2011 [Page 6] Internet-Draft Calculating Exposed Congestion April 2011 When the sender needs to reduce its sending rate, then the sender can calculate the exposed congestion rate by subtracting the TCP throughput calculated during a a Round Trip Time (RTT), i.e., (RTTi) from the TCP throughput calculated at the same sender side during the previous RTT, i.e., (RTTi-1), where i is an integer equal or higher than 1. The TCP throughput at the sender side can be calculated using, e.g., the following equation specified in [RFC5348]: "The throughput equation for X_Bps, TCP's average sending rate in bytes per second, is: s X_Bps = ---------------------------------------------------------- R*sqrt(2*b*p/3) + (t_RTO * (3*sqrt(3*b*p/8)*p*(1+32*p^2))) Where: X_Bps is TCP's average transmit rate in bytes per second. (X_Bps is the same as X_calc in RFC 3448.) s is the segment size in bytes (excluding IP and transport protocol headers). R is the round-trip time in seconds. p is the loss event rate, between 0 and 1.0, of the number of loss events as a fraction of the number of packets transmitted. t_RTO is the TCP retransmission timeout value in seconds. b is the maximum number of packets acknowledged by a single TCP acknowledgement.", copied from [RFC5368]. {More details will be included in a next version of this draft.} 4. IANA Considerations This memo includes no request to IANA. 5. Security Considerations The security considerations described in [draft-ietf-conex-abstract-mech-01] apply also for this document. 6. Conclusions {to be done} Karagiannis, Expires October 04, 2011 [Page 7] Internet-Draft Calculating Exposed Congestion April 2011 7. Acknowledgements {to be done} 8. Comments Solicited Comments and questions are encouraged and very welcome. They can be addressed to the IETF Congestion Exposure (ConEx) working group mailing list , and/or to the authors. 9. References 9.1. Normative References [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [draft-ietf-conex-abstract-mech-01] Mathis. M., Briscoe, B., "Congestion Exposure (ConEx) Concepts and Abstract Mechanism", draft-ietf-conex-abstract-mech-01 (work in progress), March 2011. 9.2. Informative References [draft-briscoe-tsvwg-re-ecn-tcp-09] Briscoe, B., Jacquet, A., Moncaster, T., and A. Smith, "Re- ECN: Adding Accountability for Causing Congestion to TCP/IP", draft-briscoe-tsvwg-re-ecn-tcp-09 (work in progress), October 2010. [RFC5348] S. Floyd, M. Handley, J. Padhye, J. Widmer, "TCP Friendly Rate Control (TFRC): Protocol Specification", RFC 5348, September 2008. Authors' Addresses Georgios Karagiannis University of Twente P.O. Box 217 7500 AE Enschede, The Netherlands EMail: g.karagiannis@ewi.utwente.nl Karagiannis, Expires October 04, 2011 [Page 8]