ConEx G. Karagiannis Internet-Draft University of Twente D. Papadimitriou Alcatel-Lucent Intended status: Experimental May 6, 2011 Expires: November 6, 2011 Exposing Conex Throughput draft-karagiannis-conex-throughput-exposure-01 Abstract This document describes a possible implementation complexity problem associated with the current Conex concept defined by the ConEx WG and it proposes a solution to this. This problem occurs due to the fact that it is complex to measure the congestion rate, at each intermediate Conex enabled device. Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on November 6, 2011. Copyright Notice Copyright (c) 2011 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Karagiannis Expires November 6, 2011 [Page 1] Internet-Draft Exposing Conex Throughput May 2011 Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 3 2. Implementation Complexity Problem . . . . . . . . . . . . . . . 4 3. Conex Throughput Solution . . . . . . . . . . . . . . . . . . . 4 3.1. Requirements for the Conex signal . . . . .. . . . . . . . 6 3.2. Codepoint Encoding . . . . . . . . . . . . . . . . . . . . 7 3.3. Conex Throughput Components . . . . . . . . . . . . . . . 7 3.3.1. Modified Senders . . . . . . . . . . . . . . . . . . . 7 3.3.2. Intermediate Conex Enabled Devices . . . . . . . . . 8 3.3.3. Modified Receivers . .. . . . . . . . . . . . . . . . 8 3.4 Additional Challenges. . . . . . . . . . . . . . . . . . . . 8 4. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9 5. Security Considerations . . . . . . . . . . . . . . . . . . . 9 6. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . 9 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 9 8. Comments Solicited . . . . . . . . . . . . . . . . . . . . . . 10 9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 10 9.1. Normative References . . . . . . . . . . . . . . . . . . . 10 9.2. Informative References . . . . . . . . . . . . . . . . . . 10 Karagiannis Expires November 6, 2011 [Page 2] Internet-Draft Exposing Conex Throughput May 2011 1. Introduction The ConEx working group is defining how IP packets will carry additional ConEx information. This document describes a possible implementation complexity problem associated with the current Conex concept defined by the ConEx WG and it proposes an alternative method to overcome this problem. This problem occurs due to the fact that it is relatively complex to measure the congestion rate, at each intermediate Conex enabled device. A ConEx enabled device can be for example, a policer or an audit device which requires to determine TCP transmission state from its sequence number so as to declare conformance to the measured congestion rate induced by this flow. This document outlines a method to overcome this problem by measuring and monitoring the per flow throughput on each intermediate ConEx enabled device instead of measuring and monitoring the per flow congestion rate on each intermediate Conex enabled device. 1.1. Terminology The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119]. In addition to the terminology specified in [draft-ietf-conex- abstract-mech-01] and [draft-ietf-conex-concepts-uses-01] the following terms are used and defined in this document. O Transport path throughput calculated at the sender: The per flow transport sending rate as a function of the congestion rate, round- trip time, and segment size. The transport path throughput calculated at the sender using the TCP throughput equation that is specified in [RFC5348]. Note that in [RFC5348] the term congestion rate is denoted as loss event rate. According to [RFC5348] a loss event is defined as one or more lost or marked packets from a window of data, where a marked packet refers to a congestion indication (CE) from Explicit Congestion Notification (ECN) [RFC3168]; O Throughput exposure signal (ConEx-Throughput): a ConEx signal encoded in IP packet headers from the sender to the network to exposure the transport path throughput calculated at the sender. O Intermediate ConEx enabled device: each Conex enabled device located on a communication path between a sender and a receiver. Karagiannis Expires November 6, 2011 [Page 3] Internet-Draft Exposing Conex Throughput May 2011 O Measured per flow throughput at intermediate ConEx enabled devices: the per flow rate of packets that are passing through the output buffer of an intermediate Conex enabled device and are not dropped, (2) are not ECN CE marked. Each ConEx enabled device is only holding state on a flow if it first receives a Throughput exposure signal. O Throughput exposure rate at intermediate ConEx devices: the per flow rate of the received throughput exposure signals (Conex- Throughput). Each ConEx enabled device is only holding state on a flow if it first receives a Throughput exposure signal (Conex- Throughput). 2. Implementation Complexity Problem The mentioned problem is related to the fact that in the context of Conex it is relatively complex to measure the congestion rate, see [RFC5348] at each intermediate Conex enabled device, see Figure 1. A ConEx enabled device, can be a policer or an audit device. In particular, the main solution proposed in [draft-ietf-conex-abstract- mech-01] on measuring the congestion rate on an audit device is: "The auditor could monitor TCP flows or aggregates of flows, only holding state on a flow if it first sends a Credit or a Re-Echo-Loss marking. The auditor could detect retransmissions by monitoring sequence numbers. It would assure that (volume of retransmitted data) <= (volume of data marked Re-Echo-Loss). Traffic would only be auditable in this way if it conformed to the standard TCP protocol and the IP payload was not encrypted (e.g. with IPsec). ", from [draft-ietf-conex-abstract-mech-01]. In other words, an audit device needs to keep per flow (individual or aggregated) state after it receives a credit or Re-echo-Loss marking in order to measure the per flow congestion rate. In addition to this the audit device must be able to read the TCP header in order to observe the TCP header numbers. This is complex to be implemented and it is not always possible, e.g., when the IP payload is encrypted (e.g. with IPsec). In this way, the audit device is also able to take into consideration also the number of ECN CE marked packets that were dropped by nodes located upstream. Note that in the opinion of the authors of this draft the same method of measuring the congestion rate at a policer is required as the one applied at the audit device. 3. ConEx Throughput Method This document provides a method to this problem, see Figure 2, by measuring and monitoring the per flow throughput (or its variation) on each intermediate Conex enabled device instead of measuring and monitoring the per flow congestion rate on each intermediate Conex enabled device. Karagiannis Expires November 6, 2011 [Page 4] Internet-Draft Exposing Conex Throughput May 2011 In addition to this, a sender calculates the per flow transport path throughput, which describes the sending rate as a function of the congestion rate, round-trip time, and segment size. The transport path throughput calculated at the sender can be accomplished using the TCP throughput equation specified in [RFC5348]. Moreover, the sender uses the throughput exposure signal (Conex- Throughput) that is a ConEx signal encoded in IP packet headers from the sender to the network to exposure the transport path throughput calculated at the same sender. Each intermediate ConEx enabled device measures the per flow Throughput (or its variation), which is the per flow rate of packets that are passing through the output buffer of an intermediate ConEx enabled device and (1) not dropped, (2) are not ECN CE marked. Each Conex enabled device is only holding state on a flow if it first receives a Throughput exposure signal (Conex-Throughput). Note that in this way the intermediate Conex enabled device can also take into account the ECN CE marked packets that were dropped by nodes located upstream. Furthermore, each intermediate Conex enabled device measures the throughput exposure rate at intermediate Conex devices, which is the per flow rate of the received throughput exposure signals (Conex- Throughput). Each ConEx enabled device is only holding state on a flow if it first receives a Throughput exposure signal (Conex- Throughput). It is important to note that the per flow throughput measured by intermediate ConEx enabled devices located closer to the sender can be higher than the per flow throughput measured by Conex enabled devices located closer to the receiver. Identical use cases and applications can be used as the ones described in [draft-ietf-conex-concepts-uses-01], using the same semantics related to the exposed and measured congestion. The only difference is now that the intermediate Conex enabled devices can monitor and process the "Measured per flow throughput" and the "Throughput exposure rate", instead of monitoring and processing the measured congestion rate and the signaled exposed congestion rate, respectively. An example of this is that each intermediate Congestion enabled device ensures that the "Throughput exposure rate" is always less or equal to the "Measured per flow throughput". {More details will be provided in a next version of this draft} Karagiannis Expires November 6, 2011 [Page 5] Internet-Draft Exposing Conex Throughput May 2011 +---------+ +---------+ |Transport| +-----------+ |Transport| | Sender |>=Data=Path=>|(Congested)|>=====Data=Path=====>| Receiver| | | | Network |>-Congestion-Signal->|---. | | | | Device | | | | | | +-----------+ | | | | | | | | | |<==Feedback=Path==============================<| | | | ,---|<--Transport Layer returned Congestion Signal-<|<--' | | V | | | ||-------|| | | ||Congest|| | | ||rate || +-----------+ |Transport| ||calcul.||>=Data=Path=>|(Congested)|>=====Data=Path=====>| Receiver| || |->(new)Conex->| Network |-(new)Conex signal)->| | |+-------+| | Device | (carried in data | | | | +-----------+ packet headers) | | +---------+ +---------+ Figure 1: Overview ConEx architecture, based on [draft-ietf-conex-abstract-mech-01] +---------+ +---------+ |Transport| +-----------+ |Transport| | Sender |>=Data=Path=>|(Congested)|>=====Data=Path=====>| Receiver| | | | Network |>-Congestion-Signal->|---. | | | | Device | | | | | | +-----------+ | | | | | | | | | |<==Feedback=Path==============================<| | | | ,---|<--Transport Layer returned Congestion Signal-<|<--' | | V | | | ||-------|| | | ||Througp|| | | || || +-----------+ |Transport| ||calcul.||>=Data=Path=>|(Congested)|>=====Data=Path=====>| Receiver| || |->(new)Conex->| Network |-(Throughp. signal)->| | |+-------+| | Device | (carried in data | | | | +-----------+ packet headers) | | +---------+ +---------+ Figure 2: Overview ConEx architecture, with Conex throughput exposure 3.1. Requirements for the ConEx Signal The following requirements apply to the Conex-Throughput signal, which are in line with most of the requirements presented in [draft-ietf-conex-abstract-mech-01]: Karagiannis Expires November 6, 2011 [Page 6] Internet-Draft Exposing Conex Throughput May 2011 o) The ConEx Signal SHOULD be visible to internetwork layer devices along the entire path from the transport sender to the transport receiver. o) The ConEx Signal SHOULD be useful under only partial deployment. o) The ConEx Signal SHOULD be timely. o) The ConEx Signal SHOULD be accurate (i.e., such that the signaled throughput is represented accurately). Note that in case of throughput exposure timeliness and accuracy shall be bound by min;max limits in order to prevent non-significant deviations (e.g. relative variations would be worth considering in this respect). 3.2. Codepoint Encoding This encoding involves signaling the following codepoint: Conex-Throughput: a ConEx signal encoded in IP packet headers from the sender to the network to exposure the transport path throughput calculated at the sender. 3.3. Conex Throughput Components The same Conex enabled devices can be used as the ones specified in [draft-ietf-conex-abstract-mech-01]. 3.3.1 Modified Senders The senders should be able to support a TCP based congestion control protocol, e.g., [RFC5681] in combination with [RFC3168], [RFC5348]. In addition, the sender MUST be able to support the following functionality: O calculates the transport path throughput calculated: The per flow transport sending rate as a function of the congestion rate, round- trip time, and segment size. The transport path throughput calculated at the sender using the TCP throughput equation specified in [RFC5348]. Note that in [RFC5348] the term congestion rate is denoted as loss event rate. According to [RFC5348] a loss event is defined as one or more lost or marked packets from a window of data, where a marked packet refers to a congestion indication (CE) from Explicit Congestion Notification (ECN) [RFC3168]. Karagiannis Expires November 6, 2011 [Page 7] Internet-Draft Exposing Conex Throughput May 2011 O generates and encodes ConEx-Throughput signals in IP packet headers from the sender to the network to exposure the transport path throughput calculated at the same sender. 3.3.2 Intermediate Conex Enabled Devices The same intermediate Conex enabled devices could be used as the intermediate Conex enabled devices specified in [draft-ietf-conex- abstract-mech-01], e.g., Policer and Audit. The only difference is now that the intermediate Conex enabled devices MUST monitor and MUST process the "Measured per flow throughput" and the "Throughput exposure rate", instead of monitoring and processing the measured congestion rate and the signaled exposed congestion rate, respectively. 3.3.3 Modified Receivers The receivers should be able to support a TCP based congestion control protocol, e.g., [RFC5681] in combination with [RFC3168], [RFC5348]. In addition, the receiver MUST be able to support the following functionality: O Measure per flow throughput: the per flow rate of packets that are Received and (1) are not dropped, (2) are not ECN CE marked. O Measure throughput exposure rate at intermediate Conex devices: the per flow rate of the received throughput exposure signals. {More details will be provided in a next version of this draft} 3.4 Additional Challenges The additional challenges associated with this approach are listed below: o) The Internet operation has to enable multi-domain information exchanges to effectively enable network-to-host information passing as detailed in this document. This condition is of particular importance for congestion control schemes that make use of explicit rate feedback. On the other hand, multi-domain environment shall be considered as non-cooperative. Karagiannis Expires November 6, 2011 [Page 8] Internet-Draft Exposing Conex Throughput May 2011 o) It is important to emphasize that network support of rate signaling suffers from the same scalability problems as identified in [RFC2208]. Indeed, in-band rate signaling or out-of-band per-flow traffic specification signaling (like in the Resource Reservation Protocol (RSVP)) results in similar scalability issues due to the per-connection state maintenance. As noticed in Section 1 of [RFC6077], due to the non-cooperative nature of multi-domain environments, it seems unlikely that network flow state could be avoided (up to a certain extend) given the network's per-packet flow rate instructions that would need to be compared against variations in the actual flow rate, which is inherently not a per-packet metric. These issues have been outstanding ever since integrated services (IntServ) was identified as unscalable in 1997 [RFC2208]. Any subsequent attempt to involve network elements in limiting flow rates (including XCP, RCP, etc.) have run up against the same scalability issues when intending deployment on the public/worldwide Internet. o) Security is a challenge for multi-domain exchange of explicit rate/congestion signals, whether in-band or out-of-band. At domain boundaries, authentication and authorization issues can arise whenever congestion control information is exchanged. 4. IANA Considerations This memo includes no request to IANA. Note to RFC Editor: this section may be removed on publication as an RFC. 5. Security Considerations The security consideration sections are the same as the security considerations associated with [draft-ietf-conex-abstract-mech-01]. {More details will be provided in a next version of this draft>} 6. Conclusions {To be done} 7. Acknowledgements {To be done} Karagiannis Expires November 6, 2011 [Page 9] Internet-Draft Exposing Conex Throughput May 2011 8. Comments Solicited Comments and questions are encouraged and very welcome. They can be addressed to the IETF Congestion Exposure (ConEx) working group mailing list , and/or to the authors. 9. References 9.1. Normative References [draft-ietf-conex-abstract-mech-01] M. Mathis, B. Briscoe, "Congestion Exposure (ConEx) Concepts and Abstract Mechanism", draft-ietf- conex-abstract-mech-01, (work in progress), March 2011. [RFC2119] S. Bradner, "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. 9.2. Informative References [draft-ietf-conex-concepts-uses-01] T. Moncaster, J. Leslie, B. Briscoe, R. Woundy, D. McDysan, "ConEx Concepts and Use Cases", draft- ietf-conex-concepts-uses-01, (draft in progress), March 2011. [RFC5348] S. Floyd, M. Handley, J. Padhye, J. Widmer, "TCP Friendly Rate Control (TFRC): Protocol Specification", RFC 5348, September 2008. [RFC6077] D.Papadimitriou (Ed.), M.Welzl, M.Scharf, B.Briscoe, Open Research Issues in Internet Congestion Control, RFC 6077, February 2011. [RFC3168] K. Ramakrishnan, S. Floyd, S., and D. Black, "The Addition of Explicit Congestion Notification (ECN) to IP", RFC 3168, September 2001. [RFC5681] M. Allman, V. Paxson, and E. E. Blanton, "TCP Congestion Control", RFC 5681, September 2009. Karagiannis Expires November 6, 2011 [Page 10] Internet-Draft Exposing Conex Throughput May 2011 Authors' Addresses Georgios Karagiannis University of Twente P.O. Box 217 7500 AE Enschede, The Netherlands EMail: g.karagiannis@ewi.utwente.nl Dimitri Papadimitriou (editor) Alcatel-Lucent Copernicuslaan, 50 2018 Antwerpen, Belgium Phone: +32 3 240 8491 EMail: dimitri.papadimitriou@alcatel-lucent.com Karagiannis Expires November 6, 2011 [Page 11]