Internet Engineering Task Force X. Wei INTERNET-DRAFT L.Zhu Intended Status: Standards Track Huawei Technologies Expires: April 11, 2015 L.Deng China Mobile October 8, 2014 Tunnel Congestion Feedback draft-wei-tsvwg-tunnel-congestion-feedback-03 Abstract This document describes a mechanism to calculate congestion of a tunnel segment based on RFC 6040 recommendations, and a feedback protocol by which to send the measured congestion of the tunnel from egress to ingress router. A basic model for measuring tunnel congestion and feedback is described, and a protocol for carrying the feedback data is outlined. Status of this Memo This Internet-Draft is submitted to IETF in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/1id-abstracts.html The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html Copyright and License Notice Copyright (c) 2014 IETF Trust and the persons identified as the document authors. All rights reserved. Wei Expires April 11, 2015 [Page 1] INTERNET DRAFT Tunnel Congestion Feedback October 8, 2014 This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Conventions and Terminology . . . . . . . . . . . . . . . . . . 4 2.1 Conventions . . . . . . . . . . . . . . . . . . . . . . . . 4 2.2 Terminology . . . . . . . . . . . . . . . . . . . . . . . . 4 3. Problem Statement . . . . . . . . . . . . . . . . . . . . . . . 5 3.1 3GPP network scenario . . . . . . . . . . . . . . . . . . . 6 3.2 Network Function Virtualization Scenario . . . . . . . . . . 7 3.3 Data Center Tenancy Scenario . . . . . . . . . . . . . . . . 9 4. Congestion Control Model . . . . . . . . . . . . . . . . . . . 9 4.1 Congestion Calculation . . . . . . . . . . . . . . . . . . . 10 4.2 Data Information . . . . . . . . . . . . . . . . . . . . . . 12 4.3 Congestion Feedback . . . . . . . . . . . . . . . . . . . . 12 4.4 Congestion Control . . . . . . . . . . . . . . . . . . . . . 13 5. Congestion Feedback Protocol . . . . . . . . . . . . . . . . . 13 5.1 Properties of Candidate Protocol . . . . . . . . . . . . . . 13 5.2 IPFIX Extensions for Congestion Feedback . . . . . . . . . . 14 5.3 Other Protocols . . . . . . . . . . . . . . . . . . . . . . 18 6. Benefits . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 7. Security Considerations . . . . . . . . . . . . . . . . . . . . 18 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 18 9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 19 9.1 Normative References . . . . . . . . . . . . . . . . . . . 19 9.2 Informative References . . . . . . . . . . . . . . . . . . 19 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 20 Wei Expires April 11, 2015 [Page 2] INTERNET DRAFT Tunnel Congestion Feedback October 8, 2014 1. Introduction In current practice of Internet protocol, encapsulation of IP headers is always the technical proposal for overlay networking scenarios. For example, mobile network are designed to encapsulate inner IP header and application layer header chain through IP header, UDP header and GTP-U header. It is also designed to fulfill the mobility, QoS control, bearer management and other specific application of the mobile network. Some organization's private network encrypt IP header by Internet tunnel solutions with private key or certification approaches to setup VPN (virtual private network) over WAN (wide area network). Congestion is the situation that traffic input exceeds throughput of any segment of transmission path, which can result from transportation constraints and interface/processor overload. In general, congestion seen as the cause of packet loss or unexpected delay to network end points. End to end congestion protocols (e.g. ECN [RFC 3168] and ECN handling for tunneling scenario [RFC6040]) are discussed in IETF. In IP header encapsulation cases, IP headers should be carried over transportation protocol like TCP or UDP, which influents the explicit congestion control feedback, since the receiver should mark ECN in TCP acknowledgment. On the other hand, packet loss and performance degradation should not be recognized by network elements, for instance the tunnel ingress and egress entity, when network segment is encapsulated by IP header and UDP header chain. That causes management problem when tunnel segment is considered as an independent administration domain, and network operator intents to keep network operation reliable. This document describes a mechanism for feedback of congestion observed in IP tunnels usages. Common tunnel deployments such as mobile backhaul networks, VPNs and other IP-in-IP tunnels can be congested as a result of sustained high load. Network providers use a number of methods to deal with high load conditions including proper network dimensioning, policies for preferential flow treatment and selective offloading among others. The mechanism proposed in this document is expected to complement them and provide congestion information that to allow making better, policies and decisions. The model and general solution proposed in chapter 4 consist of identifying congestion marks set in the tunnel segment, and feeding back the congestion information from the egress to the ingress of the tunnel. Measuring congestion of a tunnel segment is based on counting Wei Expires April 11, 2015 [Page 3] INTERNET DRAFT Tunnel Congestion Feedback October 8, 2014 outer packet CE marks for packets that have ECT marks in the inner packet. This proposal depends on statistical marking of congestion and uses the method described in RFC 6040 [RFC6040], Appendix C. In chapter 5 the desired properties of the congestion information conveying protocol are outlined, and IPFIX [RFC5101] as a candidate protocol for these extensions is explored further. 2. Conventions and Terminology 2.1 Conventions The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119] 2.2 Terminology Tunnel: A channel over which encapsulated packets traverse across a network. Encapsulation: The process of adding control information when it passes through the layered model. Encapsulator: The tunnel endpoint function that adds an outer IP header to tunnel a packet, the encapsulator is considered as the "ingress" of the tunnel. Decapsulator: The tunnel endpoint function that removes an outer IP header from a tunneled packet, the decapsulator is considered as the "egress" of the tunnel. Outer header: The header added to encapsulate a tunneled packet. Inner header: The header encapsulated by the outer header. E2E: End to End. VPN: Virtual Private Network is a technology for using the Internet or another intermediate network to connect computers to isolated remote computer networks that would otherwise be inaccessible. GRE: Generic Routing Encapsulation. IPFIX IP Flow Information Export. An IETF protocol to export flow information from routers and other devices. Wei Expires April 11, 2015 [Page 4] INTERNET DRAFT Tunnel Congestion Feedback October 8, 2014 RED Random Early Detection NFV Network Functions Virtualization is an alternative design approach for building complex IT applications, particularly in the telecommunications and service provider industries, that virtualizes entire classes of function into building blocks that may be connected, or chained, together to create services. VNF Virtualized Network Function may consist of one or more virtual machines running different software and processes, which form the building blocks for NFV. SFC Service Function Chain is a group of connected VNF in a specific sequence/map using NFV approach, in order to deliver a specific service. 3. Problem Statement Network traffic congestion control plays a significant role in network performance management, and sustaining congestion could impact subscriber's experience. Currently the solution of network congestion problem mainly focuses on end-to-end method, i.e. ECN [RFC3168], and the traffic sender are in charge of reducing traffic rates in case of network congested. But sometimes it's not always reliable to dependent on end hosts to solve the congestion situation, because some end hosts may not support ECN, or even ECN is supported by end hosts some traffics, e.g. UDP-based traffic, may not support ECN. Though the congestion happens in operator's network, in case that the congestion information is transparent to operator, network administration would be hard to take action to control the network traffic of reason to network congestion. To improve the performance of the network, it's better for operator to take network congestion situation into network traffic management. Many kinds of tunnels are widely deployed in current networks, even in some scenarios all traffics transmitted through designated tunnel(s). Because the ingress and egress of tunnel are usually deployed by operator, so it's easy for operator to execute operator's policy, for example gating, flow control and dropping. The tunnel feedback mechanism should be feasible for operator to collect network congestion information in encapsulation segment. After obtaining Wei Expires April 11, 2015 [Page 5] INTERNET DRAFT Tunnel Congestion Feedback October 8, 2014 congestion information, operator could make policy at tunnel ingress for traffic management taking these information into consideration. ECN handling mechanisms in RFC 6040 specifies how ECN should be handled for tunneling. In addition, RFC 6040, Appendix C provides guidance to calculate congestion experienced in the tunnel itself. However, there is no standardized mechanism by which the congestion information inside the tunnel can be fed back from egress to ingress router. In the following sub-sections, some network tunnel scenarios are discussed. 3.1 3GPP network scenario Tunnels, including GRE [RFC2784], GTP [TS29.060], IP-in-IP [RFC2003] or IPSec [RFC4301] etc, are widely deployed in 3GPP networks. And in 3GPP network tunnels are used to carry end user flows within the backhaul network such as shown in Figure 1. IP backhaul networks such as those of mobile networks are provisioned and managed to provide the subscribed levels of end user service. These networks are traffic engineered, and have defined mechanisms for providing differentiated services and QoS per user or flow. Policy to configure per user flow attributes in these networks have traditionally been based on monitoring and static configuration. Currently, these networks are increasingly used for applications that demand high bandwidth. The nature of the flows and length of end user sessions can lead to significant variability in aggregate bandwidth demands and latency. In such cases, it would be useful to have a more dynamic feedback of congestion information. In addition, eNB, SGW and PGW are administrated by one mobile operator, mobile backhaul to carry IP/UDP/GTP encapsulation is regally administrated by back haul service operator. This aggregate congestion feedback could be used to determine flow handling and admission control. \|/ | | +-|---+ +------+ +------+ +--+ | | Tunnel1 | | Tunnel2 | | Ext |UE|-(RAN)-| eNB |===========| S-GW |=========| P-GW |-------- +--+ | | RAN | | Core | |Network +-+---+ Backhaul +---+--+ Network +---+--+ Figure 1: Example - Mobile Network and Tunnels Wei Expires April 11, 2015 [Page 6] INTERNET DRAFT Tunnel Congestion Feedback October 8, 2014 3.2 Network Function Virtualization Scenario Telecoms networks contain an increasing variety of proprietary hardware appliances, leading to increasing difficulty in lauching new network services, as well as the complexity of integrating and deploying these appliances in a network. Network Functions Virtualisation (NFV) aims to address these problems by decoupling the software from dedicated hardware platforms to a range of industry standard server hardware for various network services, through IT virtualization technology that can be moved to, or instantiated in, various locations in the network as required. In this way, it is expected to provide significant benefits for network operators (reduced expenditures for network construction and maintenance) and their customers (shortened time-to-market for new network services). Furthermore, service functions are preferred to be deployed and managed in a data center manner, rather than being inserted on the data-forwarding path between communicating peers as today. SFC WG is currently working on a new framework to cope with this highly dynamic routing problem for a network service, which requires that the relevant data traffic be traversing a group of virtualized network function nodes (VNFs), each of which could be applied at any layer within the network protocol stack (network layer, transport layer, application layer, etc.). [SFC] As shown in Figure 2, in a SFC-enabled domain (e.g. with or across network operator's deployed data centers), a PDP (Policy Decision Point) is the central entity which is responsible for maintaining SFC Policy Tables (rules for the boundary nodes on deciding which IP flow to traverse which service function path), and enforcing appropriate policies in SF Nodes and SFC Boundary Nodes. Beginning at the Ingress node, at each hop of a given service function path (as decided by a matched SFC policy rule/map), if the next function node is not an immediate (L3) neighbor, packet are encapsulated and forwarded to correspondent downstream function node, as shown in Figure 3. Wei Expires April 11, 2015 [Page 7] INTERNET DRAFT Tunnel Congestion Feedback October 8, 2014 . . . . . . . . . . . . . . . . . . . . . . . . . . SFC Policy Enforcement . . +-------+ . . | |-----------------+ . . +-------| PDP | | . . | | |-------+ | . . | +-------+ | | . . . . | . . . . . | . . . . . | . . . . | . . . . . . . | . . . . . | . . . . . | . . . . | . . . . . | | | | . . v v v v . . +---------+ +---------+ +-------+ +-------+ . . |SFC_BN_1 | |SFC_BN_n | | SF_1 | | SF_m | . . +---------+ +---------+ +-------+ +-------+ . . SFC-enabled Domain . . . . . . . . . . . . . . . . . . . . . . . . . . Figure 2: SFC Policy Enforcement Scheme Network Service +----------+ +----------+ +----------+ | VNF#1 | tunnel#1 | VNF#2 | tunnels | VNF#n | | Instance |-----------| Instance |- ... ... -| Instance | +----------+ +----------+ +----------+ ^ | Virtualization +--------------------------------------------------------+ | Virtualization Platform | +--------------------------------------------------------+ Figure 3: Example - Mobile Network service chaining and Tunnels However, using VNFs running commodity platforms can introduce additional points of failure beyond those inherent in a single specialized server, and therefore poses additional challenges on reliability. [VNFPOOL] proposes using pooling techniques in response, which requires maintaining a backup mapping among running VNF instances for a given service function, and choosing from them for a specific data flow. It is clear that it would be helpful to make more efficient use of network capacity in case of local congestion, if the choice is based on the ECN feedback as well as the running status and/or physical resources accommodation of a candidate VNF instance. Wei Expires April 11, 2015 [Page 8] INTERNET DRAFT Tunnel Congestion Feedback October 8, 2014 3.3 Data Center Tenancy Scenario In the scenario of data center of multi-tenant, network resource would be shared between more than one tenants, and in order to provide functional isolation and at the same time guarantee scalability for tenants, the tunnel based isolation mechanisms, e.g. VxLAN and STT etc, are provided. In the scenario described above, hypervisor or vSwitch would act as tunnel endpoint for the traffic between VMs, and tunnels are agnostic to VMs, in other words, the congestion indication information such as ECN flag marked by network entity of data center are agnostic to VMs. To deal with this situation, two solutions could be used: Solution 1: Using tunnel translation, hypervisor or vSwitch marks the inner IP header according to ECN flag in outer IP header before transmits packets to VM. Solution 2: Using the congestion control mechanism provided in this document between hypervisors or vSwitchs to do congestion control for VMs' traffic. 4. Congestion Control Model In this section, the basic congestion control model will be provided, and each detailed aspect of this model will also be introduced in the following subsection. The congestion control model provides network administrator with a method to manage the data traffic in its network domain. The basic model consists of the following components: Ingress, Egress, Feedback, Meter, Collector and Manager. As shown in Figure 4, network traffic enters the tunnel through tunnel ingress, passing through en-route routers, which will mark packets according to ECN mechanism as specified in RFC3168, to tunnel egress; the egress collects the congestion level information encountered in tunnel and feeds back it to the corresponding ingress; after receiving congestion information, the ingress takes actions to control the traffic that passing through the path between the ingress and egress to reduce the congestion level in the tunnel. At egress, a module named Meter is used to estimate the congestion level in the tunnel as described in the section above. A congestion information feedback module, called Feedback, is used to control the congestion information feedback procedure. The metering module named Meter in the Egress node accounts the Wei Expires April 11, 2015 [Page 9] INTERNET DRAFT Tunnel Congestion Feedback October 8, 2014 congestion marks it receives. The Feedback module calculates the amount of congestion and feeds back the congestion information to the Ingress node. The Collector at the Ingress receives the congestion information which is fed back from the Feedback module. The Manager implements functions such as admission control and traffic engineering according to the congestion level experienced in tunnel to control the traffic to reduce the congestion level, the detailed actions taken by the Manager are out of the scope of current document. congestion feedback signal ######################################### +-----#-------+ +------#----+ | # | | # | | # | | # | | V | | # | | +---------+ | +--------------+ | +--------+| | |Collector| | | | | |Meter || traffic| +---------+ | | | | +-----+--+|traffic ======>| |Manager | |======================> | |Feedback||======> | +---------+ | | Routers | | +--------+| | | | (ECN-enabled)| | | +-------------+ +--------------+ +-----------+ Figure 4: Basic Feedback Model To support traffic management and congestion information feedback in tunnel, there are mainly three issues that this document discusses: calculation of congestion level information, feeding back the congestion information from egress to ingress, and implementation of congestion control. The tunnel ingress/egress are assumed to be compliant with RFC6040 and the tunnel interior routers are compliant with RFC3168. In addition, it should be noted that these tunnels may carry ECT or Not-ECT traffic. A well defined mechanism for aggregate congestion calculation should be able to work in the presence of all kinds of traffic and would benefit from a common feedback mechanism and protocol. 4.1 Congestion Calculation This section discusses how to calculate congestion level experienced in the tunnel, an example of how to calculate congestion level is provided. In this document calculation of congestion in the tunnel is based on the method described in RFC 6040, Appendix C. Wei Expires April 11, 2015 [Page 10] INTERNET DRAFT Tunnel Congestion Feedback October 8, 2014 The egress can calculate congestion using moving averages. The proportion of packets not marked in the inner header that have a CE marking in the outer header is considered to have experienced congestion in the tunnel. Note that the packets are ECN capable and not congestion-marked before tunnel. Since routers implementing RED randomly select a percentage of packets to mark, this method can be effectively used to expose congestion in the tunnel. When the ingress is RFC6040 compliant, the packets collected by egress can be divided into to 4 categories, shown in figure 5. The tag before "|" stands for ECN field in outer header; and the tag after "|" stands for ECN field in inner header. "Not-ECN|Not-ECN" indicates traffic that does not support ECN, for example UDP and Not-ECT marked TCP; "CE|CE" indicates ECN capable packets that have CE-mark before entering the tunnel; "CE|ECT" indicates ECN capable packets that are CE-marked in the tunnel; "ECT|ECT" indicates ECN capable packets that have not experienced congested in tunnel (or outside the tunnel). +--------------------------+ | Not-ECN|Not-ECN | +--------------------------+ | CE|CE | +--------------------------+ | CE|ECT | +--------------------------+ | ECT|ECT | +--------------------------+ Figure 5: ECN marking categories by outer/inner packet Out of the total number of packets, if the quantity of CE|ECT packets is A, the quantity of ECT|ECT packets is B, then the congestion level (C) can be calculated as follows: C=A/(A+B) As an example, consider 100 packets to calculate the moving average as shown in RFC 6040, Appendix C. Say that there are 12 packets that have CE|ECT marks indicating that they have experienced congestion in the tunnel. And, there are 58 packets that have ECT|ECT marks indicating that there was no congestion in either the tunnel or elsewhere. The egress can calculate congestions as: Wei Expires April 11, 2015 [Page 11] INTERNET DRAFT Tunnel Congestion Feedback October 8, 2014 C = 12/ (12 + 58) = 12/70 (17% congestion) 4.2 Data Information This section discusses congestion-related information that should be conveyed from egress to ingress. (1)Congestion volume. The information indicating the how much congestion has been experienced in the tunnel by traffic passing through the tunnel. Because there are both ECT packets and Not-ECT packets passing through the tunnel network, and in case of congestion, the ECT packets would be CE-marked instead of dropped and tunnel egress can be aware of these CE-marked packets; but Not-ECT packets would be dropped and tunnel egress cannot be aware of these dropped packets, so it's hard for egress to calculate the precise number of congested packets. According to the analysis in subclause 4.1, the congestion volume is preferred in the form of percentage, e.g. 17.14%. (2)Egress identifier. To control the traffic congestion in certain tunnel, the ingress needs to have the knowledge of which traffic should be controlled, especially for the case that the ingress establishes tunnels with different egresses. So the egress identifier should be transmitted together with congestion volume to ingress. This identifier is usually the identifier of the tunnel or the address of tunnel egress. 4.3 Congestion Feedback This sub-section focuses on the discussion of feedback procedure. The congestion feedback procedure conveys congestion status from egress to ingress. The discussion of feedback protocol will be discussed in the next section. To reduce the overload, caused by this procedure, on network especially in case the feedback signal goes through the same path as data traffic, the feedback will only occur when congestion happens. In other words, egress doesn't send feedback signal if there is no congestion happens. Also egress will ignore ephemeral congestion and only feed back congestion information if the congestion level goes higher than a specified threshold (TH1) and/or lasts for a specified period of time (T1). When egress detects congestion level higher than TH1 and for a period of T1, it sends feedback signal to ingress periodically (T2) until Wei Expires April 11, 2015 [Page 12] INTERNET DRAFT Tunnel Congestion Feedback October 8, 2014 the congestion level is lower than TH1. 4.4 Congestion Control After ingress receives congestion information from egress, it will take actions to try to reduce the congestion. For example, ingress could choose to drop some packets or do certain traffic engineering etc. Usually, network policy would have impact on what action is to be taken. For example, which packets to drop may be decided by the agreement between subscriber and network administrator. The specific choice of congestion alleviation measures taken by the ingress is out of scope of this document. The ingress will continue to implement control actions until there is no congestion feedback from the egress. 5. Congestion Feedback Protocol In different networks, there are always different tunnel protocols deployed. For instance, the congestion feedback can be done either by utilizing the existing tunnel protocol or using an alternative protocol. For example, in 3GPP network GTP (GPRS Tunnel Protocol)[TS29.060] is used as tunnel protocol to transmit traffic between network entities. And because GTP protocol is easy to be extended for additional information element, GTP itself would be a good choice for congestion feedback. In some other networks an independent protocol could be used for congestion feedback, for example the network using tunnel protocols such as IP-in-IP [RFC2003], GRE [RFC2784]. Currently, this section mainly focuses on the discussion of independent protocols for congestion feedback. There are two choices for such an independent protocol, one is define as a new dedicated protocol from scratch, the other one is meant to evaluate and reuse the existing protocol(s). 5.1 Properties of Candidate Protocol To feedback congestion efficiently there are some properties that are desirable in the feedback protocol. 1. Congestion friendliness. The feeding back traffics are coexistence with other traffics, so when congestion happens in the network, the feeding back traffic should be reduced, So that feedback itself will not congest the network further when the network is Wei Expires April 11, 2015 [Page 13] INTERNET DRAFT Tunnel Congestion Feedback October 8, 2014 already getting congested. In other words, feedback frequency should adjust to network's congestion level. 2. Extensibility. The authors consider that using an existing protocol, or extensions to an existing protocol is preferable. The ability of a protocol to support modular extensions to report congestion level as feedback is a key attribute of the protocol under consideration. 3. Compactness. In different situations, there may be different congestion information to be conveyed, and in order to reduce network load, the information to be conveyed should be selectable, i.e. only the required information should be possible to convey. 4. In/Out of band signal. The feedback message could be along the same path with network data traffic, referred as in band signal; or go through a different path with network data traffic, referred as out of band signal. 5.2 IPFIX Extensions for Congestion Feedback This section outlines IPFIX extensions for feedback of congestion. The authors consider that IPFIX is a suitable protocol that is reasonably easy to extend to carry tunnel congestion reporting. The Feedback module acts as IPFIX exporter, and Collector module acts as IPFIX Collector. Since IPFIX is preferred to use SCTP as transport, it has the foundation for congestion-friendly behavior, and because SCTP allows partially reliable delivery [RFC3758] - IPFIX message channels can be tagged so that SCTP does not retransmit certain losses. This makes it safe during high levels of congestion in the reverse direction, to avoid a congestion collapse.. When congestion occurs in the network, the Exporter (Egress) can reduce the IPFIX traffic. Thus the feedback itself will not congest the network further when the network is already getting congested. When the Exporter detects network congestion, it can also reduce IPFIX traffic frequency to avoid more congestion in network while being able to sufficiently convey congestion status. Because the template mechanism in IPFIX is flexible, it allows the export of only the required information. Sending only the required information can also reduce network load. Wei Expires April 11, 2015 [Page 14] INTERNET DRAFT Tunnel Congestion Feedback October 8, 2014 The basic procedure for feedback using IPFIX is as follows: (1)The exporter inform the collector how to interpret the IEs in IPFIX message using template. Collector just accepts template passively; which IEs to send is configured by other means that not included in IPFIX specification. (2)The exporter meters the traffic and sends the congestion level to collector. Congestion feedback using IPFIX is shown in the figures below. There are two variations to congestion feedback model using IPFIX. In the first one shown in Figure 6(a), congestion information is sent directly from egress to ingress and ingress makes decisions according this information. In the second case shown in Figure 6(b), congestion information is sent to a mediation controller instead of tunnel ingress; the controller is in charge of making decisions according to network congestion and control the behavior of ingress node, for example, reducing traffic or forbidding new traffic flows. In this model the congestion information from egress to controller is conveyed by IPFIX, but how controller controls the behavior of ingress is out of scope of this document. IPFIX |-----------------------------------------| | | | | | V +----------+ tunnel +-----------+ |Egress |========================== |Inress | |(Exporter)| |(Collector)| +----------+ +-----------+ (a) Direct Feedback. Wei Expires April 11, 2015 [Page 15] INTERNET DRAFT Tunnel Congestion Feedback October 8, 2014 IPFIX +-----------+ --------->|Controller |##################### | |(Collector)| # | +-----------+ # | # +----------+ tunnel +-----V-+ |Egress | ===========================|Ingress| |(Exporter)| +-------+ +----------+ (b) Mediated Feedback. Figure 6: IPFIX Congestion Feedback Models To support feeding back congestion information, some extensions to the IPFIX protocol are necessary. According to the definition of congestion-related information defined in "Data Mode" section, new IEs conveying congestion level is defined for IPFIX. Definition of new IE indicating congestion level. Description: The congestion level calculated by exporter. Abstract Data Type: float32 Data Type Semantics: quantity ElementId: TBD. Status: current The example below shows how IPFIX can be used for congestion feedback. (1) Sending Template Set The exporter use Template Set to inform the collector how to interpret the IEs in the following Data Set. +------------------------+--------------------+ |Set ID=2 |Length=n | +------------------------+--------------------+ |Template ID=257 |Field Count=m | +------------------------+--------------------+ |exporterIPv4Address=130 |Field Length=4 | +------------------------+--------------------+ |collectorIPv4Address=211|Field Length=4 | +------------------------+--------------------+ |CongestionLevel=TBD1 |Field Length=2 | +---------------------------------------------+ |Enterprise Number=TBD2 | +---------------------------------------------+ Wei Expires April 11, 2015 [Page 16] INTERNET DRAFT Tunnel Congestion Feedback October 8, 2014 (2) Sending Data Set The exporter meters the traffic and sends the congestion information to collector by Data Set. +------------------+-------------------+ |Set ID=257 |Length=n | +--------------------------------------+ |192.0.2.12 | +--------------------------------------+ |192.0.2.34 | +--------------------------------------+ |0.1714 | +--------------------------------------+ +--------+ +---------+ |Exporter| |Collector| +--------+ +---------+ | | | | | (1)Sending Template Set | |------------------------------------->| | | +--------+ | |metering| | +--------+ | | (2)Sending Data Set | |------------------------------------->| | . | | . | | . | | | | | Figure 7: IPFIX Congestion Flow Before sending congestion information to collector, the exporter sends a Template set to Collector. The Template set specifies the structure and semantics of the subsequent Data Set containing congestion-related information. The Collector understands the Data Sets that follow according to Template Set that was sent previously. The exporting Process transmits the Template Set in advance of any Data Sets that use that Template ID, to help ensure that the Collector has the Template Record before receiving the first Data Record. Data Records that correspond to a Template Record may appear in the same and/or subsequent IPFIX Message(s). Wei Expires April 11, 2015 [Page 17] INTERNET DRAFT Tunnel Congestion Feedback October 8, 2014 The Exporter meters the traffic passing through it and generates flow records. At this point, the Exporter may cache the records and then send congestion cumulative information to the collector. When Exporter detects that the network is heavily congested, it can change the feedback frequency to avoid adding more congestion to network. When receiving congestion related information, the Collector will make decisions to control the traffic entering the tunnel to reduce tunnel congestion. 5.3 Other Protocols A thorough evaluation of other protocols have not been performed at this time. 6. Benefits This section provides a short discussion about what benefits the tunnel congestion control would bring. Tunnel congestion control is a kind of local congestion control, where each tunnel is treated as an independent administrative domain in terms of congestion feedback and control, and it only responds to the congestion happened in the tunnel. The tunnel congestion control is complementary with e2e ECN control. The tunnel congestion feedback provides the network administrator with network congestion level information that can be used as an input for it local network management rather than relying solely on the e2e congestion control or blind traffic throttling. If the tunnel is congested it will be a waste of resource to allow new traffic to enter, because they may eventually get dropped in the tunnel. It's more efficient to have a control on new traffic at ingress. 7. Security Considerations This document describes the tunnel congestion calculation and feedback. For feeding back congestion, security mechanisms of IPFIX are expected to be sufficient. No additional security concerns are expected. 8. IANA Considerations IANA assignment of parameters for IPFIX extension may need to be considered in this document. Wei Expires April 11, 2015 [Page 18] INTERNET DRAFT Tunnel Congestion Feedback October 8, 2014 9. References 9.1 Normative References [RFC2003] Perkins, C., "IP Encapsulation within IP", RFC 2003, October 1996. [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC2784] Farinacci, D., Li, T., Hanks, S., Meyer, D., and P. Traina, "Generic Routing Encapsulation (GRE)", RFC 2784, March 2000. [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition of Explicit Congestion Notification (ECN) to IP", RFC 3168, September 2001. [RFC3758] Stewart, R., Ramalho, M., Xie, Q., Tuexen, M., and P. Conrad, "Stream Control Transmission Protocol (SCTP) Partial Reliability Extension", RFC 3758, May 2004. [RFC4301] Kent, S. and K. Seo, "Security Architecture for the Internet Protocol", RFC 4301, December 2005. [RFC5101] Claise, B., Ed., "Specification of the IP Flow Information Export (IPFIX) Protocol for the Exchange of IP Traffic Flow Information", RFC 5101, January 2008. [RFC6040] Briscoe, B., "Tunnelling of Explicit Congestion Notification", RFC 6040, November 2010. [I-D.boucadair-sfc-framework] Boucadair, M. etc, "Service Function Chaining: Framework & Architecture", draft-boucadair-sfc- framework-00(work in progress), October 2013. [I-D.zong-vnfpool-problem-statement] Zong, N. etc, "Virtualized Network Function (VNF) Pool Problem Statement", draft- zong-vnfpool-problem-statement-02(work in progress), January 2014. 9.2 Informative References [TS29.060]3GPP TS 29.060: "General Packet Radio Service (GPRS); GPRS Tunnelling Protocol (GTP) across the Gn and Gp interface". Wei Expires April 11, 2015 [Page 19] INTERNET DRAFT Tunnel Congestion Feedback October 8, 2014 Authors' Addresses Xinpeng Wei Beiqing Rd. Z-park No.156, Haidian District, Beijing, 100095, P. R. China E-mail: weixinpeng@huawei.com Zhu Lei Beiqing Rd. Z-park No.156, Haidian District, Beijing, 100095, P. R. China E-mail:lei.zhu@huawei.com Lingli Deng Beijing, 100095, P. R. China E-mail: denglingli@gmail.com Wei Expires April 11, 2015 [Page 20]