Internet Engineering Task Force G. Karagiannis Internet-Draft University of Twente Intended status: Informational L. Westberg Expires: September 8, 2010 G. Apostolopoulos Ericsson A. Holtzer TNO March 8, 2010 PCN Boundary Node Behaviour for the HOSE Mode of Operation draft-karagiannis-pcn-hose-edge-behaviour-01 Abstract Precongestion notification (PCN) is a means for protecting quality of service for inelastic traffic admitted to a Diffserv domain. The overall PCN architecture is described in RFC 5559. This memo is one of a series describing possible boundary node behaviours for a PCN domain. The behaviour described here is denoted as the HOSE model. Status of this Memo This Internet-Draft is submitted to IETF in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on September 8, 2010. Karagiannis, et al. Expires September 8, 2010 [Page 1] Internet-Draft PCN HOSE Boundary Node Behaviour March 2010 Copyright Notice Copyright (c) 2010 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the BSD License. Requirements Language The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119]. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 4 2. Assumed Core Network Behaviour for HOSE . . . . . . . . . . . 5 3. Node Behaviours . . . . . . . . . . . . . . . . . . . . . . . 6 3.1. Overview . . . . . . . . . . . . . . . . . . . . . . . . . 6 3.2. Behaviour of the PCN-Egress-Node . . 8 3.2.1. PCN-Egress-Node Role In Normal and Flow Admission . . . . . . . . . . . . . . . . . . . . . . .10 3.2.2. PCN-Egress-Node Role in Flow Termination . . . . . . . . . . . . . . . . . . . . . .11 3.3. Behaviour of the PCN decision point . . 13 3.3.1. PCN decision point Role In Normal and Flow Admission . . . . . . . . . . . . . . . . . . . . . . .13 3.3.2. PCN decision point Role in Flow Termination . . . . . . . . . . . . . . . . . . . . . .14 3.4. Behaviour of the PCN-Ingress-Node . . . . . . . . . . . . 16 3.4.1. PCN-Ingress-Node Role In Flow Admission . . . . . . . 17 3.4.2. PCN-Ingress-Node Role In Flow Termination . . . . . . 18 4. Specification of Diffserv Per-Domain Behaviour . . . . . . . . 18 4.1. Applicability . . . . . . . . . . . . . . . . . . . . . . 18 4.2. Technical Specification . . . . . . . . . . . . . . . . . 19 4.3. Attributes . . . . . . . . . . . . . . . . . . . . . . . . 19 4.4. Parameters . . . . . . . . . . . . . . . . . . . . . . . . 19 4.5. Assumptions . . . . . . . . . . . . . . . . . . . . . . . 19 4.6. Example Uses . . . . . . . . . . . . . . . . . . . . . . . 19 4.7. Environmental Concerns . . . . . . . . . . . . . . . . . . 19 Karagiannis, et al. Expires September 8, 2010 [Page 2] Internet-Draft PCN HOSE Boundary Node Behaviour March 2010 5. Security Considerations . . . . . . . . . . . . . . . . . . . 19 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 19 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 19 8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 20 8.1. Normative References . . . . . . . . . . . . . . . . . . . 20 8.2. Informative References . . . . . . . . . . . . . . . . . . 20 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 22 1. Introduction The main objective of Pre-Congestion Notification (PCN) is to support The quality of service (QoS) of inelastic flows within a Diffserv domain, in a simple, scalable, and robust fashion. Two mechanisms are used: admission control and flow termination. Admission control is used to decide whether to admit or block a new flow request, while flow termination is used in abnormal circumstances to decide whether to terminate some of the existing flows. To support these two features the overall rate of PCN-traffic is metered on every link in the domain, and PCN-packets are appropriately marked when certain configured rates are exceeded. These configured rates are below the rate of the link thus providing notification to boundary nodes about overloads before any congestion occurs (hence "pre-congestion" notification). The level of marking allows boundary nodes to make decisions about whether to admit or terminate. For more details see [RFC5559]. Boundary node behaviours specify a detailed set of algorithms and edge node behaviours used to implement the PCN mechanisms. Since the algorithms depend on specific metering and marking behaviour at the interior nodes, it is also necessary to specify the assumptions made about interior node behaviour. Finally, because PCN uses DSCP values to carry its markings, a specification of boundary node behaviour must include the per domain behaviour (PDB) template specified in [RFC3086], filled out with the appropriate content. This document describes this behaviour for the HOSE model of operation, see e.g., [DuGo99]. In this document the term HOSE is referring to the aggregation of incoming traffic from all ingress edges, which is associated with one traffic class, i.e., PHB, towards one egress edge. This type of HOSE model is equivalent to the Multiple Point to Point (MP2P) type of aggregation. The HOSE model ensures bandwidth limits without the need of maintaining per each ingress and egress pair ingress-egress- aggregated states. In this case all edges maintain one aggregated state per each traffic class, i.e., PHB (Per Hop Behaviour), used in the PCN domain. Moreover, the HOSE model is able to provide solutions for the flow-based ECMP (Equal Cost Multi Path) problem for both admission control and flow termination procedures. The assumption here is made that ECMP performs flow-based multipath routing, see [RFC2991] and [RFC2992]. Karagiannis, et al. Expires September 8, 2010 [Page 3] Internet-Draft PCN HOSE Boundary Node Behaviour March 2010 1.1. Terminology In addition to the terms defined in [RFC5559], this document uses the following terms: PCN decision point The node that makes the decision about which flows to admit and to terminate. In a given network deployment, this may be the egress node or a centralized control node. Of course, regardless of the location of the decision point, the ingress node is the point where the decisions are enforced. Admission block state The state ("admit" or "block") derived by PCN-egress-node based on PCN packet marking statistics. Flow termination state The operating state of the PCN edges during periods of severe overload & congested situations. Normal state The operating state of the PCN edges during periods when the PCN edges are neither operating in Admission block state nor operating in Flow termination state. Admission block decision threshold rate A rate value of ThM marked packets belonging to one PHB that are received by a PCN-egress-node and is used for its comparison with the measured ThM rate of packets received by the PCN-egress-node and that are belonging to the same PHB. If the measured ThM rate is higher than this threshold rate than the Normal state changes to Admission block state. o "Egress_state" A two bit parameter used to specify the operational state of the PCN-egress-node. Egress Normal state: Egress_state =0; Egress Admission block state: Egress_state =1; Egress Flow termination state: Egress_state =2; o "admit" A one bit Boolean parameter used to specify that a flow can be admitted or rejected: Admit a flow: "admit" = 1; Reject a flow: "admit" = 0; o "request_threshold_marked" A one bit parameter used to specify that an on-path admission control signaling request message is ThM marked. On-path admission control signaling request message is ThM (or ETM) marked: "request_threshold_marked" = 1; Karagiannis, et al. Expires September 8, 2010 [Page 4] Internet-Draft PCN HOSE Boundary Node Behaviour March 2010 On-path admission control signaling request message is PCN unmarked: "request_threshold_marked" = 0; 2. Assumed Core Network Behaviour for HOSE This section describes the considered behaviour for nodes of the PCN- domain when acting in their role as PCN-interior-nodes. The HOSE mode of operation assumes that: o encoding of PCN status within individual packets is based on [draft-ietf-pcn-3-state-encoding-01] (or on [draft-ietf-pcn-3-in-1-encoding-01], extended to provide a third PCN encoding state. o the PCN domain satisfies the conditions specified in the applicable encoding extension document; o each link has been configured with a PCN-threshold-rate having a value equal to the PCN-admissible-rate for the link; o each link has been configured with a PCN-excess-rate having a value equal to the PCN-supportable-rate for the link; o PCN-interior-nodes perform threshold-marking and excess-traffic- marking of packets according to the rules specified in [RFC5670], and any additional rules specified in the applicable encoding extension document, with the following recommendations: o in situations that the interior node is overloaded it is RECOMMENDED that the interior SHOULD preferentially drop unmarked packets instead of marked packets. This is required since the marked packets are used at the egress to calculate the excess rate during flow termination. The excess rate can be accurately calculated at the egress when marked packets are not dropped in the Core network. o the signaling messages that are passing through a PCN-interior- node are treated, from the point of PCN encoding, identically as the data packets that are processed by a PCN-interior-node. However, the signaling messages SHOULD be processed with a higher priority than data packets. This will ensure that in situations of severe overload the signaling messages could have a higher chance of not being dropped. According to [RFC5670], the encoding extension documents should specify the allowable transitions between marking states. However, to be absolutely clear, these allowable transitions are specified here. At any interior node, the only permitted transitions are the following: o It MUST NOT change the not-PCN codepoint to any other codepoint. Karagiannis, et al. Expires September 8, 2010 [Page 5] Internet-Draft HOSE Boundary Node Behaviour March 2010 o It MAY change any Not-marked codepoint to either the Threshold- marked or Excess-traffic-marked codepoints. o It MUST NOT change a Not-marked codepoint to the not-PCN codepoint. o A Not-marked codepoint MUST NOT be changed to any other Not-marked codepoint. o It MAY change the ThM codepoint to the ETM codepoint but it MUST NOT change the ThM codepoint to any other codepoint. o It MUST NOT change the ETM codepoint to any other codepoint. Obviously in every case a codepoint can remain unchanged. The precise rules governing which valid transition to use are set out in [RFC5670]. 3. Node Behaviours 3.1. Overview The HOSE model assumes that on-path admission control signalling Messages, e.g., RSVP PATH, are used from a PCN-ingress-node towards a PCN-egress-node. The HOSE mode of operation supports flow admission based on the received ThM marked traffic rate, belonging to the same PHB. If the ThM marked rate is higher than a predefined value then the state at the PCN-egress-node changes from Normal to Admission block state. The PCN-egress-node MUST be able to identify signalling request messages and at the same time separate them from received data packets. A flow is rejected if the following two conditions are valid: o the PCN decision point is informed that the PCN-egress-node operates in Admission block state o the PCN decision point is informed that the admission control signalling request message is ThM marked. By observing these two conditions, it is ensured that the aggregation level of the measured ThM packets at the PCN-egress-node is relatively high and accurate enough to identify that the PCN-egress- node operates in the Admission block state and at the same time it ensures that the admission control signalling request message passed through a PCN-interior-node that is in a congestion situation. The PCN-egress-node informs the PCN decision point about the status of these two conditions. If the two conditions are satisfied then the PCN decision point informs the PCN-ingress-node that the flow is rejected. Karagiannis, et al. Expires September 8, 2010 [Page 6] Internet-Draft HOSE Boundary Node Behaviour March 2010 In the situation that the decision point is co-located with the PCN- egress-node then it is assumed that the PCN decision point tasks are accomplished by the PCN-egress node. Furthermore, in this situation, the PCN-ingress-node can be notified that the flow is rejected by sending an on-path admission control signalling reply message, e.g., RSVP PATHErr is used to notify that the PCN-ingress-node flow is rejected. If these two conditions are not satisfied then the flow is accepted and the PCN-ingress-node is notified to accept the flow. In the situation that the decision point is co-located with the PCN- egress-node then it is assumed that the PCN decision point tasks are accomplished by the PCN-egress node. Furthermore, in this situation, the PCN-ingress-node can be notified that the flow is accepted by using an on-path admission control signalling reply message, e.g., RSVP RESV is sent. Flow termination is triggered when the PCN-egress-node receives excess-traffic-marked (ETM) packets, belonging to the same PHB. The PCN-egress-node calculates the rate of PCN-traffic contained in received packets which are excess-traffic-marked. Moreover, the PCN- egress-node records the address identity of the PCN-ingress-node and the identity of all the flows arriving at the PCN-egress-node, that are ETM marked. Only these flows, which are the ones passing through the severely overloaded PCN-interior-node(s), are candidates for termination. When operating in flow termination, the PCN-egress-node sends at the end of each fixed-length measurement interval the rate of the received packets which are excess traffic marked to the PCN decision point. In addition to that, the PCN-egress-node sends the identity and the reserved bandwidth of all the flows arriving at the PCN-egress-node, that are ETM marked. The edge nodes keep per flow state and therefore they have knowledge about the value of the bandwidth that each flow had reserved. The PCN decision point uses the received excess-traffic-marked rate (ETM) rate and calculates how many flows have to be terminated. By using the received information from the PCN-egress-node, can translate the calculated ETM rate to be terminated, to number of flows. The PCN decision point informs each PCN-ingress-node about which flows should be terminated by sending to each of them a list with flows that have been selected for termination. In the situation that the decision point is co-located with the PCN- egress-node, then it is assumed that the PCN-egress-node accomplishes the above described PCN decision point tasks. Furthermore, in this case, the list with flows that have been selected for termination can be sent in a reliable way using an on-path, flow termination signalling notify message, e.g., RSVP-TE Notify message. An alternative solution to this is that the PCN-egress node sends a signalling notify message for each flow that has to be terminated. The PCN-ingress-node, using admission control/flow termination signalling procedures must terminate these flows. Karagiannis, et al. Expires September 8, 2010 [Page 7] Internet-Draft HOSE Boundary Node Behaviour March 2010 This HOSE edge behaviour is able to provide solutions for the ECMP (Equal Cost Multi Path) problem for both admission control and flow termination procedures, when it is considered that ECMP performs flow-based multipath routing. ECMP [RFC2991] is a load balancing mechanism that allows packet flows to be routed via multiple redundant paths, in case they have the same (intermediate) destination and equal cost routes to this destination exist. ECMP is supported by various routing protocols, such as OSPF and ISIS, and therefore may be present in transport networks, which may also support PCN functionality. When PCN is used with e.g. the baseline-encoding [RFC5696], the interior-node topology of the transport network is hidden to the PCN edge nodes. Therefore, in case of ECMP, the fact that several redundant paths may exist between a single ingress-egress pair will not be noticed by PCN. Instead, when packets of a flow are PCN-marked, for example, the flow admission control mechanism of PCN may not admit any new flows over all of those paths in the transport network, while in fact only one of them may be congested. By introducing the mechanism as proposed in this HOSE edge behaviour it will become possible for the edge nodes to identify the actual congested paths and only prohibit, or terminate flows (in case of flow termination) flows from using those particular paths. The assumption here is made that ECMP performs flow-based multipath routing using a deterministic algorithm, such that all packets, including probe packets, associated with a flow and used in the mechanism are sure to travel via the same path as the same flow they are associated with. When for example the ECMP mechanism described in [RFC2991] and [RFC2992] is used, this assumption is valid. 3.2. Behaviour of the PCN-Egress-Node The egress node observes all PCN-traffic that it receives, ThM marked traffic, excess marked-traffic (ETM) and PCN unmarked traffic in order to define the state mode that the egress node is operating. Based on this operation state the PCN-egress-node can assist the PCN decision point to either admit, reject or terminate a flow. It is considered that the PCN-egress-node, in addition to the PCN-related functions described briefly in section 4.3 of [RFC5559] is able to support the following: o it measures the following rates during fixed-length measurement intervals with a duration of T, where the duration is suggested to be in the range of 50 to 100ms: NM_count: Number of bytes of PCN-traffic contained in received packets which are neither threshold-marked nor excess-traffic-marked. Karagiannis, et al. Expires September 8, 2010 [Page 8] Internet-Draft HOSE Boundary Node Behaviour March 2010 ThM_count: Number of bytes of PCN-traffic contained in received packets which are threshold-marked. ETM_count: Number of bytes of PCN-traffic contained in received packets which are excess-traffic-marked. NM_rate Rate of PCN-traffic contained in received packets which are neither threshold-marked nor excess-traffic-marked; NM_rate = NM_count/T; ThM_rate Rate of PCN-traffic contained in received packets which are threshold-marked; ThM_rate = ThM_count/T; ETM_rate Rate of PCN-traffic contained in received packets which are excess-traffic-marked; ETM_rate = ETM_count/T; o Ablock_TH an admission block detection threshold rate is a predefined ThM_rate that is used to detect whether a PCN-egress-node should change from a Normal state to an Admission block state or vice-versa. o it is considered that the used signaling messages sent from the PCN-ingress-node towards the PCN-egress-node are following the data path, i.e., the same communication path as the data packets sent from the same PCN-ingress-node towards the same PCN-egress-node. o it is able to differentiate signaling messages from data packets. o it is able to identify flows and to classify packets into flows. o it is able to identify the identity (address information) of the PCN-ingress-node that forwarded each flow. For example, in RSVP this can be provided using RSVP PHOP. o it is considered the signaling messages that are used for admission control and flow termination purposes are PCN encoded in an identical way as data packets. However, the signaling messages SHOULD be processed with a higher priority than data packets. This will ensure that in situations of severe overload the signaling messages could have a higher chance of not being dropped. The operation states & events in PCN-egress-nodes are shown in Figure 1. Karagiannis, et al. Expires September 8, 2010 [Page 9] Internet-Draft HOSE Boundary Node Behaviour March 2010 --------------------------------------------- | event B | | V ---------- ------------- ---------- | Normal | event A | Admission | event B | Flow | | state |---------->| block |-------->|Termination| | | | state | | state | ---------- ------------- ---------- ^ | ^ | | event C | | event D | ----------------------- ----------------- Figure 1: States of operation o event A: when the PCN-egress-node receives a ThM rate that is higher or equal than the admission block detection threshold rate (Ablock_TH). Note that this predefined ThM threshold rate can be set equal to a default rate that is equal to 1% of the rate capacity of the link with lowest capacity within the whole PCN domain. o event B: this event occurs when the PCN-egress-node receives packets that are ETM marked. o event C: this event occurs when the rate of incoming ThM bytes/packets decreases below the Ablock_TH. o event D: this event occurs when the egress, during an interval T, does not receive ETM marked packets. The following sections give details of the egress node operation in admission control and flow termination. 3.2.1 PCN-Egress-Node Role In Normal and Flow Admission When the PCN-egress-node operates in Normal state or in Admission Block state then the NM_count, NM_rate, ThM_count and ThM_rate variables are being calculated each measurement interval, T. Furthermore, the following situations can be identified: o if the PCN-egress-node operates in Normal state and it receives an admission control signaling request message forwarded by an PCN- ingress-node to check whether a flow can be admitted or not then the PCN-egress-node MUST inform the PCN Decision Point using the following parameters: o the value of the "egress_state" = 0 (Normal state) o the address identity of the flow that initiated the admission control signaling request message Karagiannis, et al. Expires September 8, 2010 [Page 10] Internet-Draft HOSE Boundary Node Behaviour March 2010 o the address identity of the PCN-ingress-node from where the admission control signaling request message has been sent. o the address identity of the PCN-egress-node that sends this information. o if the PCN-egress-node operates in Admission block state and it receives an admission control signaling request message that is PCN unmarked then the PCN-egress-node MUST inform the PCN Decision Point using the following parameters: : o the value of the "egress_state" = 1 (Admission block state) o the value of the "request_threshold_marked" = 0 (PCN unmarked) o the address identity of the flow that initiated the admission control signaling request message o the address identity of the PCN-ingress-node from where the admission control signaling request message has been sent. o the address identity of the PCN-egress-node that sends this infdrmation. o if the PCN-egress-node operates in either Admission block state or Flow termination state and it receives an admission control signaling request message that is ThM (or ETM) marked then the PCN-egress-node MUST inform the PCN Decision Point using the following parameters: o the value of the "egress_state" = = 1 (in Admission block state) OR = 2 (in Flow termination state) o the value of the "request_threshold_marked" = 1 (ThM or ETM marked) o the address identity of the flow that initiated the admission control signaling request message o the address identity of the PCN-ingress-node from where the admission control signaling request message has been sent. o the address identity of the PCN-egress-node that sends this information. 3.2.2 PCN-Egress-Node Role in Flow Termination When the PCN-egress-node detects an ETM packet, it changes its operation state to Flow Termination state. As a result of this transition, it immediately resets NM_count and ThM_count and begins a new measurement interval. In addition, it begins to collect the ETM_count and ETM_rate variables. The ETM_rate variable represents the bandwidth that causes an overload on a communication path within the PCN domain. Karagiannis, et al. Expires September 8, 2010 [Page 11] Internet-Draft HOSE Boundary Node Behaviour March 2010 In Flow termination, inaccuracies in excess rate measurements might occur due to the delay between the metering and marking event that occurs at the PCN-interior-nodes, the decisions that are made at PCN- egress-nodes, and the termination of flows that are performed by PCN- ingress-nodes, see section 6 in [CsTa05]. Furthermore, until the overload decreases at the PCN-interior-node such that the overload situation is solved, an additional trip time from the PCN-ingress- node to this PCN-interior-node can expire. This is because immediately before receiving the flow termination notification, the PCN-ingress-node may have sent out packets associated with the flows that were selected for termination. Without considering the above, PCN-interior-nodes would continue marking the packets until the overload situation is solved. In this way, at the end more flows will be terminated than necessary, i.e., an over-reaction takes place. In order to solve these inaccuracies, the PCN- egress-nodes use a sliding window memory to keep track of the measured ETM_rate in a couple of previous measurement intervals. At the end of a measurement interval, T, the bandwidth that needs to be terminated (denoted below as termination_PCN_marking_rate) is calculated as follows. The ETM_rate value measured during one T interval is decreased with the sum of already ETM_rate values stored in the sliding window memory, since that bandwidth to be terminated is already being handled in the flow termination handling control loop. The sliding window memory consists of an integer number of cells, i.e, n =maximum number of cells. Thus the maximum size of the sliding window memory is represented by n. Guidelines for configuring the sliding window parameters are given in [CsTa05]. However, based on several experiments that have been performed for the situation that the sliding window is applied at the PCN-egress-node, it is recommended that the best value that can be used for the sliding window size at the egress is equal to 1, i.e., n = 1. At the end of each measurement interval, the newest calculated ETM_rate is pushed into the memory with maximum size n, and the oldest cell is dropped. If Mi is the ETM_rate stored in ith memory cell (i = [1..n]), then at the end of every fixed-length measurement interval, the termination_PCN_marking_rate variable that is used to calculate the bandwidth that has to be terminated is calculated as follows: Karagiannis, et al. Expires September 8, 2010 [Page12] Internet-Draft HOSE Boundary Node Behaviour March 2010 Sum_Mi =0 For i =1 to n { Sum_Mi = Sum_Mi + Mi } termination_PCN_marking_rate = ETM_rate - Sum_Mi, where Sum_Mi is calculated as above. Next, the sliding memory is updated as follows: For i = 1..(n-1): Mi = Mi+1 Mn = termination_PCN_marking_rate The PCN-egress-node records the identity of the PCN-ingress-node that forwarded each flow, the ETM_rate and the identity of all the flows, arriving at the PCN-egress-node, with ETM marking. Subsequently, the PCN-egress-node sends at the end of each fixed- length measurement interval the bandwidth that has to be terminated, i.e., termination_PCN_marking_rate variable to the PCN decision point. In addition to the termination_PCN_marking_rate information the PCN-egress-node MUST send the following information: o the value of the "egress_state" = 2 (in Flow termination state) o a list with: (1) flow identifiers, (2) reserved bandwidth o the address identity of the PCN-egress-node that sends this information. o address identity of the PCN-ingress-node from where each individual flow passed and for which excess- marked packets have been observed. These can be used by the decision point when it selects flows for termination. 3.3. Behaviour of the PCN decision point The decision point can fulfil two roles, one for flow admission and the other one for flow termination. 3.3.1 PCN decision point Role in Normal and Flow Admission When the PCN decision point receives a report associated with an admission control signalling request message from the PCN-egress-node, it MUST parse and control the following parameters: Karagiannis, et al. Expires September 8, 2010 [Page13] Internet-Draft HOSE Boundary Node Behaviour March 2010 o If the following expression is TRUE then the PCN decision point MUST accept the request: ("egress_state" == "0") OR [("egress_state" == "1") AND ("request_threshold_marked" == 0)] In this case the PCN decision point, uses the additional information that has been sent by the PCN-egress-node, see Section 3.2.1, and sends an admission control signaling reply message, that carries an admission control "admit" = 1 value towards the PCN-ingress-node that initiated the request. In addition to this information the PCN decision point includes the address identity of the PCN- egress-node that sent the "egress_state" information. In the situation that the decision point is co-located with the PCN-egress-node then it is assumed that the PCN decision point tasks are accomplished by the PCN-egress node. Furthermore, in this situation, the PCN-ingress-node can be notified that the flow is accepted by using an on-path admission control signalling reply message, e.g., RSVP RESV is sent. o If the following expression is TRUE then the PCN decision point MUST reject the request: ("request_threshold_marked" == 1) AND [("egress_state" == "1") OR ("egress_state" == "2")] In this case the PCN decision point, uses the additional information that has been sent by the PCN-egress-node, see Section 3.2.1, and sends an admission control signaling reply message, that carries an admission control "admit" = 1 value towards the PCN-ingress-node that initiated the request. In addition to this information the PCN decision point includes the address identity of the PCN- egress-node that sent the "egress_state" information. In the situation that the decision point is co-located with the PCN-egress-node then it is assumed that the PCN decision point tasks are accomplished by the PCN-egress node. Furthermore, in this situation, the PCN-ingress-node can be notified that the flow is rejected by sending an on-path admission control signalling reply message, e.g., RSVP PATHErr is used to notify that the PCN-ingress-node flow is rejected. 3.3.2 PCN decision point Role in Flow Termination Not all operators will wish to deploy flow termination. Hence deactivation of flow termination at the decision node MUST be a configurable option. Karagiannis, et al. Expires September 8, 2010 [Page14] Internet-Draft HOSE Boundary Node Behaviour March 2010 When the PCN decision point receives a flow termination report message from the PCN-egress-node, it MUST parse and control all the received information: o the bandwidth that has to be terminated, i.e., termination_PCN_marking_rate. o the value of the "egress_state" = 2 (in Flow termination state) o a list with: (1) flow identifiers, (2) reserved bandwidth o the address identity of the PCN-egress-node that sends this information. o address identity of the PCN-ingress-node from where each individual flow passed and for which excess- marked packets have been observed. These can be used by the decision point when it selects flows for termination. The PCN decision point uses the above information to ensure that a PCN-egress-node operates in the Flow termination state and that only the flows that are received by the same PCN-egress-node and the ones passing through the severely overloaded interior node(s), are candidates for termination. Furthermore, the reserved bandwidth parameter (per notified flow) is used in order to translate the calculated bandwidth to be terminated, to number of flows. The selection of the flows to be terminated is described in the pseudo-code that is given below, which is realized by the function denoted below as calculate_terminate_flows(). terminated_bandwidth = 0; while terminated_bandwidth < termination_PCN_marking_rate { terminate_bandwidth_class = termination_PCN_marking_rate - terminated_bandwidth calculate_terminate_flows(); sum_bandwidth_terminate; terminated_bandwidth = sum_bandwidth_terminate + terminated_bandwidth; } The terminate_bandwidth_class variable represents the calculated bandwidth that has to be translated in a number flows that should be terminated. The calculate_terminate_flows() function uses the terminate_bandwidth_class value and translates this bandwidth value to number of flows that have to be terminated. Only the ETM marked flows, which are the ones passing through the severely overloaded interior node(s), are candidates for termination. After the flows to be terminated are selected the sum_bandwidth_terminate value is calculated that is the sum of the bandwidth associated with the flows that will certainly be terminated. Karagiannis, et al. Expires September 8, 2010 [Page15] Internet-Draft HOSE Boundary Node Behaviour March 2010 The constraint of finding the total number of flows that have to be terminated is that the sum_bandwidth_terminate(priority_class) should be smaller approximately equal to the variable terminate_bandwidth_class. Finally the PCN Decision Point informs each PCN-ingress-node about which flows should be terminated by sending to each of them a list with flow IDs of flows that have been selected for termination. In the situation that the decision point is co-located with the PCN- egress-node, then it is assumed that the PCN-egress-node accomplishes the above described PCN decision point tasks. Furthermore, in this case, the list with flows that have been selected for termination can be sent in a reliable way using an on-path, flow termination signalling notify message, e.g., RSVP-TE Notify message. An alternative solution to this is that the PCN-egress node sends a signalling notify message for each flow that has to be terminated. The reliable term means in this context that the PCN-egress-node should be informed, by using e.g., an acknowledgement that the flow termination signalling notify message is successfully received by the PCN-ingress-node. 3.4. Behaviour of the PCN-Ingress-Node The PCN-related functions of the PCN-ingress-node are described briefly in section 4.2 of [RFC5559]. This section focuses on the specific behaviour associated with admission and flow termination. It is considered that the PCN-ingress-node, in addition to the PCN- related functions described briefly in section 4.2 of [RFC5559], is able to support the following: o it is considered that a signaling protocol can be used for admission control and flow termination, to admit and reject new flows and terminate ongoing flows. Furthermore, it is considered that the signaling messages are using the same flow ID information and PCN encoding states as the data packets associated with these signalling messages. o it is considered that the used signaling messages sent from the PCN-ingress-node towards the PCN-egress-node are following the data path, i.e., the same communication path as the data packets sent from the same PCN-ingress-node towards the same PCN-egress-node. o it is able to differentiate signaling messages from data packets. o it is able to identify flows and to classify packets into flows. o it is able to identify the identity (address information) of the PCN-egress-node that notifies the PCN-ingress-node about admission control and flow termination decisions. Karagiannis, et al. Expires September 8, 2010 [Page 16] Internet-Draft HOSE Boundary Node Behaviour March 2010 o it is considered the signaling messages that are used for admission control and flow termination purposes are PCN encoded in an identical way as data packets. However, the signaling messages SHOULD be processed with a higher priority than data packets. This will ensure that in situations of severe overload the signaling messages could have a higher chance of not being dropped. The operation states & events in PCN-ingress-nodes are shown in Figure 2. ---------- ------------- | Normal | event B | Flow | | state |-------------->| termination | | | | state | ---------- ------------- ^ | | event E | --------------------------- Figure 2: State of operation at a PCN-ingress-node The events used in Figure 2 and applied for PCN-ingress-nodes are: o event B: this event occurs when the PCN-ingress-node receives one Flow termination signaling notification message, e.g., RSVT-TE Notify, from the PCN-egress-node/Decision Point that one or more flows have to be terminated. In Flow termination, and if the PCN- ingress-node is able to identify, for each new admission flow request received from outside the PCN domain, to which PCN-egress- node is being destined, then the PCN-ingress-node SHOULD block all new admission flow requests that are received by the PCN-ingress- node from outside the PCN domain towards the PCN-egress-nodes that are operating in flow termination state. Otherwise, the PCN-ingress-node SHOULD block all new admission flow requests, that are received by the PCN-ingress-node from outside the PCN domain and sent towards any PCN-egress-node. o event E: this event is activated after the moment that the signaling protocol informs the PCN-ingress-node that the termination of all notified flows is completed. 3.4.1. PCN-Ingress-Node Role In Flow Admission The PCN-ingress-node receives an admission control signalling request message belonging to an external to PCN, signalling protocol. Subsequently, the PCN-ingress-node sends the admission control signalling request message towards the PCN-egress-node. Karagiannis, et al. Expires September 8, 2010 [Page 17] Internet-Draft HOSE Boundary Node Behaviour March 2010 When the PCN-ingress-node receives an admission control signalling report message, e.g., RSVP RESV, that includes a report indicating "admit", it admits the flow that requested access to the PCN domain. When the PCN-ingress-node receives an admission control signalling report message, e.g., RSVP PATHErr, that includes a report indicating "block", it rejects the flow that requested access to the PCN domain. 3.4.2. PCN-Ingress-Node Role In Flow Termination The PCN-Ingress-Node changes its operation state from Normal to Flow termination state when it receives one Flow termination signaling notification message, e.g., RSVT-TE Notify, from the PCN-egress- node / PCN Decision Point that requires that one or more flows have to be terminated. The flow IDs of the flows that must be terminated are carried within a list by the flow termination signaling notification message in a list. The signaling protocol SHOULD be used to terminate all flows with flow IDs included in the received flow ID list. Moreover, in Flow termination, see event B, and if the PCN-ingress- node is able to identify, for each new admission flow request received from outside the PCN domain, to which PCN-egress-node is being destined, then the PCN-ingress-node SHOULD block all new admission flow requests that are received by the PCN-ingress-node from outside the PCN domain towards the PCN-egress-nodes that are operating in flow termination state. Otherwise, the PCN-ingress-node SHOULD block all new admission flow requests, that are received by the PCN-ingress-node from outside the PCN domain and sent towards any PCN-egress-node. 4. Specification of Diffserv Per-Domain Behaviour This section provides the specification required by [RFC3086] for a per-domain behaviour. 4.1. Applicability This section draws heavily upon points made in the PCN architecture document, [RFC5559]. The HOSE boundary node behaviour specified in this document is applicable to inelastic traffic where the QoS for admitted flows is protected primarily by admission control at the ingress to the domain. In exceptional circumstances (e.g. due to network failures) already-admitted flows may be terminated to protect the quality of service of the remainder on-going flows. Karagiannis, et al. Expires September 8, 2010 [Page 18] Internet-Draft HOSE Boundary Node Behaviour March 2010 4.2. Technical Specification The technical specification of the HOSE per domain behaviour is provided by the contents of [RFC5559], [RFC5696], [RFC5670], the specification of the encoding extension (e.g., [draft-ietf-pcn-3-state-encoding-01], [draft-ietf-pcn-3-in-1-encoding-01]) and the present document. 4.3. Attributes The basic attributes are low loss and low jitter. 4.4. Parameters The relevant parameters are loss and jitter. 4.5. Assumptions Assumed that a specific portion of link capacity has been reserved for PCN traffic. Assumed that recovery from overloads by flow termination should happen within 1-3 seconds. 4.6. Example Uses The HOSE behaviour may be used to carry real-time traffic, particularly voice and video. 4.7. Environmental Concerns In some markets, traffic pre-emption is considered to be impermissible. In such environments, flow termination would not be enabled. 5. Security Considerations [RFC5559] provides a general description of the security considerations for PCN. This memo introduces no new considerations. 6. IANA Considerations This memo includes no request to IANA. 7. Acknowledgements Parts of the content used in this memo are drawn from [draft-westberg-pcn-load-control-05]. Therefore, we would like to acknowledge the authors of that draft, which are: L. Westberg, A. Bhargava, A. Bader, G. Karagiannis, H. Mekkes. Karagiannis, et al. Expires September 8, 2010 [Page 19] Internet-Draft HOSE Boundary Node Behaviour March 2010 The template and parts of the text that are used in this memo are based on the template used in [draft-ietf-pcn-cl-edge-behaviour-00]. Therefore, we would like to acknowledge the authors of that draft, which are: A, Charny, F. Huang, G. Karagiannis, M. Menth, T. Taylor. We would also like to acknowledge Ronald in 't Velt from TNO for providing useful comments. 8. References 8.1. Normative References [RFC5696] Moncaster, T., Briscoe, B., and M. Menth, "Baseline Encoding and Transport of Pre-Congestion Information", RFC 5696, November 2009. [RFC5670] Eardley, P., "Metering and marking behaviour of PCN- nodes", RFC 5670, November 2009. [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC2991] D. Thaler, C. Hopps, "Multipath Issues in Unicast and Multicast Next-Hop Selection", RFC 2991, November 2000. [RFC2992] C. Hopps, "Analysis of an Equal-Cost Multi-Path Algorithm", RFC 2992, November 2000. [RFC5559] Eardley, P., "Pre-Congestion Notification (PCN) Architecture", RFC 5559, June 2009. 8.2. Informative References [CsTa05] Csaszar, A., Takacs, A., Szabo, R., and T. Henk, "Resilient Reduced-State Resource Reservation", Journal of Communication and Networks Vol. 7, Num. 4, December 2005. [draft-ietf-pcn-3-in-1-encoding-01]) Briscoe, B., "PCN 3-State Encoding Extension in a single DSCP (Work in progress)", February 2010. [draft-ietf-pcn-3-state-encoding-01], Moncaster, T., Briscoe, B., and M. Menth, "A PCN encoding using 2 DSCPs to provide 3 or more states (Work in progress)", February 2010. [draft-ietf-pcn-cl-edge-behaviour-02] T. Taylor, A, Charny, F. Huang, G. Karagiannis, M. Menth, "PCN Boundary Node Behaviour for the Controlled Load (CL) Mode of Operation (Work in progress)", March 2010. Karagiannis, et al. Expires September 8, 2010 [Page 20] Internet-Draft HOSE Boundary Node Behaviour March 2010 [draft-westberg-pcn-load-control-05], L. Westberg, A. Bhargava, A. Bader, G. Karagiannis, H. Mekkes, "LC-PCN: The Load Control PCN Solution (Work in progress)", November 2008. [DuGo99] Duffield, N. and P. Goyal, "A Flexible Model for Resource Management in Virtual Private", Proc. of ACM/SIGCOMM pp. 95 - 108, December 1999. [RFC3086] Nichols, K. and B. Carpenter, "Definition of Differentiated Services Per Domain Behaviors and Rules for their Specification", RFC 3086, April 2001. Karagiannis, et al. Expires September 8, 2010 [Page 21] Internet-Draft HOSE Boundary Node Behaviour March 2010 Authors' Addresses Georgios Karagiannis University of Twente P.O. BOX 217 7500 AE Enschede, The Netherlands EMail: g.karagiannis@ewi.utwente.nl Lars Westberg Ericsson AB SE-164 80 Stockholm Sweden EMail: Lars.Westberg@ericsson.com George Apostolopoulos Ericsson 4333 Still Creek Dr.Burnaby, BC, V5C 6C6 Canada Email: george.apostolopoulos@ericsson.com Arjen Holtzer TNO Information and Communication Technology Brassersplein 2 Delft 2612CT The Netherlands Email: arjen.holtzer@tno.nl Karagiannis, et al. Expires September 8, 2010 [Page 22]