Network Working Group G. Ash Internet Draft AT&T Intended status: Informational D. McDysan Verizon Expires: July, 2011 January 8, 2011 Generic Connection Admission Control (GCAC) Algorithm Specification for IP/MPLS Networks Status of this Memo This Internet-Draft is submitted to IETF in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on July 8, 2011. Copyright Notice Copyright (c) 2011 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the BSD License. Abstract This document presents a generic connection admission control (GCAC) reference model and algorithm for IP/MPLS-based networks. Service provider (SP) IP/MPLS networks need an MPLS GCAC mechanism, for example, to reject voice over Internet Protocol (VoIP) calls when additional calls would adversely affect calls already in progress. Ash, McDysan [Page 1] Internet Draft GCAC Algorithm Specification January 2011 Without MPLS GCAC, connections on congested links will suffer degraded quality. The MPLS GCAC algorithm can be optionally implemented in vendor equipment and deployed by service providers. MPLS GCAC interoperates between vendor equipment and across multiple service provider domains. The MPLS GCAC algorithm uses available standard mechanisms for MPLS based networks, such as RSVP, DSTE, PCE, NSIS, DiffServ, and OSPF. The MPLS GCAC algorithm does not include aspects of CAC that might be considered vendor proprietary implementations, such as detailed path selection mechanisms. MPLS GCAC functions are implemented in a distributed manner to deliver the objective QoS for specified QoS constraints. The source is able to compute a source route with high likelihood that MPLS GCAC via elements along the selected path will in fact admit the request. MPLS GCAC is applicable to any service or flow that must meet an objective QoS (delay, jitter, packet loss rate) for a specified quantity of traffic. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 2 2. MPLS GCAC Reference Model & Algorithm Summary . . . . . . . . 3 3. MPLS GCAC Algorithm . . . . . . . . . . . . . . . . . . . . . 7 3.1 Bandwidth Allocation Parameters . . . . . . . . . . . . . 8 3.2 GCAC Algorithm . . . . . . . . . . . . . . . . . . . . . . 10 4. Informative References . . . . . . . . . . . . . . . . . . . . 13 5. Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . 14 Appendix A: Example MPLS GCAC Implementation Including Path Selection, Bandwidth Management, QoS Signaling, & Queuing . . . . . . . . . . . . . . . . . . . . . . . 14 A.1 Example of Path Selection & Bandwidth Management Implementation . . . . . . . . . . . . . . . . . . . . . . 15 A.2 Example of QoS Signaling Implementation . . . . . . . . . 21 A.3 Example of Queuing Implementation . . . . . . . . . . . . 23 1. Introduction This document presents a generic connection admission control (GCAC) reference model and algorithm for IP/MPLS-based networks. Service provider (SP) IP/MPLS networks need an MPLS GCAC mechanism, for example, to reject voice over Internet Protocol (VoIP) calls when additional calls would adversely affect calls already in progress. Without MPLS GCAC, connections on congested links will suffer degraded quality. Given the capital constraints in some SP networks, over-provisioning is not acceptable. MPLS GCAC supports all access technologies, protocols, and services, while meeting performance objectives with a cost-effective solution and operates across routing areas, autonomous systems, and service provider boundaries. This document defines an MPLS GCAC reference model, algorithm, and functions implemented in one or more types of network elements in different domains that operate together in a distributed manner to deliver the objective QoS for specified QoS constraints, such as Ash, McDysan [Page 2] Internet Draft GCAC Algorithm Specification January 2011 bandwidth. With MPLS GCAC the source router/server is able to compute a source route with high likelihood that MPLS GCAC via elements along the selected path will in fact admit the request. MPLS GCAC includes nested CAC actions, such as RSVP aggregation, nested RSVP-TE for scaling between provider edge (PE) routers, and pseudowire (PW) CAC within traffic engineered tunnels. MPLS GCAC focuses on MPLS and PW level CAC functions, rather than application level CAC functions. MPLS GCAC is applicable to any service or flow that must meet an objective QoS (latency, delay variation, loss) for a specified quantity of traffic. This would include, for example, most real-time/RTP services (voice, video, etc.) as well as some non-real-time services. Real-time/RTP services are typically interactive, relatively persistent traffic flows. Other services subject to MPLS GCAC could include, for example, manually provisioned label switched paths (LSPs) or PWs, automatic bandwidth assignment for applications that automatically build LSP meshes among PE routers. MPLS GCAC is applicable to both access and backbone networks, for example, to slow speed access networks and to broadband DSL, cable, and fiber access networks. In Section 2 we describe the MPLS GCAC reference model, and in Section 3 we specify the MPLS GCAC algorithm based on the principles in the reference model and requirements. Appendix A gives an example of MPLS GCAC implementation including path selection, bandwidth management, QoS signaling, and queuing implementation. 2. MPLS GCAC Reference Model & Algorithm Summary Figure 1 illustrates the reference model for the MPLS GCAC algorithm. MPLS GCAC is applicable to any service or flow for which MPLS GCAC is required to meet a given QoS. As such, the reference model applies to most real-time/RTP services (voice, video, etc.) as well as some non-real-time services. Real-time/RTP services are typically interactive, relatively persistent traffic flows. Non-real-time applications subject to MPLS GCAC could include, for example, manually provisioned LSPs or PWs, and automatic bandwidth assignment for applications that automatically build LSP meshes among PE routers. The reference model also applies to MPLS GCAC when MPLS is used in access networks, which include, for example, slow speed access networks and broadband DSL, cable, and fiber access networks. Endpoints will be IP/PBXs and individual-usage SIP/RTP end devices (hard and soft SIP phones, IADs), and this traffic will enter and leave the core via possibly bandwidth-constrained access networks, which may also be MPLS aware, but may use some other admission control technology. The basic elements considered in the reference model are the MPLS GCAC edge function (GEF), GCAC core functions (GCF), the PE routers, autonomous system border routers (ASBR), and provider (P) routers. As illustrated in Figure 1, the GEF interfaces to the application at Ash, McDysan [Page 3] Internet Draft GCAC Algorithm Specification January 2011 The source and destination PE, and the GCF exist at the PE, P and ASBR routers. GEF has an end-to-end focus and deals with whether individual connection requests fit within an MPLS tunnel, and GCF has a hop-by-hop focus and deals with whether an MPLS tunnel can be established across specific core network elements on a path. The GEF functionality may be implemented in the PE, ASBR, or in a stand-alone network element. The source/destination routers (or external devices through a router interface) support both GEF and GCF, while internal routers (or external devices through a router interface) support GCF. In Figure 1, the GEF handles both signaling and bearer control. Inputs to the GEF and GCF include the following, where most are inputs to both GEF and GCF except as noted: Traffic Description Bandwidth per RFC 4124 DSTE class type (GEF, GCF) Bandwidth for LSP from RFC 3270 (GEF, GCF) Aggregated RSVP bandwidth requirements from RFC 4804 (GEF) Variance Factor (GEF, GCF) Service Class/CoS & QoS Class Type (CT) from RFC 4124 (GEF, GCF) Signaled or configured EXP-PHB mapping from RFC 3270 (GEF, GCF) Signaled or configured PHB from RFC 3270 (GEF, GCF) QoS requirements from NSIS/Y.1541 [RFC 5961, RFC 5974, RFC 5975, RFC 5976] (GEF) Priority Admission priority (high, normal, best effort) from NSIS/Y.1541 [RFC 5961, RFC 5974, RFC 5975, RFC 5976] (GEF, GCF) Preemption priority from RFC 4124 (GEF, GCF) Request type Primary tunnel (GEF, GCF) Backup tunnel and fraction of capacity reserved for backup (GEF, GCF) Oversubscription method (see RFC 3270) Over/under-subscribe requested capacity (GEF, GCF) Over/under-subscribe available bandwidth (GEF, GCF) These inputs can be received by the GEF and GCF from a signaling interface, such as SIP or H.323, RSVP, from an NMS, be derived from measured traffic levels, or from elsewhere. Figure 1 illustrates the GEF to GEF MPLS GCAC algorithm to determine whether there is sufficient bandwidth to complete a connection. The originating GEF receives a connection request including the above input parameters over the input interface, for example, via an RSVP bandwidth request as specified in RFC 4804. The GEF a) determines whether there is enough bandwidth on the route between the originating and terminating GEFs via routing and signaling communication with the GCF functions at the P, PE and ASBR network elements along the path to accommodate the connection, b) communicates the accept/reject decision on the input interface for the connection request, and c) keeps account of network resource Ash, McDysan [Page 4] Internet Draft GCAC Algorithm Specification January 2011 allocations by tracking bandwidth use and allocations per COS. Optionally, the GEF may dynamically adjust the tunnel size by signaling communication with the GCF functions at nodes along the candidate paths. For example, the GEF could a) maintain per-COS tunnel capacity based on aggregated connection requests and respond on a connection-by-connection basis based on the available capacity, b) periodically adjust the tunnel capacity upward, when needed, and downward when spare capacity exists in the tunnel, and c) use a 'make before break' mechanism to adjust tunnel capacity in order to minimize disruption to the bearer traffic. ,-. ,-. ,--+ `--+--'- --'\ +----+_____+------+ { +----+ +----+ `. +------+ |GEF1| | |______| P |___| P |______| | | |-----| PE1 | { +----+ +----+ /+| PE2 | | | | |==========================>| ASBR | +-:--+ | |<==========================| | _|..__ +------+ { DSTE/MAR Tunnels ; +------+ _,' \-| ./ -'._ !| | Access \ / +----+ \, !| | Network | \_ | P | | !| | / `| +----+ / !| `--. ,.__,| | IP/MPLS Network / !| '`' '' ' .._,,'`.__ _/ '---' !| | '`''' !| C1 !| ,-. ,-. !| ,--+ `--+--'- --'\ !| +----+_____+------+ { +----+ +----+ `. +------+ |GEF2| | |______| P |___| P |______| | | |-----| PE4 | { +----+ +----+ /+| PE3 | | | | |==========================>| ASBR | +-:--+ | |<==========================| | _|..__ +------+ { DSTE/MAM Tunnels ; +------+ _,' \-| ./ -'._ | Access \ / +----+ \, | Network | \_ | P | | | / `| +----+ / `--. ,.__,| | IP/MPLS Network / '`' '' ' .._,,'`.__ _/ '---' | '`''' C2 CUSTOMER I/F PARAMETERS: BW, QoS, COS, priority LEGEND: --------- ASBR: autonomous system border router BW: bandwidth COS: class of service DSTE: DiffServ-aware MPLS traffic engineering Ash, McDysan [Page 5] Internet Draft GCAC Algorithm Specification January 2011 GCAC: generic connection admission control GCF: GCAC core function GEF: GCAC edge function I/F: interface MAM: maximum allocation model MAR: maximum allocation with reservation P: provider router PE: provider edge router --- connection signaling ___ bearer/media flows Note: PE, P, ASBR, GEF elements all support GCF functions Figure 1 - MPLS GCAC Reference Model In the reference model, DSTE [RFC 4124] tunnels are configured between the GEFs based on the traffic forecast and current network utilization. These guaranteed bandwidth DSTE tunnels are created using RSVP-TE [RFC 3209]. DSTE bandwidth constraints models are applied uniformly within each domain, such as the maximum allocation with reservation bandwidth constraints model (MAR) [RFC 4126], maximum allocation bandwidth constraints model (MAM) [RFC 4125], and the Russian dolls bandwidth constraints model (RDM) [RFC 4127]. An IGP such as OSPF or ISIS is used to advertise bandwidth availability by CT for use by the GCF to determine MPLS tunnel bandwidth allocation and admission on core (backbone) links. These DSTE tunnels are configured based on the forecasted traffic load and, when needed, LSPs for different CTs can take different paths. In the MPLS GCAC the GEFs implement RSVP aggregation (RFC 4804) for scalability. The GEF RSVP aggregator keeps a running total of bandwidth usage on the DSTE tunnel, adding the bandwidth requirements during connection setup and subtracting during connection teardown. The aggregator determines whether or not there is sufficient Bandwidth for the connection from that originating GEF to the destination GEF. The destination GEF also checks whether there is enough bandwidth on the DSTE tunnel from the destination GEF to the originating GEF. The aggregate bandwidth usage on the DSTE tunnel is also available to the DSTE bandwidth constraints model. If the available bandwidth is insufficient, then the GEF sends a PATH message through the tunnel to the other end, requesting bandwidth using GCF functions, and if successful the source would then complete a new explicit route with a PATH message along the path with increased bandwidth, again invoking GCF functions on the path. If the size of the DSTE tunnel cannot be increased on the primary and alternate LSPs, then when the DSTE tunnel bandwidth is exhausted, the GEF aggregator sends a message to the endpoint denying the reservation. If the DSTE tunnels are underutilized, the tunnel bandwidth may be reduced periodically to an appropriate level. In the case of a basic single class TE scenario, there is a single TE Ash, McDysan [Page 6] Internet Draft GCAC Algorithm Specification January 2011 tunnel rather than multiple-CT DSTE tunnels, otherwise the above GCAC functions remain the same. The reference model implements separate queues with DiffServ based on EXP bits. For example, these queues may include two expedited forwarding (EF) priority queues, with the highest priority assigned to emergency traffic (ETS) and the second priority assigned to normal priority real-time traffic (alternatively, there could be a single EF queue with dual policers [RFC 5865]). Several assured forwarding (AF) queues may be used for various data traffic, for example, premium private data traffic, premium public data traffic, and a separate best-effort queue is used for the best-effort traffic. Several DSTE tunnels may share the same physical link, and therefore share the same queue. The MPLS GCAC algorithm increases the likelihood that the route selected by the GEF will succeed, even when the LSP traverses multiple service provider networks. Path computation is not part of the GCAC algorithm, rather it is considered as a vendor proprietary function, although standard IP/MPLS functions may be included in path computation, such as the following: a) Path computation element (PCE) [RFC 4655, RFC 4657, RFC 5440] to implement inter-area/inter-AS/inter-SP path selection algorithms, including alternate path selection, path reoptimization, backup path computation to protect DSTE tunnels, and inter-area/inter-AS/inter-SP traffic engineering. b) Backward recursive path computation (BRPC) [RFC 5441]. c) Per domain path computation (PDPC) [RFC 5152]. d) MPLS fast reroute (FRR) [RFC 4090] to protect DSTE LSPs against failure. e) MPLS crankback [RFC 4920] to trigger alternate path selection and enable explicit source routing. 3. MPLS GCAC Algorithm This section specifies the MPLS GCAC algorithm, which includes MPLS GCAC bandwidth allocation, LSP path selection and DSTE bandwidth management, and QoS signaling and implementation. Inputs to the GEF and GCF include the following, where most are inputs to both GEF and GCF except as noted: Traffic Description Bandwidth per RFC 4124 DSTE class type (GEF, GCF) Bandwidth for LSP from RFC 3270 (GEF, GCF) Aggregated RSVP bandwidth requirements from RFC 4804 (GEF) Variance Factor (GEF, GCF) Service Class/CoS & QoS Class Type (CT) from RFC 4124 (GEF, GCF) Ash, McDysan [Page 7] Internet Draft GCAC Algorithm Specification January 2011 Signaled or configured EXP-PHB mapping from RFC 3270 (GEF, GCF) Signaled or configured PHB from RFC 3270 (GEF, GCF) QoS requirements from NSIS/Y.1541 [RFC 5961, RFC 5974, RFC 5975, RFC 5976] (GEF) Priority Admission priority (high, normal, best effort) from NSIS/Y.1541 [RFC 5961, RFC 5974, RFC 5975, RFC 5976] (GEF, GCF) Preemption priority from RFC 4124 (GEF, GCF) Request type Primary tunnel (GEF, GCF) Backup tunnel and fraction of capacity reserved for backup (GEF, GCF) Oversubscription method (see RFC 3270) Over/under-subscribe requested capacity (GEF, GCF) Over/under-subscribe available bandwidth (GEF, GCF) These inputs can be received by the GEF and GCF from a signaling interface, such as SIP or H.323, RSVP, from an NMS, be derived from measured traffic levels, or from elsewhere. MPLS GCAC is performed at the GEF during the connection setup attempt phase to determine if a connection request can be accepted without violating existing connections' QoS and throughput requirements. To enable routing to produce paths that will likely be accepted, it is necessary for nodes to advertise some information about their internal CAC states. Requiring nodes to expose detailed and up-to-date CAC information, however, may result in unacceptably high rate of routing updates. MPLS GCAC advertises CAC information that is generic (e.g., independent of the actual path selection algorithms used) and rich enough to support any CAC. MPLS GCAC defines a set of parameters to be advertised and a common admission interpretation of these parameters. This common interpretation is in the form of an MPLS GCAC algorithm to be performed during MPLS LSP path selection to determine if a link or node can or cannot be included for consideration. The algorithm uses the advertised MPLS GCAC parameters (available from the topology database) and the characteristics of the connection being requested (available from QoS signaling) to determine if a link/node will likely accept or reject the connection. A link/node is included if the MPLS GCAC algorithm determines that it will likely accept the connection, and excluded otherwise. 3.1 Bandwidth Allocation Parameters MPLS GCAC bandwidth allocation parameters for each DSTE CT are as defined within DSTE [RFC 4126], OSPF-TE extensions [RFC 4203], and ISIS-TE extensions [RFC 5307]. The following parameters are available from DSTE/TE extensions, advertised by the IGP, and available to the GEF and GCF [RFC 4124]. Note that the approach presented in this section is adapted from PNNI Appendix B [PNNI]. Ash, McDysan [Page 8] Internet Draft GCAC Algorithm Specification January 2011 MRBk maximum reservable bandwidth on link k specifies the maximum bandwidth that may be reserved; this may be greater than the maximum link bandwidth, in which case the link may be oversubscribed. BWCck bandwidth constraint for CTc on link k = allocated (minimum guaranteed) bandwidth for CTc on link k. UCBck unreserved link bandwidth on CTc on link k specifies the amount of bandwidth not yet reserved for CTc Also available are administrative weight [RFC 2328], TE metric [RFC 3630], and administrative group (also called color) 4-octet mask [RFC 3630]. The following quantities can be derived from information advertised by the IGP and otherwise available to the GEF and GCF: RBWck reserved bandwidth on CTc on link k (0 = c = MaxCT-1), RBWck = total amount of bandwidth reserved by all established LSPs that belong to CTc = BWCck - UCBck. ULBk unreserved link bandwidth on link k specifies the amount of bandwidth not yet reserved for any CT, ULBk = MRBk - sum [RBWck (0 = c = MaxCT-1)]. The GCAC algorithm assumes that a DSTE bandwidth constraints model is used uniformly within each domain (e.g., MAR [RFC 4126], MAM [RFC 4125], or RDM [RFC 4127]. EANTC testing [EANTC] has shown that interoperability is problematic when different DSTE bandwidth constraints models are used by different network elements within a domain. Specific testing of MAM and RDM across different vendor equipment, showed the incompatibility. However, while the characteristics of the 3 DSTE bandwidth constraints models are quite different, it is necessary to specify interworking between them even though it could be complex. The following DSTE local variables are also available to GCF: RBTk reservation bandwidth threshold for link k. ULBCck unreserved link bandwidth on CTc on link k specifies the amount of bandwidth not yet reserved for CTc, ULBCck = ULBk - delta0/1(CTck) * RBTk where delta0/1(CTck) = 0 if RBWck < BWCck delta0/1(CTck) = 1 if RBWck = BWCck MRBCck maximum reservable link bandwidth for CTc on link k specifies the amount of bandwidth not yet reserved for CTc, MRBCck = MRBk - delta0/1(CTck) * RBTk where Ash, McDysan [Page 9] Internet Draft GCAC Algorithm Specification January 2011 delta0/1(CTck) = 0 if RBWck < BWCck delta0/1(CTck) = 1 if RBWck = BWCck Note that these bandwidth parameters must be configured in a consistent way within domains and across domains. GEF routing of LSPs is based on ULBCck, where ULBk is available and RBTk can be accounted for by configuration, e.g., RBT typically = .05 * MRBk. The following parameters are also defined and available to GCF, and are assumed to be locally configured to be a consistent value across all nodes and domain(s): SBWck sustained bandwidth for CTc on link k (aggregate of existing connections); SBWck = factor * RBWck where factor is based on standard 'demand overbooking' factors. VFck variance factor for CTc on link k; VFck is BWMck normalized by variance of SBWck; VFck is based on typical traffic variability statistics. Note that different demand overbooking factors can be specified for each CT, e.g., no overbooking might be used for constant bit-rate services, while a large overbooking factor might be used for bursty variable bit-rate services. We specify demand overbooking rather than link overbooking for the GCAC algorithm; one advantage is the demand overbooking is compatible with source routing used by the GCAC algorithm. Also defined is BWMck bandwidth margin for CTc on link k; BWMck = RBWck - SBWck GEF uses BWCck, RBWck, UCBck,, SBWck, BWMck, and VFck for LSP/IGP routing. GEF also needs to track per-CT LSP bandwidth allocation and reserved bandwidth parameters, which are defined as follows: RBWLcl reserved bandwidth for CTc on LSP l UBWLcl unreserved bandwidth for CTc on LSP l 3.2 GCAC Algorithm The assumption behind the MPLS GCAC is that the ratio between BWMck, which represents the safety margin the node is putting above the SBWck, and the standard deviation of the SBWck defined below does not change significantly as one new aggregate flow is added on the link. Any ingress node doing path selection can then compute the new standard deviation of the aggregate rate (from the old value and the aggregate flow's traffic descriptors) and an estimate of the new BWMck. From this, the increase in bandwidth required to carry the new aggregate flow can be computed and compared to BWCck. Ash, McDysan [Page 10] Internet Draft GCAC Algorithm Specification January 2011 To expand on the discussion above, let RBWck denote the reserved bandwidth capacity, i.e., the amount of bandwidth that has been allocated to existing aggregate flows for CT c on link k by the actual CAC used in the node. BWMck is the difference between RBWck and the aggregate sustained bandwidth (SBWck) of the existing aggregate flows. SBWck can be either the sum of existing aggregate flows' declared sustainable bandwidth (SBWi for aggregate flow i) or a smaller - possibly measured or estimated - value. Let MRBCck denote the maximum reservable bandwidth that is usable by aggregate flows for CT c on link k. The following diagram illustrates the relationship among MRBCck, RBWck, BWMck, SBWck and ULBCck: |<-- BWMck-->|<----- ULBCck ----->| |---------------|------------|--------------------| 0 SBWck RBWck MRBCck The assumption is that BWMck is proportional to some measure of the burstiness of the traffic generated by the existing aggregate flows, this measure being the standard deviation of the aggregate traffic rate defined as the square root of the sum of SBWi(PBWi - SBWi) over all existing aggregate flows, where SBWi and PBWi are declared sustainable and peak bandwidth for aggregate flow i, respectively. This assumption is based on the simple argument that RBWck needs to be some multiple of the standard deviation above the mean aggregate traffic rate to guarantee some levels of packet loss ratio and packet queuing time. Depending on the actual CAC used, the BWMck-to-standard-deviation ratio may vary as aggregate flows are established and taken down. It is reasonable to assume, however, that around some sufficiently large value of RBWck, this ratio will not vary significantly. What this means is a link can advertise its current BWMck-to-standard-deviation ratio (actually in the form of VF, which is the square of this number), and the MPLS GCAC algorithm Can use this number to estimate how much bandwidth is required to carry an additional aggregate flow. The MPLS GCAC algorithm is derived as follows: Consider an aggregate flow bandwidth change request DBWi with peak bandwidth PBWi and sustainable bandwidth SBWi, and a link with the following MPLS GCAC parameters: ULBCck, BWMck, and VFck for CT c and link k. Denote the variance (i.e., square of standard deviation) of the aggregate traffic rate by VARk (not advertised). Denote other unadvertised MPLS GCAC quantities by RBWck and SBWck. Then, VARk = SUM SBWi*(PBWi-SBWi) (1) over existing aggregate flows i and Ash, McDysan [Page 11] Internet Draft GCAC Algorithm Specification January 2011 BWMck**2 VFck = ---------- (2) VARk Using the above equation, VARk can be computed from the advertised VFck and BWMck as: VARk = BWMck**2/VFck. Let DBWi be the additional bandwidth capacity needed to carry the Flow with aggregate bandwidth DBWi. The MPLS GCAC algorithm Basically computes DBWi from the advertised MPLS GCAC parameters and the new aggregate flow's traffic descriptors, and compares it with ULBCck. If DBWi =< ULBCck then the link is included for path selection consideration; otherwise, it is excluded, i.e., Include link k ULBCck = DBWi (3) < Exclude link k Let BWMcknew denote the bandwidth margin if the new aggregate flow were accepted. Denote other 'new' quantities by RBWcknew, SBWcknew, and VARnew. Then, DBWi = BWMcknew - BWMck + SBWi (4) since BWMcknew = RBWcknew - SBWcknew, BWMck = RBWck - SBWck, and SBWcknew - SBWck = SBWi. Substituting (4) into (3), rearranging terms, and squaring both sides yield: Include link k [ULBCck+BWMck-SBWck]**2 = BWMcknew**2 (5) < Exclude link k Using the MPLS GCAC assumption made earlier, BWMcknew**2 can be computed as: BWMcknew**2 = VFck * VARnew, (6) Where VARnew = VARk + SBWck * (PBWi-SBWi). (7) Substituting (2), (6) and (7) into (5) yields: Include link k [ULBCck+BWMck-SBWi]**2 = BWMck**2 + VFck*SBWi(PBWi-SBWi), (8) < Exclude link k Ash, McDysan [Page 12] Internet Draft GCAC Algorithm Specification January 2011 and moving BWMck**2 to the left-hand side and rearranging terms yield Include link k [ULBCck-SBWi] * [ULBCck-SBWi+2*BWMck] = VFck*SBWi(PBWi-SBWi) (9) < Exclude link k Equation (9) represents the constrained shortest path first (CSPF) method implemented by most vendors and deployed by most service providers in MPLS networks. Note that VF and BWM are frequently assumed to be zero, and if so then if SBWi = ULBCck the link is included. In general DBWi is between SBWi and PBWi. So the above test is not necessary for the cases ULBCck = PBWi and ULBCck < SBWi. In the former case, the link is included; in the latter case, the link is excluded. Exclude Include |<--- link ---->|<-- Test (9)-->|<--- link ----->| |---------------|---------------|----------------| ULBCck SBWi PBWi MPLS GCAC must not reject a best effort (BE, unassigned bandwidth) aggregate flow request based on bandwidth availability but it may reject based on other reasons such as number of BE flows exceeding a chosen threshold. MPLS GCAC defines only one parameter for BE service category - maximum bandwidth (MBW) - to advertise how much capacity is usable for BE flows. The purpose of advertising this parameter is twofold: MBW can be used for path optimization, and MBW = 0 is used to indicate that a link is not accepting any (additional) BE flows. Demand overbooking of LSP bandwidth is employed and must be compliant with RFC 4124 and RFC 3270 to over/under-subscribe requested capacity. It is simplest to use only one oversubscription method, i.e., the GCAC algorithm assumes oversubscription of demands per CT, both within domains and for interworking between domains. The motivation is that interworking may be infeasible between domains if use different overbooking models are used. Note that the same assumption was made for DSTE bandwidth constraints models, in that the GCAC algorithm assumes a consistent DSTE bandwidth constraints model is used within each domain, and interoperability of bandwidth constraints models across domains. 4. Informative References [PNNI] ATM Forum Technical Committee, "Private Network-Network Interface Specification Version 1.1 (PNNI 1.1)," af-pnni-0055.002, April 2002. [EANTC] "Multi-vendor Carrier Ethernet Interoperability Event," Carrier Ethernet World Congress 2006, Madrid Spain, September 2006. Ash, McDysan [Page 13] Internet Draft GCAC Algorithm Specification January 2011 [TQO] Ash, G., "Traffic Engineering & QoS Optimization of Integrated Voice & Data Networks," Elsevier, 2006. [FEEDBACK] Ashwood-Smith, P., et al., "Improving Topology Data Base Accuracy with Label Switched Path Feedback in Constraint Based Label Distribution Protocol," IETF work in progress. 5. Authors' Addresses Gerald Ash (Editor) AT&T Email: gash5107@yahoo.com Dave McDysan Verizon 22001 Loudoun County PKWY Ashburn, VA 20147 Email: dave.mcdysan@verizon.com Appendix A: Example MPLS GCAC Implementation Including Path Selection, Bandwidth Management, QoS Signaling, & Queuing Figure 2 illustrates an example of the integrated voice/data MPLS GCAC method in which bandwidth is allocated on an aggregated basis to the individual DSTE CTs. In the example method, CTs have different priorities including high-priority, normal-priority, and best-effort priority services CTs. Bandwidth allocated to each CT is protected by bandwidth reservation methods, as needed, but bandwidth is otherwise shared among CTs. Each originating GEF monitors CT bandwidth use on each MPLS LSP [RFC 3031] for each CT, and determines when CT LSP bandwidth needs to be increased or decreased. In Figure 2, changes in CT bandwidth capacity are determined by GEFs based on an overall aggregated bandwidth demand for CT capacity (not on a per-connection/per-flow demand basis). Based on the aggregated bandwidth demand, GEFs make periodic discrete changes in bandwidth allocation, that is, either increase or decrease bandwidth on the LSPs constituting the CT bandwidth capacity. For example, if aggregate flow requests are made for CT LSP bandwidth that exceeds the current DSTE tunnel bandwidth allocation, the GEF initiates a bandwidth modification request on the appropriate LSP(s), which may entail increasing the current LSP bandwidth allocation by a discrete increment of bandwidth denoted here as DBW, where DBW is the additional amount needed by the current aggregate flow request. The bandwidth admission control for each link in the path is performed by the GCF function based on the status of the link using the bandwidth allocation procedure described below, where we further describe the role of the different parameters such as reserved bandwidth threshold RBT shown in Figure 2 in the admission control procedure. Also, the GEF periodically monitors LSP bandwidth use, and if bandwidth use falls below the current LSP allocation the GEF initiates a bandwidth modification request to decrease the LSP bandwidth allocation down to the current level of bandwidth utilization. Ash, McDysan [Page 14] Internet Draft GCAC Algorithm Specification January 2011 HIGH-PRIORITY-CT LSP +----+======================+----+======================+----+ |GEF1|NORMAL-PRIORITY-CT LSP| VN | |GEF2| | |======================| |======================| | | |LOW-PRIORITY/BE-CT LSP| | | | +----+======================+----+======================+----+ LEGEND ------ BE - BEST EFFORT CT - CLASS TYPE GEF- GCAC EDGE FUNCTION LSP - LABEL SWITCHED PATH VN - VIA NODE o distributed bandwidth allocation method applied on a per-class-type (CT) LSP basis o GEF allocates bandwidth to each CTc LSP based on demand - GEF decides CTc LSP bandwidth increase based on + aggregate flow sustained bandwidth SBWi & variance factor VFck + routing priority (high, normal, best-effort) + CTc reserved bandwidth RBWck & bandwidth constraint BWCck + link reserved bandwidth threshold RBTk & unreserved bandwidth ULBk - GEF periodically decreases CTc LSP bandwidth allocation based on bandwidth use o VNs send crankback message to GEF if DSTE-MAR bandwidth allocation rules not met o link(s) not meeting request excluded from TE topology database before attempting another explicit route computation Figure 2 Per-Class-Type (CT) LSP Bandwidth Management GEF uses SBWi, VFck, RBWck, BWCck, RBTk, and ULBk for LSP bandwidth allocation decisions and IGP routing, and uses RBWcl and UBWcl to track per-CT LSP bandwidth allocation and reserved bandwidth. In making a CT bandwidth allocation modification, the GEF determines the CT priority (high, normal, or best-effort), CT bandwidth-in-use, and CT bandwidth allocation thresholds. These parameters are used to determine whether network capacity can be allocated for the CT bandwidth modification request. A.1 Example of Path Selection & Bandwidth Management Implementation Ash, McDysan [Page 15] Internet Draft GCAC Algorithm Specification January 2011 In OSPF link-state flooding is used to make status updates. This is a state dependent routing (SDR) method where CSPF is typically used to alter LSP routing according to the state of the network. In general, SDR methods calculate a path cost for each connection request based on various factors such as the load-state or congestion state of the links in the network. In contrast, the example MPLS GCAC algorithm uses event dependent routing (EDR), where LSP routing is updated locally on the basis of whether connections succeed or fail on a given path choice. In the EDR learning approaches, the path last tried, which is also successful, is tried again until congested, at which time another path is selected at random and tried on the next connection request. EDR path choices can also be changed with time in accordance with changes in traffic load patterns. Success-to-the-top (STT) EDR path selection, illustrated in Figure 3, uses a simplified decentralized learning method to achieve flexible adaptive routing. The primary path path-p is used first if available, and a currently successful alternate path path-s is used until it is congested. In the case that path-s is congested (e.g., bandwidth is not available on one or more links), a new alternate path path-n is selected at random as the alternate path choice for the next connection request overflow from the primary path. Bandwidth reservation is used under congestion conditions to protect traffic on the primary path. STT-EDR uses crankback when an alternate path is congested at a via node, and the connection request advances to a new random path choice. In STT-EDR, many path choices can be tried by a given connection request before the request is rejected. Figure 3 illustrates the example MPLS GCAC operation of STT-EDR path selection and admission control combined with per-CT bandwidth allocation. GEF A monitors CT bandwidth use on each CT LSP, and determines when CT LSP bandwidth needs to be increased or decreased. Based on the bandwidth demand, GEF A makes periodic discrete changes in bandwidth allocation, that is, either increase or decrease bandwidth on the LSPs constituting the CT bandwidth capacity. If aggregate flow requests are made for CT LSP bandwidth that exceeds the current LSP bandwidth allocation, GEF A initiates a bandwidth modification request on the appropriate LSP(s). |<----- ULBk <= RBTk ---->| LSP-p |------------------------|-------------------------| A B E |<-- ULBk <= RBTk -->| LSP-s |---------------|--------------------|-------------| A C D E Example of STT-EDR routing method: 1. if node A to node E bandwidth needs to be modified (say increased by DBW) primary LSP-p (e.g., LSP A-B-E) is tried first 2. available bandwidth tested locally on each link in LSP-p, if bandwidth not available (e.g., unreserved bandwidth on link BE less than reserved bandwidth threshold & this CT is above its Ash, McDysan [Page 16] Internet Draft GCAC Algorithm Specification January 2011 bandwidth allocation), crankback to node A 3. if DBW is not available on one or more links of LSP-p, then the currently successful LSP-s (e.g., LSP A-C-D-E) is tried next 4. if DBW is not available on one or more links of LSP-s, then a new LSP is searched by trying additional candidate paths until a new successful LSP-n is found or the candidate paths are exhausted 5. LSP-n is then marked as the currently successful path for the next time bandwidth needs to be modified Figure 3 STT-EDR Path Selection & Per-CT Bandwidth Allocation For example, in Figure 3 if the LSR-A to LSR-E bandwidth needs to be modified, say increased by DBW, the primary LSP-p (A-B-E) is tried first. The bandwidth admission control for each link in the path is performed based on the status of the link using the bandwidth allocation procedure described below, where we further describe the role of the reserved bandwidth RBWck shown in Figure 3 in the admission control procedure. If the first choice LSP cannot admit the bandwidth change, node A may then try an alternate LSP. If DBW is not available on one or more links of LSP-p, then the currently successful LSP-s A-C-D-E (the 'STT path') is tried next. If DBW is not available on one or more links of LSP-s, then a new LSP is searched by trying additional candidate paths (not shown) until a new successful LSP-n is found or all of the candidate paths held in the cache are exhausted. LSP-n is then marked as the currently successful path for the next time bandwidth needs to be modified. DBW is set to the additional amount of bandwidth required by the aggregate flow request. If all cached candidate paths are tried without success, the search then generates a new CSPF path. If a new CSPF calculation succeeds in finding a new path, that path is made the stored path and the bottom cached path falls off the list. If all cached paths fail and a new CSPF path cannot be found, then the original stored LSP is retained. New requests go through the same routing algorithm again, since available bandwidth, etc. has changed and new requests might be admitted. Also, GEF A periodically monitors LSP bandwidth use (e.g., once each 2 minute interval), and if bandwidth use falls below the current LSP allocation, the GEF initiates a bandwidth modification request to decrease the LSP bandwidth allocation down to the currently used bandwidth level. Bandwidth reservation occurs in STT-EDR with PATH/RESV messages per application of RFC 4804. In the STT-EDR computation most of the time the primary path and stored path will succeed and no CSPF calculation needs to be done. Therefore the STT-EDR algorithm achieves good throughput performance while significantly reducing link-state flooding control load [TQO]. An analogous method was proposed earlier in the MPLS working group [FEEDBACK], where feedback based on failed path routing attempts is kept by the TE data base and used when running CSPF. Ash, McDysan [Page 17] Internet Draft GCAC Algorithm Specification January 2011 In the example GCAC method, bandwidth allocation to the primary and alternate LSPs uses the MAR bandwidth allocation procedure, as described below. Path selection uses a topology database that includes available bandwidth on each link. From the topology database pruned of links that do not meet the bandwidth constraint the GEF determines a list of shortest paths by using a shortest path algorithm (e.g., Bellman-Ford, Dijkstra methods). This path list is determined based on administrative weights of each link, which are communicated to all nodes within the routing domain (e.g. administrative weight = 1 + e x distance, where e is a factor giving a relatively smaller weight to the distance in comparison to the hop count). Analysis and simulation studies of a large national network model show that 6 or more primary and alternate cached paths provide the best overall performance. PCE [RFC 4655, RFC 4657, RFC 5440] is used to implement inter-area/inter-AS/ inter-SP path selection algorithms, including alternate path selection, path reoptimization, backup path computation to protect DSTE tunnels, and inter-area/inter-AS/inter-SP traffic engineering. The DSTE tunnels are protected against failure by using MPLS Fast Reroute [RFC 4090]. OSPF TE extensions [RFC 4203] are used to support the TE database (TED) required for implementation of the above PCE path selection methods. The example MPLS GCAC method incorporates the MAR bandwidth constraint model [RFC 4126] incorporated within DSTE [RFC 4124]. In DSTE/MAR, a small amount of reserved bandwidth RBTk governs the admission control on link k. Associated with each CTc on link k are the allocated bandwidth constraints BWCck to govern bandwidth allocation and protection. The reservation bandwidth on a link, RBTk, can be accessed when a given CTc has reserved bandwidth RBWck below its allocated bandwidth constraint BWCck. However, if RBWck exceeds its allocated bandwidth constraint BWCck, then the reservation bandwidth threshold RBTk cannot be accessed. In this way, bandwidth can be fully shared among CTs if available, but is otherwise protected by bandwidth reservation methods. Therefore, bandwidth can be accessed for a bandwidth request = DBW for CTc on a given link k based on the following rules: For an LSP on a high priority or normal priority CTc: If RBWck = BWCc: admit if DBW = ULBk If RBWck > BWCc: admit if DBW = ULBk - RBTk; or, equivalently: If DBW = ULBCck, admit the LSP. where Ash, McDysan [Page 18] Internet Draft GCAC Algorithm Specification January 2011 ULBCck = idle link bandwidth on link k for CTc = ULBk - delta0/1(CTck) x RBWk delta0/1(CTck) = 0 if RBWck < BWCck delta0/1(CTck) = 1 if RBWck = BWCck For an LSP on a best-effort priority CTc: allocated bandwidth BWCc = 0; DiffServ queuing serves best-effort packets only if there is available link bandwidth. In setting the bandwidth constraints for CTck, for a normal priority CTc, the bandwidth constraints BWCck on link k are set by allocating the maximum reservable link bandwidth MRBk in proportion to the forecast or measured traffic load bandwidth TRAF_LOAD_BWck for CTc on link k. That is: PROPORTIONAL_ BWck = TRAF_LOAD_ BWck/[S (c) {TRAF_LOAD_ BWck, c=0, MaxCT-1}] x MRBk For normal priority CTc: BWCck = PROPORTIONAL_ BWck For a high priority CT, the bandwidth constraint BWCck is set to a multiple of the proportional bandwidth. That is: For high priority CTc: BWCck = FACTOR * PROPORTIONAL_ BWck where FACTOR is set to a multiple of the proportional bandwidth (e.g., FACTOR = 2 or 3 is typical). This results in some over-allocation ('overbooking') of the link bandwidth, and gives priority to the high priority CTs. Normally the bandwidth allocated to high priority CTs should be a relatively small fraction of the total link bandwidth, a maximum of 10-15 percent being a reasonable guideline. As stated above, the bandwidth allocated to a best-effort priority CTc is set to zero. That is: For best-effort priority CTc: BWCck = 0 Analysis and simulation studies show that the level of reserved capacity RBTk in the range of 3-5% of link capacity provides the best overall performance. We give a simple example of the MAR bandwidth allocation method. Assume that there are two class-types: CT0, CT1, and a particular link with Ash, McDysan [Page 19] Internet Draft GCAC Algorithm Specification January 2011 MRB = 100 with the allocated bandwidth constraints set as follows: BWC0 = 30 BWC1 = 50 These bandwidth constraints are based on the forecast traffic loads, As discussed above. Either CT is allowed to exceed its bandwidth constraint BWCc as long a there is at least RBW units of spare bandwidth remaining. Assume RBT = 10. So under overload, if RBW0 = 20 RBW1 = 70 Then for this loading UBW = 100 - 20 - 70 = 10 If a bandwidth increase request = 5 = DBW arrives for Class Type 0 (CT0), then accept for CT0 since RBW0 < BWC0 and DBW (= 5) < ILBW (= 10). If a bandwidth increase request = 5 = DBW arrives for Class Type 1 (CT1), then reject for CT1 since RBW1 > BC1 and DBW (= 5) > ILBW - RBT = 10 - 10 = 0. Therefore CT0 can take the additional bandwidth (up to 10 units) if the demand arrives, since it is below its BWC value. CT1, however, can no longer increase its bandwidth on the link, since it is above its BWC value and there is only RBT=10 units of idle bandwidth left on the link. If best effort traffic is present, it can always seize whatever idle bandwidth is available on the link at the moment, but is subject to being lost at the queues in favor of the higher priority traffic. On the other hand, if a request arrives to increase bandwidth for CT1 by 5 units of bandwidth (i.e., DBW = 5). We need to decide whether to admit this request or not. Since for CT1 RBW1 > BWC1 (70 > 50), and DBW > UBW - RBT (i.e., 5 > 10 - 10) the bandwidth request is rejected by the bandwidth allocation rules given above. Now let's say a request arrives to increase bandwidth for CT0 by 5 units of bandwidth (i.e., DBW = 5). We need to decide whether to admit this request or not. Since for CT0 RBW0 < BWC0 (20 < 30), and DBW < UBW (i.e., 5 < 10) The example illustrates that with the current state of the link and Ash, McDysan [Page 20] Internet Draft GCAC Algorithm Specification January 2011 the current CT loading, CT1 can no longer increase its bandwidth on the link, since it is above its BWC1 value and there is only RBW=10 units of spare bandwidth left on the link. But CT0 can take the additional bandwidth (up to 10 units) if the demand arrives, since it is below its BWC0 value. For the example GCAC the method for bandwidth additions and deletions to LSPs in is as follows. The bandwidth constraint parameters defined in the MAR method [RFC 4126] do not change based on traffic conditions. In particular, these parameters defined in [RFC 4126], as described above, are configured and do not change until reconfigured: MRBk, BWCck, and RBTk. However, the reserved bandwidth variables change based on traffic: RBWck, ULBk, and UCBck. The RBWck and bandwidth allocated to each DSTE/MAR tunnel is dynamically changed based on traffic: it is increased when the traffic demand increases (using RSVP aggregation) and it is periodically decreased when the traffic demand decreases. Furthermore, if tunnel bandwidth cannot be increased on the primary path, an alternate LSP path is tried. When LSP tunnel bandwidth needs to be increased to accommodate a given aggregate flow request, the bandwidth is increased by the amount of the needed additional bandwidth, if possible. The tunnel bandwidth quickly rises to the currently needed maximum bandwidth level, wherein no further requests are made to increase bandwidth, since departing flows leave a constant amount of available or spare bandwidth in the tunnel to use for new requests. Tunnel bandwidth is reduced every 120 seconds by 0.5 times the difference between the allocated tunnel bandwidth and the current level of the actually utilized bandwidth (i.e., the current level of spare bandwidth). Analysis and simulation modeling results show that these parameters provide the best performance across a number of overload and failure scenarios. A.2 Example of QoS Signaling Implementation The example GCAC method uses next steps in signaling (NSIS) algorithms for signaling MPLS GCAC QoS requirements of individual flows. NSIS QoS signaling is being specified in the IETF NSIS working group, and extends RSVP signaling by defining a two-layer QoS signaling model: o NSIS transport layer protocol (NTLP) [RFC 5961] o NSIS QoS signaling layer protocol (QoS-NSLP) [RFC 5974] RFC 5975 defines a QoS specification (QSPEC) object, which contains the QoS parameters required by a QoS model (QOSM) [RFC 5976]. A QOSM specifies the QoS parameters and procedures that govern the resource management functions in a QoS-aware router. Multiple QOSMs can be supported by the QoS-NSLP, and the QoS-NSLP allows stacking of QSPEC parameters to accommodate different QOSMs being used in different domains. As such, NSIS provides a mechanism for interdomain QoS signaling and interworking. Ash, McDysan [Page 21] Internet Draft GCAC Algorithm Specification January 2011 The QSPEC parameters defined in [RFC 5975] include, among others: TRAFFIC DESCRIPTION Parameters: o Parameters CONSTRAINTS Parameters: o , , , Parameters o Parameter o Parameter o Parameter o Parameter o & Parameters The ability to achieve end-to-end QoS through multiple Internet domains is also an important requirement. MPLS GCAC end-to-end QoS signaling ensures that end-to-end QoS is met by applying the Y.1541-QOSM [RFC 5976], as now illustrated. The QoS GEF initiates an end-to-end, inter-domain QoS RESERVE message containing the QoS parameters, including for example, , , , and perhaps other parameters for the flow. The RESERVE message may cross multiple domains, each node on the data path checks the availability of resources and accumulating the delay, delay variation, and loss ratio parameters, as described below. If an intermediate node cannot accommodate the new request the reservation is denied. If no intermediate node has denied the reservation, the RESERVE message is forwarded to the next domain. If any node cannot meet the requirements designated by the RESERVE message to support a QoS parameter, for example, it cannot support the accumulation of end-to-end delay with the parameter, the node sets a flag that will deny the reservation. Also, parameter negotiation can be done, for example, by setting the to a lower class than specified in the RESERVE message. When the available must be reduced from the desired , say because the delay objective has been exceeded, then there is an incentive to respond to the GEF with an available value for delay in the parameter. For example, if the available is 150 ms (still useful for many applications) and the desired QoS is 100 ms (according to the desired Class 0), then the response would be that Class 0 cannot be achieved and Class 1 is available (with its 400 ms objective). In addition, the response includes an available = 150 ms, making acceptance of the available more likely. A.3 Example of Queuing Implementation In this MPLS GCAC example queuing behaviors for the CT traffic priorities incorporates DiffServ mechanisms and assumes separate Ash, McDysan [Page 22] Internet Draft GCAC Algorithm Specification January 2011 queues based on EXP/COS bits. The queuing implementation assumes 3 levels of priority, high, normal, and best effort. These queues include two EF priority queues [RFC 3246, 5865], with the highest priority assigned to emergency traffic (GETS/ETS/E911) and the second priority assigned to normal priority real-time (e.g., VoIP) traffic. Separate AF queues [RFC 2597] are used for data services, such as premium private data and premium public data traffic, and a separate best-effort queue is assumed for the best-effort traffic. All queues have static bandwidth allocation limits applied based on the level of forecast traffic on each link, such that the bandwidth limits will not be exceeded under normal conditions, allowing for some traffic overload. In the MPLS GCAC method high-priority, normal-priority, and best-effort traffic share the same network, under congestion the DiffServ priority-queuing mechanisms push out the best-effort priority traffic at the queues so that the normal-priority and high-priority traffic can get through on the MPLS-allocated LSP bandwidth. Ash, McDysan [Page 23]