Network Working Group Jun Kyun Choi (ICU) Internet Draft Dipnarayan Guha Category: Informational Expiration Date: August 2006 March 2006 Fast End-to-End Restoration Mechanism with SRLG using Centralized Control draft-choi-pce-e2e-centralized-restoration-srlg-05.txt Status of this Memo Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. Copyright (C) The Internet Society (2006). Abstract This draft describes the concept of the Shared Link Risk Group (SRLG) based logical ring configuration and recovery method using ring SRLG for the purpose of PCE-based backup path computation. In this restoration architecture, backup paths can be easily established through the end-to-end path which follows from the logical ring configuration. It guarantees the establishment of backup path disjoint from the working path at all levels. To take advantage of bandwidth considerations and fast restoration mechanisms, a centralized Controller is used to provide dedicated protection to Optical Transport Networks using the SRLG concept. Choi, Guha Informational [Page 1] Internet Draft draft-choi-pce-e2e-centralized-restoration-srlg-05.txt March 2006 A robust and efficient signaling protocol, PCEMP, is used to distribute the mapping table from the Controller (PCE node) to the nodes in the optical transport network and for informing a failure from a node to the corresponding Controller using PCEMP message exchanges. This draft is in conjunction to explore the possibility of explicitly including PCE-based backup path computation within the scope of the PCE WG Charter Choi, Guha Informational [Page 2] Internet Draft draft-choi-pce-e2e-centralized-restoration-srlg-05.txt March 2006 Table of Contents 1 Terminology ................................................. 4 2 Introduction ................................................ 5 3 Network Architecture for centralized control using SRLG ..... 6 3.1 Introduction to the centralized Controller .............. 7 3.2 Network structure ....................................... 7 3.3 Control structure ....................................... 8 3.4 Control plane hierarchy architecture for SRLG based protection and PCE based recovery ........................... 8 4 Logical ring configuration based on SRLG .................... 9 4.1 Logical ring with SRLG .................................. 9 4.2 Segment wise logical ring using the centralized Controller in the PCE node .................................. 11 4.3 Resource allocation with SRLG by the Controller ......... 11 5 Integrated Layer survivability and recovery mechanisms ...... 12 5.1 PCE-based Protection and Recovery Mechanisms ............ 12 5.2 Protocol based Ring Recovery mechanisms using the Controller in the PCE node .................................. 13 6 Key features of PCEMP ....................................... 14 6.1 Other features of PCEMP ................................. 14 6.2 Protocol level hierarchy architecture on the control plane ....................................................... 16 6.3 Role of CC and SPC ...................................... 17 6.4 Protocol implementation considerations using PCEMP ...... 17 6.5 Inter-domain point-to-multipoint considerations ......... 19 7 Conclusion .................................................. 19 8 Security Considerations ..................................... 20 9 IANA Considerations ......................................... 20 10 Acknowledgements ............................................ 20 11 Intellectual Property Considerations ........................ 20 12 Normative References ........................................ 21 13 Informational References .................................... 21 14 Authors' Addresses .......................................... 23 15 Full Copyright Statement .................................... 23 Choi, Guha Informational [Page 3] Internet Draft draft-choi-pce-e2e-centralized-restoration-srlg-05.txt March 2006 1. Terminology This memo makes use of the following terms: 1. Path Computation Element (PCE): an entity that is responsible for computing/finding inter/intra domain LSPs. This entity can simultaneously act as a client and a server. Several PCEs can be deployed in a given Autonomous System (AS). 2. Path Computation Element node (PCE Node): a network processing unit comprising of a PCE unit. This can be embedded in a router or a switch. 3. Domain: Denotes an Autonomous system (AS) within the scope of this draft. Choi, Guha Informational [Page 4] Internet Draft draft-choi-pce-e2e-centralized-restoration-srlg-05.txt March 2006 2. Introduction With the rapid growth of the Internet, the advance of wavelength division multiplexing (WDM) technology, and the integration of various communication technologies, the communication network is evolving to include huge bandwidth-intensive network applications. Survivability refers to the ability of the network to transfer the interrupted service onto spare network capacity to circumvent a point of failure in the network and it is a critical requirement for IP over WDM networks. In a WDM network, a link failure, fiber cut, node down may be due to human error or natural disasters leading to the loss of large amount of data and multiple failures of all the optical paths that traverse the fiber. So, we have to develop appropriate recovery architecture and strategies that minimize the data loss when a failure on a path occurs in WDM based GMPLS (Generalized Multi-Protocol Label Switching) networks that will offer fast recovery, with speeds comparable to SONET, and versatile survivable functions. Recovery techniques are broadly classified by computation timing as pre-computed and dynamic and by their type of rerouting as link-based, partial path-based and path-based. In dynamic techniques, a search for backup path is initiated upon occurrence of a failure. A backup path is computed based on availability of resource at that time of failure. While dynamic techniques provide better resource utilization, they suffer from long delays to search and reroute the traffic on to the backup path and there is no guarantee that the connection can be restored upon failure. Dynamic techniques provide a best-effort type of service. In protection techniques the primary and backup routes are computed and resources are reserved for backup paths before the connection is established. Upon occurrence of a failure the backup path is established and traffic is immediately routed on to the backup path. A pre-computed method avoids long delays in setting up backup paths upon failure. The pre-computed techniques also provide guarantee that a connection can be restored in the event of failure. According to range of rerouting, the recovery techniques are classified into link-based, segment-wised based and path-based recovery. Link-based techniques reroute disrupted traffic around the failed link. This approach requires the ability to identify a failed link at both ends. It also makes recovery more difficult in the event of a node failure. Furthermore, it limits the choice of backup path and thus may use more capacity, while path-base techniques replace the whole path between the two endpoints of a demand. The path-based techniques have better resource utilization while span-based techniques have better recovery time. Therefore, we focus on the path-based recovery, called end-to-end recovery. Choi, Guha Informational [Page 5] Internet Draft draft-choi-pce-e2e-centralized-restoration-srlg-05.txt March 2006 Most backbone networks have a mesh physical topology. However, the mesh-based schemes have some shortcomings. They are not as fast in failure recovery as ring-based schemes and complicated working path and backup path routing arrangements are used to achieve optimality, and the optimization procedures used for mesh-based schemes are very computationally intensive that are virtually impossible to solve for very large networks. SONET networks are, for the most part, protected in the form of rings. The rings are interconnected in order to provide overall network connectivity and protection. It is possible to design a fast and simple recovery strategy for ring network so ring protection switching is well established and robust in these days. Therefore, we need the ring concept in the mesh optical network. This draft describes the Ring configuration based on SRLG information and the approach of using a centralized Controller that enables fast restoration in optical transport networks. Using the centralized Controller guarantees the establishment of a disjoint end-to-end restoration path from failed working paths, and helps in achieving near real-time end-to-end restoration in optical transport networks. This forms the basis of including PCE-based backup path computation within the scope of the PCE WG Charter. 3. Network Architecture for centralized Control using SRLG __________________Control/Management Network__________________________ / \ / +--+ PCEMP +--+ PCEMP +--+ \ / | |-------------------| |--------------------------| | \ / +--+Centralized +--+Centralized +--+Centralized \ \ /|| Controller /|| Controller / || Controller / \ / | \ (PCE node) / | \ (PCE node) / | \ / \___|_|__\_________________|_|__\______________________|__ |__\__________/ | | | | | \ | | \ / | \ / | | / | | Control / | \-- PCEMP ---- / | \ ------ PCEMP--- / | \ Channel / | \ / | \ / | \ _______/____|_____ \___ _____/____|______\_____ ______|______ |______\_____ \ +--+ | +--+ | \ +--+ | +--+ | \ +--+ | +--+ | \ | |OXC | | | | \ | | | OXC| | | \ | |OXC | | | | \ +--+ | +--+ / \ +--+ | +--+ / \ +--+ | +--+ / \ \ | / / \ \ | / / \ \ | / / \ \ | / / \ \ | Xfail/ \ Data-> \ | / / \ \ | / / \ \ \ / / \Channel \ | / / \ +--+ / \ +--+ / \ +--+ / \ | | / \ | | / \ | | / \ +--+ / \ +--+ / Optical \ +--+ / \ / \ / Transport \ / \____/ \_____/ Network \________/ Figure 1. Network Architecture for Centralized Control using SRLG Choi, Guha Informational [Page 6] Internet Draft draft-choi-pce-e2e-centralized-restoration-srlg-05.txt March 2006 Generalized Multi-protocol Label Switching (GMPLS) enables service providers to build networks with the flexibility of IP, the reliability of SONET/SDH and the scalability of optics at costs to offer services at extremely competitive prices. GMPLS supports a concept of common control of packet, TDM, wavelength and fiber services, and is a key enabler of the new network architecture model. 3.1 Introduction to the centralized Controller: This draft addresses the coordination of the IP and optical network survivability based on a consolidated network element, the controller and a truly integrated control plane. This centralized Controller in the PCE node enables an integrated network architecture where each network layer can freely exchange topology and resource information. This allows network performance to be globally optimized across all layers. In addition, a single control plane and the central controller that manages all network layers greatly simplify network management tasks. As far as control and management types are concerned, they can be classified into three categories: centralized, distributed and hybrid control/management. Each control and management type has its' own advantages and disadvantages, but typical telecommunication networks and automatic switched optical networks (ASON) defined in ITU-T follows a centralized control/management architecture in which the control plane is separate from the data plane. In this draft, for path establishment and protection, we consider the control plane to be separated logically from the data plane. Separate Controllers or control/management networks comprising of a series of Controllers could be connected to each other by through signaling channels and appropriate control message exchanges. If the domains of the control/management networks increase, a hierarchical control/management structure could be applied. This can be fully realized in the centralized Controller through logical functional blocks. We do not restrict the interconnection architecture of the optical transport networks such as overlay model, peer model and augmented model. There is a channel interface between the Controller and any node in the optical transport network, and this can be realized using a number of methods, like Simple Network Management Protocol (SNMP), General Switch Management Protocol (GSMP) etc. In this architecture, the Controller is responsible for path calculation and recovery. Some of the recovery functions are also assigned to nodes in the optical network for the purpose of control and load balancing. The Controller thus forms the building block of a PCE node for the purpose of PCE based backup path computation. 3.2 Network Structure: In this draft, we develop the network architecture with a hierarchical structure following the existing network management Choi, Guha Informational [Page 7] Internet Draft draft-choi-pce-e2e-centralized-restoration-srlg-05.txt March 2006 architecture. We focus on the backbone part of networks where link capacity is at least OC-48 (2.5 Gb/s). Each network node is assumed to have OXC and IP router capabilities in the same hardware setup, which results in the support of multiple traffic types at the same location. The traffic manager in each optical network node also manages multiple traffic types. Each node can communicate directly with the centralized Controller to report its status. As mentioned in Section 3.1, this could be achieved via a network management standard, such as SNMP or GSMP. The Controller takes care of network nodes within the same administrative domain. It also has the responsibility of centralizing domain network management service and integrating the management of the transport network in its respective domain. This structure permits scalability, as well as internetworking of different administrative domains. 3.3 Control Structure: Network elements within each domain communicate with one another via a common control plane. We assume a dedicated out-of-band control channel between two adjacent nodes, and between each node and the centralized Controller. The common control plane can be implemented based on the GMPLS standard. The Resource Reservation Protocol (RSVP) and Constraint-Based Routing Label Distribution Protocol (CR-LDP) extensions to GMPLS can provide traffic engineering in this unified network architecture. Moreover, neighbor discovery and link state update can employ routing protocol Link State Advertisements (LSA), such as the Intermediate System to Intermediate System (IS-IS) and Open Shortest Path First (OSPF) extensions to GMPLS. 3.4 Control Plane hierarchy architecture for SRLG Protection and Recovery: In the integrated control plane proposed here, three levels of functional control hierarchy are mapped into one centralized Controller node and implemented as a single unit. The functional blocks involved in the controller node are: the network processor (the network management system with extended functionalities), the domain processor (the network element management system with extended functionalities), and the node processor. In the first level, the network processor acts as an interface between users and all sub-network domains. Its main functionality is to oversee the provisioning of new connections across multiple sub-networks and to maintain the network-wide topological view. The domain processor supervises tasks within a sub-network domain, such as service provisioning and network status monitoring. It handles requests for connection setup and teardown, and computes explicit paths that meet the SLA of each request. The network Choi, Guha Informational [Page 8] Internet Draft draft-choi-pce-e2e-centralized-restoration-srlg-05.txt March 2006 monitor observes the overall network health and detects failure and repair events. The databases maintained by the domain processor include the domain topology, the domain link state database gathered via the LSA protocol within its domain, and the domain connection database which keeps track of all established connections in the domain. The node processor manages specific functionalities that can be done in a distributed manner at each node, such as overload handling, failure recovery, and status monitoring. It also detects sudden link overloads, conducts a countermeasure and provides rapid protection and restoration capability in times of failure. The databases maintained by the node processor are the local link state and the local connection databases. The local link state is obtained automatically via the neighbor discovery protocol, while the list of local connections is obtained from all connections that traverse the node. The structure of the topology database may contain the combined IP and optical layer topology in a unified form. The link state database contains information not only about link connectivity, but also about the shared-risk link group (SRLG) it belongs to, the resources available on that link, the link protection type, and the link status. This extra information is defined in the LSA extensions to the GMPLS. 4. Logical ring configuration based on SRLG In this part, we describe some of the logical ring configuration methods for optical networks and the functions of the centralized Controller in providing real-time protection and recovery. Ring-based schemes are essentially some extensions of self-healing ring in the mesh topology, and the study of logical ring in mesh network has been developed. 4.1 Logical Ring with SRLG: Shared Risk Link Groups (SRLGs) allow the definition of resources or groups of resources that share the same risk of failure [6]. The knowledge of SRLGs may be used to compute diverse paths that can be used for protection in optical networks. The concept of SRLG has been used to compute a path that is disjoint from a set of links sharing the same risk. When two or more links share the same risk, it may be the case that when a link fails, the others fail at the same time. Proper planning needs to be done for the network to recover from failures due to these risks. The risks are generally represented by SRLGs. The SRLG concept generates another dimension to the existing constraint-based path computation methods traditionally used in hierarchical networks. Choi, Guha Informational [Page 9] Internet Draft draft-choi-pce-e2e-centralized-restoration-srlg-05.txt March 2006 Existing logical ring architectures for recovery do not consider the SRLG information for survivability of working paths and backup paths and is generally configured based on topology information and network characteristics. If the link from the first ingress node is broken, the network cannot provide LSP SRLG disjointness. This is a rather strong bottleneck to support survivability of connections with different bandwidth requirements and QoS constraints. The existing logical ring configuration does not take account into the probability of resource failure and risk of the link. Therefore, the disjoint path may, in some cases, have some problems in being computed and hence the probability of backup path failure increases though the backup path may exist. We need to consider the possibility of failure of the logical ring configuration at the connection setup stage. We propose the network architecture as the concept of the logical ring with the SRLG for reliable transmission in pre-configuration stage using the centralized Controller concept. The proposed network with ring-SRLG is the set of SRLGs with contribution weights per link to avoid backup path failure and guarantee the survivability of traffic. The controller manages the entire domain network, as discussed in Section 3.4. The description follows a logical ring configuration with SRLG for the purpose of restoration in mesh networks. In this architecture, backup paths can be easily established end-to-end using the logical ring configuration. It guarantees the establishment of a backup path that is disjoint from a primary path that is set up. The logical ring with ring-SRLG has both a primary path and a backup path in same ring with one ring-SRLG. Ring-SRLG must support two-way-connectivity, which supports the logical ring architecture in OXC based mesh network and helps in protection and recovery using the centralized Controller concept. Based on a given SRLG table, which is configured at each node by the centralized Controller, one can make rings between a source node and a destination node and many intermediate nodes between the two. To extend network scalability, a distributed domain management system must be used, which is determined by the centralized Controller. In order to configure the SRLG-based logical ring, a control unit handling algorithm may be set up in the centralized Controller which then configures the SRLG-based logical ring as well as GMPLS signaling for LSP setups. Our restoration signaling on the SRLG-based logical ring can allow dynamic network configuration instead of static configuration by operators or management systems. The use of signaling with SRLG via the centralized Controller vastly reduces the complexity of network configuration. Choi, Guha Informational [Page 10] Internet Draft draft-choi-pce-e2e-centralized-restoration-srlg-05.txt March 2006 4.2 Segment wise logical ring using the centralized Controller: As a network becomes large, the possibility of the size of the ring pattern also became large. So, applying ring does not promote efficiency in terms of end-to-end delay and recovery time. In this section we propose the method, called segment-wised ring [7] that can be effectively applied to real networks without those problems. Additionally, it can support fast recovery and can care for partially multiple simultaneous failures. The main concept of segment-wised ring is to partition a large network into several small networks to configure ring to each small network. This is one of the major functionalities of the centralized Controller, which effectively partitions the addressed domain using the three functional blocks, the Network Processor, Domain Processor and Node Processor. Sub-networks are chosen according to network provisioning such as physical layer conditioning, call demands or QoS demands, which may arise from the user. The following shows segment wise logical ring architecture, with the centralized Controller managing the different sub-networks: Subnetwork 1 Subnetwork 2 Subnetwork 3 +-----------------+--------------+------------------+ | +--+ | +--+ | +--+ | | |PCE| | |PCE| | |PCE| | | //+--+\\ | /+--+\ | //+--+\\ | | // \\ | / \ | // \\ | | // \\ | / \ | // \\ | | +--+ +---+ +---+ +--+ | | |ON| |ON | |ON | |ON| | | +--+ +---+ +---+ +--+ | | \ / | \\ // | \ / | | \ / | \\ // | \ / | | \ / | \\ // | \ / | | +--+ | +--+ | +--+ | | |ON| | |ON| | |ON| | | +--+ | +--+ | +--+ | +-----------------+--------------+------------------+ // : Working path / : backup path Figure 2. Segment-wise ring architecture 4.3 Resource allocation with SRLG by the Controller: The source node can pre-compute the ring configuration based on Choi, Guha Informational [Page 11] Internet Draft draft-choi-pce-e2e-centralized-restoration-srlg-05.txt March 2006 SRLG information during the primary path setup that it receives from the centralized Controller. The network architecture with the concept of logical ring with SRLG for reliable transmission is established via pre-configuration. To discuss about the survivability of logical topology, we consider that the logical topology is redundant (two-connectivity); the logical topology remains connected when a physical link goes down. The ingress nodes should have the SRLG history and Ring-SRLG combined with logical ring. This can be received from the centralized Controller, as described in Section 3.2 and 3.4. The controller can pre-compute the ring architecture before a failure based on network topology information and SRLG contribution weight factors and also configure the ring architecture after failure by allocating resources via signaling for backup purposes. This information is also conveyed to the individual ingress nodes in the domain. 5. Integrated Layer Survivability and Recovery Mechanisms: In this section we discuss of the integrated layer protection mechanisms appropriate for the central Controller. 5.1 Protection and Recovery Mechanisms: A request of LSP establishment from a client network is handled by the Domain Processor of the centralized Controller, as described in Section 3.4. This is mapped by the Controller to the optical transport network and conveyed to the corresponding nodes. When the Controller receives the request, it will try to compute a logical ring encompassing the ingress and the egress node based on the requested traffic parameters and the SRLG properties. There can be two cases what the Controller can do, based on the number of connection setup requests and the number of already established connections in a domain. It can either distribute the LSP mapping information to the participating nodes in the optical transport domain and clear the domain link state database and domain connection database, or provide the mapping links to the participating nodes. i) Distribution of direct mapping information to nodes in the optical network: In this method, the role of Controller is to find a logical ring, to distribute the mapping table and SRLG information between a primary path and a backup path to the nodes in the optical network, and to trigger the establishment of a backup path once a failure occurs. Each node is responsible for maintaining the mapping table and establishment of primary and backup path by using signaling messages. When a node detects a failure, it reports the failure to its corresponding domain Controller. If an end-to-end path protection is used, the Controller triggers the changeover from the primary path to the backup path to the ingress nodes. If the Choi, Guha Informational [Page 12] Internet Draft draft-choi-pce-e2e-centralized-restoration-srlg-05.txt March 2006 backup path is already established, the ingress node simply changes the direction of the traffic flow from the primary path to the backup path. However, if there is not an established backup path, the ingress node tries to set up a backup path in the network when it receives the trigger from the Controller. Since the ingress node has been maintaining the route object for the backup path received from the Controller, the path setup message from the ingress node will propagate via a route that is disjoint with the primary path. ii) Distribution of mapping information link to nodes in the optical network: In this method, the nodes in the optical transport network performs failure detection and switching operation based on the pointers provided by the forwarding table given by the corresponding domain processor. The Controller performs the roles of finding a logical ring as well as signaling to establish a primary and backup path. When a failure is reported from a node in the optical network to the corresponding Controller, the Controller should look up its' internal table in the domain link state and domain connection databases that maintains the backup path mapped to the failed primary path. By using the Backup Route Object [7], the Controller tries to establish the backup path. Once it is done, the setup message of the backup path will be sent from the Controller managing the ingress node domain to the Controller managing the egress node domain. When the backup path is established through the logical ring configuration, each Controller on the path should send the forwarding table to the corresponding node to configure its' forwarding databases. 5.2 Protocol based Ring Recovery Mechanisms using the Controller in the PCE node: ITU-T G.otnpro.2 [9] provides protection switching by using logical ring concept. However, it does not take account into the probability of resource failure and risk of the link according to a lack of true diverse fiber routes. We may need to consider the possibility of failure of the logical ring configuration at the connection setup stage. The APS protocol (ITU-T G.otnpro.2) can be used between the Controller and the optical network nodes in the ring topology. It is ideal to overcome the bottleneck of the usual 50 ms communications delay between Controller and nodes in the optical transport network. Choi, Guha Informational [Page 13] Internet Draft draft-choi-pce-e2e-centralized-restoration-srlg-05.txt March 2006 A robust and efficient signaling protocol should be used to distribute the mapping table from the Controller to the nodes in the optical transport network and for informing a failure from a node to the corresponding Controller. Signaling for restoration is also needed along the primary path and the backup path at the time of connection setup. GMPLS mechanisms are similar to those used for setting up primary paths and backup paths. In order to support our mechanism, GSMP or APS may be extended or a totally new protocol could be proposed. PCEMP, suggested in [10] can be a suitable protocol mechanism for this purpose. 6. Key features of PCEMP This section summarizes the key features of the PCEMP protocol. PCEMP is a generic domain routing and path computation protocol that runs on any PCE unit that is capable of computing a path based on an ordered graph. PCEMP uses data vector techniques for path computation. Individual link state advertisements (LSAs) are mapped onto the computation units directly at TE-LSP setup time. Each PCE unit maintains this mapping information through the controller unit and the mapping synchronization of the Link State Databases (LSDBs) are performed using PCEMP finite state machines. From this central controller sub-units, each PCE constructs a routing table by calculating a shortest data vector tree, the root being the calculating PCE node itself. 6.1 Other features of PCEMP The other features of the PCEMP protocol are: 1. Central Controller (CC). The central controller acts as the originator of the network's local information environment. The controller also acts in the global scenario of inter domain PCEs and inter layer networks. It serves as the key functional point of the data vector driven algorithm for all PCE information and link state synchronizations by co-ordinating LSP advertisements from other PCEs and LSRs (All PCE peers). 2. Soft PCE Controller (SPC). A Soft PCE Controller is an entity designed primarily for protection and fast route establishment in conjunction with the Central Controller. The SPC's primary functionality is to provide a robust and real-time path computation adjacency without crossover delays for the data driven algorithm mechanism. Also, this enables the LSP state and path computation state retention in case of nodal faults and hardware failures. Choi, Guha Informational [Page 14] Internet Draft draft-choi-pce-e2e-centralized-restoration-srlg-05.txt March 2006 3. Support for peer adjacency through non-participating interior nodes. PCEMP treats these nodes opaquely and is able to maintain the PCE adjacencies over inter-domains and inter-layer networks. The protocol is generic and can be easily carried over existing routing mechanisms over non-supporting network clouds. There is no necessity for any additional configuration updates for PCEs attached to such networks for initial discovery as the data driven mechanism is flow based. 4. PCE domain areas (PCEDA). PCEMP allows the formation of distinct PCE domain areas in a specific domain for end-to-end peer participations. This is useful for several reasons. This is in line with the protocol architecture that provides a granularity of data protection within an autonomous system and isolation of data to local branches of the tree. This is also helpful for the design of the PCE units using soft memory techniques and reduces the algorithm operation costs. 5. Data driven mapping of external routing information. In PCEMP, each external route is imported into the PCE domain area in separate data driven computation strategies. This reduces the amount of instantaneous re-computation of routing traffic data. It also enables partial controller database updates when there is a partial external route change. 6. Three level functional control hierarchy. PCEMP has a three level controller hierarchy, intra-PCE-domain, inter-PCE-domain and external-PCE-domain. This is discussed in the context of the Centralized Controller. 7. Virtual link mapping. This is done via the real-time configuration of logical local links based on the data-driven strategy of the algorithm. The mechanism is thus made topology independent and generic. 8. Soft Computation Memory (SCM). This is a novel feature in terms of path computation in the CC and SPC. It helps split the tree into a combination of sparse subtrees for fast computation. SCM can be used to assign metrics of path computation as well as compute data-driven flow based mappings in the PCE. 9. Data-driven routing metric. In PCEMP, the computation metrics are assigned to the outbound router interfaces and the soft memory cycles in the PCE controller unit. The cost of a path is then the weighted sum of the path's component interfaces and the soft memory cycles. The routing and external path metrics can be assigned externally. 10. Flow-based routing. Separate sets of paths can be computed for each type of service. This is done by assigning flow-based metrics to each outgoing router interface. Choi, Guha Informational [Page 15] Internet Draft draft-choi-pce-e2e-centralized-restoration-srlg-05.txt March 2006 6.2 Protocol level hierarchy architecture on the control plane In an integrated control plane, three levels of functional control hierarchy are mapped into one PCE node in the core of the Network Processing engine and implemented as a single unit. PCEMP thus has a three level controller hierarchy, intra-PCE-domain, inter-PCE-domain and external-PCE-domain. The functional blocks involved in the PCE node are: the network processor (the network management system with extended functionalities), the domain processor (the network element management system with extended functionalities), and the node processor. These are invoked by the PCEMP state machines and comprise the fundamental protocol level hierarchies on the control plane. In the first level, the network processor acts as an interface between users and all sub-network domains. Its' main functionality is to oversee the provisioning of new connections across multiple sub-networks and to maintain the network-wide topological view by reducing the computational domains to PCEDAs. The domain processor supervises tasks within a sub-network domain, such as service provisioning and network status monitoring. It handles requests for connection setup and teardown, and computes explicit paths that meet the SLA of each request. The network monitor observes the overall network health and detects failure and repair events. The databases maintained by the domain processor include the domain topology, the domain link state database gathered via the LSA protocol within its domain, and the domain connection database which keeps track of all established connections in the domain. The node processor manages specific functionalities that can be done in a distributed manner at each node, such as overload handling, failure recovery, and status monitoring. It also detects sudden link overloads, conducts a countermeasure and provides rapid protection and restoration capability in times of failure. The databases maintained by the node processor are the local link state and the local connection databases. The local link state is obtained automatically via the neighbor discovery protocol, while the list of local connections is obtained from all connections that traverse the node. These are implemented using soft memory concepts and synchronized using PCEMP. For the purpose of establishing a guaranteed a disjoint backup path and fast restoration techniques in the participating PCEDAs, it is essential that the large scale data processing in the CC and SPC have minimum overhead and processing delay. The CC manages the entire domain network, as discussed before. In this architecture, backup paths can be easily established end-to-end using the logical configuration in the SPCs using PCEMP. Based on the data driven routing metric table, which is configured at each PCE node by the CC, one can make a robust real-time path between a source node and a destination node and many intermediate nodes between the two. Choi, Guha Informational [Page 16] Internet Draft draft-choi-pce-e2e-centralized-restoration-srlg-05.txt March 2006 This is done using soft decision PCEMP algorithms [10]. 6.3 Role of CC and SPC The CC is responsible for handling data vectors and synchronization of different link states. The SPC acts a fast path computation mechanism in case of crossover and faults. This conjunction makes the PCE unit's functioning much more efficient. This has the same implications as the LSA protocol used for peer advertisement and topology discovery. A significant improvement by using PCEMP is that the number of routers that can be attached to a single PCEDA is quite arbitrary. As traffic increases, the SPC eases the stringency of backup path computation and the CC-SPC combination guarantees a disjoint alternate path calculation with the minimum crossover time. This follows from the previous section 6.2. +--------+ +--------+ +--------+ +--------+ +--------+ | PCE | | PCE | | PCE | | PCE | | PCE | | +----+ | | +----+ | | +----+ | | +----+ | | +----+ | | |PCEMP|| | |PCEMP|| | |PCEMP|| | |PCEMP|| | |PCEMP|| | +----+ | | +----+ | | +----+ | | +----+ | | +----+ | | || |====| || |====| || |====| || |====| || | | +----+ | | +----+ | | +----+ | | +----+ | | +----+ | | |PCEMP|| | |PCEMP|| | |PCEMP|| | |PCEMP|| | |PCEMP|| | +----+ | | +----+ | | +----+ | | +----+ | | +----+ | +--------+ +--------+ +--------+ +--------+ +--------+ | ^ PCEMP PCEMP Control plane --------------------V----------- |----------------------------------- Data plane +--------+ +--------+ +--------+ +--------+ +-------- + | Sender |--->| N1 |--->| N2 |--->| N3 |--->| Receiver| | App | | | | | | | | App | +--------+ +--------+ +--------+ +--------+ +---------+ Appp = Application N1, N2, N3 = node ==== = Signaling Messages ---> = Data flow Messages Figure 3. PCE based backup path computation architecture Figure 3 shows a simple PCE based backup computation architecture 6.4 Protocol implementation considerations using PCEMP |----------- (1) ------------>| | | |<-----------(2) -------------| | | |------------(3) ------------>| | | |<-----------(4) -------------| | | |------------(5) ------------>| | | |<-----------(6) -------------| Centralized Controller OXC Figure 4. PCEMP message exchanges between the Controller and the OXC Choi, Guha Informational [Page 17] Internet Draft draft-choi-pce-e2e-centralized-restoration-srlg-05.txt March 2006 For the Centralized Controller-OXC connection, we can use PCEMP messaging for implementing SRLG based fast protection and restoration using PCE techniques. Here is a sample sequence of messaging that helps in the Controller-OXC nodes maintain soft PCEMP status. Details about the protocol messages can be found in [10]. (1): Send a PCEMP Common Header with the COMPUTE_PATH enabled in the PCEMode field and the ACK REQUESTED enabled in the PCEFlag field. The PCEMP message MAY include a PCE SUBOBJECT to inform the responder (OXC) about the initiator's (Centralized Controller) PCEMP-LOCAL-INITIATOR-ADDRESS. In this way the OXC is initialized with the soft-memory based computation for PCEMP FSMs. (2): OXC receives this message and is configured with the soft-memory based path computation states. (If the OXC does not support this, it may be needed to be configured administratively). It sends back an ACK message with the responder's (OXC's) PCEMP-LOCAL-RESPONDER-ADDRESS (3): The Controller sends a PCEMP Common Header with the ESTABLISH_PATH enabled in the PCEMode field and the Protection mode set in the PCEStatus field of the Common Header. It also sends a PCEMP ESTABLISH message with the following parameters: 1. Flag field set to VARIABLE. The MAX_PCE_TIME_FIT is to be negotiated as discussed in [10]. In case this is not able to be negotiated, then the PCEMP ERROR message SHOULD be generated with the ERROR CODE field set to OUT OF TIME, and the Controller node MUST issue a fresh PCEMP ESTABLISH message with the Flag field set to STATIC. 2. The SRLG and bandwidth support parameters are carried in the PCE SUBOBJECT. If this object is absent, a PCEMP ERROR message MAY be generated, or the responder might wish to choose a different granularity of protection. It MAY send a ESTABLISH message with the PCE SUBOBJECTs containing the level of protection required, protection supported and bandwidth/SRLG conditions supported, upon whose receipt the Controller MAY issue a PCEMP TEARDOWN message to stop PCEMP message exchanges altogether. (4): The OXC sends a PCEMP RESPOND message with the corresponding PCE Descriptor ID and a Backup Path Computation ID in the PCE SUBOBJECT. This is matched and stored statically for the lifetime of the path computation between the Controller-OXC so that this ID remains static between them till the path computation is over. If the PCE Descriptor ID changes value, a PCEMP ERROR message MUST be generated with the ERROR CODE field set to PROTOCOL ERROR with its own saved PCE Descriptor ID of the wronged hop in the PCEMP NEGOTIATE OBJECT. It MUST also set the Protection mode in the PCEStatus field of the corresponding outgoing PCEMP Common Header. Choi, Guha Informational [Page 18] Internet Draft draft-choi-pce-e2e-centralized-restoration-srlg-05.txt March 2006 (5): The Centralized Controller issues a PCEMP TEARDOWN message with the RequestType field set to CHANGE IN COMPUTATION STYLE for the OXC to remain PCEMP enabled and the path computation state active. (6): The OXC sends a PCEMP Common Header with the COMPUTE_PATH enabled in the PCEMode field. The PCEMP message MAY include a PCE SUBOBJECT to inform the responder (Centralized Controller) about the initiator's (OXC's) PCEMP-LOCAL-INITIATOR-ADDRESS. 6.5 Inter-domain point-to-multipoint considerations The protocol implementations are followed as in the previous section, except for a few modifications. Before Step (3) as in 6.4, the Controller needs to send a PCEMP Common Header with the COMPUTE_LSP_TYPE set in the PCEMode and the Peer-to-peer mode set to 0 in the PCEStatus, as in [10]. A BANDWIDTH OBJECT MUST also be inserted in the corresponding PCEMP ESTABLISH message. The participating PCE peers thus get information that the path computation session is a point-to-multipoint one. The paths may even be aggregated for multipoint-to-multipoint TE LSP path computation in inter-domains. In this mode of operation, Step (6) in 4.4 is replaced with the OXC (or OXCn, in the more general case) sending a PCEMP Common Header with the COMPUTE_LSP_TYPE enabled in the PCEMode field and the peer-to-peer mode set to 0 in the PCEStatus. A BANDWIDTH OBJECT MUST also be added, as discussed in [10]. The BANDWIDTH OBJECT SHOULD be retained across inter-domains LSPs. 7. Conclusion Network survivability is a critical requirement in high-speed networks. So, recovery mechanisms that can provide fast recovery and efficient capacity are needed. Our proposed network architecture using the centralized Controller considered high survivability of backup path, called ring-SRLG that has grouped traffic driven logical rings and shared resources in GMPLS based networks. Ring-SRLG with the centralized Controller can guarantee the survivability of backup paths with constraints to the other logical ring configuration. Our proposed backup paths can be easily established through the end-to-end path, which follows the logical ring configuration. It guarantees the establishment of backup path disjoint from the working path. We have shown that our proposed integrated provisioning, which combines protection efforts from both IP and optical layers, is favorable over the traditional provisioning approach. The integrated protection effort achieves efficient resource allocation in terms of total bandwidth reservation, bandwidth Choi, Guha Informational [Page 19] Internet Draft draft-choi-pce-e2e-centralized-restoration-srlg-05.txt March 2006 utilization, and connection blocking probability. The level of improvement largely depends on the type of pre-configured underlying lightpath protection. While we expect that the choice of lightpath protection should depend on the nature of service requests, we leave it up to the service provider to make this choice. The scheme takes advantage of information sharing across network layers which is facilitated by GMPLS. The proposed scheme uses GMPLS capabilities to provide end-to-end survivability against network failures. This integrated provisioning scheme can deliver rapid service provisioning dynamically on demand. The consolidated effort simplifies the provisioning process and reduces network management complexity by eliminating the cumbersome coordination of provisioning in separate network layers. PCEMP along with the PCE framework can be an efficent mechanism for this purpose of fast backup path computation. 8. Security Considerations The impact of the use of the PCEMP architecture is relatively much secure as the PCEDA are computed and distributed internal to the PCE unit. An increase in inter-domain information flows and the facilitation of inter-domain path establishment through PCEMP does not increase the existing vulnerability to security attacks. It should be remembered that PCEMP works by an invoked logic scheme local to each participating PCE unit, and the protocol invoke is brought into play only when there is a significant change in the data profile within the time of goodness of fit. However, it is expected that the PCE solutions will address security issues mentioned in [Ash] in details using authentication and security techniques. 9. IANA Considerations This document makes no requests for IANA action. 10. Acknowledgements This work was supported by the Ministry of Information and Communications (MIC), Republic of Korea 11. Intellectual Property Considerations The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Choi, Guha Informational [Page 20] Internet Draft draft-choi-pce-e2e-centralized-restoration-srlg-05.txt March 2006 Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org. 12. Normative References [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC3667] Bradner, S., "IETF Rights in Contributions", BCP 78, RFC 3667, February 2004. [RFC3668] Bradner, S., "Intellectual Property Rights in IETF Technology", BCP 79, RFC 3668, February 2004. 13. Informational References [1] Mannie, E. et al., "Generalized Multi-Protocol Label Switching Architecture", Internet Draft, draft-ietf-ccamp-gmpls-architecture-07.txt, November 2003. [2] Papadimitriou, D. and Mannie, E. (Editors), "Analysis of Generalized Multi-Protocol Label Switching (GMPLS) based Recovery Mechanisms (Including Protection and Restoration)", Internet Draft, draft-ietf-ccamp-gmpls-recovery-analysis-03.txt, October 2004. [3] Mannie, E. and Papadimitriou, D. (Editors), "Recovery (Protection and Restoration) Terminology for Generalized Multi-Protocol Label Switching (GMPLS)", Internet Draft, draft-ietf-ccamp-gmpls-recovery-terminology-04.txt, October 2004. [4] Lang, P. and Rajagopalan, B. (Editors), "Generalized Multi-Protocol Label Switching (GMPLS) Recovery Functional Specification", Internet Draft, draft-ietf-ccamp-gmpls-recovery-functional-02.txt, October 2004. [5] Papadimitriou, D. et al., "Shared Risk Link Groups Inference and Processing", Internet Draft, draft-papadimitriou-ccamp-srlg-processing-02.txt, December 2003 Choi, Guha Informational [Page 21] Internet Draft draft-choi-pce-e2e-centralized-restoration-srlg-05.txt March 2006 [6] Czezowski, P. et al., "Optical Network Failure Recovery Requirements", Internet Draft, draft-czezowski-optical-recovery-reqs-01.txt, December 2003. [7] Choi, J.K. et al., "Signaling Extension for the End-to-End Restoration with SRLG", Internet Draft, draft-choi-ccamp-e2e-restoration-srlg-01.txt, October 2004 [8] Lang, J.P., Rekhter, Y., Papadimitriou, D., "RSVP-TE Extensions in support of End-to-End GMPLS-based Recovery", Internet Draft, draft-lang-ccamp-gmpls-recovery-e2e-signaling-03.txt, August 2004. [9] ITU-T SG15 G.otnpro.2 Work In Progress [10] Choi, J.K., Guha, D. et al., "Path Computation Element Metric Protocol (PCEMP)", draft-choi-pce-metric-protocol-04.txt, March 2006 (work in progress) [RFC2702] Awduche, D., Malcolm, J., Agogbua, J., O'Dell and J. McManus, "Requirements for Traffic Engineering over MPLS", RFC 2702, September 1999. [RFC3209] Awduche, D., et. al., "Extensions to RSVP for LSP Tunnels", RFC 3209, December 2001. [RFC3473] Berger, L., et. al., "Generalized Multi-Protocol Label Switching (GMPLS) Signaling - Resource ReserVation Protocol-Traffic Engineering (RSVP-TE) Extensions", RFC 3473, January 2003. [INTER-AREA] Le Roux, J., Vasseur, JP, Boyle, J., "Requirements for Support of Inter-Area and Inter-AS MPLS Traffic Engineering", draft-ietf-tewg- interarea-mpls-te-req-00.txt, March 2004 (work in progress). [INTER-AS] Zhang, R., Vasseur, JP., et. al., "MPLS Inter-AS Traffic Engineering requirements", draft-ietf-tewg-interas-mpls-te-req-06.txt, January 2004 (work in progress). [MRN] Papadimitriou, D., et. al., "Generalized MPLS Architecture for Multi-Region Networks,"draft-vigoureux-shiomoto-ccamp-gmpls-mrn-04.txt, February 2004 (work in progress). Le Roux, J.L., Ed., "Requirements for Path Computation Element (PCE) Discovery", draft-ietf-pce-discovery-reqs-03.txt, February 2006 Ash, J., and Le Roux, J.L., Ed., "PCE Communication Protocol Generic Requirements", draft-ietf-pce-comm-protocol-gen-reqs-04.txt, February 2006 Choi, Guha Informational [Page 22] Internet Draft draft-choi-pce-e2e-centralized-restoration-srlg-05.txt March 2006 14. Authors' Addresses Jun Kyun Choi Information and Communications University (ICU) 103-6 Munji-Dong, Yuseong-gu, Daejeon, 305-732, Republic of Korea Phone: +82-42-866-6122 Email: jkchoi@icu.ac.kr Dipnarayan Guha Email: dg236@cornell.edu 15. Full Copyright Statement Copyright (C) The Internet Society (2006). All Rights Reserved. This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Choi, Guha Informational [Page 23] Internet Draft draft-choi-pce-e2e-centralized-restoration-srlg-05.txt March 2006