IETF Ping Pan Internet Draft Mohana Srinivas Rajan Rao Biao Lu (Infinera) Sam Aldrin (Huawei) Expires: September 14, 2011 March 14, 2011 Supporting Shared Mesh Protection in MPLS-TP Networks draft-pan-shared-mesh-protection-01.txt Status of this Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. This document may not be modified, and derivative works of it may not be created, and it may not be published except as an Internet-Draft. This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. This document may not be modified, and derivative works of it may not be created, except to publish it as an RFC and to translate it into languages other than English. This document may contain material from IETF Documents or IETF Contributions published or made publicly available before November 10, 2008. The person(s) controlling the copyright in some of this material may not have granted the IETF Trust the right to allow modifications of such material outside the IETF Standards Process. Without obtaining an adequate license from the person(s) controlling the copyright in such materials, this document may not be modified outside the IETF Standards Process, and derivative works of it may not be created outside the IETF Standards Process, except to format it for publication as an RFC or to translate it into languages other than English. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that Pan et.al Expires September 14, 2011 [Page 1] Internet-Draft Shared Mesh Protection in MPLS-TP March 2011 other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html This Internet-Draft will expire on September 14, 2011. Copyright Notice Copyright (c) 2011 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Abstract Shared mesh protection is a common protection and recovery mechanism in transport networks, where multiple paths can share the same set of network resources for protection purposes. In the context of MPLS-TP, it has been explicitly requested as a part of the overall solution (Req. 67, 68 and 69 in RFC5654 [1]). It's important to note that each MPLS-TP LSP may be associated with transport network resources. In event of network failure, it may require explicit activation on the protecting paths before switching user traffic over. In this memo, we define a lightweight signaling mechanism for protecting path activation in shared mesh protection-enabled MPLS-TP networks. Pan et.al Expires September 14, 2011 [Page 2] Internet-Draft Shared Mesh Protection in MPLS-TP March 2011 Table of Contents 1. Introduction...................................................3 2. Background.....................................................4 3. Problem Definition.............................................5 4. Protection Switching...........................................6 5. Activation Operation Overview..................................7 6. Protocol Definition............................................9 6.1. Activation Messages.......................................9 6.2. Message Encapsulation....................................10 6.3. Reliable Messaging.......................................12 6.4. Message Scoping..........................................12 7. Processing Rules..............................................13 7.1. Enable a protecting path.................................13 7.2. Disable a protecting path................................13 7.3. Get protecting path status...............................14 7.4. Acknowledgement with STATUS..............................14 7.5. Preemption...............................................14 8. Security Consideration........................................15 9. IANA Considerations...........................................15 10. Normative References.........................................15 11. Acknowledgments..............................................15 1. Introduction Shared mesh protection is a common protection and recovery mechanism in transport networks, where multiple paths can share the same set of network resources for protection purposes. In the context of MPLS-TP, it has been explicitly requested as a part of the overall solution (Req. 67, 68 and 69 in RFC5654 [1]).Its operation has been further outlined in Section 4.7.6 of MPLS-TP Survivability Framework [2]. It's important to note that each MPLS-TP LSP may be associated with transport network resources. In event of network failure, it may require explicit activation on the protecting paths before switching user traffic over. In this memo, we define a lightweight signaling mechanism for protecting path activation in shared mesh protection-enabled MPLS-TP networks. Here are the key design goals: Pan et.al Expires September 14, 2011 [Page 3] Internet-Draft Shared Mesh Protection in MPLS-TP March 2011 1. Fast: The protocol is to activate the previously configured protecting paths in a timely fashion, with minimal transport and processing overhead. The goal is to support 50msec end-to-end traffic switch-over in large transport networks. 2. Reliable message delivery: Activation and deactivation operation have serious impact on user traffic. This requires the protocol to adapt a low-overhead reliable messaging mechanism. 3. Modular: Depending on deployment scenarios, the signaling may need to support functions such as preemption, resource re-allocation and bi-directional activation in a modular fashion. Here are some of the conventions used in this document. The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC-2119 [RFC2119]. 2. Background Transport network protection can be typically categorized into three types: Cold Standby: In this type of protection, the nodes will only negotiate and establish backup path after the detection of network failure. Hot Standby: The protecting paths are established prior to network failure. This is also known as "make-before-break". Upon the detection of network failure, the edge nodes will switch data traffic into pre-established backup path immediately. Warm Standby: The nodes will negotiate and reserve protecting path prior to network failure. However, data forwarding path will not be programmed. Upon the detection of network failure, the nodes will send explicit messages to relevant nodes to "wake up" the protecting path. The activation signaling defined in this memo is to support warm standby in the context of MPLS-TP. Further, the activation procedure may be triggered using the failure notification methods defined in MPLS-TP OAM specifications. Pan et.al Expires September 14, 2011 [Page 4] Internet-Draft Shared Mesh Protection in MPLS-TP March 2011 3. Problem Definition In this section, we describe the operation of shared mesh protection in the context of MPLS-TP networks, and outline some of the relevant definitions. We refer to the figure below for illustration: ----- B ------- C ---- / \ / \ A D \ / \ / ==== E === F === G === / \ / \ H K \ / \ / ----- I ------- J ---- Working paths: X = {A, B, C, D}, Y = {H, I, J, K} Protecting paths: X' = {A, E, F, G, D}, Y' = {H, E, F, G, K} The links between E, F and G are shared by both protecting paths. All paths are established via MPLS-TP control plane prior to network failure. All paths are assumed to be bi-directional. An edge node is denoted as a headend or tailend for a particular path in accordance to the path setup direction. Initially, the operators setup both working and protecting paths. During setup, the operators specify the network resources for each path. The working path X and Y will configure the appropriate resources on the intermediate nodes, however, the protecting paths, X' and Y', will reserve the resources on the nodes, but won't occupy them. Depending on network planning requirements (such as SRLG), X' and Y' may share the same set of resources on node E, F and G. The resource Pan et.al Expires September 14, 2011 [Page 5] Internet-Draft Shared Mesh Protection in MPLS-TP March 2011 assignment is a part of the control-plane CAC operation taking place on each node. At some time, link B-C is cut. Node A will detect the outage, and initiate activation messages to bring up the protecting path X'. The intermediate nodes, E, F and G will program the switch fabric and configure the appropriate resources. Upon the completion of the activation, A will switch the user traffic to X'. The operation may have extra caveat: 1. Preemption: Protecting paths X' and Y' may share the same resources on node E, F or G due to resource constraints. Y' has higher priority than that of X'. In the previous example, X' is up and running. When there is a link outage on I-J, H can activate its protecting path Y'. On E, F or G, Y' can take over the resources from X' for its own traffic. The behavior is acceptable with the condition that A should be notified about the preemption action. 2. Over-subscription (1:N): A unit of network resource may be reserved by one or multiple protecting paths. In the example, the network resources on E-F and F-G are shared by two protecting paths, X' and Y'. In deployment, the over- subscription ratio is an important factor on network resource utilization. 4. Protection Switching The entire activation and switch-over operation need to be within the range of milliseconds to meet customer's expectation [1]. This section illustrates how this may be achieved on MPLS-TP-enabled transport switches. Note that this is for illustration of protection switching operation, not mandating the implementation itself. The diagram below illustrates the operation. Pan et.al Expires September 14, 2011 [Page 6] Internet-Draft Shared Mesh Protection in MPLS-TP March 2011 +---------------+ Control | MPLS-TP | Control <=== Signaling ====| Control Plane |=== Signaling ===> +---------------+ / \ / \ (MPLS label assignment) / \ / \ +-------+ +------+ +-------+ Activation |Line | |Switch| |Line | Activation <=== Messages ===|Module |===|Fabric|===|Module |=== Messages ===> +-------+ +------+ +-------+ Typical MPLS-TP user flows (or, LSP's) are bi-directional, and setup as co-routed or associated tunnels, with a MPLS label for each of the upstream and downstream traffic. On this particular type of transport switch, the control-plane can download the labels to the line modules. Subsequently, the line module will maintain a label lookup table on all working and protecting paths. Upon the detection of network failure, the headend nodes will transmit activation messages along the MPLS LSP's. When receiving the messages, the line modules can locate the associated protecting path from the label lookup table, and perform activation procedure by programming the switching fabric directly. Upon its success, the line module will swap the label, and forward the activation messages to the next hop. In summary, the activation procedure involves efficient path lookup and switch fabric re-programming. To achieve the tight end-to-end switch-over budget, it's possible to implement the entire activation procedure with hardware-assistance (such as in FPGA or ASIC). The activation messages are encapsulated with a MPLS-TP Generic Associated Channel Header (GACH) [3]. Detailed message encoding is explained in Section 6. 5. Activation Operation Overview In this section, we describe the activation procedure using the same figure shown before: Pan et.al Expires September 14, 2011 [Page 7] Internet-Draft Shared Mesh Protection in MPLS-TP March 2011 ----- B ------- C ---- / \ / \ A D \ / \ / ==== E === F === G === / \ / \ H K \ / \ / ----- I ------- J ---- Working paths: X = {A, B, C, D}, Y = {H, I, J, K} Protecting paths: X' = {A, E, F, G, D}, Y' = {H, E, F, G, K} Upon the detection of working path failure, the edge nodes, A, D, H and K may trigger the activation messages to activate the protecting paths, and redirect user traffic immediately after. We assume that there is a consistent definition of priority levels among the paths throughout the network. At activation time, each node may rely on the priority levels to potentially preempt other paths. When the nodes detect path preemption on a particular node, they should inform all relevant nodes to free the resources. To optimize traffic protection and resource management, each headend should periodically poll the protecting paths about resource availability. The intermediate nodes have the option to inform the current resource utilization. Note that, upon the detection of a working path failure, both headend and tailend may initiate the activation simultaneously (known as bi-directional activation). This may expedite the activation time. However, both headend and tailend nodes need to coordinate the order of protecting paths for activation, since there may be multiple protecting paths for each working path (i.e., 1:N protection). For clarity, we will describe the operation from headend in the memo. The tailend operation will be available in the subsequent revisions. Pan et.al Expires September 14, 2011 [Page 8] Internet-Draft Shared Mesh Protection in MPLS-TP March 2011 6. Protocol Definition 6.1. Activation Messages The activation requires the following messages: o ENABLE: this is initiated by the headend nodes to activate a protecting path o DISABLE: this is initiated by the headend nodes to disable a protecting path and free the associated network resources o GET: this is initiated by the headend to gather resource availability information on a particular protecting path o NOTIFY: this is initiated by the intermediate nodes and terminate on the headend nodes to report preemption or protection failure conditions o STATUS: this is the acknowledgement message for ENABLE, DISABLE, GET, and NOTIFY messages, and contains the relevant status information Each activation message has the following format: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |Version| Type | Reserved | Seq | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Additional Info | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ o Version: 1 o Type: o ENABLE 1 o DISABLE 2 o GET 3 o STATUS 4 Pan et.al Expires September 14, 2011 [Page 9] Internet-Draft Shared Mesh Protection in MPLS-TP March 2011 o NOTIFY 5 o Reserved: This field is reserved for future use o Seq: This uniquely identifies a particular message. This field is defined to support reliable message delivery o Additional Info: the message-specific data 6.2. Message Encapsulation Activation messages use MPLS labels to identify the paths. Further, the messages are encapsulated in GAL/GACH: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | MPLS Label stack | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | GAL | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |0 0 0 1|Version| Reserved | Activation Channel Type | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ACH TLV Header | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Activation Message Payload | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ o GAL is described in [3] o Activation Channel Type is the GACH channel number assigned to the protocol. This uniquely identifies the activation messages. o ACH TLV Header contains the message length, and is described in [3] Specifically, ENABLE, DISABLE and GET messages have the following message format: Pan et.al Expires September 14, 2011 [Page 10] Internet-Draft Shared Mesh Protection in MPLS-TP March 2011 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Label | Exp |S| TTL | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Label (13) | Exp |S| TTL | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |0 0 0 1|Version| Reserved | Activation Channel Type | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Length | Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Ver(1)| Type | Reserved (0) | Seq | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Both STATUS and NOTIFY messages have the following message format: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Label | Exp |S| TTL | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Label (13) | Exp |S| TTL | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |0 0 0 1|Version| Reserved | Activation Channel Type | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Length | Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Ver(1)| Type | Reserved (0) | Seq | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Status Code | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Currently, the status code used for acknowledging and preemption notification has the following definition: o 1xx: OK . 101: end-to-end ack o 2xx: message processing errors . 201: no such path o 3xx: processing issues: Pan et.al Expires September 14, 2011 [Page 11] Internet-Draft Shared Mesh Protection in MPLS-TP March 2011 . 301: no more resource for the path . 302: preempted by another path . 303: system failure o 4xx: informative data: . 401: shared resource has been taken by other paths Further, for preemption notification, we may consider of using the existing MPLS-TP OAM messaging. More details will be available in the future revisions. 6.3. Reliable Messaging The activation procedure adapts a simple two-way handshake reliable messaging. Each node maintains a sequence number generator. Each new sending message will have a new sequence number. After sending a message, the node will wait for a response with the same sequence number. Specifically, upon the generation of ENABLE, DISABLE, GET and NOTIFY messages, the message sender expects to receive a STATUS in reply with same sequence number. If a sender is not getting the reply (STATUS) within a time interval, it will retransmit the same message with a new sequence number, and starts to wait again. After multiple retries (by default, 3), the sender will declare activation failure, and alarm the operators for further service. 6.4. Message Scoping Activation signaling uses MPLS label TTL to control how far the message would traverse. Here are the processing rules on each intermediate node: o On receive, if the message has label TTL = 0, the node must drop the packet without further processing o The receiving node must always decrement the label TTL value by one. If TTL = 0 after the decrement, the node must process the message. Otherwise, the node must forward the message without further processing (unless, of course, the node is headend or tailend) Pan et.al Expires September 14, 2011 [Page 12] Internet-Draft Shared Mesh Protection in MPLS-TP March 2011 o On transmission, the node will adjust the TTL value. For hop-by- hop messages, TTL = 1. Otherwise, TTL = 0xFF, by default. 7. Processing Rules 7.1. Enable a protecting path Upon the detection of network failure on a working path, the headend node identifies the corresponding MPLS-TP label and initiates the protection switching by sending an ENABLE message. ENABLE messages always use MPLS label TTL = 1 to force hop-by-hop process. Upon reception, a next-hop node will locate the corresponding path and activate the path. If the Enable message is received on an intermediate node, due to label TTL expiry, the message is processed and then propagated to the next hop of the MPLS TP LSP, by setting the MPLS TP label TTL = 1. The intermediate node may NOT respond back to the headend node with STATUS message. The headend node will declare the success of the activation only when it gets a positive reply from the tailend node. This requires that the tailend nodes must reply STATUS messages to the headend nodes in all cases. If the headend node is not receiving the acknowledgement within a time internal, it will retransmit another ENABLE message with a different Seq number. If the headend node is not receiving a positive reply within a longer time interval, it will declare activation failure. If an intermediate node cannot activate a protecting path, it will reply an NOTIFY message to report failure. When the headend node receives a NOTIFY message for failure, it must initiate DISABLE messages to clean up networks resources on all the relevant nodes on the path. 7.2. Disable a protecting path The headend removes the network resources on a path by sending DISABLE messages. Pan et.al Expires September 14, 2011 [Page 13] Internet-Draft Shared Mesh Protection in MPLS-TP March 2011 In the message, the MPLS label represents the path to be de- activated. The MPLS TTL is one to force hop-by-hop processing. Upon reception, a node will de-activate the path, by freeing the resources from the data-plane. As a part of the clean-up procedure, each DISABLE message must traverse through and be processed on all the nodes of the corresponding path. When the DISABLE message reaches to the tailend node, the tailend is required to reply with a STATUS message to the headend. The de-activation process is complete when the headend receives the corresponding STATUS message from the tailend. 7.3. Get protecting path status The operators have the option to trigger GET messages from the headend to check on the protecting path periodically or on-demand. The process procedure on each node is very similar to that of ENABLE messages on the intermediate nodes, except the GET messages should not trigger any network resource re-programming. Upon reception, the node will check the availability of resources. If the resource is no longer available, the node will reply a NOTIFY with error conditions. 7.4. Acknowledgement with STATUS The STATUS message is the acknowledgement packet to all messages, and may be generated by any node in the network. Each STATUS message must use the same sequence number as the corresponding message (ENABLE, DISABLE, GET and NOTIFY). When replying to headend, the tailend nodes must originate STATUS messages with a large MPLS TTL value (0xff, by default). 7.5. Preemption The preemption operation typically takes place when processing an ENABLE message. If the activating network resources have been used by another path and carrying user traffic, the node needs to compare the priority levels. Pan et.al Expires September 14, 2011 [Page 14] Internet-Draft Shared Mesh Protection in MPLS-TP March 2011 If the existing path has higher priority, the node needs to reject the ENABLE message by sending a STATUS message to the corresponding headend to inform the unavailability of network resources. If the new path has higher priority, the node will reallocate the resource to the new path, and send an NOTIFY message to old path's headend node to inform about the preemption. 8. Security Consideration The protection activation takes place in a controlled networking environment. Nevertheless, it is expected that the edge nodes will encapsulate and transport external traffic into separated tunnels, and the intermediate nodes will never have to process them. 9. IANA Considerations Activation messages are encapsulated in MPLS-TP with a specific GACH channel type that needs to be assigned by IANA. 10. Normative References [1] RFC 5654: Requirements of an MPLS Transport Profile, B. Niven- Jenkins, D. Brungard, M. Betts, N. Sprecher, S. Ueno, September 2009 [2] IETF draft, Multiprotocol Label Switching Transport Profile Survivability Framework (draft-ietf-mpls-tp-survive-fwk- 06.txt), N. Sprecher, A. Farrel, June 2010 [3] RFC5586 - Vigoureux,, M., Bocci, M., Swallow, G., Aggarwal, R., and D. Ward, "MPLS Generic Associated Channel", May 2009. [4] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [5] Crocker, D. and Overell, P.(Editors), "Augmented BNF for Syntax Specifications: ABNF", RFC 2234, Internet Mail Consortium and Demon Internet Ltd., November 1997. 11. Acknowledgments Authors like to thank Eric Osborne, Lou Berger, Nabil Bitar and Deborah Brungard for detailed feedback on the earlier work, and the guidance and recommendation for this proposal. Pan et.al Expires September 14, 2011 [Page 15] Internet-Draft Shared Mesh Protection in MPLS-TP March 2011 We also thank Maneesh Jain, Mohit Misra, Yalin Wang, Ted Sprague, Ann Gui and Tony Jorgenson for discussion on network operation, feasibility and implementation methodology. Authors' Addresses Ping Pan Email: ppan@infinera.com Sri Mohana Satya Srinivas Singamsetty Email: ssingamsetty@infinera.com Rajan Rao Email: rrao@infinera.com Biao Lu Email: blu@infinera.com Sam Aldrin Email: sam.aldrin@huawei.com Pan et.al Expires September 14, 2011 [Page 16]