Network Working Group S. Kini Internet Draft A. Liu Intended status: Standards Track Ericsson Expires: August 2010 February 22, 2010 Efficient Fast Re-route (FRR) using Facility backup in ring topology draft-kini-mpls-ring-frr-facility-backup-00.txt Status of this Memo This Internet-Draft is submitted to IETF in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet- Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html This Internet-Draft will expire on August 22, 2010. Copyright Notice Copyright (c) 2010 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Kini & Liu Expires August 22, 2010 [Page 1] Internet-Draft Efficient Fast Re-route (FRR) using February 2010 Facility backup in ring topology Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Abstract Fast Re-route (FRR) using facility backup is a widely deployed traffic protection mechanism that can protect against link and node failure. It provides a fast protection switch mechanism to minimize traffic loss (typically sub 50msec) when the node that detects the failure locally, does the repair by re-routing the protected traffic to go over the bypass LSP. A direct application of this mechanism to a ring topology results in an inefficient use of link bandwidth. This document describes a mechanism to do it efficiently. Table of Contents 1. Introduction ..................................................3 2. Conventions used in this document .............................3 3. Problem Statement .............................................3 4. Solution ......................................................4 4.1. RSVP-TE signaling extensions .............................5 4.1.1. Session Attribute Object Flag: Local_protection_recording .................................5 4.1.2. RRO subobjects ......................................5 4.1.2.1. Bypass Session Subobject .......................6 4.1.2.1.1. Bypass_LSP_TUNNEL_IPv4 Session Suboobject .6 4.1.2.1.2. Bypass_LSP_TUNNEL_IPv6 Session Subobject ..6 4.1.2.2. Bypass Sender Template Subobject ...............7 4.1.2.2.1. Bypass_LSP_TUNNEL_IPv4 Sender Template Suboobject ...........................................7 4.1.2.2.2. Bypass_LSP_TUNNEL_IPv6 Sender Template Subobject ............................................7 4.2. Alert mechanism ..........................................8 4.3. Procedures at PLR ........................................8 4.4. Procedures at PLR-u ......................................9 5. Example .......................................................9 6. Applicability to generic topologies ..........................10 7. Future work ..................................................10 8. Security Considerations ......................................11 9. IANA Considerations ..........................................11 10. References ..................................................11 10.1. Normative References ...................................11 11. Acknowledgments .............................................12 Kini & Liu Expires August 22, 2010 [Page 2] Internet-Draft Efficient Fast Re-route (FRR) using February 2010 Facility backup in ring topology 1. Introduction Fast Re-route for TE tunnels [MPLS-FRR] is widely used to ensure sub 50msec protection of traffic in case of link or node failure. Specifically, the facility backup mode described in [MPLS-FRR] provides a scalable mechanism to protect a large number of LSPs. When applied to a ring topology this results in an in-efficient use of bandwidth. This is described in detail with a sample topology/scenario in section 3. The solution is described in section 4. An explanation of the solution by applying it to the topology/scenario in the problem statement is provided in section 5. 2. Conventions used in this document The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119]. 3. Problem Statement +-------L1--------L2---------L3--------L4-------+ | .........>.........B1.........>............ | | . .......<.........B2.........<.......... . | | . . . . | | . . . . | L10 ^ V ^ V L5 | . . . . | | . . . . | | . ............> >............... . | | ..............< <................. | +-------L9--------L8---------L7--------L6-------+ >**************P1**************> <**************P2**************< ...... Bypass LSP ****** Protected LSP Figure 1 Facility bypass in a Ring topology The issue with bypass LSP in a ring topology is illustrated through the example in Figure 1. L1-10 are LSRs in a ring topology and say each link in the ring has 1G bandwidth. To protect the facility L8-L7, a uni-directional bypass LSP B1 is routed along L8-L9-L10-L1-L2-L3-L4-L5-L6-L7 with L8 as PLR and L7 Kini & Liu Expires August 22, 2010 [Page 3] Internet-Draft Efficient Fast Re-route (FRR) using February 2010 Facility backup in ring topology as MP. Similarly, a uni-directional bypass LSP B2 is routed along L7-L6-L5-L4-L3-L2-L1-L10-L9-L8 with L7 as PLR and L8 as MP. A protected LSP P1 is routed along L9-L8-L7-L6 and has a bandwidth constraint of 0.6G. Another uni-directional protected LSP P2 is routed along L6-L7-L8-L9 and has bandwidth constraint of 0.6G. When facility L8-L7 fails, P1 is bypassed over B1 at PLR L8 and P2 over B2 at PLR L7. Since the aggregate bandwidth requirement on L8->L9 becomes 1.2G (0.6G for P1 + 0.6G for B2), both P1 and P2 start experiencing degraded service. If there are other uni- directional protected LSPs e.g. P3 along the path L10-L9-L8-L7- L6, P4 along the path L8-L9, etc, they would also start experiencing degraded service. Even when P1 needs bandwidth less than 0.5G, the wraparound consumes bandwidth that would be better used by other LSPs such as P4. Note that similar issues would arise if (P1, P2) was a single bi- directional protected LSP and/or (B1, B2) was a single bi- directional bypass LSP. 4. Solution LSRs along the bypass LSPs must be alerted at the time of failure to do a protection switch. The protected LSP must do a protection switch of traffic at a LSR that is furthest upstream of the PLR and has the protected LSPs and the associated bypass LSP going in opposite directions to a neighboring LSR. Such a node is referred to as PLR-upstream (PLR-u) of the protected LSP. The traffic should be protection switched to a backup path terminating on a LSR along the protected LSP that is furthest downstream from the MP and has the protected and bypass LSPs going in the opposite directions. A backup path routed along the bypass' path ensures that traffic closely follows a path as if it had been protection switched at PLR, but without the downsides of the problematic U- turn. In such a case resources may be shared between the backup LSP and the bypass LSP. This document does not preclude choosing a MP further downstream or a different path as long as other constraints of the protected LSP are met by the backup path. The backup LSP must be setup before the failure (typically done when the protected LSP is setup). Also, the association between the protected LSP and the backup LSP must be understood in advance so that when the failure trigger arrives the protection switch can be performed very fast. Towards this, the following mechanisms are defined in this document: Kini & Liu Expires August 22, 2010 [Page 4] Internet-Draft Efficient Fast Re-route (FRR) using February 2010 Facility backup in ring topology 1. A signaling mechanism towards enabling computation of the backup path for the LSP that the traffic needs to be protection switched to. This is described in section 4.1. 2. An alert mechanism that indicates to the LSRs along the bypass LSP to trigger the protection switch of the protected LSP. This is described in section 4.2. 4.1. RSVP-TE signaling extensions To enable backup path computation, the information about the bypass LSP is needed at LSRs that are upstream of the PLR. The RRO is used to record this information. This additional information is requested using the "Session Attribute" object as described in section 4.1.1. The information in the RRO is described in 4.1.2. 4.1.1. Session Attribute Object Flag: Local_protection_recording A new flag "Local protection recording desired" is defined for the "Flags" field in the Session Attribute object. This flag indicates that information about the local protection used should be included when doing a route record. 4.1.2. RRO subobjects A new RRO subobject called "Bypass Session" subobject is defined. It follows an IPv4/IPv6 address sub-object (that would have "Local Protection Available" set). This subobject describes the bypass LSP that protects against a failure detected at the interface corresponding to the preceding IPv4/IPv6 address sub- object. This subobject is carried in the RRO of a protected LSP. The encoding of this subobject is described in section 4.1.2.1. A new RRO subobject called "Bypass Sender Template" subobject is defined. It follows the "Bypass Session" subobject and is included only if the "Bypass Session" subobject does not uniquely identify the tunnel. The encoding of this subobject is described in section section 4.1.2.2. Kini & Liu Expires August 22, 2010 [Page 5] Internet-Draft Efficient Fast Re-route (FRR) using February 2010 Facility backup in ring topology 4.1.2.1. Bypass Session Subobject 4.1.2.1.1. Bypass_LSP_TUNNEL_IPv4 Session Suboobject 0 1 2 3 00 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Length | Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | Bypass_LSP_TUNNEL_IPv4 Session Subobject (12 bytes) | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Reserved This field is reserved. It MUST be set to zero on transmission and MUST be ignored on receipt. Bypass_LSP_TUNNEL_IPv4 Session subobject This must be the same as the object defined in section 4.6.1.1 of [RSVP-TE]. 4.1.2.1.2. Bypass_LSP_TUNNEL_IPv6 Session Subobject 0 1 2 3 00 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Length | Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | . . . Bypass_LSP_TUNNEL_IPv6 Session Subobject (36 bytes) . . . | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Reserved This field is reserved. It MUST be set to zero on transmission and MUST be ignored on receipt. Bypass_LSP_TUNNEL_IPv6 Session subobject Kini & Liu Expires August 22, 2010 [Page 6] Internet-Draft Efficient Fast Re-route (FRR) using February 2010 Facility backup in ring topology This must be the same as the object defined in section 4.6.1.2 of [RSVP-TE]. 4.1.2.2. Bypass Sender Template Subobject 4.1.2.2.1. Bypass_LSP_TUNNEL_IPv4 Sender Template Suboobject 0 1 2 3 00 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Length | Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Bypass_LSP_TUNNEL_IPv4 Sender Template Subobject (8 bytes) | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Reserved This field is reserved. It MUST be set to zero on transmission and MUST be ignored on receipt. Bypass_LSP_TUNNEL_IPv4 Sender Template subobject This must be the same as the object defined in section 4.6.2.1 of [RSVP-TE]. 4.1.2.2.2. Bypass_LSP_TUNNEL_IPv6 Sender Template Subobject 0 1 2 3 00 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Length | Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | . . . Bypass_LSP_TUNNEL_IPv6 Sender Template Subobject (20 bytes) . . . | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Reserved This field is reserved. It MUST be set to zero on transmission and MUST be ignored on receipt. Bypass_LSP_TUNNEL_IPv6 Sender Template subobject Kini & Liu Expires August 22, 2010 [Page 7] Internet-Draft Efficient Fast Re-route (FRR) using February 2010 Facility backup in ring topology This must be the same as the object defined in section 4.6.2.2 of [RSVP-TE]. 4.2. Alert mechanism All the LSRs along a bypass LSP must be alerted since any of them may be upstream LSRs of LSPs that are protected by that bypass. A message format that is implementable in hardware/firmware is preferred since the trigger should be generated and processed as quickly as possible. Typically, implementations of [ID.BFD] fit these requirements. Hence the mechanism in [ID.BFD-MPLS] operating in the G-Ach of the bypass LSP (as described in [RFC5586]) with the BFD control packet encoded in a label stack and ACH as described in [ID.fast-lsp-alert] is used. Thus each LSR along the bypass receives the packet. A new BFD control packet diagnostic (Diag) code "Backup LSP not carrying protected traffic" is defined. It is applicable even if the "State" of the BFD session is "Up". A transit LSR along the bypass may choose to ignore validating the "Authentication" section of the BFD control packet received via this alert mechanism for that bypass since the end point would do the authentication and that would be sufficient to authenticate the BFD session. Similar reasoning holds true for any validation of ACH TLVs (some of them are listed in [ID.ACH-TLVs]). Also, a transit LSP along the bypass that is not an upstream LSR for any of the protected LSPs, simply ignores this alert message for local processing. 4.3. Procedures at PLR A PLR runs a BFD session on the bypass LSP's G-Ach as described in [ID.BFD-MPLS] and [RFC5586]. When the BFD session is Up, it signals the association of the protected LSPs with that bypass by updating the RRO of the protected LSP as described in section 4.1. While the protection switch has not occurred at PLR and the BFD session in Up, the "Diagnostic" field in the BFD control packet is set to "Backup LSP not carrying protected traffic". When the protection switch to the bypass is effective the value "Backup LSP not carrying protected traffic" is no more applicable in the "Diagnostic" field. The PLR sends the BFD Control packets on the bypass with FLA-bit set. The PLR may choose to not set FLA-bit for every such BFD control packet. In that case, FLA-bit must be set for a period of at least the BFD "Detection Time" when any of the following events occur: 1. The association of a protected LSP with this bypass is setup or deleted. Kini & Liu Expires August 22, 2010 [Page 8] Internet-Draft Efficient Fast Re-route (FRR) using February 2010 Facility backup in ring topology 2. A protection switch or revert action for the bypass occurs (at PLR). 3. The bypass LSP is re-routed (typically with MBB) The BFD transmit timer must be treated as if it expired as soon as any of the above events occur. Additionally, to recover from packet loss, the BFD control packet should be sent a couple more times within a very short duration (typically 10 milliseconds). Under stable conditions the PLR should set FLA-bit with a much lower frequency. It is recommended that this frequency be once every 100 * "Detection Time". 4.4. Procedures at PLR-u An upstream LSR on the protected LSP that is a transit for the bypass LSP routed in a direction opposite to the protected LSP, deduces the association when processing the protected LSP's RRO. Such a LSR is a PLR-u and it creates a backup LSP (if needed) to a destination LSR as has been described earlier. It also associates a protection switch action of the protected LSP to the backup LSP with a trigger. The trigger is the receipt of a BFD control packet received as an alert packet on the bypass LSP. The protection switch action depends on the contents of the BFD control packet. If the "State" indicates that the session is Up, and the "Diagnostic" field is clear, traffic on the protected LSP is switched to the backup LSP. Else, traffic is sent along the protected LSP. If the bypass LSP is re-routed away from the PLR-u the traffic is sent along the protected LSP. Conversely if a bypass is newly created at that LSR and it acts as a PLR-u, the association between bypass LSP and protected LSP(s) is setup and a protection switch from protected LSP to backup LSP is done if depending on the contents of the BFD control packet alert packet on the bypass LSP. 5. Example Consider the scenario described in section 3. PLR (L8) creates a BFD session on B1. When the BFD session is Up, L8 appends to the RRO of P1, P3, P4, P5 the bypass LSP Tunnel Session (of B1) subobject following the interface address of the link L8->L7. L9 on interpreting this subobject for P1 and also that the corresponding LSP (B1) is going in the opposite direction as P1, functions as PLR-u for P1. Using the RROs of B1, P1 and the TED, it deduces that a backup path L9-L10-L1-L2-L3-L4-L5-L6 can be used to protect P1. Say this LSP is called B3. L9 initiates a Kini & Liu Expires August 22, 2010 [Page 9] Internet-Draft Efficient Fast Re-route (FRR) using February 2010 Facility backup in ring topology setup of B3 if it does not already exist and associates the trigger of an alert message on B1 to perform the protection switch from P1 to B3. Normally PLR-u (L9) receives a BFD control packet with FLA-bit set on the bypass with "State" as "Up" and the "Diagnostic" code as "Backup LSP not carrying protected traffic". Say L8 does not set FLA-bit on all BFD control packets on the bypass. When L8 detects failure on link L8-L7 (due to loss of signal, BFD, etc), it does a protection switch of P1 to the bypass B1. It also modifies the BFD control packet for the session on B1, by clearing the "Diagnostic" code. The FLA-bit for the session is set for a short time. When PLR-u (L9) receives the G-Ach message due to FLA-bit being set, it processes the embedded BFD control packet. Since "State" is "Up" and the "Diagnostic" code is clear, L9 does a protection switch of P1 to B3. Conversely when the fault on L8-L7 clears, L8 switches traffic on P1 out of the bypass and sends it directly to L7. It also modifies the BFD control packet for the session on B1, by setting the "Diagnostic" code to "Backup LSP not carrying protected traffic". The FLA-bit for the session is set for a short time. Again PLR-u (L9) receives the G-Ach message and processes the embedded BFD control packet. Since "State" is "Up" and the "Diagnostic" code is "Backup LSP not carrying protected traffic", L9 reverts P1 by sending traffic directly to L8 instead of the backup path. 6. Applicability to generic topologies This document uses a ring topology in isolation, purely for illustrative clarity purposes. A bypass LSP along with the segment that it is protecting forms a ring. This sub-graph can be placed in any generic topology. This problem and solution would still be equally applicable. 7. Future work The solution in its current form is applicable to point-to-point protected LSPs. However the problem applies to point-to- multipoint LSPs as well and a solution could follow a similar approach. Solutions to P2MP LSPs would be incorporated in future versions of this draft. Authentication parameters required by [ID.BFD] or the ACH TLVs, need to be communicated to all LSRs along the bypass if it Kini & Liu Expires August 22, 2010 [Page 10] Internet-Draft Efficient Fast Re-route (FRR) using February 2010 Facility backup in ring topology required to do authentication at each transit LSR. This can be simplified by extensions to [RSVP-TE]. 8. Security Considerations This document does not introduce new security considerations other than those already described in [MPLS-FRR], [ID.BFD] and [ID.fast-lsp-alert]. The optimization (optional) of not validating the authenticity of the BFD control packet (via authentication field or ACH TLVs) in a transit LSR of the bypass LSP should not introduce any new security threats. Note that this message is processed only when the "State" in the BFD control packet indicates Up. Any non-authentic packets would be detected by the LSRs at the LSP's end-points. Alarms would be raised by the end point and should be sufficient to diagnose potential problems. 9. IANA Considerations Values need to be allocated for: 1. "Flags" field position in SESSION_ATTRIBUTE object for newly defined flag "Local protection recording desired". Recommend using next unused field position 0x20. 2. "Type" field in RECORD_ROUTE object with C-Type 1 for newly defined subobjects Bypass_LSP_TUNNEL_IPv4_Session, Bypass_LSP_TUNNEL_IPv6_Session, Bypass_LSP_Tunnel_IPv4_Sender_Template and Bypass_LSP_TUNNEL_IPv6_Sender_Template. Recommend using values the next available values 4, 5, 6 and 7 respectively. 3. "Diagnostic" field in BFD control packet for newly defined code "Backup LSP not carrying protected traffic". Recommend using the first reserved value 9 for this purpose. 10. References 10.1. Normative References [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RSVP-TE] Awduche, D., et al, "RSVP-TE: Extensions to RSVP for LSP Tunnels", RFC3209, December 2001. Kini & Liu Expires August 22, 2010 [Page 11] Internet-Draft Efficient Fast Re-route (FRR) using February 2010 Facility backup in ring topology [MPLS-FRR] Pan, P., et al, "Fast Reroute Extensions to RSVP-TE for LSP Tunnels", RFC4090, May 2005. [RFC4561] Vasseur, J. P., et al, "Definition of a Record Route Object (RRO) Node-Id Sub-Object", RFC4561, June 2006. [RFC5586] Bocci, M., et al, "MPLS Generic Associated Channel", RFC 5586, June 2009. [ID.BFD] Katz, D., et al, "Bidirectional Forwarding Detection", draft-ietf-bfd-base-11, (Work in progress), Jan 2010. [ID.BFD-MPLS] Aggarwal, R., et al, "BFD For MPLS LSPs", draft- ietf-bfd-mpls-07 (Work in progress), June 2008. [ID.ACH-TLVs] Boutros, S., et al, "Definition of ACH TLV Structure", draft-ietf-mpls-tp-ach-tlv-01 (Work in progress), February 2010. [ID.fast-lsp-alert] Kini, S., Liu, A., "A fast LSP-alert mechanism", draft-kini-mpls-tp-fast-lsp-alert-00 (Work in progress), February 2010. 11. Acknowledgments The authors would like to thank Marc Rapoport and Joel Halpern for their comments. This document was prepared using 2-Word-v2.0.template.dot. Authors' Addresses Sriganesh Kini Ericsson 300 Holger Way, San Jose, CA 95134 Email: sriganesh.kini@ericsson.com Autumn Liu Ericsson 300 Holger Way, San Jose, CA 95134 Email: autumn.liu@ericsson.com Kini & Liu Expires August 22, 2010 [Page 12]