Traffic Engineering WG Stephen Shew Internet Draft Nortel Networks Document: October 1999 Fast Restoration of MPLS Label Switched Paths Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026 [1]. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet- Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. 1. Abstract This document describes a mechanism for fast detection of MPLS LSP failure when a link fails and scales to all LSPs affected by the failure. Fast detection enables ingress LERs to quickly recover onto backup LSPs. A performance improvement in the reliability of LSPs is expected. The mechanism described relies on a node and network architecture that integrates L1/L2/L3 technology. 2. Conventions used in this document The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC-2119 [2]. Acronyms from the MPLS Architecture document [3] are used here. Specifically, we use: LSR - Label Switching Router LSP - Label Switched Path Additionally, LER - Label Edge Router - an LSR which originates/terminates an LSP. Shew Expires April 1999 [Page 1] Internet Draft draft-shew-lsp-restoration-00.txt October 1999 3. Introduction MPLS provides a useful mechanism for placing traffic in IP networks which is a key capability for traffic engineering. Label Switched Paths can be defined independent of L3 shortest paths and this enables explicit engineering of traffic loads. In an MPLS network supported by an underlying optical network, increasing reliance on large LSPs is a problem because the impact of an LSP failure could be extensive. Traditional SONET layer protection could be used for the L2 segments in an MPLS network but the efficiency of bandwidth usage is an issue. The potential for MPLS mechanisms to provide recovery performance similar to SONET has been mentioned [5] and a framework for protection and recovery in MPLS networks is described in [6]. A goal for MPLS protection then is to have better recovery performance when there is an L1 link failure in the network and also be efficient with bandwidth (i.e., no reserved protection bandwidth). This document describes a mechanism for fast LSP failure detection that is needed before recovery procedures can be executed. It also describes an integrated network architecture in which reserved protection bandwidth is optional. 4. Problem Space In this draft, we want to consider the problems of engineering reliability of router-router links and fast recovery of MPLS LSPs. Specifically, the problem of fast failure detection and notification of affected MPLS LSPs is addressed. 4.1 LSP failure detection Fast recovery in MPLS is hampered by the fact that detecting an LSP failure at the ingress LER can take a long time. After a break in an LSP hop, Notification messages are propagated along the LSP intermediate nodes back to the ingress LER. Message processing occurs at each hop and this adds delay in informing the ingress LER that the LSP has failed. ATM has a similar problem with VC failure detection in that Release messages also have to be processed at each intermediate switch on the way back to the source node. I.610 and I.630 are attempts to standardize fast detection and recovery methods in ATM but this relies on support of OAM cells flowing along the VC. In IP connectionless networks, failures affecting TCP sessions can also take a long time to detect since the end-systems must decide if the session went down. This is a consequence of the connectionless paradigm where all you care about is maintaining connectivity. Because connectionless recovery is dependent on IP routing, detecting loss of connectivity can take seconds. Shew Expires April 2000 [Page 2] Internet Draft draft-shew-lsp-restoration-00.txt October 1999 The fastest detection occurs at the local end of a link failure. Schemes that try to mend connections at the point of failure are known as "local repair" schemes. In ATM PNNI, the connection signalling procedure can crankbank to a earlier intermediate point and then try to establish connectivity toward the destination. Local repair has performance advantages in maintaining connectivity but at the expense of efficiency (more hops, more bandwidth, more end-to-end delay). 4.2 Scaling of LSP Failure Notification A second problem with single L2 link failure is that multiple LSPs can be affected and many (hundreds) ingress points must be informed. This is computationally expensive if MPLS signalling (LDP or RSVP) is used for each LSP. Because many LSP can be affected by a single link failure, the magnitude of failure notification is an important issue. 4.3 Magnitude of L1 Failure Just as a single L2 failure can affect multiple LSPs, a single L1 failure can affect multiple L2 links. Here, failure detection is not as much of a problem if L2 restoral mechanisms exist (e.g., ATM I.630), but the effect of multiple simultaneous router-router link failures is large on the stability of an IP network. When an IP network operator leases lines for router-router links, physical link diversity is a consideration. It can be difficult to ensure that a physical link failure does not affect two or more leased lines due to the multiplexing complexities of L1 and more recently, L0 networks. If an IP network runs over an L2 network (e.g., ATM), there can be similar difficulty in ensuring minimum impact on router-router links when an L2 link fails. Even if both IP and L2 networks are controlled by the same organization, engineering for router-router link reliability over shared L2 links is complex. 5. Solution Motivation This section describes several concepts that help motivate the solution presented later. 5.1 Overlay vs Integrated Networks MPLS networks are less complex than an IP network overlayed on an L2 switched network. One of the reasons for this is that the L2/L3 topology is aligned and there is a single routing protocol that can take action when an L2 link fails. This was noted in [4]. In an overlayed network, an L2 link could be part of two switched connections that are actually router-router links. If that link Shew Expires April 1999 [Page 3] Internet Draft draft-shew-lsp-restoration-00.txt October 1999 fails, multiple router-router connections are affected which trigger an IP routing protocol to update the topological and forwarding views. If the L2 network also has a routing system (e.g., ATM PNNI) then both L3 and L2 routing systems are run. The L3 routing system will be affected by the L2 routing system in that the L2 routing system may attempt to reroute or re-establish connections. This can increase the detection time for a link failure at L3 because the L2 control layer has to declare the connection to be down first, and the L2 connection tear down procedure may have to be executed across multiple switches A single routing system minimizes the link failure detection time for L3 since there is no L2 control that precedes the L2 connection down event. 5.2 L1 Detection As noted earlier, L1 failure detection is fast due to physical methods (loss of light, loss of carrier signal). This is an attractive property. Further, in a TDM, optical mux (SONET), or optical cross connect network, when a link fails all of the paths (at that layer) which use the link go down. Unlike higher layers, the endpoints of those paths detect the failure quickly because the signalling of the failure is very fast (e.g., AIS signals in SONET) and because the signalling is sent to each channel of the failed link. So in L1 networks, the detection of a failed connection is fast and scales well for all connections on the failed link. To be fair, the number of L1 connections on a link is not as high as the number of say ATM VCs on a link. However, these detection properties are highly desirable for L2 connections on L2 links. Fast detection is possible with ATM that has hardware that can handle inband OAM cells (I.630), but is not really tractable for MPLS LSPs. This is because of the variety of L2 media (esp. Ethernet and PPP) and the amount of packets that would have to be sent to get fast (<100 millisecond) detection. Also, I.630 is slower due to timers in some of its mechanisms (e.g., 3.5 seconds for Loss of Continuity, up to 500ms for AIS injection). 6. L1/L2/L3 Integration Solution A key to the solution for fast detection is the alignment of L1, L2, and L3 capabilities into a single node. This architecture and its impacts on the ability to detect LSP failure are now described. 6.1 L1/L2/L3 Integration As was noted earlier, in MPLS LSRs, the alignment of the L3 and L2 topology brings some advantages in the speed at which the network can react to a link failure. This integration is extended to Shew Expires April 1999 [Page 4] Internet Draft draft-shew-lsp-restoration-00.txt October 1999 encompass L1 components in order to realize further speed advantages. We define an L1/L2/L3 switch as an LSR combined with an L1 cross connect switch. This could be a SONET Add/Drop Mux, an optical cross connect, or traditional TDM switch. The integrated switch is able to originate and terminate IP traffic from the L1 cross connect. Conceptually, this is done over dedicated L1 channels between the L1 cross connect and the pure IP router function of the integrated switch. The switch can also tandem L1 traffic through the L1 cross connect component. This is similar to the way in which LSRs can tandem L2 data in their established tandem LSPs. In the L2 case, this is a label swap. In the L1 case, it is for example, a mapping from one time slot to another time slot on an outgoing interface. Two L1/L2/L3 nodes are connected by a physical L1 link. A channel in that link is used as a router-router IP link. For example, an OC-3 channel of an OC-48 link with PPP over SONET for the framing. This is analogous to the L2 control channel between two MPLS switches connected over an ATM interface. A key difference between this type of network and L2/L3 networks which are overlayed on L1 networks, is that the L1/L2/L3 network does not have any L1 paths which act as router-router links. In an integrated network, the L3 routing protocol has a view of both the L2 and L1 topology since those layers are now aligned. Consider a SONET ring in Fig. 1 with 4 Add/Drop Muxes (ADMs) and an LSR attached to each ADM. A typical configuration with an L1 overlay is to fully mesh the LSRs. Of the six router-router links, four of them are one-hop channels between ADMS, and two of them are actually SONET paths which bypass an ADM. If the ring is protected, then if there is a fiber cut, all router-router links will be preserved as the affected L1 paths would be L1 rerouted over the protection bandwidth. Converting such a network to an L1/L2/L3 network would involve the elimination of the two-hop L1 paths that act as router-router links. The protection bandwidth could also be fully used and as a consequence of the failure recovery method of section 6.3, the L1 protection function is not needed. LSR1 | +---ADM1---+ | | LSR4--ADM4 ADM2--LSR2 | | +---ADM3---+ | LSR3 Fig. 1 SONET ring with 4 LSRs Shew Expires April 1999 [Page 5] Internet Draft draft-shew-lsp-restoration-00.txt October 1999 Note that an L1/L2/L3 network can also be built with LSRs and TDM switches. 6.2 L1 Cut-through Paths In earlier IP over ATM work (e.g., MPOA, LANE, NHRP, use of OSPF ARA), the notion of an "L2 cut-through" was defined. This is a VC which is set up to directly connect two IP routers/hosts for a known IP flow between those entities. MPLS re-uses the "L2 cut-through" in a different manner. Instead of a separate L2 network around which the L3 nodes are connected over L2 cut-throughs, MPLS combines or integrates the L2 cut-throughs in the same L2/L3 network. That is, every LSR is capable of L2 switching and L3 forwarding. Cut-through paths are distinguished from L2 paths which are used as L3 links. When an L2 path is an L3 link, it carries L3 routing control traffic and is equivalent to a PPP link between IP routers. Cut-throughs are not router-router links and don't carry routing control traffic. Thus they don't need to appear in the L3 topology database. LSRs are used by inserting them as next hop entries in the IP forwarding table of ingress LERs. If incoming IP connectionless traffic matches a Forwarding Equivalence Class, the traffic is sent to the corresponding LSP. Once on the LSP, the traffic is label switched along the path to the end of the LSP and is independent of how L3 forwarding would have directed it. Existing L2 overlays on L1 networks exhibit the same separation as earlier IP over ATM work. An L1 path is configured for one L2 link between two L2 switches. This L1 path is a series of channels that are connected by L1 cross connects. Typically, the service offered over L1 paths is a leased line. The same concept of MPLS integration can be applied to L1 paths. Here, in an L1/L2/L3 network, an L1 path has an LSR at every cross connect point. To use an L1 path, treat it as if were an LSP, or overlay an LSP onto this path. That is, consider the L1 path as a cut-through. When an incoming IP packet is matched to a Forwarding Equivalence Class associated with this L1 cut-through, the IP forwarding table entry points to the start of this L1 path. As with L2 cut-throughs, an L2 header is added. The packet is sent to this path and is then L1 switched until it reaches the end of the path. At the termination point, the packet could be L2 switched or L3 forwarded. Like existing LSPs, packets traversing an L1 cut-through are independent of how L3 forwarding would have directed them. Also, L1 cut-throughs are not router-router links. 6.3 Fast Recovery 6.3.1 LSP Recovery Shew Expires April 1999 [Page 6] Internet Draft draft-shew-lsp-restoration-00.txt October 1999 Using L1 cut-throughs in an L1/L2/L3 network enables fast detection of LSP failure. Consider two LSPs that are L1 cut-throughs: LSR1-LSR2-LSR3-LSR4 and LSR5-LSR2-LSR3-LSR6 If L1 link LSR2-LSR3 goes down, all nodes in both LSRs can detect the path failure based on L1 physical methods. For example, loss of light (Alarm Indication Signal in SONET) or carrier signal (TDM). In particular, the LSP endpoints can determine that the LSP is down much faster than the protocol based method in LDP of Notification messages which is processed at each LSR on the paths back to the ingress and egress. For example, propagation of the physical failure is about 5 microseconds per kilometer. Not only is the failure detection fast, but it scales for all LSPs that are affected by a single L1 failure. In the example above, two LSPs are notified, but if there were 192 paths in an OC192 link, then all of their endpoints could detect the link failure within a short period of time (a few milliseconds). When an LSP failure is detected, the LSR can reroute the traffic to a backup LSP. This backup LSP could be pre-defined to be link disjoint from the primary LSP, and could also be set up in advance. To avoid wasting dedicated bandwidth (i.e., a dedicated backup L1 cut-through), the backup LSP for the L1 cut-through could be an LSP created over L2 connections which share bandwidth (e.g., ATM UBR VC). Assuming that a backup LSP is already set up, restoration of a failed LSP that is overlayed on an L1 cut-through could be implemented with similar performance to SONET Line and Ring restoration. For LSR which provide L3 connectionless forwarding, traffic from the failed LSP could also be immediately handled by L3 forwarding if a backup path LSP is not provided. 6.3.2 Router Link Recovery In an L1/L2/L3 network, when a physical link goes down only one router-router link is affected. This is in contrast to an overlayed network where multiple router-router links could be affected by a single L1 link failure. The alignment then of the layers can reduce the magnitude of the L1 failure in the L3 topology. Note that the IP routing update process executes in parallel with the fast LSR recovery scheme. It will however, be much slower due to protocol processing and topology database maintenance. 6.4 Applicability and Limitations This scheme obviously requires either control of an L1 network, and/or information from an L1 service provider (leased line) on L1 topology. Co-ordination of L1 changes would be important in the Shew Expires April 1999 [Page 7] Internet Draft draft-shew-lsp-restoration-00.txt October 1999 latter case. The MPLS network operator needs to be able to configure L1 paths or have them configured so that matching LSPs can be overlayed. It is recognized that the number of channels in optical or TDM multiplexing is less than the number of labels available in L2 networks (e.g., VPI/VCI space on an ATM interface) so the number of LSPs is limited to hundreds, not tens of thousands. LSPs overlayed onto L1 cut-throughs will have fixed bandwidths unlike LSPs that share a common L2 link. They will also be bidirectional since L1 facilities come this way. Despite the above 2 limitations, the solution may be a good fit in high performance IP backbones whose LSPs are 'core' LSPs which contain stacked LSPs inside them. The solution adheres to the principle of end-system control, in this case, the LER contains the intelligence to use the L1 cut-through and the recovery procedure. 7. Intellectual Property Considerations Nortel Networks may seek patent or other intellectual property protection for some or all of the technologies disclosed in this document. If any standards arising from this document are or become protected by one or more patents assigned to Nortel Networks, Nortel intends to disclose those patents and licence them on reasonable and non-discriminatory terms. 8. Security Considerations In addition to security issues raised in [3], if the MPLS network leases L1 services from another organization, then maintaining the alignment of L1 switches with LSRs requires that the MPLS network operator be notified of any changes in the L1 network. Otherwise, L1 cut-throughs may not be correctly set up. 9. References [1] Bradner, S., "The Internet Standards Process -- Revision 3", BCP 9, RFC 2026, October 1996. [2] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997 [3] E. Rosen, A. Viswanathan, R. Callon, "Multiprotocol Label Switching Architecture", Work in Progress, August 1999. [4] C. Semeria, J. Stewart, "Optimized Routing Software for Reliable Internet Growth", Juniper Networks white paper, July 1998 Shew Expires April 1999 [Page 8] Internet Draft draft-shew-lsp-restoration-00.txt October 1999 [5] W.F. Maton, "MPLS and CA*Net 2/3", MPLS'99 Conference, June 1999 [6] S. Makam, V. Sharma, K. Owens, C. Huang, "Protection/Restoration of MPLS Networks", , work in progress, June 1999. 10. Acknowledgments The author would like to thank Ken Hayward for discussing many of the ideas and issues in this draft. 11. Author's Addresses Stephen Shew Nortel Networks PO Box 3511 Station C Ottawa, ON Canada K1Y 4H7 Phone: 613-763-2462 Email: sdshew@nortelnetworks.com Shew Expires April 1999 [Page 9] Internet Draft draft-shew-lsp-restoration-00.txt October 1999 Full Copyright Statement "Copyright (C) The Internet Society (date). All Rights Reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implmentation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English. The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assigns.