Network Working Group Venkata Naidu Internet Draft Marconi Expiration Date: Dec 2004 June 2004 IP Fast Reroute using Multiple Path Algorithm (MPA) draft-venkata-ipfrr-mpa-00.txt Status of this Memo By submitting this Internet-Draft, I certify that any applicable patent or other IPR claims of which I am aware have been disclosed, and any of which I become aware will be disclosed, in accordance with RFC 3668. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http:// www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on December 29, 2004. Copyright Notice Copyright (C) The Internet Society (2004). All Rights Reserved. Abstract This document describes a practical mechanism to implement IP fast reroute using multiple viable loop-free next hops, which are not necessarily shortest path to a destination. Upon a link failure, all the destinations reachable via the primary nexthop will be forwarded to pre-computed alternative loop free nexthops, if any, without much delay. The mechanism does not impose any new protocol message/capability exchange. The mechanism does not use any tunneling techniques. It is shown that simple changes to existing IGP shortest path computation and routing table structures are sufficient to achieve the desired functionality. More over, routers which implement the proposed mechanism can co-exist and can be fully backward compatible with existing unmodified routers. Venkata Expires Dec 2004 [Page 1] INTERNET DRAFT IP Fast Reroute using MPA June 2004 Table of Contents 1. Introduction ...................................................... 3 2. Terminology ....................................................... 3 3. Framework ......................................................... 4 4. Multiple Viable Loop Free NextHops ................................ 6 5. Changes to Routing Table Structure ................................ 7 6. Computing Multiple Viable Loop Free Next Hops ..................... 8 7. Conclusion ....................................................... 10 8. Security Considerations .......................................... 10 9. IANA Considerations .............................................. 10 10. Normative References ............................................ 11 11. Informative References .......................................... 11 12. Author's Address ................................................ 11 Venkata Expires Dec 2004 [Page 2] INTERNET DRAFT IP Fast Reroute using MPA June 2004 1. Introduction A primary function in today's Internet router is to determine the next hop to which a packet should be forwarded. The algorithm used by each router to obtain the next-hop information for a packet should always result in a loop free routing path to the packet's destination. An example of such an algorithm is the interior gateway link-state protocols like OSPF and ISIS. In an IGP-based router, all packets with the same destination are forwarded to a next hop that lies on a path of shortest distance to the destination. Given a set of link states, all packets between the same source and destination are forwarded along one or more paths of the same shortest distance. This document presents an algorithm based on [RECONF] that enables a packet to be forwarded to its destination along multiple alternate loop-free paths, which are not necessarily of shortest distance. The key idea of this mechanism is to maintain an efficient data structure in a router that helps compute for each destination, alternate viable next hops from the router in order to ensure loop free routing for each packet. Moreover, the data structure used in this algorithm can be implemented as an add-on software to existing link-state protocols, such as OSPF. This algorithm is totally interoperable with link-state protocols in the sense that loop free routing is still guaranteed in a network in which only some of the routers use the algorithm. 2. Terminology A network will be modeled as a directed graph where each router corresponds to a node and each link corresponds to an edge in the graph. G = (V,E) - A directed connected graph where V is the set of nodes and E is the set of edges in the graph. ei - Each edge ei (where i=0, . . ., E-1) has an associated weight (cost) of wi, which is assumed to be positive. w(A, B) - The weight of an edge connecting node A to node B is denoted by w(A, B). T - A rooted tree T is a subgraph of G such that a particular node is specified as the root and every node in G is reachable from the root through a unique directed path using only edges in T. Venkata Expires Dec 2004 [Page 3] INTERNET DRAFT IP Fast Reroute using MPA June 2004 SPT - A tree T is said to be a shortest path tree (SPT) for G if the distance of every node ni in T is the shortest distance of ni in G. Note that there can be multiple SPTs in a graph, but the shortest distance of any node is unique. Nexthop - A node Nj is called the next hop of node Ni along a directed path if Ni is directly connected to Nj via an edge of the path. d(S, D) - The shortest distance d(S, D) from node S to node D is the length of the shortest path from node S to node D. The length of a path is the sum of the weights of the edges in that path. The distance of a node Ni (where i=1, . . ., N) in T is the length of the directed path in T connecting the root to Ni. The shortest distance of a node Ni is the length of a shortest path in G connecting the root to Ni. P(S, D) - The set includes all paths between the routers S and D. D(S, D) - The subset of P(S,D) consists of all paths from S to D such that the shortest distance from each consecutive hop to the destination D decreases as the path is traversed, i.e., if Nj is the next hop of Ni along the path, then d(Nj,D) < d(Ni,D). S(S, D) - The subset of D(S,D) consists of all paths from S to D such that the shortest distance from each consecutive hop to the destination D decreases and the distance from S to each consecutive hop increases as the path is traversed, i.e., if Nj is the next hop of Ni along the path, then d(Nj,D) < d(Ni,D) AND d(S,Ni) < d(S,Nj) M(S, D) - The subset of S(S,D) consists of all shortest paths from S to D. 3. Framework In the Internet, a packet from a source traverses intermediate network routers before reaching its destination. When a packet arrives at a router, a routing table is used to determine the next hop and the corresponding outgoing link (interface) to which the packet should be forwarded. Interior gateway link-state routing protocol, such as OSPF [OSPF], computes this next-hop information in two phases. First, given the current link states in the network, each router S keeps track of a shortest path tree (SPT) rooted at S to every other node. Since the SPT determines a unique path from S to each router Ri, the first hop from S along this path will then be used as the next hop to forward any packet that has Ri as its destination. Venkata Expires Dec 2004 [Page 4] INTERNET DRAFT IP Fast Reroute using MPA June 2004 Although a router using OSPF has global information about the network topology, it determines only the local route of a packet - the next hop. When each router computes such next-hop information based on an SPT, then each packet is guaranteed to be forwarded along a loop-free path to its destination. If multiple SPTs rooted at router S co-exist in a network, there can be multiple choices of next hops from S to forward a packet, each of which will lead to a different shortest path of the same distance (also called equal cost multi-path - ECMP). In this case, the packet can be forwarded to any of these next hops, which would also result in a loop-free path. This document presents a new Multiple Path Algorithm (MPA) which generalizes this idea by allowing a router to forward packets to multiple viable next hops that are not necessarily part of a shortest path from the router to the packet's destination. The essence of MPA is the implementation of an efficient data structure in a router that helps compute alternate viable next hops for each destination. When each router forwards its packets to any of the viable next hops, all packets will be routed to their destinations on loop-free paths. Moreover, a degree of optimality is determined for each viable next hop in the MPA. Different degrees of optimality yield paths of different distances. An advantage of maintaining multiple viable next hops in a router for each destination is to speed up recovery from link failures. A router supporting MPA can recover from link failures much faster since the recovery mechanism using alternate paths can be implemented locally by the forwarding engine in the router, and there is almost no delay incurred by the link failure. On the other hand, without supporting MPA, after detecting its link failure, the router would need to first inform other routers about the status of the link failure by flooding the information throughout the network. The routers would then re-compute/update the shortest path tree (from scratch or using dynamic SPF algorithms) to determine a new alternative path. Another desirable property of our MPA scheme is that the algorithm and its data structure can be efficiently implemented as an add-on component to existing IGPs such as OSPF and ISIS. It only adds a small constant overhead to the shortest path tree computation of the underlying routing protocol. Each MPA-enhanced router computes multiple paths based on the topology information exchanged by the router protocols. Conventional routers based on shortest path routing can co-exist with MPA-enhanced routers in the same routing area and still guarantee loop-free routing for each packet. As we will discuss in more detail later, such interoperability is a consequence of the loop-free routing policy employed by the MPA scheme. In general, an MPA-enhanced router can interoperate with any router whose routing protocol uses the loop-free routing policy. Venkata Expires Dec 2004 [Page 5] INTERNET DRAFT IP Fast Reroute using MPA June 2004 4. Multiple Viable Loop Free Nexthops The first objective of MPA algorithm is to determine multiple loop free next hops (if any) from a router to each destination router in the network. This notion of viable next hop is destination as well as routing algorithm dependent. Formally, let S be a source router directly connected to another router Ni. Given a routing algorithm and a destination router D != S, N is a viable next hop for S if any packet departing N with destination D will never return to S. As we discussed earlier, the basic requirement of any routing protocol is that any packet should be routed to its destination along a loop-free path. IGPs guarantees loop free routing to each destination by imposing that each router chooses the next hop based on an SPT. If the SPT is unique, then corresponding to each destination the next hop from the router is also unique. In addition, the routing of each packet is optimized with respect to the total cost associated with the traversed links. Determining the next hop by computing an SPT is one way of guaranteeing loop-free routing. There are many different types of packet forwarding rules that when applied to each router in the network will guarantee that no packet will return to a router it has visited before. S / \ / \ N Ri . . . . . . . . . . . . . . . . . . . . D Figure 1. From source S, destination D is reachable via multiple Nexthops N and Ri. If router N is the next hop of router S along a shortest path from S to destination router D, then we must have d(N,D) < d(S,D). As in Figure 1, it turns out that if we replace N by any router Ri such that d(Ri,D) < d(S,D) as the next hop for S, alternate loop Venkata Expires Dec 2004 [Page 6] INTERNET DRAFT IP Fast Reroute using MPA June 2004 free paths can be found. In other words, it can be shown that loop free routing can be guaranteed if the shortest distance from a router to its destination is always larger than the shortest distance from the next hop of the router to the destination. More formally, such a general routing policy can be stated as: If a packet with destination D is forwarded by router S to its next hop N, then we must have d(N,D) < d(S,D). It follows that given a destination, any next hop of the router chosen according to the loop-free routing policy is a viable next hop. As we will discuss in next section, the proposed algorithm makes use of a data structure to help compute alternate viable next hops whose shortest distances to the destination are less than the shortest distance from the router to the destination. In practice, many Internet routers compute next-hop information based on shortest path routing. Since shortest path routing is a special case of the loop-free routing policy, MPA routers and conventional routers based on shortest path routing can interoperate in the same network and guarantee a loop-free route for each packet. 5. Changes to Routing Table Structure A conventional SPF algorithm (e.g., Dijkstra, Bellman-Ford, etc.) during its execution usually keeps track of certain state information about each node in the network. Typically, such information consists of: * a distance attribute * the next hop attribute from the router, and * an optional parent attribute for each node. At each instant of the algorithm execution, the distance attribute of a node usually represents the distance of the shortest path found so far from the source router to that node, while the next hop attribute and the parent attribute respectively identify the first hop and the last hop of that path. +-----------------------------------------------------------------+ | Destination | NextHop | Edge Weight | Total Distance | +-----------------------------------------------------------------+ | X | N1 | w(S, N1) | d(S,X) via N1 | | X | N2 | w(S, N2) | d(S,X) via N2 | | X | N3 | w(S, N3) | d(S,X) via N3 | | . . . | . . . | . . . | . . . | +-----------------------------------------------------------------+ Table 1: Routing Table structure for a destination X Venkata Expires Dec 2004 [Page 7] INTERNET DRAFT IP Fast Reroute using MPA June 2004 To compute alternate viable next hops, the MPA uses a data structure that obtains more state information about each node in the network than those described above without incurring significant computational complexity. More specifically, the MPA in a given router S maintains the following data structure for each destination X (see Table 1): 1. For each node X, the first entry in the data structure will contain the shortest distance dopt(S,X) from the router S to X discovered so far. This information is computed by existing SPF algorithms such as Dijkstra or Bellman-Ford. 2. In addition, the cost w(S,Ni) for each outgoing link connecting the router S to a potential next hop Ni is also maintained. 3. Furthermore, the data structure keeps track of the length of the shortest path found so far that uses Ni as the next hop from S, which is denoted by dNi(S,X). When the execution of the MPA terminates, dNi(S,X) should be equal to the sum of w(S, Ni) and the shortest distance d(Ni,X) from Ni to X. By maintaining this data structure, the MPA identifies viable next hops for destination X by making use of the following equation: IF { d(S,X) via Ni } - w(S,Ni) < d(S,X) THEN Ni is a viable Next Hop for destination X from source S. Once the MPA has obtained information about the distance attribute for each possible next hop, it can easily determine which next hops are viable for a given destination (although not all viable next hops might be discovered). By searching through different paths, the data structure at every node can be updated, increasing the number of viable next hops. As long as the shortest distance between two nodes is computed correctly, the resulting viable next hops will not lead to a routing loop. 6. Computing Multiple Viable Loop Free Next Hops The construction and update of the data structure is achieved by the MPA in conjunction with a SPF algorithm. In the following discussion, an underlying SPF algorithm is assumed to be available to compute the shortest path tree from a router to every other node in the network. While it is essential that the underlying SPF algorithm calculates the exact shortest distance from a router to every other node, the MPA does not necessarily compute all viable alternate paths to each destination. These alternate paths only provide for a greater routing choice and performance but are not essential for the correctness of the routing procedure. The only constraint is that the selected paths must not violate the loop-free routing policy. Venkata Expires Dec 2004 [Page 8] INTERNET DRAFT IP Fast Reroute using MPA June 2004 During its execution, the underlying SPF algorithm will constantly compare the distance attributes of two neighboring nodes. IGPs use commonly used SPF algorithms such as Bellman-Ford or Dijkstra are comparison-based algorithms, whose basic operation is to compare attributes between two neighboring nodes. Under any of these SPT algorithms, the distance attribute of a node will eventually converge to the shortest distance from the source to that node. In the data structure of the MPA, the distance attribute of node X (from S) computed by the SPF algorithm is contained in the corresponding field dopt(S,X). This field is initialized and updated by the underlying SPF algorithm, independently of other operations of the MPA. The other distance attributes di(S,X) are initialized to infinity. Every time the underlying SPF algorithm makes a comparison between nodes A and B, dopt(B) is set to the minimum of its old value and dopt(A) + w(A, B). At the same time, while the comparison between nodes A and B is being made, the MPA is triggered and starts making its own comparisons. For every next hop Ni from S, dNi(S,B) is set to the minimum of its old value and dNi(S,A) + w(A,B). After the execution of the shortest path algorithm is completed, the data structure in Table 1 is now updated. If the entry for a nexthop Ni satisfies the loop-free routing policy, then it can be used as a viable alternative hop to reach the destination. The number of viable next hops discovered by the MPA depends on which underlying SPF algorithm the router is using. The most commonly used SPT engine is Dijkstra's algorithm. For example, it is the one used in OSPF. If Dijkstra is used as the underlying SPF algorithm, then all the paths in S(S,X) will be probed by the MPA. Therefore, at the end of execution, dNi(S, X) corresponds to the distance of the shortest path in S(S, X) that traverses the next hop Ni. However, this algorithm might not probe the paths that are in D(S,X) but not in S(S,X). Note that D(S,X) is a super set of S(S,X). Note that the first hop of any path in D(S,X) is always a viable path, but this algorithm might not be able to find it. If we want the MPA to be able to find the next hops associated with the paths in D(S,X) as well as with those in S(S,X), a more complex search is required. A more exhaustive search of alternate next hops can be obtained by running a set of independent shortest path tree algorithms in parallel. For every next hop Ni, any shortest path algorithm (including dynamic ones) is executed using only the distance attribute dNi(S,X) in the data structure associated with the destination X. At the end of the executions, the data structure will be thoroughly updated. All of the next hops Ni that satisfy the loop-free routing policy are obtained. These next hops constitute the set of all first hops of the paths in D(S,X). The Venkata Expires Dec 2004 [Page 9] INTERNET DRAFT IP Fast Reroute using MPA June 2004 distance dNi(S,X) is the distance of the shortest path in D(S,X) that traverses next hop Ni. Once the MPA has terminated, its data structure will contain a list of viable next hops. The distance attribute for each next hop dNi(S,X) can be used by the forwarding engine to determine the desirability of the each viable next hop. The smallest dN1(S,X) corresponds to the next hop N1 of the shortest path to destination D. In general, the smaller dNi(S,X) is, the more desirable Ni is as a next hop for X, since it is most likely to lead to a shorter path to X. 7. Conclusion The multiple path algorithm is an addition to a regular shortest path algorithm used in routers. Its purpose is to obtain alternative routing paths. The multiple path algorithm is interoperable with most existing shortest path algorithms, including the Bellman-Ford and Dijkstra algorithms used in routing protocols such as RIP, OSPF and ISIS. The extra paths obtained by the multiple path algorithm are guaranteed to forward the packet in such a way that there will never be a loop in the packet's route even if other routers don't use the MPA. The data structure contained in every node determines how much the packet is advancing toward its destination. If the advance is positive, there clearly can be no loop. This metric can also be used to decide on the desirability of the path. The mechanism described in this document [RECONF] is limited in scope in a way that only directly connected neighbor routers are used as viable loop free nexthops. [IPFRR-SSPA] presents a more advanced mechanism to compute and reroute packets using dynamic SPF and tunnels to remote nexthops. 8. Security Considerations The mechanism described in this document is an add-on off-line components to existing link-state IGPs. In the worst case, the behavior and operation of the proposed algorithm is as good as existing hop-hop routing. There are no known security issues with the proposed mechanism. 9. IANA considerations There are no IANA considerations that arise from this document. Venkata Expires Dec 2004 [Page 10] INTERNET DRAFT IP Fast Reroute using MPA June 2004 10. Normative References Internet-drafts are works in progress available from http://www.ietf.org/internet-drafts/ 11. Informative References Internet-drafts are works in progress available from http://www.ietf.org/internet-drafts/ [IPFRR] Shand, M., "IP Fast Reroute Framework", draft-ietf-rtgwg-ipfrr-framework-01.txt, Work In Progress. [OSPF] Moy, J., "OSPF Version 2", RFC 2328, April 1998. [RECONF] P. Narvaez, "Routing Reconfiguration in IP Networks", Doctoral Thesis, MIT, May 2000. [IPFRR-SSPA] Venkata Naidu, "IP Fast Reroute using Stitched Shortest Paths Algorithm", draft-venkata-ipfrr-sspa-00.txt, Work In Progress. 12. Author's Address Venkata Naidu Marconi Communications 900 Chelmsford Street, Cross Point III, 4th Floor Lowell, MA 01851 Phone: (978) 275-7418 EMail: Venkata.Naidu@Marconi.com Intellectual Property Statement The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the IETF's procedures with respect to rights in IETF Documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. Venkata Expires Dec 2004 [Page 11] INTERNET DRAFT IP Fast Reroute using MPA June 2004 The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org. Disclaimer of Validity This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Copyright Statement Copyright (C) The Internet Society (2004). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. Acknowledgment Funding for the RFC Editor function is currently provided by the Internet Society. Venkata Expires Dec 2004 [Page 12]