Network Working Group G. Wilfong Internet Draft A. Basu F. B. Shepherd Expires November 2002 Lucent Technologies C.-H. L. Ong Oxford University A. Rasala MIT Eliminating IBGP Oscillations Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Abstract This document proposes an extension to the Border Gateway Protocol (BGP), that eliminates the route update oscillations and routing loops in Internal BGP (IBGP) even in the presence of route reflection. 1. Introduction We describe an extension to the Border Gateway Protocol (BGP) [1], that eliminates route oscillations and routing loops in Internal BGP (IBGP) even in the presence of route reflection [2]. The rest of the document is organized as follows: + Section 2 gives an overview of the problem; + Section 3 describes an extension to BGP; + Section 4 outlines the intellectual property rights issues; + Section 5 contains acknowledgments; and + Section 6 contains references. 2. Brief Description of the Route Oscillation Problem May 2002 Eliminating IBGP Oscillations [Page 2] For a given Autonomous System (AS), consider the routes chosen as the best routes by each router in the AS to some given fixed destination d. We call the set of such routes a routing. Then a routing is said to be a stable routing if no router would change its best route after being sent the best routes of all of its BGP neighbors. The route oscillation problem we consider occurs when some subset of routers within an Autonomous System (AS) exchange route update information forever without ever settling on a stable routing to a particular destination. There are actually two classes of possible route oscillations (1) persistent route oscillations is the case when there does not exist any stable routing (i.e., where no matter what route each router chooses, there will always be at least one router that will change to a more preferred route when its neighbors announce their current choices) and (2) transient route oscillations where there are multiple stable routings but because of timing coincidences they do not settle on one until such timing coincidences fail to occur. It is known that in certain network configurations, a combination of route reflectors [2] and the use of the Multi-Exit Discriminator (MED) attribute [1] can lead to persistent route oscillations [3]. In fact, it has been noted that the use of route reflectors alone in the absence of MEDs can lead to persistent or transient route oscillations [4]. Also, whether or not route reflectors are used, configurations using the MED attribute can lead to transient route oscillations[5]. The real problem with the potential for transient route oscillations is not that such timing coincidences could occur indefinitely but the fact that a router could adopt radically different routes depending on the timing of route update messages. This non-determinism can result in routers converging to one stable routing but later, after a re-start, the routers could choose other routes. In fact, there are cases where one stable routing actually fails to respect the semantics of the MED attribute. Moreover, such behavior can make traffic estimations and debugging less transparent. For an example of persistent route oscillations consider the configuration in Figure 1. This figure is taken from [4]. In the figure there are three route reflectors RR1, RR2 and RR3 in the same AS where C1 is a client of RR1, C2 is a client of RR2 and C3 is a client of RR3. In the left side of the figure are shown the BGP sessions. Internally, each route reflector and its client establish a BGP session, and each route reflector has a BGP session with every other route reflector. In addition, each client router has an E-BGP session with some router R in another AS. In the right side of the figure we show the hardware links within the AS and their IGP weights. It is straightforward to check for example that RR1 will constantly switch between choosing the route to R through its client C1 and through the client C2 depending on which route RR2 announces to RR1. The other route reflectors exhibit analogous behavior. Admittedly, the configuration in Figure 1 is not likely in a May 2002 Eliminating IBGP Oscillations [Page 3] reasonable network. The point though is that such behavior is possible with route reflectors and in fact it can be shown that determining if there are routes that each router can choose so that they do not constantly oscillate is "difficult" (i.e., NP-complete) [5]. 1 ******************** *********************** * * * * * * * * [RR1]****[RR2]*****[RR3] * [RR1] [RR2] [RR3] * * * * * * * * * * * * * * *1 * *1 * * * * * *5 * *5 * *5 * * * * * * * * * [C1] [C2] [C3] **[C1] [C2] [C3] * * * * * * ********[R]********* BGP sessions Physical links Figure 1. 3 An Extension to BGP Normally, BGP route selection proceeds step by step until just one route remains. That sole remaining route is that router's best route and it may or may not be announced to its BGP neighbors depending on the particular relationship between them. For example, if there is only one route with the highest LOCAL_PREF value, then that route is chosen. Otherwise, the procedure continues with additional steps such as choosing those routes with the shortest AS_PATH length and so on. If the best route is determined before the BGP route selection process reaches the tie-breaking stages involving E-BGP routes versus I-BGP routes, then the proposed extension to BGP does nothing different from normal BGP. However if there remain multiple potential routes at this stage of the route selection process (that is, at the end of the stage in the selection procedure involved with comparing MED values), then the extension says to advertise all these surviving routes to I-BGP neighbors according to the usual rules for route reflectors and client routers and then to proceed with the tie-breaking selection process to determine that router's best route as with normal BGP. May 2002 Eliminating IBGP Oscillations [Page 4] Despite the fact that this procedure would result in a router announcing routes in addition to its best route, it has been proven that even in the presence of a single layer of route reflectors, the process will result in routers choosing routes that form a stable routing [5]. Also, it is proven that this process will not result in any routing loops or routing oscillations of any kind [5]. This shows that the routing decisions due to this extension are deterministic in the sense that routers will always return to the same routes after failures and re-starts. As an example, consider again the configuration shown in Figure 1. The modified BGP process will have router RR1 always announcing the route through C1 even though it will choose the route through C2 as its best route. Similarly, RR2 will choose the route through C3 and RR3 will choose the route through C1. An obvious drawback to this modification is the fact that additional routes must be announced and more importantly stored by the routers. Given that a key problem is that router memories are being stressed by the number of routes they must keep, requiring routers to store even more routes is clearly problematic. 4. Intellectual Property Statement Lucent has filed a patent on aspects of the material discussed in this draft. 5. Acknowledgments We would like to thank Igor Faynberg for valuable assistance in preparing this draft. 6. References [1] Halabi, B., and McPherson, D., "Internet Routing Architectures", Cisco Press, 2000. [2] Bates, T., Chandra, R., Chen, E., "BGP Route Reflection - An Alternative to Full Mesh IBGP", RFC 2796, April 2000. [3] McPherson, D., Gill, V., Walton, D., and Retana, A., "BGP Persistent Route Oscillation Condition", Work in Progress (draft-ietf-ide-route-oscillation-01.txt), February 2002. [4] Griffin, T., and Wilfong, G., "Constraints for Correct Configuration of IBGP", SIGCOMM 2002. May 2002 Eliminating IBGP Oscillations [Page 5] [5] Basu, A., Ong, L., Shepherd, B., Rasala, A., and Wilfong, G., "Route Oscillations in I-BGP with Route Reflection", SIGCOMM 2002. Author's Addresses Gordon Wilfong Lucent Technologies Room 2C-454 600 Mountain Avenue Murray Hill, NJ 07974 USA E-mail: gtw@lucent.com Anindya Basu Lucent Technologies Room 2C-417 600 Mountain Avenue Murray Hill, NJ 07974 USA E-mail: anindyabasu@lucent.com F. Bruce Shepherd Lucent Technologies Room 2C-381 600 Mountain Avenue Murray Hill, NJ 07974 USA E-mail: bshep@lucent.com Dr. C.-H. L. Ong Oxford University Computing Laboratory Wolfson Building Parks Road Oxford OX1 3QD UK E-mail: Luke.Ong@comlab.ox.ac.uk April Rasala Laboratory for Computer Science, MIT 200 Technology Square Cambridge, MA 02139 USA E-mail: arasala@theory.lcs.mit.edu Full Copyright Statement Copyright (C) The Internet Society (1998). All Rights Reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the May 2002 Eliminating IBGP Oscillations [Page 6] copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English. The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assigns. This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. May 2002