Network working group X. Xu Internet Draft E. Guo Category: Informational Huawei Technologies Expires: January 2011 July 2, 2010 Avoiding Transient Loops when using BGP Best External draft-xu-idr-best-external-loop-avoidance-00 Status of this Memo This Internet-Draft is submitted to IETF in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on Januray 2, 2011. Copyright Notice Copyright (c) 2009 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Xu, et al. Expires January 2, 2011 [Page 1] Internet-Draft Loops Avoidance for BGP Best External July 2010 Abstract This document proposes a mechanism for avoiding transient forwarding loop when using BGP best external approach [BEST-EXTERNAL] to achieve BGP fast convergence. Conventions used in this document The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC-2119 [RFC2119]. Table of Contents 1. Problem Statement............................................3 2. Transient Loop Avoidance Mechanism...........................5 3. Security Considerations......................................5 4. IANA Considerations..........................................6 5. Acknowledgements.............................................6 6. References...................................................6 6.1. Normative References....................................6 6.2. Informative References..................................6 Authors' Addresses..............................................6 Xu, et al. Expires January 2, 2011 [Page 2] Internet-Draft Loops Avoidance for BGP Best External July 2010 1. Problem Statement According to the base BGP specifications, a BGP speaker can not advertise any route other than the best route for a given destination which largely hampers the fast convergence of BGP route. To improve the BGP convergence performance, the best external route approach [BEST-EXTERNAL] is proposed, which allows a BGP speaker to advertise the second-best route for a given destination which can be used by its BGP peers as a standby route immediately after the best route fails. However, due to the time routers take for propagating the topology change around the network together with the difference in time taken by routers to update their FIBs for the affected destinations, transient forwarding loops will usually occur during this transition process. AS#2 AS#3 | | | | +------+-----------+------+ | +--+--+ +--+--+ | | | R1 +-----+ R2 | | | +-\---+ +--/--+ | | \ / | | \ / | | *------* | | | R3 | | | +------+ | | AS#1 | +-------------------------+ Figure 1: Best External Use Case In the above scenario shown in Figure 1, BGP best external approach is enabled on the BGP speakers of AS#1 (i.e., R1, R2 and R3), assuming R1 and R2 each learns an eBGP route to the same destination P from AS#2 and AS#3 respectively, and the route learnt from AS#2 is selected as the best route by R1, R2 and R3. According to the best external approach, R2 will advertise its best external route to P (i.e., the route learnt from AS#3) to its iBGP peers, R1 and R3. When R1's connection to AS#2 is broken, a transient forwarding loop will be formed till the best route to P for R2 has converged to an alternate path (i.e., the route learnt from AS#3). That is: R3 still forward packets destined for P towards R1 which, in turn, will forward them to R2 due to its fast local repair process. Upon receiving those packets, R2 will still forward them back to R1 since the route learnt from its iBGP peer R1 (i.e., the route learnt from Xu, et al. Expires January 2, 2011 [Page 3] Internet-Draft Loops Avoidance for BGP Best External July 2010 AS#2) is still deemed as the best route to P... Thus, a transient forwarding loop is formed between R1 and R2 till the network is converged. Somebody may argue that in the above scenario, by using IGP within AS#1 for fast flooding of AS#2's ASBR next-hop down event, together with the Prefix Independent Convergence (PIC) core technology, the BGP convergence can be completed in a very short period. Therefore, there is absolutely no need to avoid such transient forwarding loop any more. However, besides the hardware requirement for PIC capability, such network design also requires that the BGP next-hop of the eBGP-learnt routes should not be changed when being advertised to iBGP peers. Unfortunately, there does exist some scenarios where the BGP next-hop of the eBGP-learnt routes MUST be rewritten by a BGP speaker as itself when advertising these routes to its iBGP peers. BGP/MPLS VPN [RFC4364] and Softwire Mesh [RFC5565] are two common examples. +-------------------------+ | VPN Site #1 | | | | | | +--+--+ +--+--+ | | | CE1 | | CE2 | | | +--+--+ +--+--+ | +------+-----------+------+ | | +------+-----------+------+ | +--+--+ +--+--+ | | | PE1 +-----+ PE2 | | | +--\--+ +--/--+ | | \ / | | \ / | | *------* | | | PE3 | | | +---+--+ | | Provider | AS | +------------+------------+ | +------------+------------+ | +--+--+ | | | CE1 | | | +-----+ | | VPN Site #2 | +-------------------------+ Figure 2: BGP/MPLS VPN Multi-homing Use Case Xu, et al. Expires January 2, 2011 [Page 4] Internet-Draft Loops Avoidance for BGP Best External July 2010 Taken the BGP/MPLS VPN Customer Edge (CE) multi-homing as an example (See Figure 2), assuming PE1 and PE2 each learns a route to the same destination P from CE1 and CE2 respectively, and the route learnt from CE1 is selected as the best route by PE1, PE2 and PE3. Once the link between PE1 and CE1 fails, due to fast local repair process, PE1 will immediately reroute the packets destined for P to PE2 which, in turn, forwards the packets back to PE1 since the routes on PE2 have not been converged. Thus the transient loop is formed. 2. Transient Loop Avoidance Mechanism The transient loop scenario demonstrated in Figure 1 can be avoided as long as the best external route is processed according to the approaches defined in Carrying Label Information in BGP-4 [RFC3107]. For example, R2 will assign a unique MPLS label for its best external route (i.e., the route to P learnt from AS#3) and then install it into its MPLS forwarding table. When advertising the best external route to its iBGP peers, R2 will attach the corresponding MPLS label in the BGP update. Upon receiving the best external route, R1 and R3 could either process it as normal BGP behavior or install it into their own forwarding table as a backup route to destination P. In case that the best route becomes unavailable (e.g., R1's connection to AS#2 is lost), R1 will immediately reroute the packets destined for P towards R2 with the corresponding MPLS label attached, according to the standby route advertised by R2. R2 in turn will forward the received packets to AS#3 according to the corresponding MPLS forwarding entry (i.e., its best external route to P), rather than the route entry in its routing table (i.e., its best route to P). Of course, if R1 and R2 are not directly connected, an IP or MPLS tunnel from R1 to R2 is required since the above MPLS label has only local meaning on R2. Similarly, for the transient loop scenario described in Figure 2, the PE2 should assign a unique label for the standby route (i.e., best external route) to P which is different from the label (either per-VPN label or per-route label) for the best route to P, and then install this standby route into its MPLS forwarding table. Upon receiving packets with inner MPLS label (i.e., VPN label) matching the label of the standby route to P, PE2 will forward the packets directly to CE2, other than to PE1, according to the matching MPLS forwarding table entry. Thus the transient loop will not occur any more in this case. 3. Security Considerations No additional security risk is introduced by using this loop prevention mechanism. Xu, et al. Expires January 2, 2011 [Page 5] Internet-Draft Loops Avoidance for BGP Best External July 2010 4. IANA Considerations No requirement for IANA. 5. Acknowledgements Thanks to Robert Raszuk, Stewart Bryant for their valuable comments on this loop prevention idea. 6. References 6.1. Normative References [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC3107] Rekhter, Y. and E. Rosen, "Carrying Label Information in BGP-4", RFC 3107, May 2001. [RFC3736] Marques,P., Fernando, R., Chen, E, Mohapatra, P., " Advertisement of the best external route in BGP", draft- ietf-idr-best-external-01.txt, April 2004. [RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private Networks (VPNs)", RFC 4364, February 2006. [RFC5565] Wu, J., Cui, Y., Metz, C., and E. Rosen, "Softwire Mesh Framework", RFC 5565, June 2009. 6.2. Informative References [RFC5717] Shand, M., Bryant, S, "A Framework for Loop-Free Convergence", RFC 5717, January 2010. Authors' Addresses Xiaohu Xu Huawei Technologies, No.3 Xinxi Rd., Shang-Di Information Industry Base, Hai-Dian District, Beijing 100085, P.R. China Phone: +86 10 82836073 Email: xuxh@huawei.com Erwei Guo Huawei Technologies, No.156 Beiqing Rd., Z-park, Shi-Chuang-Ke-Ji-Shi-Fan-Yuan, Hai-Dian District, Beijing 100095, P.R. China Xu, et al. Expires January 2, 2011 [Page 6] Internet-Draft Loops Avoidance for BGP Best External July 2010 Phone: +86 10 59722588 Email: guoerwei@huawei.com