Internet DRAFT - draft-chen-bgp-path-reduction

draft-chen-bgp-path-reduction









Internet Engineering Task Force (IETF)                          E. Chen
Internet Draft                                             P. Mohapatra
Intended Status: Informational                            Cisco Systems
Expiration Date: March 18, 2013                      September 17, 2012



      Reduction of BGP Alternate Paths from Inter-Exchange Points
                  draft-chen-bgp-path-reduction-00.txt


Status of this Memo

   This Internet-Draft is submitted to IETF in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/1id-abstracts.html

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html

   This Internet-Draft will expire on March 18, 2013.

Copyright Notice

   Copyright (c) 2012 IETF Trust and the persons identified as the
   document authors. All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document. Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document. Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.



draft-chen-bgp-path-reduction-00.txt                            [Page 1]





Internet Draft     draft-chen-bgp-path-reduction-00.txt    Sept 17, 2012


Abstract

   In this document we present a mechanism that enhances the "IGP-metric
   based MED" approach so that load balancing is maintained while
   limiting the number of BGP alternate paths carried in a network.  The
   mechanism involves the use of a "scale factor" to scale down the IGP
   metrics for the purpose of setting MEDs, and is thus termed "scaled
   IGP-metric based MED".


1. Introduction

   The BGP sessions [RFC4271] between service providers are typically
   established and maintained at multiple inter-exchange points for the
   purpose of routing redundancy, load balancing, and traffic
   localization.  For a particular prefix (i.e., destination) there
   would exist multiple routes with different nexthops (corresponding to
   different peering points) in a network.  Given the large number of
   routes, and the number of inter-exchange points, the challenge is to
   utilize these peering connections efficiently while maintaining
   operational simplicity.

   One common practice is to direct packets received on an ingress
   router to the "closest" (in terms of the IGP metric to the nexthop)
   inter-exchange points.  Thus from the network point of view, traffic
   destined to a particular prefix would be distributed among the inter-
   exchange points from which the routes for the prefix are received.
   The scheme is commonly known as "shortest-exit routing", or "hot-
   potato routing".

   As described in [RFC4271], the first few steps in BGP route selection
   [RFC4271] involve the comparisons of the LOCAL_PREF value, the AS-
   PATH length, the MED value, and the IGP metrics for the nexthop.
   Given that the AS-PATH length is typically network-topology dependent
   and agnostic to the peering locations, a common implementation of the
   "shortest-exit routing" is to set the LOCAL_PREF value and the MED
   value to a constant value, respectively, for the routes received from
   all these peering points, thus using the IGP metrics as the tie-
   breaker in the BGP route selection.  This scheme offers fast routing
   convergence, consumes minimal network bandwidth for a particular
   network, and requires little coordination and cooperation between
   providers.

   However, the number of alternate paths carried in a network, in
   particular on the route reflectors [RFC4456], grows linearly to the
   number of peering locations.  A large number of alternate paths from
   the peering locations could become a scaling issue as described in
   [NANOG46].



draft-chen-bgp-path-reduction-00.txt                            [Page 2]





Internet Draft     draft-chen-bgp-path-reduction-00.txt    Sept 17, 2012


   Clearly one alternative is for the service providers to use the IGP
   metric as the MED in route advertisement, and accept the MEDs from
   each other.  This approach (hereby referred as "IGP metric based
   MED") is straightforward both technically and operationally. The
   amount of coordination between the providers would also be minimal.
   However, compared with the "shortest-exit routing", the "IGP-metric
   based MED" approach has the drawbacks of slower routing re-
   convergence as only the paths with the lowest MED are readily
   available in the network.  In addition, only one peering point may be
   used for traffic to a given destination, which may potentially impact
   load balancing across all peering locations.

   In this document we present a mechanism that enhances the "IGP-metric
   based MED" approach so that load balancing is maintained while
   limiting the number of BGP alternate paths carried in a network.  The
   mechanism involves the use of a "scale factor" to scale down the IGP
   metrics for the purpose of setting MEDs, and is thus termed "scaled
   IGP-metric based MED".


2. Scaled IGP-Metric Based MED

   The "Scaled IGP-Metric based MED" approach consists of the following
   procedures:

     o Conceptually divide the network and the inter-exchange points
       with a particular provider into multiple topological regions.
       There should usually be more than one inter-exchange points in
       a region so that the traffic destined toward that region will
       be load balanced across the inter-exchange points within that
       region.

     o For a route sourced (either internally or received from EBGP
       peers) within a region, advertise it with an identical MED
       across all the BGP sessions with that provider in the region;
       and advertise it with less preferred MEDs across BGP sessions
       with that provider in other regions.

   Note that only the paths with more preferred MEDs are carried in the
   network, and the external paths with less preferred MEDs would not be
   further advertised by the peering routers to the internal peers.

   The number of alternate paths (for a prefix) carried by the other
   provider that accepts such MEDs will be controlled, roughly, by the
   number of BGP sessions inside the region that sources the prefix.
   Due to the presence of more than one path in the network, the fast
   routing re-convergence will be maintained.  In addition, multiple
   peering locations will be used for traffic destined to the prefix.



draft-chen-bgp-path-reduction-00.txt                            [Page 3]





Internet Draft     draft-chen-bgp-path-reduction-00.txt    Sept 17, 2012


   Operationally this approach can be easily implemented by setting the
   MED based on a scaled IGP metric (i.e., divide the IGP metric by a
   scale factor).  The scale factor can be set as one plus the maximum
   IGP metric (or diameter) between the peering routers and other
   routers within the region.  Clearly the "scale factor" implementation
   would work better in a network with differentiated IGP metric values
   for the "inter-regional" links vs "intra-regional" links.

   This approach can also be implemented by attaching "location
   communities" for routes sourced from different locations, and then
   setting the MED based on the communities.

   It also noted that both the "shortest-exit routing" and the "IGP
   metric based MED" schemes can be considered as special cases of this
   "scaled IGP-metric based MED" scheme with the scale factor being the
   largest 32-bit unsigned integer, and 1, respectively.


3. Example

   In the following figure, A1, A2, A3, and A4 are the peering routers
   in SP1; P1, P2, P3, P4 are prefixes/routes sourced at A1, A2, A3, A4
   respectively.  They are advertised at all the peering points.

   B1, B2, B3, and B4 and the peering routers in SP2; RR1, RR2, RR3, RR4
   are route reflectors in different clusters in SP2.

   The numerical number above a dotted line is the IGP metrics assigned
   to the link.



        (P1)              (P2)                 (P3)            (P4)
         |                 |                    |                |
         |       10        |         30         |       20       |
         A1 -------------- A2 ----------------- A3 ------------- A4
         |                 |                    |                |
         |                 |                    |                |
         |                 |                    |                |
         B1 -------------- B2 ----------------- B3 ------------- B4
         |                 |                    |                |
         |                 |                    |                |
         RR1               RR2                  RR3              RR4
         |                 |                    |                |


   To use the "scaled IGP-metric based MED" scheme, SP1 can conceptually
   organize the networks into two regions, one with the inter-exchange



draft-chen-bgp-path-reduction-00.txt                            [Page 4]





Internet Draft     draft-chen-bgp-path-reduction-00.txt    Sept 17, 2012


   peerings of A1/B1 and A2/B2, and another with A3/B3 and A4/B4.  The
   scale factor for the first region would be 11, and the scale factor
   for the second region would be 21.

   As a result, the number of alternate paths would be 2 on the route
   reflectors in SP2 for each of the prefixes/routes P1 - P4.


4. IANA Considerations

   This document requires no action from the IANA.


5. Security Considerations

   This document does not introduce any new security issues.


6. Acknowledgments

   We would like to thank Eric Rosen and Saikat Ray for their review and
   suggestions.


7. References


7.1. Normative References

   [RFC4271]   Rekhter, Y., Ed., Li, T., Ed., and S. Hares, Ed., "A
               Border Gateway Protocol 4 (BGP-4)", RFC 4271, January
               2006.

   [RFC4456]   Bates, T., Chen, E., and R. Chandra, "BGP Route
               Reflection: An Alternative to Full Mesh Internal BGP
               (IBGP)", RFC 4456, April 2006.


7.2. Informative References

   [NANOG46]   McPherson, D., S. Amante, and L. Zhang, "BGP Scalability
               Considerations - The Intra-domain BGP Scaling Problem",
               NANOG-46, June 2009.








draft-chen-bgp-path-reduction-00.txt                            [Page 5]





Internet Draft     draft-chen-bgp-path-reduction-00.txt    Sept 17, 2012


8. Authors' Addresses

   Enke Chen
   Cisco Systems, Inc.

   Email: enkechen@cisco.com


   Pradosh Mohapatra
   Cisco Systems, Inc.

   Email: pmohapat@cisco.com







































draft-chen-bgp-path-reduction-00.txt                            [Page 6]