Internet DRAFT - draft-white-bounded-longest-match

draft-white-bounded-longest-match





Network Working Group                                           R. White
Internet-Draft                                             Cisco Systems
Intended status: Experimental                                   S. Hares
Expires: February 1, 2009                           NextHop Technologies
                                                               T. Hardie
                                                           July 31, 2008


                  Bounding Longer Routes to Remove TE
                  draft-white-bounded-longest-match-02

Status of this Memo

   By submitting this Internet-Draft, each author represents that any
   applicable patent or other IPR claims of which he or she is aware
   have been or will be disclosed, and any of which he or she becomes
   aware will be disclosed, in accordance with Section 6 of BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   This Internet-Draft will expire on February 1, 2009.

Abstract

   Some ASes currently use length-based filters to manage the size of
   the routing table they use and propagate.  This draft explores an
   alternative to length-based filters which allows for more automatic
   configuration and which provides for better redundancy.

   Rather than use a filter, this draft proposes a method of modifying
   the BGP [RFC1771] longest match algorithm by setting a bound on the
   prefix lengths eligible for preference.  A bound would operate on
   long prefixes when covering route announcements are available; in
   certain circumstances it would cause a router to prefer an aggregate



White, et al.           Expires February 1, 2009                [Page 1]

Internet-Draft     Bounding Longer Routes to Remove TE         July 2008


   over a more specific route announcement.


1.  Introduction

   Many routes injected into the global default free zone of the
   Internet today are injected to steer traffic (or provide traffic
   engineering), rather than to provide reachability information
   directly.  In several recent discussions, it has been asserted that
   this table growth due to routes injected to provide traffic
   engineering is causing many problems within the default free zone,
   including more table instability, as these routes appear to change
   state more often than shorter prefix aggregate routes.

   While filtering all routes at some predetermined length is an
   attractive option, it can be difficult to maintain and manage large
   filter sets built around a constantly changing database.  It appears
   a more fruitful approach would be to detect routes injected for
   traffic engineering purposes, and remove them from the routing system
   automatically once they are beyond the point in the network where
   they are useful.  This draft proposes a mechanism to perform just
   this task.  When two routes with overlapping prefixes are detected,
   they are marked, and removed from the routing system, at a point
   where they are no longer needed.  This mechanism does not suffer from
   any problems from route withdraws or failures, since routing will
   naturally take care of any connectivity changes.  Various estimates
   have stated that removing the longer prefix routes within the routing
   table could reduce the table size by 25%.

   No actual changes to the operation of the BGP protocol at the packet
   or peering levels are required to implement this draft.  A new well
   known non-transitive community is proposed.


2.  Proposed Enhancements

   Two enhancements are proposed by this draft: three new communities,
   and a new way of handling overlapping prefixes received from an
   external peer.

   As each prefix is received by a BGP speaker from an external peer, it
   would be evaluated in the light of other prefies already received.
   If two prefixes overlap in space (such as 192.168.0.0/16 and
   192.168.1.0/24), the longer prefix would be marked with a new BOUNDED
   community, and the local preference set to a very high number so that
   it would always win in any best path computations within the
   autonomous system.  The longer prefix may also be marked with a new
   community, NO_INSTALL.



White, et al.           Expires February 1, 2009                [Page 2]

Internet-Draft     Bounding Longer Routes to Remove TE         July 2008


   Routes marked with the new BOUNDED community MAY be filtered at the
   autonomous system edge to reduce the number of routes advertised by
   an AS.

2.1.  Example of Bounding the Longer Prefix

   Assume the following configuration of autonomous systems:
                    (   )
           /-------( AS2 )--------\
    (   ) /         (   )          \ (   )       (   )
   ( AS1 )                          ( AS4 )-----( AS5 )
    (   ) \         (   )          / (   )       (   )
           \-------( AS3 )--------/
                    (   )

   o  AS1 is advertising 192.168.1.0/24 to both AS2 and AS3.
   o  AS2 is advertising both 192.168.1.0/24 and 192.168.0.0/16 into
      AS4.
   o  AS3 is advertising 192.168.1.0/24 into AS4
   o  Each connection (session) is handled by a seperate router within
      each AS (for instance, AS4 peers with AS2 and AS3 on a seperate
      routers).

   When the peering router in AS4 between AS4 and AS2 receives both the
   192.168.1.0/24 and the 192.168.0.0/16 prefixes, it will mark the
   192.168.1.0/24 as BOUNDED, and set the local preference high, based
   on its router ID, as described in the section Setting the Local
   Preference, below, and will then propogate this through AS4.

   The border router between AS4 and AS3 will receive the longer prefix
   from AS3, and the iBGP prefix with the high local preference with
   BOUNDED set.  Given it does not see the overlapping prefix, it will
   compare the default (lower) local preference of the externally
   learned route with the higher local preference set by the AS2/AS4
   border router, and will not advertise the 192.168.1.0/24 prefix into
   AS4 at all.

   The AS3/AS4 border router may also, on detecting the overlap, mark
   the longer prefix with a new community, NO_INSTALL, which is non-
   transitive and optional.  Router which understand this community may
   choose not to install this prefix into the local RIB, in order to
   reduce memory consumption.

   If the link between AS1 and AS2 fails, the longer length prefix will
   be withdrawn from AS2, and thus the peering point between AS2 and AS4
   will no longer have an overlapping set of prefixes.  Within AS4, the
   border router which peers with AS2 will cease advertising the
   192.168.1.0/24 prefix, which allows the AS3/AS4 border router to



White, et al.           Expires February 1, 2009                [Page 3]

Internet-Draft     Bounding Longer Routes to Remove TE         July 2008


   being advertising it into AS4, and through AS4 into AS5, restoring
   connectivity to AS1.

2.2.  Setting the Local Preference

   Since there could be multiple points at which an autonomous system
   may receive the same pair of overlapping prefixes, there must be some
   way to ensure that one of the longer prefixes wins in the [BGP]
   decision algorithm consistently.  In practice, this means that each
   BGP speaker which receives an overlapping set of routes should set
   the local preference on the set of longer prefixes so there won't be
   two longer prefixes with matching local preferences.

   The easiest way to ensure this within an autonomous system is to set
   the local preference for longer prefixes based on some unique number
   assigned to each BGP speaker.  Given the router ID and the local
   preference are both 32 bit numbers, an ideal solution appears to be
   to simply set the local preference to the router ID of the BGP
   speaker.  The primary problem with this is that in some cases, the
   router ID of the device may be lower than some standard Local
   Preference, perhaps even lower than a standard Local Prference used
   by default throughout a network.

   To alleviate this problem, the local preference of longer prefixes
   which overlap with shorter prefixes should be set to the router ID of
   the BGP speaker, and then the high order bit of the Local Preference
   should be set, so the setting will be gauranteed to be at least above
   64,000.

2.3.  The NO_INSTALL Community

   An optional optimization to bounding longer prefixes by marking them
   with a high Local Preference and the BOUNDED community is to also
   mark them with a new, non-trasitive, optional community, NO_INSTALL.
   The effect of this community would be for any BGP speaker receiving a
   prefix with this community set to treat the prefix normally in the
   BGP bestpath computation, and to forward bestpaths marked as
   NO_INSTALL to iBGP peers, but to simply fail to install such prefixes
   in the local routing table.

   This would result in a some small amount of information stored and
   maintained in the local routing table, and the local forwarding
   tables built from the local routing table.  If there are enough
   prefixes thus marked, the memory and computation savings could be
   significant.  BGP sepakers which receive a prefix marked with
   NO_INSTALL, and which do not understand this community, simply ignore
   the community.




White, et al.           Expires February 1, 2009                [Page 4]

Internet-Draft     Bounding Longer Routes to Remove TE         July 2008


3.  The NO_BOUNDING Community

   In some situations, the originator of a longer length prefix might
   determine their routing will not work properly if their prefix is
   bounded at a point where it overlaps with a shorter prefix aggregate.
   To resolve this case, we propose a new transitive optional extended
   community, NO_BOUNDING.

   The NO_BOUNDING extended community consists of a type, to be
   determined through the IANA process, and a value containing the
   minimum AS Path length before which the route should not be bounded.
   If a BGP speaker determines a route could be bounded, but the route
   is marked with NO_BOUNDING, and the AS Path length is shorter than
   the minumum AS Path length noted in the NO_BOUNDING extended
   community, they speaker SHOULD NOT mark the route for bounding.

   This allows the originator of a prefix to control the bounding
   properties of the prefix.


4.  Benefits and Risks

   The benefits and risks associated with this proposal are discussed in
   the sections below.

4.1.  Advantages to the Service Provider

   AS4, in each of the situations, reduces the number of prefixes carred
   through the autonomous system by the number of longer prefixes that
   overlap with aggregates of those prefixes.  While one copy of the
   prefix continues to be carried through the autonomous system, this
   entry can be marked with the optional NO_INSTALL community, so it is
   not placed in the forwarding table, nor is it propogated outside the
   autonomous system.

   AS5 receives one prefix instead of two (or possibly more).

4.2.  Advantages to the Customer

   In this case, the customer is respresented as AS1.  The customer will
   continue to receive some amount of traffic over both peering
   sessions, and dual homing through two Service Providers is still
   effective.  If the customer's primary link fails, the alternate link
   through AS3 will take over receving all inbound traffic
   automatically.  With most other schemes presented to this point, the
   customer loses all impact of dual-homing into the Internet, unless
   both connections are through one Service Provider.




White, et al.           Expires February 1, 2009                [Page 5]

Internet-Draft     Bounding Longer Routes to Remove TE         July 2008


4.3.  Advantages to the Internet

   Beyond the second AS hop, aggregation is preserved in all cases.
   While this would not reduce the backbone routing table by the
   dramatic amounts that other methods might, the advantages to the
   community are great, and at greatly reduced risk to customers.

4.4.  Implications for Router processing

   This proposal clearly adds to the work which needs to be done during
   overall [BGP] processing.  Because a check needs to be done for both
   covered and covering routes, some part of this work is required for
   routes of lengths on either side of the bound.  Should this become
   common, however, the rate of growth in the number of routes should be
   smaller and a balance should be struck between the extra processing
   per route and the number of routes.

4.5.  Implications for Traffic engineering

   The implementation of a bound risks magnifying or removing the effect
   of certain widely deployed traffic engineering methods.  If, for
   example, an AS chose to prepend its own route to an announcement in
   order to alter the preference for that route, a BGP neighbor using a
   bounded longest match might now see that route as eligible for
   discard in favor of an aggregate.  While it is fairly easy to code
   around that particular problem, to avoid this class of problems it
   might be preferable to allow this to apply to specific AS Sets as
   well as to all BGP neighbors.

4.6.  Implications for Convergence Time

   If the route to the AS providing the route to the aggregate should be
   lost, the more-specific must propagate into the ASes which had
   formerly heard only the aggregate.  This increases convergence time
   and may create situations in which reachability is temporarily
   compromised.  Unlike the filter case, however, normal BGP behavior
   should restore reachability without changes to the router
   configuration.  There is a also a risk that during a pathological
   event the increased processing required by this change will degrade
   propagation times during those events.  This depends on both the
   speed of specific implementations and the character of the topology.


5.  Acknowledgements

   Cengiz Alaentinoglu, Alvaro Retana, Daniel Walton, David Ball, and
   Barry Greene gave valuable comments on this draft.  Jeff Hass
   suggested the NO_BOUNDING community, along with the AS Path length



White, et al.           Expires February 1, 2009                [Page 6]

Internet-Draft     Bounding Longer Routes to Remove TE         July 2008


   limit described in the NO_BOUNDING section.  A number of colleagues
   also gave the author valuable comments on the white board markings
   that gave rise to this paper; among them are Lane Patterson, Ian
   Cooper, Gerd Besch, Bill Norton, Diarmuid Flynn, and Sean Donelan.


6.  Security Considerations

   This document presumes that the implementation of bounded longest
   match is a knob inside a router config.  Since the use of the knob
   affects route announcements not originating within the router's AS or
   its direct neighbors, the new behavior may result in surprises to the
   announcing AS.  It is possible that this behavior might be considered
   a denial of service or mistaken for a denial of service by systems
   designed to detect black-holing on behalf of the origin AS.


7.  IANA Considerations

   This draft proposes three new communities, BOUNDED, NO_BOUNDING, and
   NO_INSTALL, for which new community values would need to be assigned.
   These should be assigned as described in EXT-COMM.


8.  Informative References

   [BGP-TABLE]
              Bush, R., "Plenary, IETF 51.
              http://www.ietf.org/proceedings/01aug/".

   [EXT-COMM]
              Sangli, S., Tappan, D., and Y. Rekhter, "BGP Extended
              Communities Attribute",
              draft-ietf-idr-bgp-ext-communities-09 (work in progress),
              January 2006.

   [RFC1771]  Rekhter, Y. and T. Li, "A Border Gateway Protocol 4
              (BGP-4)", RFC 1771, March 1995.


Authors' Addresses

   Russ White
   Cisco Systems







White, et al.           Expires February 1, 2009                [Page 7]

Internet-Draft     Bounding Longer Routes to Remove TE         July 2008


   Susan Hares
   NextHop Technologies
   825 Victors Way
   Ann Arbor, MI  48108


   Phone: 734-222-1610
   Fax:
   Email: skh@nexthop.com
   URI:


   Ted Hardie






































White, et al.           Expires February 1, 2009                [Page 8]

Internet-Draft     Bounding Longer Routes to Remove TE         July 2008


Full Copyright Statement

   Copyright (C) The IETF Trust (2008).

   This document is subject to the rights, licenses and restrictions
   contained in BCP 78, and except as set forth therein, the authors
   retain all their rights.

   This document and the information contained herein are provided on an
   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
   THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
   OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
   THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.


Intellectual Property

   The IETF takes no position regarding the validity or scope of any
   Intellectual Property Rights or other rights that might be claimed to
   pertain to the implementation or use of the technology described in
   this document or the extent to which any license under such rights
   might or might not be available; nor does it represent that it has
   made any independent effort to identify any such rights.  Information
   on the procedures with respect to rights in RFC documents can be
   found in BCP 78 and BCP 79.

   Copies of IPR disclosures made to the IETF Secretariat and any
   assurances of licenses to be made available, or the result of an
   attempt made to obtain a general license or permission for the use of
   such proprietary rights by implementers or users of this
   specification can be obtained from the IETF on-line IPR repository at
   http://www.ietf.org/ipr.

   The IETF invites any interested party to bring to its attention any
   copyrights, patents or patent applications, or other proprietary
   rights that may cover technology that may be required to implement
   this standard.  Please address the information to the IETF at
   ietf-ipr@ietf.org.











White, et al.           Expires February 1, 2009                [Page 9]