Network Working Group R. White Internet-Draft Cisco Systems Expires: January 16, 2006 T. Hardie July 15, 2005 Bounding Longest Match Considered draft-white-bounded-longest-match-01 Status of this Memo By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on January 16, 2006. Copyright Notice Copyright (C) The Internet Society (2005). Abstract Some ASes currently use length-based filters to manage the size of the routing table they use and propagate. This draft explores an alternative to length-based filters which allows for more automatic configuration and which provides for better redundancy. Rather than use a filter, this draft proposes a method of modifying the BGP [RFC1771] longest match algorithm by setting a bound on the prefix lengths eligible for preference. A bound would operate on White & Hardie Expires January 16, 2006 [Page 1] Internet-Draft Bounding Longest Match Considered July 2005 long prefixes when covering route announcements are available; in certain circumstances it would cause a router to prefer an aggregate over a more specific route announcement. 1. Introduction Modifying longest match would limit the rate of growth in the routing table seen by many BGP speakers. The current rate of growth and the time to convergence represent threats to the stability to the Internet. In the short term, the IETF is considering efforts to curb these threats while new routing paradigms that attack the fundamental limitations of path vector protocols are developed and deployed. A number of the practical efforts to limit the rate of growth of the routing table have focused on filter policies, arguing that aggressive filtering will return the Internet to a state in which provider aggregates are a majority of the routes in the routing table [BGP-TABLE]. This draft proposes an approach along those same lines, but using a bound on the longest match algorithm rather than a filter policy. The authors believe that this approach can produce a similar (though not identical) effect while retaining full reachability and allowing multi-homing non-transit networks to achieve the main goals which have motivated their becoming independent ASes. 2. Proposed Enhancements Two enhancements are proposed by this draft: three new communities, and a new way of handling overlapping prefixes received from an external peer. As each prefix is received by a BGP speaker from an external peer, it would be evaluated in the light of other prefies already received. If two prefixes overlap in space (such as 192.168.0.0/16 and 192.168.1.0/24), the longer prefix would be marked with a new BOUNDED community, and the local preference set to a very high number so that it would always win in any best path computations within the autonomous system. The longer prefix may also be marked with a new community, NO_INSTALL. Routes marked with the new BOUNDED community MAY be filtered at the autonomous system edge to reduce the number of routes advertised by an AS. 2.1 Example of Bounding the Longer Prefix Assume the following configuration of autonomous systems: White & Hardie Expires January 16, 2006 [Page 2] Internet-Draft Bounding Longest Match Considered July 2005 ( ) /-------( AS2 )--------\ ( ) / ( ) \ ( ) ( ) ( AS1 ) ( AS4 )-----( AS5 ) ( ) \ ( ) / ( ) ( ) \-------( AS3 )--------/ ( ) o AS1 is advertising 192.168.1.0/24 to both AS2 and AS3. o AS2 is advertising both 192.168.1.0/24 and 192.168.0.0/16 into AS4. o AS3 is advertising 192.168.1.0/24 into AS4 o Each connection (session) is handled by a seperate router within each AS (for instance, AS4 peers with AS2 and AS3 on a seperate routers). When the peering router in AS4 between AS4 and AS2 receives both the 192.168.1.0/24 and the 192.168.0.0/16 prefixes, it will mark the 192.168.1.0/24 as BOUNDED, and set the local preference high, based on its router ID, as described in the section Setting the Local Preference, below, and will then propogate this through AS4. The border router between AS4 and AS3 will receive the longer prefix from AS3, and the iBGP prefix with the high local preference with BOUNDED set. Given it does not see the overlapping prefix, it will compare the default (lower) local preference of the externally learned route with the higher local preference set by the AS2/AS4 border router, and will not advertise the 192.168.1.0/24 prefix into AS4 at all. The AS3/AS4 border router may also, on detecting the overlap, mark the longer prefix with a new community, NO_INSTALL, which is non- transitive and optional. Router which understand this community may choose not to install this prefix into the local RIB, in order to reduce memory consumption. If the link between AS1 and AS2 fails, the longer length prefix will be withdrawn from AS2, and thus the peering point between AS2 and AS4 will no longer have an overlapping set of prefixes. Within AS4, the border router which peers with AS2 will cease advertising the 192.168.1.0/24 prefix, which allows the AS3/AS4 border router to being advertising it into AS4, and through AS4 into AS5, restoring connectivity to AS1. 2.2 Setting the Local Preference Since there could be multiple points at which an autonomous system may receive the same pair of overlapping prefixes, there must be some White & Hardie Expires January 16, 2006 [Page 3] Internet-Draft Bounding Longest Match Considered July 2005 way to ensure that one of the longer prefixes wins in the [BGP] decision algorithm consistently. In practice, this means that each BGP speaker which receives an overlapping set of routes should set the local preference on the set of longer prefixes so there won't be two longer prefixes with matching local preferences. The easiest way to ensure this within an autonomous system is to set the local preference for longer prefixes based on some unique number assigned to each BGP speaker. Given the router ID and the local preference are both 32 bit numbers, an ideal solution appears to be to simply set the local preference to the router ID of the BGP speaker. The primary problem with this is that in some cases, the router ID of the device may be lower than some standard Local Preference, perhaps even lower than a standard Local Prference used by default throughout a network. To alleviate this problem, the local preference of longer prefixes which overlap with shorter prefixes should be set to the router ID of the BGP speaker, and then the high order bit of the Local Preference should be set, so the setting will be gauranteed to be at least above 64,000. 2.3 The NO_INSTALL Community An optional optimization to bounding longer prefixes by marking them with a high Local Preference and the BOUNDED community is to also mark them with a new, non-trasitive, optional community, NO_INSTALL. The effect of this community would be for any BGP speaker receiving a prefix with this community set to treat the prefix normally in the BGP bestpath computation, and to forward bestpaths marked as NO_INSTALL to iBGP peers, but to simply fail to install such prefixes in the local routing table. This would result in a some small amount of information stored and maintained in the local routing table, and the local forwarding tables built from the local routing table. If there are enough prefixes thus marked, the memory and computation savings could be significant. BGP sepakers which receive a prefix marked with NO_INSTALL, and which do not understand this community, simply ignore the community. 3. The NO_BOUNDING Community In some situations, the originator of a longer length prefix might determine their routing will not work properly if their prefix is bounded at a point where it overlaps with a shorter prefix aggregate. To resolve this case, we propose a new transitive optional extended community, NO_BOUNDING. White & Hardie Expires January 16, 2006 [Page 4] Internet-Draft Bounding Longest Match Considered July 2005 The NO_BOUNDING extended community consists of a type, to be determined through the IANA process, and a value containing the minimum AS Path length before which the route should not be bounded. If a BGP speaker determines a route could be bounded, but the route is marked with NO_BOUNDING, and the AS Path length is shorter than the minumum AS Path length noted in the NO_BOUNDING extended community, they speaker SHOULD NOT mark the route for bounding. This allows the originator of a prefix to control the bounding properties of the prefix. 4. Benefits and Risks The benefits and risks associated with this proposal are discussed in the sections below. 4.1 Advantages to the Service Provider AS4, in each of the situations, reduces the number of prefixes carred through the autonomous system by the number of longer prefixes that overlap with aggregates of those prefixes. While one copy of the prefix continues to be carried through the autonomous system, this entry can be marked with the optional NO_INSTALL community, so it is not placed in the forwarding table, nor is it propogated outside the autonomous system. AS5 receives one prefix instead of two (or possibly more). 4.2 Advantages to the Customer In this case, the customer is respresented as AS1. The customer will continue to receive some amount of traffic over both peering sessions, and dual homing through two Service Providers is still effective. If the customer's primary link fails, the alternate link through AS3 will take over receving all inbound traffic automatically. With most other schemes presented to this point, the customer loses all impact of dual-homing into the Internet, unless both connections are through one Service Provider. 4.3 Advantages to the Internet Beyond the second AS hop, aggregation is preserved in all cases. While this would not reduce the backbone routing table by the dramatic amounts that other methods might, the advantages to the community are great, and at greatly reduced risk to customers. White & Hardie Expires January 16, 2006 [Page 5] Internet-Draft Bounding Longest Match Considered July 2005 4.4 Implications for Router processing This proposal clearly adds to the work which needs to be done during overall [BGP] processing. Because a check needs to be done for both covered and covering routes, some part of this work is required for routes of lengths on either side of the bound. Should this become common, however, the rate of growth in the number of routes should be smaller and a balance should be struck between the extra processing per route and the number of routes. 4.5 Implications for Traffic engineering The implementation of a bound risks magnifying or removing the effect of certain widely deployed traffic engineering methods. If, for example, an AS chose to prepend its own route to an announcement in order to alter the preference for that route, a BGP neighbor using a bounded longest match might now see that route as eligible for discard in favor of an aggregate. While it is fairly easy to code around that particular problem, to avoid this class of problems it might be preferable to allow this to apply to specific AS Sets as well as to all BGP neighbors. 4.6 Implications for Convergence Time If the route to the AS providing the route to the aggregate should be lost, the more-specific must propagate into the ASes which had formerly heard only the aggregate. This increases convergence time and may create situations in which reachability is temporarily compromised. Unlike the filter case, however, normal BGP behavior should restore reachability without changes to the router configuration. There is a also a risk that during a pathological event the increased processing required by this change will degrade propagation times during those events. This depends on both the speed of specific implementations and the character of the topology. 5. Acknowledgements Cengiz Alaentinoglu, Alvaro Retana, Daniel Walton, David Ball, and Barry Greene gave valuable comments on this draft. Jeff Hass suggested the NO_BOUNDING community, along with the AS Path length limit described in the NO_BOUNDING section. A number of colleagues also gave the author valuable comments on the white board markings that gave rise to this paper; among them are Lane Patterson, Ian Cooper, Gerd Besch, Bill Norton, Diarmuid Flynn, and Sean Donelan. 6. Security Considerations This document presumes that the implementation of bounded longest White & Hardie Expires January 16, 2006 [Page 6] Internet-Draft Bounding Longest Match Considered July 2005 match is a knob inside a router config. Since the use of the knob affects route announcements not originating within the router's AS or its direct neighbors, the new behavior may result in surprises to the announcing AS. It is possible that this behavior might be considered a denial of service or mistaken for a denial of service by systems designed to detect black-holing on behalf of the origin AS. 7. IANA Considerations This draft proposes three new communities, BOUNDED, NO_BOUNDING, and NO_INSTALL, for which new community values would need to be assigned. These should be assigned as described in EXT-COMM. 8. Informative References [BGP-TABLE] Bush, R., "Plenary, IETF 51. http://www.ietf.org/proceedings/01aug/". [EXT-COMM] Sangli, S., Tappan, D., and Y. Rekhter, "BGP Extended Communities Attribute", draft-ietf-idr-bgp-ext-communities-09 (work in progress), January 2006. [RFC1771] Rekhter, Y. and T. Li, "A Border Gateway Protocol 4 (BGP-4)", RFC 1771, March 1995. Authors' Addresses Russ White Cisco Systems Ted Hardie White & Hardie Expires January 16, 2006 [Page 7] Internet-Draft Bounding Longest Match Considered July 2005 Intellectual Property Statement The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org. Disclaimer of Validity This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Copyright Statement Copyright (C) The Internet Society (2005). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. Acknowledgment Funding for the RFC Editor function is currently provided by the Internet Society. White & Hardie Expires January 16, 2006 [Page 8]