Network Working Group Pedro Marques INTERNET DRAFT Juniper Networks Expiration Date: October 2003 Robert Raszuk Dan Tappan Cisco Systems Luca Martini Level 3 Communications April 2003 RFC2547bis networks using internal BGP as PE-CE protocol. draft-marques-ppvpn-ibgp-00.txt Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as ``work in progress.'' The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Abstract This document defines protocol extensions and procedures for BGP PE- CE router iteration in RFC2547bis networks. These have the objective of making the usage of the RFC2547bis VPN transparent to the customer network, as far as routing information is concerned. draft-marques-ppvpn-ibgp-00.txt [Page 1] Internet Draft draft-marques-ppvpn-ibgp-00.txt April 2003 1. Introduction In current deployments, when BGP is used as the PE-CE routing protocol, this peering sessions are typically configured as an external peering between the VPN provider AS and the customer network AS. A PE router advertising a route received from a remote PE often remaps the customer network autonomous-system number to its own. Otherwise the customer network can use different autonomous-system numbers at different sites or configure their CE routers to accept routes containing their own AS number. While this technique works well in situations where there are no BGP routing exchanges between the client network and other networks, it does have drawbacks for customer networks that use BGP internally for purposes other than interaction between CE and PE routers. In order to make the usage of RFC2547bis VPN services as transparent as possible to any external interaction, it is desirable to define a mechanism by which PE-CE routers can exchange BGP routes by means other than external BGP. One can consider a RFC2547bis VPN as a provider-managed backbone service interconnecting several customer-managed sites. This model is not universal but it is thought to be common enough to justify special attention. Independently of the presence of VPN service, networks which use an hierarchical design are typically modeled such that the top-level core or backbone participates in a full iBGP mesh which distributes routing information between sites via BGP route reflection [BGP-RR] or confederations [CONFED]. This document presents two different approaches for the sake of discussion. draft-marques-ppvpn-ibgp-00.txt [Page 2] Internet Draft draft-marques-ppvpn-ibgp-00.txt April 2003 2. Confederations One possible option to resolve the issue of externally visible AS_PATH changes introduced by the VPN service would be to use BGP confederations such that the external session between PE and CE become a confed external session. Any external BGP interaction between the customer network and a third network would result in the border router replacing the confed segment by the externally visible customer AS. When inter-as routing exchanges are used by the VPN provider itself, this solution would result in a new as-path segment being added containing the VPN autonomous-systems the route transitions through. This left-most as-path segment SHOULD be stripped by a PE router before adding the AS number that represents the VPN network to the confed segment added by the CE which advertises this route into the VPN cloud. 65001 AS1 AS2 65002 +---+ +---------------------+ +--------------------+ +---+ CE1 -- PE1 --- [vpn] -- ASBR1 -- ASBR2 -- [vpn] --- PE2 -- CE2 Figure 1 As an example, consider in the network represented in Figure 1 a route advertisement originating in CE2 for a given prefix X. PE2 receives this route from CE2 with the as-path "(65002)". PE1 will receive this a vpn route with an as-path attribute with the value "2 (65002)". Before advertising this route to CE1, PE2 should remove the as-sequence that was added to the as-path attribute and add a confederation AS number that represents the VPN domain in the customer network (65000 for example). If CE1 (or another border router in this site) advertises this route through an external session to another AS, it will replace the confed sequence with the customer network AS number. Contrary to the behavior of BGP confederations, the NEXT_HOP attribute is not preserved in this solution. A PE router must rewrite the NEXT_HOP attribute with its own address when advertising a route received from a CE into the VPN domain so that the appropriate PE-PE LSP can be calculated. A PE router advertising a PE-received route to a CE must also rewrite the NEXT_HOP to the address used in the peering relationship since the address of the remote PE is not significant to the VPN. Further discussion concerning the NEXT_HOP attribute is presented below. draft-marques-ppvpn-ibgp-00.txt [Page 3] Internet Draft draft-marques-ppvpn-ibgp-00.txt April 2003 3. Route Reflection In a typical backbone/area hierarchical design, routers that attach an area (or site) to the core, use BGP route reflection to distribute routes between the top-level core iBGP mesh and the local area iBGP cluster. To provide equivalent functionality in a network using a provider provisioned backbone, one can consider the VPN network as the equivalent of an Internal BGP route server which multiplexes information from N VPN attachment points. A PE router then acts as a route reflector to local CE routers. Note that route reflection can be used hierarchically in order to avoid direct communication between the PE and non-directly connected CEs that may exist in the site. BGP path attributes are manipulated in order to isolate the VPN network from the customer network. A new BGP path attribute is defined that can act as an path attribute stack. At the ingress to the VPN network, the BGP attributes of the received routes are pushed into the stack. The stack is popped by a remote PE before performing route selection on the VRFs Adj-RIB-In. For the purposes of VRF route selection performed at the PE, between routes received from local CEs and remote PEs, VPN network IGP metrics should always be considered higher (thus least preferred) than local site metrics. When backdoor links are present, this would tend to direct the traffic between two sites through the backdoor link for BGP routes originated by a remote site. However BGP already has policy mechanisms to address this type of situations such as LOCAL_PREF. When a given CE is connected to more than one PE it may not advertise the route that it receives from a PE to another PE unless configured as a route reflector, due to the standard BGP route advertisement rules. When a CE reflects a PE received route to another PE, the fact that the original attributes of a route are preserved across the VPN network prevents the formation of routing loops due to mutual redistribution between the two networks. draft-marques-ppvpn-ibgp-00.txt [Page 4] Internet Draft draft-marques-ppvpn-ibgp-00.txt April 2003 3.1. Carrying internal BGP routes In order to carry the original BGP attributes of a route received from a CE, this document defines a new BGP path attribute: ATTR_SET (type code TBD) ATTR_SET is an optional transitive attribute that carries a set of BGP path attributes. An attribute set (ATTR_SET) can include any BGP attribute that can occur in a BGP UPDATE message, except the MP_REACH and MP_UNREACH attributes. This attribute consists of a 4-byte autonomous system number plus a variable length sequence of BGP path attributes. This attribute is used by a PE router to store the original set of BGP attributes it receives from a CE. When a PE router advertises a PE-received route to a CE, it will use the path attributes carried in the ATTR_SET attribute. In other words, the BGP Path Attributes are "pushed" into this stack- like attribute when the route is received by the VPN network and "popped" when the route is advertised in the PE to CE direction. Using this mechanism isolates the customer network from the attributes used in the VPN network and vice versa. Attributes as the route reflection cluster list attribute are segregated such that customer network cluster identifiers won't be considered by the VPN network route reflectors and vice-versa. The autonomous system number present in the ATTR_SET attribute is designed to prevent a route originating in a given autonomous-system iBGP to be leaked into a different autonomous-system. It should contain the autonomous system of the customer network that originates the given set of attributes. The NEXT_HOP attribute SHOULD NOT be included in an ATTR_SET. draft-marques-ppvpn-ibgp-00.txt [Page 5] Internet Draft draft-marques-ppvpn-ibgp-00.txt April 2003 4. Next-hop handling When RFC2547bis VPNs are not in use, the NEXT_HOP attribute in iBGP routes carries the address of the border router advertising the route into the domain. An important component of BGP route selection is the IGP distance to the NEXT_HOP of the route. When a VPN service is used to provide interconnection between different sites, since the VPN network runs a different IGP domain, metrics between the VPN and customer networks are not comparable. However, the most important component of a metric is the inter-area metric, which is known to the VPN network. The intra-area metric is typically negligible. The use of route reflection, for instance, requires metrics to be configured so that inter-cluster/area metrics are always greater than intra-cluster metrics. The approach taken by this document is to rewrite the NEXT_HOP attribute at the PE-CE boundary. PE routers take into account the PE- PE IGP distance calculated by the VPN network IGP, when selecting between routes advertised from different PEs. An advantage of the proposed method is that the customer network can run independent IGPs at each site. 5. Exchanging routes between different VPN customer networks. A given VPN customer network SHOULD use internal or external BGP sessions consistently for peering sessions where the same autonomous system is used. In scenarios such as what is commonly referred to an "extranet" VPN, routes MAY be advertised to both internal and external VPN attachments, belonging to different autonomous systems. +-----+ +-----+ | PE1 |-----------------| PE2 | +-----+ +-----+ / \ | +-----+ +-----+ +-----+ | CE1 | | CE2 | | CE3 | +-----+ +-----+ +-----+ AS 1 AS 2 AS 1 draft-marques-ppvpn-ibgp-00.txt [Page 6] Internet Draft draft-marques-ppvpn-ibgp-00.txt April 2003 Consider the example given above where (PE1, CE1) and (PE2, CE3) sessions are iBGP. In RFC2547 VPNs, a route received from CE1 above may be distributed to the VRFs corresponding to the attachment points for CEs 2 and 3. The desired result, in such a scenario is to present the internal peer (CE3) with a BGP advertisement that contains the same BGP Path Attributes received from CE1 and to the external peer (CE 2) a BGP advertisement that would correspond to a situation where AS 1 and 2 have a external BGP session between them. It order to achieve this goal the following set of rules apply: When advertising an iBGP originated route to iBGP, a PE router MUST check that the autonomous-system contained in the ATTR_SET attribute matches the autonomous system of the CE to which the route is being advertised. In case the autonomous-systems do match the route is advertised with the attributes contained in the ATTR_SET attribute. Otherwise, in the case of an autonomous-system mismatch, the set of attributes to be advertised to the CE in question shall be constructed as follows: 1. The path attributes are set to the attributes contained in the ATTR_SET attribute. 2. Internal BGP specific attributes are discarded (LOCAL_PREF, ORIGINATOR, CLUSTER_LIST, etc). 3. The autonomous-system contained in the ATTR_SET attribute is prepended to the as-path following the rules that would apply to an external BGP peering between the source and destination ASes. 4. Internal BGP specific attributes corresponding to the configuration of destination AS (LOCAL_PREF) are added. When advertising an iBGP originated route to eBGP, a PE router shall apply steps 1 to 3 defined above and subsequently prepend its own autonomous-system number to the AS_PATH attribute (i.e. both the originator and VPN network as numbers are prepended). When advertising an eBGP originated route to iBGP, a PE router MUST prepend its own as number before adding iBGP only as-path attributes (LOCAL_PREF). In all cases where an iBGP originating route is processed, attributes draft-marques-ppvpn-ibgp-00.txt [Page 7] Internet Draft draft-marques-ppvpn-ibgp-00.txt April 2003 present on the VPN route other than the NEXT_HOP attribute are ignored, both from the point of view of route selection in the VRF Adj-RIB-in and route advertisement to a CE router. 6. Acknowledgments We would like to thank Yakov Rekhter for his comments and suggestions. We would also like to acknowledge Luyuan Fang who provided valuable input into this work. 7. References [BGP-BASE] Y. Rekhter, T. Li, S. Hares, "A Border Gateway Protocol 4 (BGP-4)", draft-ietf-idr-bgp4-20.txt, 03/03 [RFC2547bis] "BGP/MPLS VPNs", Rosen et. al., draft-ietf-ppvpn- rfc2547bis-03.txt, 10/02. [BGP-RR] Bates, Chandra, and Chen, "BGP Route Reflection: An alternative to full mesh IBGP", RFC 2796. [CONFED] P. Traina, D. McPherson, J. Scudder, "Autonomous System Confederations for BGP", RFC 3065. 8. Author's Addresses draft-marques-ppvpn-ibgp-00.txt [Page 8] Internet Draft draft-marques-ppvpn-ibgp-00.txt April 2003 Pedro Marques Juniper Networks 1194 N. Mathilda Ave. Sunnyvale, CA 94089 E-mail: roque@juniper.net Robert Raszuk Cisco Systems, Inc. Al. Jerozolimskie 146C 02-305 Warsaw, Poland Email: rraszuk@cisco.com Dan Tappan Cisco Systems Inc. 300 Beaver Brook Rd. Boxborough MA 01719 Email: tappan@cisco.com Luca Martini Level 3 Communications, LLC. 1025 Eldorado Blvd. Broomfield, CO, 80021 e-mail: luca@level3.net draft-marques-ppvpn-ibgp-00.txt [Page 9]