Network Working Group Li Bin Internet Draft Dong Weisi Expires: May 2003 Li Defeng Huawei Technologies November 06, 2002 Hierarchy of Provider Edge Device in BGP/MPLS VPN Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Abstract In the deployment of BGP/MPLS VPN,the PE(Provider Edge)Device should maintain all the VPN routes of the VPNs which it belong to.When there are many VPNs converged by a PE,and the capacity of PE is relevant limited,then the bottleneck will be encountered.Another problem is that the current BGP/MPLS VPN model is something of a "Plane Modle" where the demand of the performance of the PE device are all the same no matter which layer the PE device is belongs to.However,the typical network is "Core-Convergence-Access(Edge)" model,and the performance of the device is superior in Core Layer and inferior in Access Layer,and the scale of network is large in Access Layer and small in Core Layer,the routes are converged in every layer,so in current "Plane Modle",when PE device push to the edge layer,it has to maintain more VPN routes,this makes it difficult to extend the PE Libin, et al. Hierarchy of PE Device in BGP/MPLS VPN [Page 1] Draft November 2002 device to edge layer.This document defines an model of Hierarchy of Provider Edge Device in BGP/MPLS VPN,where Hierarchy of Provider Edge Device can be composed of several device and every device take on the different part,partake the function of the former concentrative PE,we call this model "Hierarchy Model",In this model the demand of performance in Routing and Switching is strict to the PE device in High layer,loose to the PE device in edge layer. One HoPE can be composed of a SPE and UPES connected to the SPE, or be composed of a high-level SPE and HoPEs connected the high level SPE and and build up a new HoPE.This build is called nesting of HoPE,and this kind of nesting can be done for many times.Thus the former HoPE connect to the high-level SPE as a role of UPE,and the new HoPE can connect a single UPE too. Table of Contents(will edit in the last) 1. Introduction ................................................. 3 2. Working Principle ........................ 4 2.1 VPN routes ............................................... 6 2.2 Control Flow(Route Advertising and Label Distribution) 2.3 Data Flow(Label Operation and Packet Forwarding) ................. 7 3. Interface between UPE and SPE....................................... 7 4. Nesting of HoPE 5. Multi-Homing UPE 6. Backdoor link between UPEs 7. The Forwarding Procedure in Some Special Cases 8. Security Consideration 9. Acknowledge 10. References 11. Authors' Addresses Full Copyright Statement ........................................ 11 Libin, et al. Hierarchy of PE Device in BGP/MPLS VPN [Page 2] Draft November 2002 1. Introduction This document defines an model of Hierarchy of Provider Edge Device in BGP/MPLS VPN,where Hierarchy of Provider Edge Device can be composed of several device and every device take on the different part,partake the function of the former concentrative PE,we call this model "Hierarchy Model",In this model the demand of performance in Routing and Switching is strict to the PE device in High layer,loose to the PE device in edge layer. PE device can connect to not only the Customer Edge(CE) device,but also a PE device,or even more generally an MPLS VPN network,and the connected PE devices formed the "Hierarchy of PE",and the PE device which replace the former position of CE device in "Plane Model" is called Under-layer PE,UPE in short,and the PE device which UPE is connected to is called Superstratum PE,SPE in short,this architiecture is called Hierarchy of PE,HoPE in short.and the architecture figure is as follows(figure 1): +----------+ +----+ +---+ |VPN1 Site1|--| |---------------| | +----------+ | | +----------+ | | +-------+ +----------+ |UPE1| |VPN1 Site4|--| | | | +--+ +----------+ |VPN2 Site1|--| | +----------+ | | | |--|PE|--|VPN1 Site3| +----------+ +----+ +----------+ | |-| MPLS | +--+ +----------+ |VPN1 Site4|--|SPE| |NETWORK| +----------+ +----+ +----------+ | | | | +--+ +----------+ |VPN1 Site2|--| | +-------+ | | | |--|PE|--|VPN2 Site3| +----------+ | |--| MPLS |----| | | | +--+ +----------+ +----------+ |UPE2| |NETWORK| | | +-------+ |VPN2 Site2|--| | +-------+ +---+ +----------+ +----+ figure 1: Hierarchy of PE Architecture Several UPE and SPE formed the Hierarchy of PE,which provide the conventional function of PE in "Plane Model",and their respective functions are as follows: UPE maintains only the routes of the VPN sites which is directly connected to UPE,it don't maintain the routes of the remote VPN sites or only maintain the aggregate routes,SPE maintain the routes of all the sites directly connected to this SPE and the sites directly connected to UPE which directly connected to this SPE. UPE distribute the inner MPLS labels for the routes in the sites directly connected to it,and advertise the labels to SPE with the Libin, et al. Hierarchy of PE Device in BGP/MPLS VPN [Page 3] Draft November 2002 VPN routes through MP-BGP,SPE don't advertise the routes in the remote sites to UPE,it only advertise the VRF default route or aggregate route to UPE,and the route is concomitant with the MPLS label. MP-IBGP or MP-EBGP can be applied between UPE and SPE,while MP-IBGP is applied,SPE should be the Route Reflector(RR) for all the UPEs collected to it,and UPE play the role of Client of this RR,BUT SPE doesn't act as the RR for other PEs. If MP-EBGP is applied between MP-EBGP,the AS number of the UPEs should be the private AS number(64512~65535).In fact,the Hierarchy of PE can be handles with the rules of Confederation,in which case every confederation AS is composed of only one BGP Speaker,the UPE collected to the SPE. The packet forwarding between SPE and UPE is based on the label,so only one interface is needed to connected to each other,this interface can be a physical interface,or sub-interface,such as VLAN, PVC,tunnel such as GRE or LSP.When tunnel is applied between UPE and SPE,an IP network or MPLS network can be deployed between them. Hierarchy of PE takes on all the functions of the normal PE, there is no difference between them when looked outside,so this "special" PE can coexist with other PEs in the MPLS network. 2. Working Principle This section specifies the working principle of Hierarchy of PE including the maintaining of VPN routes,distribution of MPLS labels, and packet forwarding. 2.1 VPN routes SPE can establish MP-BGP neighborship with UPE,if they are administered by the same service provider,they can be MP-IBGP neighborship,otherwise should establish the MP-EBGP neighorship. In the MP-IBGP case,if there exists the sites of the different UPEs collected to an SPE belong to the same VPN,SPE as the Route Reflector for the relevant UPEs,otherwise SPE act as the convergent PE for all the UPEs collected to it.Route-Target lists are used to select the right VPN routes from other PEs as the normal "Plane Modle" BGP/MPLS VPN(RFC 2547bis),while only SPE will exchange the route information with other PEs,UPE should send the import route target list to SPE,SPE converge the route target lists and derive the HoPE-wide import route target list,with this HoPE-wide import route target,SPE can select the right VPN routes which belong to the VPNs which have the sites connected the SPE directly or through UPE. This HoPE-wide import route target list can be configured staticly or derived dynamically between SPE and UPEs.The dynamic mechanism is as follows: Libin, et al. Hierarchy of PE Device in BGP/MPLS VPN [Page 4] Draft November 2002 UPE advertise the ORF(Outbound Route Filter)[BGP-ORF] to SPE through Route Refresh(RFC 2918) message,and an extended community list is included in the ORF item,the content of the extended community list is the aggregation of the import route target lists of all the VRFs in the UPE,and SPE converge all the import route target lists received from the UPEs connected to SPE and derive the HoPE-wide import route target list. In MP-EBGP case,SPE should derive the HoPE-wide import route target list all the same with the mechanism as above.In general,UPE should adopt the private AS number in VPN routes advertised to SPE.When SPE advertised the routes to other PEs,SPE should omitted the private AS. The scheme in which SPE connected to part of UPEs through MP-IBGP, and to the other part UPEs through MP-EBGP is permitted. UPE selects its own VPN routes by match its own import route target list with the export route target list attached with VPN routes respectively. SPE advertised the VRF default route or aggregate routes(by which UPE transfer the VPN packet to SPE) to UPE,this default VRF route can be formed dynamically or configured statically.When formed dynamically,it can be filtered by the ORF mechanism mentioned above. 2.2 Control Flow(Route Advertising and Label Distribution) This section specifies the mechanism the network distribute the labels for the routes in the VPN sites.The following figure(Figure 2) signifies the control flow between VPN1 site1 and VPN1 SITE2,the control flows(route and label distribution) in the two direction are different,there are four steps in every direction,labeled (1) through (4) in the direction from VPN1 SITE2 to VPN1 site1,and labeled (5) through (8) in the direction from VPN1 site1 to VPN1 SITE2. | (4) | (3) | (2) | (1) | |<-------|<------|<--------------|<-----| v v v v v +----------+ +---+ +----+ +---+ +---+ +--+ +---+ +----------+ |VPN1 Site1|-|CE1|---|UPE |---|SPE|---| P |---|PE|--|CE2|-|VPN1 SITE2| +----------+ +---+ +----+ +---+ +---+ +--+ +---+ +----------+ ^ ^ ^ ^ ^ | (5) | (6) | (7) | (8 ) | |------->|------>|-------------->|----->| Figure 2: The control flow(route and label distribution) Libin, et al. Hierarchy of PE Device in BGP/MPLS VPN [Page 5] Draft November 2002 In Figure 2,the meanings of (1) through (8) are as follows: Label (1) through (4) specifies the procedure in the direction from VPN1 SITE2 to VPN1 Site1: (1) CE2 advertise a route in the VPN1 SITE2 to PE; (2) PE distribute an inner MPLS label for this route;PE advertise this route to SPE through MP-BGP with the inner label,and the relevant export route target list attached; (3.a) SPE match the export route target list in the received route with the HoPE-wide import route target list to decide whether or not import the VPN route,in this case,they can be matched,then SPE will import the received VPN route. (3.b)At the same time SPE will match the export route target list in the received route with the import route target list advertised by VRF in UPEs connected to SPE,in this case, the import route target list of the VRF in UPE correspond to VPN1 Site1(called VRF1) will match,then SPE will advertise the default VRF1 route to UPE,with the inner MPLS label distributed by PE attached; (4) UPE advertise this route to CE1 through the route protocol(RIP, OSPF,BGP,or default route), Label (5) through (8) specifies the procedure in the direction from VPN1 SITE2 to VPN1 Site1: (5) CE1 advertise a route in the VPN1 site1 to UPE; (6) UPE distribute an inner label for this route;UPE advertise this route to SPE through MP-BGP with the inner label; (7) SPE replace the inner label distributed by UPE with another inner label distributed by SPE,then advertise this route to PE through MP-BGP with the new inner label attached; (8) PE distributed the route to CE2 with no label attached. Of course,the LSP should be established between SPE and PE in two directions respectively,This mechanism is the same as RFC 2547bis. The procedure of forwarding VPN packet with the two labels detailed in section 2.3 2.3 Data Flow(Label Operation and Packet Forwarding) This section specifies the mechanism of forwarding the VPN packets in the network with the route and label information derived by section 2.2. The following figure(Figure 3) signifies the data flows between Libin, et al. Hierarchy of PE Device in BGP/MPLS VPN [Page 6] Draft November 2002 VPN1 site1 and VPN1 SITE2,the data flows(Label Operation and Packet Forwarding) in the two direction are different,there are five steps in every direction,labeled (1) through (5) in the direction from VPN1 SITE2 to VPN1 site1,and labeled (6) through (10) in the direction from VPN1 site1 to VPN1 SITE2. | (10) | (9) | (8) | (7) | (6) | |<-------|<------|<--------|<-----|<-----| v v v v v v +----------+ +---+ +----+ +---+ +---+ +--+ +---+ +----------+ |VPN1 Site1|-|CE1|---|UPE|---|SPE|---| P |---|PE|--|CE2|-|VPN1 SITE2| +----------+ +---+ +----+ +---+ +---+ +--+ +---+ +----------+ ^ ^ ^ ^ ^ ^ | (1) | (2) | (3) | (4) | (5) | |------->|------>|--------|----->|----->| Figure 3: Data Flow(Label Operation and Packet Forwarding) In Figure 3,the meanings of (1) through (10) are as follows: Label (1) through (5) specifies the forwarding procedure in the direction of VPN1 Site1 visit VPN1 SITE2: (1) When the VPN packet of VPN1 Site1 visit VPN1 SITE2 arrived to CE1, CE1 forward the packet to UPE based on the default route or the route derive from dynamic route protocol between CE1 and UPE specified in the (4) of section 2.2. (2) UPE push the inner label based on the default VRF route,forward the VPN packet to SPE,the inner label and the default VRF route are specified in the (3) of section 2.2. (3) SPE POP the inner label pushed by UPE specified in (2) above,and look up the VRF route,push the new inner label which distributed by PE specified by (2) of section 2.2,then push the outer label distributed by P and forward the VPN packet to P,in backbone network, all the P router in the path of LSP from SPE to PE swap the outer label and forward the VPN packet through this LSP until the packet arrived to PE,or optionally the P router pen-ultimate hop the label. (4) The P router pen-ultimate hop to PE POP the outer label,forward the VPN packet to PE. (5) PE POP the inner label,forward the VPN packet to CE2,then CE2 forward the VPN packet to the VPN1 SITE2 with the route in this site. Label (6) through (10) specifies the forwarding procedure in the direction of VPN1 SITE2 visit VPN1 Site1: (6) When the VPN packet of VPN1 SITE2 visit VPN1 Site1 arrived to CE2, Libin, et al. Hierarchy of PE Device in BGP/MPLS VPN [Page 7] Draft November 2002 CE2 forward the packet to PE based on the default route or the route derive from dynamic route protocol between CE2 and PE specified in the (8) of section 2.2. (7) PE push the inner label distributed by SPE specified in (7) of section 2.2 based on the VRF route,then push the outer label distributed by P and forward the VPN packet to P,in backbone network, all the P router in the path of LSP from PE to SPE swap the outer label and forward the VPN packet through this LSP until the P router pen-ultimate hop to SPE. (8) The P router optioanlly pen-ultimate hop to SPE POP the outer label,forward the VPN packet to SPE. (9) SPE swap the inner label in the VPN packet replace the inner label distributed by SPE with the new inner label distributed by UPE specified in (6) of section 2.2,then forward the VPN packet to UPE. (10) UPE POP the new inner label,forward the VPN packet to CE1,then CE1 forward the VPN packet to the VPN1 Site1 with the route in this site. 3. Interface between UPE and SPE UPE can connect to SPE with any type of interface and sub-interface, even with tunnel interface,in this case,UPE can connect with SPE through an IP or MPLS network,and because SPE and UPE are MP-BGP peers,the routes can be advertise directly through TCP connection. In MP-EBGP case,UPE and SPE can set up EBGP peers across the IP network or MPLS network by Multi-hop EBGP,when UPE or SPE forwarding the VPN packets with the label,they must pass a tunnel, if the tunnel is GRE,MPLS encapsulation must be supported;If the tunnel is LSP,the network between UPE ang SPE must be MPLS network, LDP/CR-LDP or RSVP-TE must be supported in UPE and SPE. 4. Nesting of HoPE One HoPE can be composed of a SPE and UPES connected to the SPE, or be composed of a high-level SPE and HoPEs connected the high level SPE and and build up a new HoPE.This build is called nesting of HoPE,and this kind of nesting can be done for many times.Thus the former HoPE connect to the high-level SPE as a role of UPE,and the new HoPE can connect a single UPE too.Figure 4 in the following signifies a three-layer HoPE,and call the PE in the middle layer as MPE(Middle-level PE). Libin, et al. Hierarchy of PE Device in BGP/MPLS VPN [Page 8] Draft November 2002 +----------+ +----+ +---+ |VPN1 Site1|--| |---------------| | +----------+ | | +----------+ | | +-------+ +----------+ |UPE1| |VPN1 Site4|--| | | | +--+ +----------+ |VPN2 Site1|--| | +----------+ | | | |--|PE|--|VPN1 Site3| +----------+ +----+ +----------+ | |-| MPLS | +--+ +----------+ |VPN1 Site4|--|SPE| |NETWORK| +----------+ +----+ +----------+ | | | | +--+ +----------+ |VPN1 Site2|--| | +-------+ | | | |--|PE|--|VPN2 Site3| +----------+ | |--| MPLS |----| | | | +--+ +----------+ +----------+ |UPE2| |NETWORK| | | +-------+ |VPN2 Site2|--| | +-------+ | | +----------+ +----+ | | | | +----------+ +----+ +----+ | | |VPN1 Site4|--|UPE3|--|MPE |------| | +----------+ +----+ +----+ +---+ (Nesting of HoPE) figure 4: Nesting of HoPE Between SPE and MPE,and between MPE and UPE run MP-BGP,if run MP-IBGP,then SPE worked as the route reflector for all the MPE,and MPEs worked as the route reflector for all the UPEs,and MP-BGP advertise all the VPN routes of the underlayer PEs to the upperlayer PE,and advertise the default VRF route or the aggregate route of the upperlayer to the underlayer PE. So SPE maintains all the VPN routes of the whole HoPE,and the MPE maintains the VPN routes of UPEs connected to this MPE.UPE maintain the VPN routes of sites connected to this UPE. SPE advertise the default VRF routes with the label attached to MPE, MPE replaces this label with the new label,and advertise this route with the new label attached to UPE. The upperlayer PE should create a HoPE-wide global import route target list,filter the VPN routes that don't belong to the VPNs connected to this upperlayer PE.MPE converge all the import route target lists of the UPEs connected to this MPE,and SPE converge all the converged import route target lists of the MPEs connected to this SPE.And this convergence can be configured statically or derived dynamically,in the latter case,MPE should forward the import route target list of MPE to SPE through ORF. When the VPN packet from the local site of HoPE is forwarded to UPE, UPE look up the relevant default VRF route,push the label and forward the packet to MPE,MPE POP the label,look up the default VRF Libin, et al. Hierarchy of PE Device in BGP/MPLS VPN [Page 9] Draft November 2002 route or aggregate route to SPE,PUSH a new label forward the packet to SPE,SPE POP this label,look up the VRF forwarding table,PUSH the inner label,and the outer label the P router distribute for this route,then follow the RFC 2547bis forwarding procedure. Because MPE has distributes the inner label for the destination address of the VPN packet from the remote site,when the remote VPN packet is forwarded to SPE through the MPLS network following the RFC 2547bis forwarding procedure. SPE swap the inner label,forward this packet to MPE,for the same reason,MPE swap the inner label and forward this packet to UPE,then UPE POP the inner label and forward this packet to the local site. 5. Multi-Homing UPE One UPE can connect to several SPEs,in this case UPE is called Multi-Homing UPE,all the SPEs advertise the default VRF routes to this UPE,and UPE select the best one or treat them as ECMP(Equal Cost Multi-Path) and share the load among them.UPE advertised the VPN routes to all the SPEs,it can advertises all the VPN routes to all the SPEs,or it can advertises one part of VPN routes to one SPE, the other part to the other and shares the load among the SPEs. 6. Backdoor link between UPEs A kind of backdoor link can be setup between UPEs,so that the sites connected to these two UPEs can communicate with each other directly without passing through the SPE.And these UPEs can be those connect to the same SPE,or can be those connect to the different SPE,these UPEs advertise the VPN routes to each other through MP-BGP,The control flow and data flow are the same with RFC 2547bis Procedures, and even theu can cross a network,and the packet can pass through a tunnel such as GRE or LSP. 7. The Forwarding Procedure in Some Special Cases In one case that SPE connect two UPEs called UPE and UPE2,and these two UPEs connect two sites called site1 and site2 respectively,and site1 and site2 belong to the same VPN and visit to each other through SPE.The forwarding procedure is as follows:When the packet sent from site1 arrived to UPE,UPE push the label based on the default route and forward it to SPE,SPE POP this label,then look up the VRF route table,PUSH the label distribute by UPE2 and forward it to UPE2,and UPE2 POP the label,forward it to site2,and in the opposite direction,vise versa. In another case that SPE connect one UPE and one CE,CE connect to site1,UPE connect to site2,site1 and site2 belong to the same VPN. The forwarding procedure is as follows: (1)When the packet with no label sent from site1 arrived to SPE through CE, SPE look up VRF route table,and push the label distributed by UPE,forward it to UPE,UPE POP this label,forward it to Site2. (2)When the packet originated from Site2 arrived to UPE,UPE transfer the packet to SPE on the basis of default or aggregate route,SPE then look up the VRF route table,POP the label,forward it to Site1 as an IP packet. Libin, et al. Hierarchy of PE Device in BGP/MPLS VPN [Page 10] Draft November 2002 8. Security Consideration The level of security provided by this architecture is identical to that provided by the RFC 2547bis,there is no security problem introduced. 9. Acknowledgements The authors would like to thank Li Hejun, Cao Xuegui,Chang Wenjun, We are very appreciated for their support 10. References [RFC2026] Bradner, S., "The Internet Standards Process -- Revision 3", BCP 9, RFC 2026, October 1996. [BGP-ORF] Enke Chen,Yakov Rekhter,"Cooperative Route Filtering Capability for BGP-4",draft-ietf-idr-route-filter-06.txt. [BGP-RR] Chen, E., "Route Refresh Capability for BGP-4", RFC2918, September 2000 [RFC2547] E. Rosen, Y. Rekhter, _BGP/MPLS VPNs,_ RFC 2547,March 1999. [2547bis] Rosen, E., Rekhter, Y. et al., "BGP/MPLS VPNs", work in progress. 11.0 Author's Address Li Bin D201 ,HuaWei Bld. No3 Xinxi Rd. Shang-Di Information Industry Base, Hai-Dian District BeiJing P.R.China Zip : 100085 Email : l.b@huawei.com Dong Weisi C401 ,HuaWei Bld. No.3 Xinxi Rd. Shang-Di Information Industry Base, Hai-Dian District BeiJing P.R.China Zip : 100085 Email : dongws@huawei.com Li Defeng D201 ,HuaWei Bld. No.3 Xinxi Rd. Shang-Di Information Industry Base, Hai-Dian District BeiJing P.R.China Zip : 100085 Email : lidefeng@huawei.com Libin, et al. Hierarchy of PE Device in BGP/MPLS VPN [Page 11] Draft November 2002 Full Copyright Statement Copyright (C) The Internet Society (2001). All Rights Reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English. The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assigns. This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Libin, et al. Hierarchy of PE Device in BGP/MPLS VPN [Page 12]