NVO3 Weiguo Hao Lucy Yong S. Hares Internet Draft Huawei R. Raszuk Mirantis Inc. L. Fang Osama Zia Microsoft Shahram Davari Broadcom Andrew Qu MediaTec Intended status: Standard Track March 5, 2015 Expires: September 2015 Inter-AS Option B between NVO3 and BGP/MPLS IP VPN network through centralized architecture draft-hao-nvo3-inter-as-vpn-00.txt Abstract This draft describes the solution of vanilla inter-as option-B connection between NVO3 network and MPLS/IP VPN network through centralized NVE-NVA architecture. The ASBR located in NVO3 network is called ASBR-d, NVO3 tunnel and MPLS tunnel stitching should be performed on the ASBR-d. No distributed BGP VPN protocol (RFC4364) is running between NVEs and ASBR-d in NVO3 network, NVEs and ASBR-d are controlled by centralized NVA. Status of this Memo This Internet-Draft is submitted to IETF in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents Hao & et,al Expires September 5, 2015 [Page 1] Internet-Draft Inter-As Option-B March 2015 at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Copyright Notice Copyright (c) 2015 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Table of Contents 1. Introduction ................................................ 3 2. Conventions used in this document............................ 3 3. Reference model ............................................. 5 4. Option-A inter-as solution overview.......................... 6 5. Vanilla Option-B inter-as solution overview.................. 6 6. Vanilla Inter-As Option-B Architecture....................... 7 7. Vanilla Inter-As Option-B Procedures......................... 8 7.1. Control plane procedures................................ 8 7.1.1. DC to WAN direction................................ 8 7.1.2. WAN to DC direction................................ 9 7.2. Data plane procedures.................................. 10 7.2.1. DC to WAN direction............................... 10 7.2.2. WAN to DC direction............................... 10 8. Partial Option-B solution................................... 10 8.1.1. Control plane procedures.......................... 11 8.1.2. Data plane procedures............................. 11 9. Inter-as option comparisons................................. 11 10. Security Considerations.................................... 12 11. IANA Considerations........................................ 12 12. References ................................................ 12 12.1. Normative References.................................. 12 12.2. Informative References................................ 12 13. Acknowledgments ........................................... 13 Hao & et,al Expires September 5, 2015 [Page 2] Internet-Draft Inter-As Option-B March 2015 1. Introduction In cloud computing era, multi-tenancy has become a core requirement for data centers. Since NVO3 can satisfy multi-tenancy key requirements, this technology is being deployed in an increasing number of cloud data center network. NVO3 focuses on the construction of overlay networks that operate over an IP (L3) underlay transport network. It can provide layer 2 bridging and layer 3 IP service for each tenant. VXLAN and NVGRE are two typical NVO3 technologies. NVO3 overlay network can be controlled through centralized NVE-NVA architecture or through distributed BGP VPN protocol. NVO3 has good scaling properties from relatively small networks to networks with several million tenant systems (TSs) and hundreds of thousands of virtual networks within a single administrative domain. In NVO3 network, 24-bit VNID is used to identify different virtual networks, theoretically 16M virtual networks can be supported in a data center. In a data center network, each tenant may include one or more layer 2 virtual network and in normal cases each tenant corresponds to one routing domain (RD). Normally each layer 2 virtual network corresponds to one or more subnets. To provide cloud service to external data center client, data center networks should be connected with WAN networks. BGP MPLS/IP VPN has already been widely deployed at WAN networks. Normally internal data center and external MPLS/IP VPN network belongs to different autonomous system(AS). This requires the setting up of inter-as connections at Autonomous System Border Routers(ASBRs) between NVO3 network and external MPLS/IP network. Currently, a typical connection mechanism between a data center network and an MPLS/IP VPN network is similar to Inter-AS Option-A of RFC4364, but it has scalability issue if there is huge number of tenants in data center networks. To overcome the issue, inter-as Option-B between NVO3 network and BGP MPLS/IP VPN network is proposed in this draft. 2. Conventions used in this document Network Virtualization Edge (NVE) - An NVE is the network entity that sits at the edge of an underlay network and implements network virtualization functions. Tenant System -A physical or virtual system that can play the role of a host, or a forwarding element such as a router, switch, Hao & et,al Expires September 5, 2015 [Page 3] Internet-Draft Inter-As Option-B March 2015 firewall, etc. It belongs to a single tenant and connects to one or more VNs of that tenant. VN -A VN is a logical abstraction of a physical network that provides L2 network services to a set of Tenant Systems. RD -Route Distinguisher. RDs are used to maintain uniqueness among identical routes in different VRFs, The route distinguisher is an 8- octet field prefixed to the customer's IP address. The resulting 12- octet field is a unique "VPN-IPv4" address. RT -Route targets. It is used to control the import and export of routes between different VRFs. Hao & et,al Expires September 5, 2015 [Page 4] Internet-Draft Inter-As Option-B March 2015 3. Reference model +---------------------------------------------------+ | +----+ AS1 | | | TS1| - | | +----+ - | | - +----+ +----+ | | - |NVE1| -- |TOR1|---------------+ | | +----+ - +----+ +----+ | | | | TS2|- | | | +----+ | | | +-------+ | | +------------ | ASBR-d|-|--| | +----+ | +-------+ | | | | TS3| - | | | | +----+ - | | | | - +----+ +----+ | | | - |NVE2| -- |TOR2| | | | +----+ - +----+ +----+ | | | | TS4|- | | | +----+ | | ----------------------------------------------------| | | |---------------------------------------------------| | | AS2 | | | +----+ | | | | CE1| - | | | +----+ - | | | - +----+ +-------+ | | | - | PE1| --------------------| ASBR-w|-|--| | +----+ - +----+ +-------+ | | | CE2|- | | +----+ | |---------------------------------------------------| Figure 1 Reference model Figure 1 shows an arbitrary Multi-AS VPN interconnectivity scenario between NVO3 network and BGP MPLS/IP VPN network. NVE1, NVE2, and ASBR-d forms NVO3 overlay network in internal DC. TS1 and TS2 connect to NVE1, TS3 and TS4 connect to NVE2. PE1 and ASBR-w forms MPLS IP/VPN network in external DC. CE1 and CE2 connect to PE1. The NVO3 network belongs to AS 1, the MPLS/IP VPN network belongs to AS 2. Hao & et,al Expires September 5, 2015 [Page 5] Internet-Draft Inter-As Option-B March 2015 There are two tenants in NVO3 network, TSs in tenant 1 can freely communicate with CEs in VPN-Red, TSs in tenant 2 can freely communicate with CEs in VPN-Green. TS1 and TS3 belong to tenant 1, TS2 and TS4 belong to tenant 2. CE1 belongs to VPN-Red, CE2 belongs to VPN-Green. VNID 10 and VNID 20 are used to identify tenant 1 and tenant 2 respectively. 4. Option-A inter-as solution overview In Option-A inter-as solution, peering ASBRs are connected by multiple sub-interfaces, each ASBR acts as a PE, and thinks that the other ASBR is a CE. Virtual routing and forwarding (VRF)data bases (RIB/FIB) are configured at AS border routers (ASBR-d and ASBR-w) so that each ASBRs associate each such sub-interface with a VRF and use EBGP to distribute unlabeled IPv4 addresses to each other. In the data-plane, VLANs are used for tenant traffic separation. ASBR-d terminates NVO3 encapsulation for inter-subnet traffic from TS in internal DC to CE in external DC. Option-A inter-as solution has following issues: 1. Up to 16 million (16M) gateway interfaces (virtual/physical) and 16M EBGP session need to exist between the ASBRs. 2. UP to 16M VRFs need to be supported on border routers. 3. Several million routing entries need to be supported on border routers. Inter-as option-B between NVO3 network and MPLS IP/VPN network can be used to address these issues. As option-B proposed in this draft is for multi-as interconnection between heterogeneous networks, so there are some differences from traditional Inter-AS Option-B of RFC4364. 5. Vanilla Option-B inter-as solution overview Similar to the solution described in section 10, part (b) of [RFC4364] (commonly referred to as Option-B), the traffic that flows between ASBR-d and ASBR-w is placed in MPLS tunnels. Traffic separation among different VPNs between the ASBRs relies on MPLS VPN Label. The advantage of this option is that it's more scalable, as there is no need to have separate interface and BGP session per VPN/Tenant. As for the routing distribution process from DC to WAN side, MPLS VPN Label is allocated on ASBR-d per VN per NVE. As for the routing Hao & et,al Expires September 5, 2015 [Page 6] Internet-Draft Inter-As Option-B March 2015 distribution process from WAN to DC side, VNID is allocated per MPLS VPN Label receiving from ASBR-w on ASBR-d. As for the data plane process, NVO3 tunnel and MPLS VPN tunnel are stitched at ASBR-d. From DC to WAN side, NVO3 tunnel is terminated, VNID and MPLS VPN Label switching is performed by looking up outgoing forwarding table described in section 6.1.2. From WAN to DC side, MPLS VPN tunnel is terminated, MPLS VPN Label and NVO3 tunnel switching is performed by looking up incoming forwarding table described in section 6.1.1. ASBR-w has no difference with traditional RFC4364 based Option-B behavior, no VRF is created on the ASBR-d. 6. Vanilla Inter-As Option-B Architecture Each NVE operates as a gateway for local connecting TS(s). There must be a separate VNID to identify each tenant. VRFs can be created on each NVE to isolate IP forwarding process between different tenants. No distributed BGP VPN protocol (RFC4364) is running between NVEs and ASBR-d in NVO3 network, NVEs and ASBR-d are controlled by centralized NVA. The NVA runs EBGP VPN protocol with peer ASBR-w and exchanges VPN routing information between NVO3 network and MPLS/IP VPN network. NVA maintains tenant information for all tenants. This information includes tenant identification VN ID, RD and RT. RD and RT can be automatically generated based on VN ID, the RT on NVA must be consistent with the RT configured on remote MPLS VPN PEs of same VPN. NVA also maintains all TS's MAC/IP address and its attached NVE information for each tenant. ------ EBGP -------- |NVA | ------------- |ASBR2 | ------ -------- . . Southbound interface(Openflow,OVSDB,I2RS) ........................ . . . . . . . . . ------ ------ ------- |NVE1| |NVE2| |ASBR1| ------ ------ ------- Figure 2 NVE-NVA Architecture Hao & et,al Expires September 5, 2015 [Page 7] Internet-Draft Inter-As Option-B March 2015 7. Vanilla Inter-As Option-B Procedures 7.1. Control plane procedures 7.1.1. DC to WAN direction 1. NVA allocates MPLS VPN Label per tenant per NVE. The allocated MPLS VPN label and its corresponding pair forms incoming forwarding table [Table 1] which is used to forward MPLS traffic from WAN to DC side. +--------------------+------------------+ | MPLS VPN Label | NVE + VNID | +--------------------+------------------+ | 1000 | NVE1 + 10 | +--------------------+------------------+ | 2000 | NVE1 + 20 | +--------------------+------------------+ | 1001 | NVE2 + 10 | +--------------------+------------------+ | 2001 | NVE2 + 20 | +--------------------+------------------+ Table 1 Incoming forwarding table 2. NVA advertises all internal data center VPN routing information [Table 2] to peer ASBR-w, which includes RD, IP prefix, RT, and MPLS VPN Label. +--------+--------+-----------+----------------+ | RD | RT | IP Prefix| MPLS VPN Label | +--------+--------+-----------+----------------+ | RD-A | RT-A | TS1 IP | 1000 | +--------+--------+-----------+----------------+ | RD-A | RT-A | TS3 IP | 1001 | +--------+--------+-----------+----------------+ | RD-B | RD-B | TS2 IP | 2000 | +--------+--------+-----------+----------------+ | RD-B | RD-B | TS4 IP | 2001 | +--------+--------+-----------+----------------+ Table 2 VPN routing information from DC side 3. NVA downloads the incoming forwarding table [Table 1] to ASBR-d. Hao & et,al Expires September 5, 2015 [Page 8] Internet-Draft Inter-As Option-B March 2015 7.1.2. WAN to DC direction 1. NVA receives VPN routing information from peer ASBR-w. Assuming ASBR-w allocats MPLS VPN Label 3000 and 4000 for VPN-Red and VPN- Green at PE1, the VPN routing information received from ASBR-w is as follows: +--------+--------+-----------+----------------+ | RD | RT | IP Prefix| MPLS VPN Label | +--------+--------+-----------+----------------+ | RD-C | RT-A | CE1 IP | 3000 | +--------+--------+-----------+----------------+ | RD-D | RT-B | CE2 IP | 4000 | +--------+--------+-----------+----------------+ Table 3 VPN routing information from WAN side 2. NVA allocates VN ID for each MPLS VPN Label receiving from ASBR-w. The role of the VNID is similar to the role of Incoming VPN Label in traditional MPLS VPN Option-B based ASBR defined in [RFC 4364], it has local significance on ASBR-d, each VNID corresponds to a MPLS VPN Label received from peer ASBR-w. The allocated VNID and its corresponding out VPN Label forms an outgoing forwarding table [Table 4] which is used to forward NVO3 traffic from DC to WAN side. +------------------+--------------------+ | VNID | Out VPN Label | +------------------+--------------------+ | 10000 | 3000 | +------------------+--------------------+ | 10001 | 4000 | +------------------+--------------------+ Table 4 Outgoing forwarding table 3. NVA downloads the outgoing forwarding table [Table 4] to ASBR-d. 4. NVA matches local Route Target configuration, imports VPN route to each tenant, and downloads routing table to corresponding NVE. The routing table is used for forwarding traffic to WAN side. Hao & et,al Expires September 5, 2015 [Page 9] Internet-Draft Inter-As Option-B March 2015 7.2. Data plane procedures This section describes the step by step procedures of data forward between TS1 and CE1 for either: a) DC to WAN direction IP data flows, or b) WAN to DC direction IP data flows. 7.2.1. DC to WAN direction 1. TS1 sends traffic to NVE1, the destination IP is CE1's IP address. 2. NVE1 looks up tenant 1's IP forwarding table, then it gets NVO3 tunnel encapsulation information. The destination outer address is ASBR-d's IP address, VNID is 10000 allocated by ASBR-d for VPN route of CE1 received from ASBR-w. NVE1 performs NVO3 encapsulation and sends the traffic to ASBR-d. 3. ASBR-d decapsulates NVO3 encapsulation and gets VNID 10000. Then it looks up outgoing forwarding table based on the VNID and gets MPLS VPN label 3000. Finally it pushes MPLS VPN label for the IP traffic and sends it to ASBR-w. 4. Then the traffic is forwarded to CE1 through regular MPLS VPN forwarding process. 7.2.2. WAN to DC direction 1. CE1 sends traffic to PE1, destination IP is TS1's IP address. The traffic is forwarded to ASBR-d through regular MPLS VPN forwarding process. The incoming MPLS VPN label at ASBR-d is 1000 allocated by ASBR-d for tenant 1 at NVE1. 2. ASBR-d looks up incoming forwarding table and gets NVO3 encapsulation, then performs NVO3 encapsulation and sends the traffic to NVE1. The destination outer IP is NVE1's IP, VNID is 10 corresponding to tenant 1. 3. NVE1 decapsulates NVO3 encapsulation, gets local IP forwarding table relying on VNID 10, and then sends the traffic to TS1. 8. Partial Option-B solution In this solution, VRF is created for each tenant on ASBR-d while no direct NVO3 tunnel and MPLS tunnel stitching is performed on ASBR-d. Hao & et,al Expires September 5, 2015 [Page 10] Internet-Draft Inter-As Option-B March 2015 8.1.1. Control plane procedures In the direction from DC to WAN side, NVA doesn't need allocate MPLS VPN Label per tenant per NVE, it only need allocate it per tenant. In the direction from WAN to DC side, NVA doesn't allocate new VNID for each MPLS VPN Label received from ASBR-w, the VPN route from WAN side populates to local VRF. 8.1.2. Data plane procedures In the direction from DC to WAN side, IP routing process is performed, VRF is selected based on VNID, and then the traffic will be MPLS encapsulated and send to peer ASBR-w. In the direction from WAN to DC side, MPLS tunnel is terminated and IP routing table is looked up and then the traffic will be NVO3 encapsulated and send to peer NVE. 9. Inter-as option comparisons The document describes several inter-as implementation options between ASBR-d and ASBR-w. The following table illustrates the comparison among the implementation options. +----------------+-----------+------------------+----------------+ | | Option-A |Partial Option-B |Vanilla Option-B| +----------------+-----------+------------------+----------------+ | Sub-interface | Yes | No | No | +----------------+-----------+------------------+----------------+ | VRF | Yes | Yes | No | +----------------+-----------+------------------+----------------+ | Scalability | Worst | Middle | Best | +----------------+-----------+------------------+----------------+ | Hardware | | | | | Implementation | | | | | at ASBR-d |No Upgrade | No Upgrade | Need Upgrade | +----------------+-----------+------------------+----------------+ Table 5 Inter-as option comparisons Option-A design uses a regular VPN handoff between ASBR-d and ASBR-w. A sub-interface is required per a NVO instance in between. Both border routers perform the VRF lookup. Thus, the solution has a scalability concern. Existing hardware supports this solution. Partial Option-B does not require sub-interfaces between ASBR-d and ASBR-w, only ASBR-d performs the VRF lookup, so it has better scalability than option A. Existing hardware can support this solution. Hao & et,al Expires September 5, 2015 [Page 11] Internet-Draft Inter-As Option-B March 2015 In the vanilla Option-B solution, there is no sub-interface between border routers and no VRF table on ASBR-d and ASBR-w. Tunnel stitching is performed on the ASBR-d. Thus this solution has the best scalability. From hardware perspective, the vanilla option-B needs ASBR-d hardware upgrade to support the tunnel stitching. 10. Security Considerations Similar to the security considerations for inter-as Option-B in [RFC4364] the appropriate trust relationship must exist between NVO3 network and MPLS/IP VPN network. VPN-IPv4 routes in NVO3 network should neither be distributed to nor accepted from the public Internet, or from any BGP peers that are not trusted. For other general VPN Security Considerations, see [RFC4364]. 11. IANA Considerations This document requires no IANA actions. RFC Editor: Please remove this section before publication. 12. References 12.1. Normative References [1] [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [2] [RFC4364] E. Rosen, Y. Rekhter, " BGP/MPLS IP Virtual Private Networks (VPNs)", RFC 4364, February 2006. [3] [RFC5512] P. Mohapatra, E. Rosen, " The BGP Encapsulation Subsequent Address Family Identifier (SAFI) and the BGP Tunnel Encapsulation Attribute", RFC5512, April 2009 12.2. Informative References [1] [NVA] D.Black, etc, "An Architecture for Overlay Networks (NVO3)", draft-ietf-nvo3-arch-01, February 14, 2014 [2] [RFC7047] B. Pfaff, B. Davie,''The Open vSwitch Database Management Protocol'', RFC 7047, December 2013 Hao & et,al Expires September 5, 2015 [Page 12] Internet-Draft Inter-As Option-B March 2015 [3] [OpenFlow1.3]OpenFlow Switch Specification Version 1.3.0 (Wire Protocol 0x04). June 25, 2012. (https://www.opennetworking.org/images/stories/downloads/sdn- resources/onf-specifications/openflow/openflow-spec-v1.3.0.pdf) 13. Acknowledgments Authors like to thank Xiaohu Xu, Liang Xia, Shunwan Zhang, Yizhou Li, Lili Wang for his valuable inputs. Authors' Addresses Weiguo Hao Huawei Technologies 101 Software Avenue, Nanjing 210012 China Phone: +86-25-56623144 Email: haoweiguo@huawei.com Lucy Yong Huawei Technologies Phone: +1-918-808-1918 Email: lucy.yong@huawei.com Susan Hares Huawei Technologies Phone: +1-734-604-0323 Email: shares@ndzh.com. Hao & et,al Expires September 5, 2015 [Page 13] Internet-Draft Inter-As Option-B March 2015 Robert Raszuk Mirantis Inc. 615 National Ave. #100 Mt View, CA 94043 USA Email: robert@raszuk.net Luyuan Fang Microsoft Email: lufang@microsoft.com Osama Zia Microsoft osamaz@microsoft.com Shahram Davari Broadcom Davari@Broadcom.com Andrew Qu MediaTec andrew.qu@mediatek.com Hao & et,al Expires September 5, 2015 [Page 14]