NVO3 Weiguo Hao Lucy Yong Yizhou Li Internet Draft Huawei Feng Wang H3C W.Shao Tencent Vic Liu China Mobile Intended status: Informational June 30, 2014 Expires: December 2014 NVO3 Anycast Layer 3 Gateway draft-hao-nvo3-anycast-gw-00.txt Status of this Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. This document may not be modified, and derivative works of it may not be created, and it may not be published except as an Internet-Draft. This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. This document may not be modified, and derivative works of it may not be created, except to publish it as an RFC and to translate it into languages other than English. This document may contain material from IETF Documents or IETF Contributions published or made publicly available before November 10, 2008. The person(s) controlling the copyright in some of this material may not have granted the IETF Trust the right to allow modifications of such material outside the IETF Standards Process. Without obtaining an adequate license from the person(s) controlling the copyright in such materials, this document may not be modified outside the IETF Standards Process, and derivative works of it may not be created outside the IETF Standards Process, except to format it for publication as an RFC or to translate it into languages other than English. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that Hao & et,al Expires December 30, 2014 [Page 1] Internet-Draft NVO3 anycast Layer 3 Gateway June 2014 other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html This Internet-Draft will expire on December 30, 2014. Copyright Notice Copyright (c) 2013 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Abstract This draft describes centralized anycast layer 3 gateway solution for NVO3 networks interworking with external networks. Comparing to traditional VRRP based active-standby layer 3 gateway solution, this solution can achieve better load balancing and scalability. Hao & et,al Expires December 30, 2014 [Page 2] Internet-Draft NVO3 anycast Layer 3 Gateway June 2014 Table of Contents 1. Introduction ................................................ 3 2. Conventions used in this document............................ 4 3. Anycast Layer 3 Gateway...................................... 5 4. ARP Handling ................................................ 6 5. Resilience on Gateway Node Failure........................... 6 6. Anycast GW and VRRP GW Comparison............................ 6 6.1. VRRP based layer 3 gateway solution..................... 6 6.2. Comparison ............................................. 7 7. Security Considerations...................................... 7 8. IANA Considerations ......................................... 7 8.1. Normative References.................................... 8 8.2. Informative References.................................. 8 9. Acknowledgments ............................................. 8 1. Introduction NVO3 overlay networks provide network connectivity to a set of Tenant Systems (TSs) [NVO3FRWK]. A data center(DC) may support many tenant networks.[NVO3PS] It is very often that some Tenant Systems need to communicate with external networks. An external network may be another overlay network in DC or a VPN in WAN or Internet. In this case, a gateway (GW) is required where inter-network policies are placed and enforced. Figure 1 illustrates a popular DC infrastructure where two DC GWs are used at DC boarder. All tenant system traffic going in/out DC and between overlay VNs will pass through the DC GWs. For a large DC and supporting many tenant networks, such GWs can be the pain point for the scalability. Although VRRP [RFC2338] [RFC3768] [RFC5798] may be used at the GWs to provide link/node redundancy, it does not resolve the scalability issue. Distributed GW may be implemented on NVEs [NVO3ARCH], which may reduce the traffic passing through these GWs; however all the traffic going in/out DC still have to go through these GWs. Hao & et,al Expires December 30, 2014 [Page 3] Internet-Draft NVO3 anycast Layer 3 Gateway June 2014 ,---------. ,' `. ( IP/MPLS WAN ) `. ,' * -+------+' * * * * * --------- --------- | GW1 | | GW2 | | | ************ | | --------- --------- * * * * * * * * * * --------- --------- --------- --------- | TOR1 | ******** | TOR2 | ********| TOR3 |********| TOR4 | | | | | | | | | --------- --------- --------- --------- | | | | | | | | --------- --------- --------- --------- | NVE1 | | NVE2 | | NVE3 | | NVE4 | | | | | | | | | --------- --------- --------- --------- | | | | | | | | ____ ____ ____ ____ ____ ____ ____ ____ |T | |T | |T | |T | |T | |T | |T | |T | |S1| |S2| |S3| |S4| |S5| |S6| |S7| |S8| ---- ---- ---- ---- ---- ---- ---- ---- 1. Figure1 Centralized layer 3 gateway in NVO3 network This draft proposes anycast layer 3 gateway solution for DC GWs that address the scalability concern. To differentiate it from distributed GWs in NVO3, the DC GW is referred to as centralized GW. A centralized GW is a gateway network device that has embedded NVE capability, i.e. ability to maintain the inner/outer mapping and terminates overlay tunnels. 2. Conventions used in this document The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT","SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119].The acronyms and terminology in [RFC6325] is used herein with the following additions: Hao & et,al Expires December 30, 2014 [Page 4] Internet-Draft NVO3 anycast Layer 3 Gateway June 2014 Network Virtualization Edge (NVE)- An NVE is the network entity that sits at the edge of an underlay network and implements network virtualization functions. Tenant System - A physical or virtual system that can play the role of a host, or a forwarding element such as a router, switch,firewall, etc. It belongs to a single tenant and connects to one or more VNs of that tenant. VN - A VN is a logical abstraction of a physical network that provides L2 network services to a set of Tenant Systems. 3. Anycast Layer 3 Gateway Anycast Layer 3 Gateway means that multiple GW network devices support GW functions between overlay VNs and external networks and have the same GW IP and MAC address for each overlay VN, these gateways share same gateway IP and MAC address for each VN, the GW IP and MAC address is called gateway anycast IP and MAC address. To ensure NVO3 traffic load balancing from ingress NVEs to these gateways, these gateways also share same outer IP address, this address is called as device underlying anycast IP address in the document. Gateway anycast IP address is used as the default gateway's IP address for all TSs in the corresponding VN. As different VNs are allowed to have overlapping MAC address space, different anycast gateway IP addresses can map to the same anycast MAC. That is to say, each VN should have a unique anycast gateway IP address, however the gateway MAC address for VNs may map to the same anycast MAC. It is recommended to configure only one anycast MAC as all VNs gateway MAC address on each gateway device for simplicity purpose. When sending traffic toward a VN gateway on GW devices, ingress NVEs use the device underlying anycast IP address as outer IP destination address on NVO3 packets. The VN may be an L2 VN or L3 VN. Each GW network device announces device underlying anycast IP address in underlying IGP network. If these gateways have same routing cost to an ingress NVE, the underlying equal-cost multi-path (ECMP) approach will distribute the NVO3 traffic from the ingress NVE to one of GW devices. When sending traffic toward a tenant system in a VN, a VN GW on a GW device obtains the mapping of the tenant system and attached NVE from a table lookup. If the VN is an L3 VN, the GW device encapsulates the packet with the NVE IP address as outer destination Hao & et,al Expires December 30, 2014 [Page 5] Internet-Draft NVO3 anycast Layer 3 Gateway June 2014 IP address and its device underlying anycast IP address as the outer source IP address. If the VN is an L2 VN, the GW device inserts inner MAC header with its anycast MAC address as the source MAC address and tenant MAC address as the destination MAC address; then encapsulates the packet with the NVE IP address as outer destination IP address and its underlying anycast IP address as the outer source IP address. NVA [NVO3ARCH] maintains TS/NVE mappings per a VN and pushes the mappings to the NVEs and GW network devices. To support anycast L3 GW, NVA has the mapping of VN GW anycast IP and device underlying anycast IP for an L3 VN; or the mapping of VN GW anycast MAC and device underlying anycast IP for an L2 VN. 4. ARP Handling To avoid ARP request flooding in each VN, NVEs can make use of the mapping information from a Network Virtualization Authority (NVA) to response the ARP request. For L3 VN, upon receiving an ARP request with a VN GW anycast IP address, local NVE intercepts it, and uses itself MAC address in the reply. For L2 VN, upon receiving an ARP request with a VN GW anycast IP address, local NVE snoop it, and uses VN GW anycast MAC address in the reply. Note that NVEs may locally maintain the mapping of VN GW anycast IP and MAC address, or obtain from NVA. 5. Resilience on Gateway Node Failure Anycast L3 gateway solution is resilient on a GW network device failure. If a GW network device fails, IGP updates link status and the host routes, the NVO3 encapsulated traffic with device underlying anycast IP will only reach the remaining GW network devices. 6. Anycast GW and VRRP GW Comparison 6.1. VRRP based layer 3 gateway solution The Virtual Router Redundancy Protocol (VRRP) [RFC2338] [RFC3768] [RFC5798] is designed to eliminate the single point of gateway failure. VRRP is an election protocol that dynamically assigns responsibility for a virtual router to one of the VRRP routers on a layer 2 VN. Any of the virtual router's IP addresses on a LAN can then be used as the default first hop router by end-hosts. The layer Hao & et,al Expires December 30, 2014 [Page 6] Internet-Draft NVO3 anycast Layer 3 Gateway June 2014 3 gateway of VRRP master is responsible for forwarding packets destined to the virtual router. If VRRP master fails, VRRP backup will take over. VRRP based solution has the following issues: 1. Inefficient network bandwidth usage. Only the VRRP master gateway forwards the traffic. VRRP slave is idle most of the time. 2. VRRP session number per VN. VRRP session among physical layer 3 gateways should be established per layer 2 VN. Large number of layer 2 VN will cause heavy CPU workload for each layer 3 gateway. 6.2. Comparison +----------------------------+------------------------+--------------------------------+ | Dimension | VRRP | Anycast gatway solution | +----------------------------+------------------------+--------------------------------+ | Network bandwidth usage | Low | High | +----------------------------+------------------------+--------------------------------+ | Keep alive workload | VRRP Session per VN | No | +----------------------------+------------------------+--------------------------------+ | Network resilience | VRRP Switchover | Underlying network convergence | +----------------------------+------------------------+--------------------------------+ 7. Security Considerations NA 8. IANA Considerations NA Hao & et,al Expires December 30, 2014 [Page 7] Internet-Draft NVO3 anycast Layer 3 Gateway June 2014 8.1. Normative References [1] [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC2119, March 1997. 8.2. Informative References [1] [NVO3ARCH] Black, D, Narten, T., et al, " An Architecture for Overlay Networks (NVO3)", draft-ietf-nvo3-arch-01, work in progress. [2] [NVO3FRWK] LASSERRE, M., Motin, T., et al, "Framework for DC Network Virtualization", draft-ietf-nvo3-framework-03, work in progress. [3] [RFC 2338] S. Knight, et al, ''Virtual Router Redundancy Protocol'',RFC 2338, April 1998 [4] [RFC 3768] R. Hinden, Ed., "Virtual Router Redundancy Protocol (VRRP)", RFC 3768, April 2004 [5] [RFC 5798] S. Nadas,Ed., ''Virtual Router Redundancy Protocol (VRRP) Version 3 for IPv4 and IPv6'', RFC 5798, March 2010 9. Acknowledgments The authors wish to acknowledge the important contributions of Zhang Chengsong. Hao & et,al Expires December 30, 2014 [Page 8] Internet-Draft NVO3 anycast Layer 3 Gateway June 2014 Authors' Addresses Weiguo Hao Huawei Technologies 101 Software Avenue, Nanjing 210012 China Phone: +86-25-56623144 Email: haoweiguo@huawei.com Lucy Yong Phone: +1-918-808-1918 Email: lucy.yong@huawei.com Yizhou Li Huawei Technologies 101 Software Avenue, Nanjing 210012 China Phone: +86-25-56625375 Email: liyizhou@huawei.com Feng wang H3C Technologies Email: imfeng@h3c.com Wade Shao Tencent Email: wadeshao@tencent.com Vic Liu China Mobile 32 Xuanwumen West Ave, Beijing, China Email: liuzhiheng@chinamobile.com Hao & et,al Expires December 30, 2014 [Page 9]