Network working group L. Yong Internet Draft L. Xia Category: Standard Track Huawei Expires: May 2014 November 11, 2013 Network Virtualization Edge (NVE) draft-yong-nvo3-nve-01 Abstract This document specifies Network Virtualization Edge (NVE) data plane functionality for Network Virtualization Overlays (NVO3). These functionality specifications are necessary for the interoperability between an NVE and its attached tenant systems and between the NVEs. Status of this Memo This Internet-Draft is submitted to IETF in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on May 11, 2014. Copyright Notice Copyright (c) 2013 IETF Trust and the persons identified as the document authors. All rights reserved. Yong, et al Expires May 11, 2014 [Page 1] Internet-Draft Network Virtualization Edge (NVE) November 2013 This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Table of Contents 1. Introduction...................................................3 1.1. Conventions used in this document.........................3 1.2. Terminology...............................................3 2. NVE Design Consideration.......................................3 3. Tenant Systems.................................................4 3.1. Virtual Machines and Bare Metal Servers...................4 3.2. Network Service Appliances................................5 3.3. Gateways..................................................5 4. Network Virtualization Edge (NVE)..............................6 4.1. NVE Service Type..........................................6 4.1.1. L2 NVE Service.......................................6 4.1.2. L3 NVE Service.......................................7 4.1.3. L2/L3 NVE Service....................................9 4.2. Overlay Tunnel between NVEs..............................10 4.3. Multi-Tenancy Support....................................11 4.4. Route Path Control.......................................11 4.5. Split-NVE................................................12 4.6. Multi-Homing Support.....................................13 4.7. OAM Tools on NVE.........................................13 5. Operation Considerations......................................13 5.1. VM Mobility..............................................13 5.2. NVE Service Type Choice for a Tenant Network.............14 5.3. Gateway vs. Distributed Gateway..........................14 6. Security Considerations.......................................16 7. Acknowledgements..............................................16 8. IANA Considerations...........................................16 9. References....................................................16 9.1. Normative References.....................................16 9.2. Informative References...................................16 Yong, et al Expires May 11, 2014 [Page 2] Internet-Draft Network Virtualization Edge (NVE) November 2013 1. Introduction Network Virtualization Edge (NVE) is a component in Network Virtualization Overlays Technology. This component is described in the NVO3 framework [NVO3FRWK] and architecture [NV03ARCH]. This document specifies NVE data plane functionality. The functionality specifications are necessary for the interoperability between an NVE and its attached tenant systems and between the NVEs. The data plane functionality described in this document is independent of NVE or NVO3 control plane functionality. Thus, the control plane functionality is outside the scope of this document. However the specifications in this document can support any control plane solution and are helpful in the control plane protocol development. NVE data plane functionality essential is the packet forwarding. It receives a packet from a tenant system via a virtual access point (VAP), processes it, and forwards it to the peer NVE via an overlay tunnel or forwards to a local VAP; it receives a packet from a peer NVE, processes it, and forwards it to a tenant system via a VAP. In the process, an NVE may modify the packet header and/or insert/remove the tunnel header on the packet prior to the forwarding. The tunnel encapsulation protocol is outside the scope of this document. In order to make NVO3 data plane work properly, some configurations on NVEs are necessary. They can be done manually or in automation. How these configurations are done is outside the scope of the document. 1.1. Conventions used in this document The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC-2119 [RFC2119]. 1.2. Terminology This document uses the terms defined in NVO3 framework [NVO3FRWK] and architecture [NVO3ARCH] documents. 2. NVE Design Consideration NVE design goals are: 1. The solution supports multi-tenancy in a common underlying network. Yong, et al Expires May 11, 2014 [Page 3] Internet-Draft Network Virtualization Edge (NVE) November 2013 2. The solution support different types of tenant systems and require no change on tenant system configuration and behavior. 3. The solution is agnostic to NVE location, i.e. regardless NVE is co-located with tenant systems on a server/device or physically separated from tenant systems. 4. No change on tenant system configuration and behavior whether a gateway or distributed gateway is used. 5. The solution must support virtual machine (VM) mobility. 6. The solution must be scalable in supporting a VN having many NVEs each of which may have many attached tenant systems; and in supporting an NVE being the members of several VNs and having many attached tenant systems that belong to the same or different VNs. Note that NVO3 architecture [NVO3ARCH] defines NVE and NVA entities; the goal 5 and 6 achievement depends on both NVE and NVA. This document only focuses on NVE data plane functionality. The interaction between NVE and NVA, between NVAs, and between hypervisor and NVE are outside the scope of the document. 3. Tenant Systems NVO3 architecture [NVO3] defines several types of tenant systems. Following sections describes these tenant systems in terms of their role and networking behavior. 3.1. Virtual Machines and Bare Metal Servers Tenant system may be a virtual machine on a server or a bare metal server. For a virtual machine, Guest OS runs on the tenant system and application software runs on top of Guest OS. For a bare metal server, host OS runs on the server and application software runs on top of it. Here is the summary of such tenant system networking behavior: . A tenant system (TS) is configured with a subnet such as 10.1.1.0/24 and a default GW IP address, and TS MAC and IP address (manually or automatically). Note that TS IP address MUST be an address in the subnet. How to configure these addresses is outside the scope of document. Yong, et al Expires May 11, 2014 [Page 4] Internet-Draft Network Virtualization Edge (NVE) November 2013 . A tenant system learns the MAC address of the peer in the same subnet by using ARP protocol and may learn it from the source MAC address on the packet as well. . A tenant system learns the GW MAC address by using ARP protocol. The GW entity MUST support ARP protocol. . A tenant system may cache the interested destination IP and MAC address for the packet forwarding. . For intra subnet forwarding, a tenant system inserts the destination MAC address on the packet. . For inter subnet forwarding, a tenant system inserts the GW MAC address on the packet. . A tenant system may filter received packets and only accept the packet with the designation MAC address on it is the same as its MAC address. 3.2. Network Service Appliances A network service appliance such as load balancer or firewall can act as a tenant system and provide a service to one or more VNs (via distinct VAPs). A network service appliance can be implemented on a physical device, a bare metal server, or a virtual machine on a server. The last one is often referred as Virtualized Network Function (VNF). A tenant system, as a network service appliance, may have different configuration and behavior as a host as described in section 3.1. The tenant system may attach to an NVE via a local VLAN interface, IP interface, or directly attached port/vport and act as a middle box in a VN. Typically, the configuration or policy on the local NVE determines which VN traffic or traffic flows are forwarded to the tenant system. In other words, the local NVE does not forward the packets to the tenant system based on the destination address on the packets. Such tenant system may also be the member of multiple VNs via distinct virtual access points (VAPs), respectively, and may forward the received traffic back to the same VN or to a different VN. The tenant system may even modify the received packets prior to forwarding them, e.g. Network Address Translation (NAT). 3.3. Gateways A gateway may be used to interconnect two NV03 VNs, between an NV03 VN and other networks that may be virtual, physical, between an NV03 VN and Internet, or a combination of these. A gateway can also interconnect two NVO3 VNs that are implemented with different NVE Yong, et al Expires May 11, 2014 [Page 5] Internet-Draft Network Virtualization Edge (NVE) November 2013 types. A gateway may be implemented on a physical device, a bare metal server, or a VM. For a gateway that interconnects two NVO3 VNs, it may be also implemented as of distributed gateway. The distributed gateway means a gateway function implemented on NVEs so that the traffic between the VNs can be forwarded at the local NVE; as a result path optimization is gained. This document describes the distributed gateway function at NVEs. It is often that a gateway integrates with several other network service appliances such as NAT, firewall and policy based forwarding for the interconnection need, which means that inter-VN traffic gets these special treatments. A gateway can be implemented as a Tenant System, in which it is like a network service appliance described in section 3.2. A gateway can also be implemented on a network device, i.e. have embedded NVE component; which means that the device has to support NVE capability and interwork with other NVEs seamlessly. In the rest of document, it describes a gateway as an entity which can be implemented in either way. 4. Network Virtualization Edge (NVE) NVO3 framework [NVO3FRWK] defines three NVE service types: L2 service, L3 service, and L2/L3 service. A tenant network can be implemented with one of NVE types or the combination w/ a gateway. 4.1. NVE Service Type 4.1.1. L2 NVE Service An L2 VN can be implemented with L2 NVE service type and provides L2 broadcast domain to the tenant systems on the VN. The tenant system attaches to the NVE via a VLAN, directly attached port or virtual port. For the scalability and performance concern, an NVE should not forward the ARP request message to remote NVEs. If the interested tenant system at a remote NVE, the local NVE sends the ARP response with the remote MAC address back. An NVE can obtain the remote tenant system MAC address via NVA.[NVO3ARCH] The L2 overlay is used between NVEs (NVE gets the inner/outer address mapping from NVA). Note that, an L2 virtual network overlay provides a single broadcast domain. Figure 1 illustrates a VN with L2 NVE service type. As shown in the Figure, TSs attach to an NVE via the VAPs that are associated to the L2 VNIx on the NVE. L2 Overlay is used between the NVEs that are the members of a VN. The VN ID is encoded in the overlay header on the packets. Yong, et al Expires May 11, 2014 [Page 6] Internet-Draft Network Virtualization Edge (NVE) November 2013 +------------+ +-----------+ | +------+ |-~-~-~-~-~-~-~-~-~| +------+ | TSs----+-|L2VNIx+---{ L2 Overlay }--+L2VNIx|-+----TSs | +------+ |-~-~-~-~-~-~-~-~-~| +------+ | +------------+ +-----------+ NVE1 NVE2 Figure 1 L2 NVE Service Model L2 NVE service supports a broadcast domain in the VN. How to transport broadcast/multicast traffic among NVEs requires a way to map the VN broadcast/multicast traffic to a underlying IP multicast solution. IP network does not support a multicast solution yet and relies on PIM to support multicast transport. An NVE can send the VN broadcast/multicast traffic to the remote NVEs by using unicast outer IP address on the packets, i.e. replicating in the underlying network. This method does not require the underlying network to support a multicast transport. How NVA conveys such mapping information to NVEs is outside the scope of the document. To interconnect with external virtual or physical networks or an overlay or non-overlay virtual network, a gateway is necessary. The gateway participates in the L2 VN and performs the traffic treatment between the interconnected networks. L2 NVE service may apply to non-IP and IP applications. However IP based application can be implemented in other ways too. (see below) To use NVE data plane learning of mapping between remote tenant system MAC and remote NVE IP address mapping, and for a remote tenant system to use ARP in notifying its MAC address is for further study. 4.1.2. L3 NVE Service An L3 VN can be implemented with L3 NVE service type and provide L3 routing domain for the tenant systems in the VN. The VAP can be an IP interface, attached port/vport, or a VLAN interface. An NVE acts as the first hop router to the attached tenant systems. For the VLAN interface, i.e. Ethernet access interface, the NVE needs support ARP protocol, terminates all ARP request messages, and reply its MAC address to the tenant system. An NVE tracks the tenant system IP and MAC address mapping if L2 access is used unless the special configuration is done on the NVE. (see section 3.2). When a tenant system forwards packets to an attached NVE, the NVE receives either IP packets or Ethernet frames with NVE MAC address Yong, et al Expires May 11, 2014 [Page 7] Internet-Draft Network Virtualization Edge (NVE) November 2013 as the destination MAC on the packets depending on VAP interface type. The NVE performs an IP table lookup based on the destination IP address on the packets. If the packet needs to forward to another NVE, the NVE sends it via L3 overlay (NVE obtains the inner/outer address mapping from NVA). If the packet needs to be forwarded to a local VAP, and the VAP is Ethernet access, the NVE inserts the destination MAC address on the packet prior to sending it to the tenant system; if the VAP is an IP interface, the NVE sends IP packet directly. How NVE obtains local tenant system IP and MAC addresses is outside the scope of this document. The rule of thumb is that such method MUST not require any change on the tenant system side. The tenant systems connecting to an L3 VN can be on the same or different subnets. A special case of this VN is that the VN provides the connectivity for all intra and inter subnets, which brings optimal path. Typically, subnet or flow based policies may be configured for route constraint or route path control policies may be configured depending on the tenant network requirements. Figure 2 depicts an L3 VN model. +------------+ +-----------+ | +------+ |-~-~-~-~-~-~-~-~-~| +------+ | TSs----+-|L3VNIx+---{ L3 Overlay }--+L3VNIx|-+----TSs | +------+ |-~-~-~-~-~-~-~-~-~| +------+ | +------------+ +-----------+ NVE1 NVE2 Figure 2 L3 NVE Service Model If an L3 VN needs to interconnect with external networks, Internet, or another VN, a gateway MUST be used. To avoid address collision, either IP address space partition or IP address translation between two VNs at a gateway is necessary. The former, in turn, looks like one routing domain with one IP space shared between two virtual networks. For two L3 VNs interconnection, NVEs may be function as distributed gateway and perform both intra-VN routing/forwarding and inter-VN routing/forwarding/gateway. Figure 3 depicts an L3 NVE w/ distributed gateway. As shown in the Figure, TSs attach to an NVE via the VAPs that are associated to the L3 VNIx or L3 VNIy on the NVE. L3 Overlay is used between the NVEs that are the members of the VNx and VNy. The VNx and VNy IDs are encoded in the overlay header on the packets. Section 5.3 discusses operation considerations in use of a gateway or distributed gateway. Yong, et al Expires May 11, 2014 [Page 8] Internet-Draft Network Virtualization Edge (NVE) November 2013 +------------+ +-----------+ | +------+ |-~-~-~-~-~-~-~-~-~| +------+ | TSs---+-|L3VNIx+---{ L3 Overlay(VNx) }--+L3VNIx|-+---TSs | +--+---+ |-~-~-~-~-~-~-~-~-~| +--+---+ | | +-+-+ | | +-+-+ | | |dGW| | | |dGW| | | +-+-+ | | +-+-+ | | +--+---+ |-~-~-~-~-~-~-~-~-~| +--+---+ | TSs---+ |L3VNIy+---{ L3 Overlay(VNy) }--+L3VNIy|-+---TSs | +--+---+ |-~-~-~-~-~-~-~-~-~| +---+--+ | +------------+ +-----------+ NVE1 NVE2 Figure 3 L3 NVE Service w/dGW Model Note: an L3 VN, as a route domain, does not support broadcast and multicast function. Applicability for MVPN [RFC4834] to NVO3 is for further study. It is obvious that L3 NVE service type only supports IP based application. 4.1.3. L2/L3 NVE Service L2/L3 NVE service type is used for multiple L2 VNs w/ distributed L3 gateway function on NVEs, or L2 VNs and L3 VNs interconnected w/ distributed L3 gateway function on NVEs. In the former case, the VAP may be VLAN or port/vport based. Tenant systems on the same L2 VN or different L2 VNs may be on the same or different NVEs. The tenant systems on the same L2 VN are in a broadcast domain and can be communicated without a constraint. The implementation looks like L2 NVE described above. For the traffic across L2 VNs, i.e. from one L2 VN to another, the tenant system sends the packet to a GW with GW MAC address that is configured on the local NVE. The same MAC address should be used on all NVEs. (Use of different GW MAC address on NVEs is for further study). The NVE performs the IP table lookup and gets the destination VN, and gets the destination MAC address, and outer address or VAP address in the VN, then replace VN ID and destination MAC address on the packet, and further adds the outer addresses on the packet if sending to a peer NVE or send to the VAP where destination tenant system attaches to. L2 overlay is used between NVEs (NVE get the inner/outer address mapping from NVA). Operator can configure the communication policy Yong, et al Expires May 11, 2014 [Page 9] Internet-Draft Network Virtualization Edge (NVE) November 2013 between L2 VNs not only for unicast traffic but also broadcast/multicast traffic. The latter allows broadcast/multicast traffic sending to another L2 VN network. The policy can be associated to VNs instead of subnets or flows. In the latter case, the L2 VN implementation on NVEs is like the L2 NVE described in section 4.1.1, the L3 VN implementation on NVEs is like the L3 NVE described in section 4.1.2. The distributed L3 gateway function on NVEs is used for local inter-VN traffic described above and policy be used for inter-VN traffic control. Figure 4 illustrates the model for NVE L2/L3 service type. The NVEs in this case maintain both tenant system IP and MAC addresses. +------------+ +-----------+ | +------+ |-~-~-~-~-~-~-~-~-~| +------+ | TSs---+-|L2VNIx+---{ L2 Overlay(VNx) }--+L2VNIx|-+---TSs | +--+---+ |-~-~-~-~-~-~-~-~-~| +--+---+ | | +-+---+ | | +-+---+ | | |L3dGW| | | |L3dGW| | | +-+---+ | | +-+---+ | | +--+-----+ |-~-~-~-~-~-~-~-~-~| +---+----+| TSs---+-|L2|3VNIz+-{L2|L3 Overlay(VNz)}-+L2|3VNIz|+---TSs | +--+-----+ |-~-~-~-~-~-~-~-~-~| +---+----+| +------------+ +-----------+ NVE1 NVE2 Figure 4 L2/L3 NVE Service Model The tenant network with multiple L2 VNs and L3 VNs interconnected w/distributed L3 gateway function MUST share the same IP and MAC address space. If the tenant network further interconnects with other tenant networks, external, or Internet via a gateway, either IP and/or MAC address partition or address translation MUST be used. An L2/L3 VN interconnecting to Internet is the second case in general. 4.2. Overlay Tunnel between NVEs NVE may implement L2 overlay or L3 overlay depending on NVE types. A tunnel between two NVEs may be over one underlying network segment/domain or span across multiple network domains. Both NVEs need to use the same encapsulation protocol to encap/decap packets to/from a tunnel in between. There are several encapsulation methods in the industry such as VXLAN [VXLAN], NVGRE [NVGRE]. If two NVEs do Yong, et al Expires May 11, 2014 [Page 10] Internet-Draft Network Virtualization Edge (NVE) November 2013 not support one same encapsulation method, the gateway may be used for the encapsulation translation. (It would be nice for DC operator and IETF to influence the industry to have one encapsulation protocol for the virtual network, which simplify the solution and lower its cost.) A P2MP tunnel or MP2MP tunnel may be used for multicast traffic transport. According to NVO3 architecture, an NVE relies on the NVA to obtain the inner/outer address mapping and the underlying network supports the IP connectivity between two NVEs. How NVA obtains the mapping information is outside the scope of the document. The encapsulation process on an NVE also needs convey the packet characteristics in a VN to the underlying network, i.e. encode or translate the packet parameters in the inner header to the outer header, so that the packet can get the same treatment in the underlying network.[NVO3DPREQ]. Some examples are CoS value, and entropy information calculation. The detail will be updated in next version. 4.3. Multi-Tenancy Support It is very important for NVO3 to support multi-tenancy over a common physical infrastructure, and ensures independent address space in individual tenant networks that are not communicated directly, i.e. only communicate with address translation or via Internet, and traffic isolation among them. NVE MUST maintain separated forwarding tables to support address overlapping. Since a tenant network may have one virtual or multiple virtual networks, it is important for a tenant or DC operator to manage the address allocation for the virtual networks in a tenant network to avoid address collision. A tunnel between NVEs may carry the traffic belonging to different virtual networks. The VN ID in the overlay header serves the traffic segregation. 4.4. Route Path Control A tenant network may be implemented as a full mesh among NVEs and not have a policy on route path at all. As the result, single hop is between sender TS and destination TS. A tenant network may contain some tenant systems that are designated as a network service appliance. In the case, tenant may want some tenant traffic passing through some network service appliance prior to delivering to the destination tenant and some are not. For example, Tenant network is implemented with three VNs in a DC. One L2 VN is for Web Tier, second L2 VN is for application tier; third L2 VN is for Database Tier. The policies are illustrated in the following figure. Traffic from the Web Tier to the App Tier MUST pass through tenant firewall Yong, et al Expires May 11, 2014 [Page 11] Internet-Draft Network Virtualization Edge (NVE) November 2013 on the Tenant System, say A; The traffic from App Tier to Web Tier can be directly routed; Traffic between App Tier and DB Tier MUST pass tenant firewall on tenant system, say B. No communication is allowed between the Web Tier and the DB Tier. Web Tier ----FW A----> App Tier ----FW B----> DB Tier <----------- <----FW B----- Figure 5 Polices on a Three-Tier Tenant Network In this case, An NVE attached by the firewall tenant system A is configured to forward all the traffic from Web Tier to the system A via a VAP. The system A processes the packets and may forward the packets to the App Tier via another VAP that is associated to App Tier on the NVE. The NVE forwards to the tenant systems in App Tier on behalf of the system A. The traffic from App Tier can be forwarded to the Web Tier directly. The NVE simply obtains the inner/outer address mapping and translate VN IDs on the packets prior to forwarding to the peer NVEs. Thus, in the pass-thru firewall case, the inner/outer address mapping that an NVE gets from NVA is not destination tenant address (inner)/its NVE address; the outer address is the address of the NVE which the system A attaches to. The NVO3 architecture, such route path control can be implemented in NVA, NVE, or both. How to implement such policies on NVE is either beyond the scope of the document or for further study. 4.5. Split-NVE Split-NVE may be used in several use cases [NVO3ARCH]. Some NVE functions may reside on NVE spoke and some are on NVE hub. An overlay tunnel is used between NVE spoke and hub. One useful splitting structure in the data plane is to simplify the forwarding table on the NVE spoke, i.e. only maintains local forwarding entry; and let the NVE hub maintain the complete forwarding table. An NVE spoke just sends the packets to the NVE hub if the receiving point is not at local. It is possible that an NVE hub does not have any direct attached TS but connects to many NVE spokes. In this case, all the NVE spokes and NVE hubs are the member of one VN. Another useful structure is Intra NVE and Inter NVE splitting. The Intra NVEs forwards the packets if the sender TS and receiver TS are in the same VN and forwards the packets to the Inter NVE if not. The Inter NVE forwards the packets between the different VNs. The Inter NVE is often called a gateway. Note that, this design may cause the packet hire-pinning if the sender and receiver TSs in two different VNs are on the same Intra NVE. See section 5.3. Yong, et al Expires May 11, 2014 [Page 12] Internet-Draft Network Virtualization Edge (NVE) November 2013 Split-NVE applies to all NVE service types. There is no configuration and behavior change between a TS and attaching NVE regardless if split-NVE is used or not. However, the communication performance and the tenant network cost may differ. The splitting control plane functionality on an NVE is outside the scope of this document. 4.6. Multi-Homing Support When an NVE is physically separated from attached tenant systems, a tenant system may attach to one or more NVEs via the VAPs. The detail description comes soon. 4.7. OAM Tools on NVE It is necessary for an NVE to support some OAM tools. A tool can be turned on when a tunnel or VN is set up or dynamically turned on according to operation needs. NVE may support following tools. This section will be updated after the completion of nvo3 OAM requirement [NV03OAM]. 5. Operation Considerations 5.1. VM Mobility VM Mobility provides some benefits for DC operator in term of resource optimizations and performance turning. If a tenant system on a VM acts as a guest OS and runs application software, it can be moved from one NVE to another NVE without an network impact. If the tenant systems running a network service appliances software, it may not easily move from one NVE to another NVE because there are some special configuration and policy on the NVE and tenant system to make it work. (See section 3.2) Furthermore if a VN is implemented with a special route graph instead of the "full" mesh route topology among NVEs, moving a VM from one NVE to another may require the route graph change, which may require some configuration changes on NVEs. A VN ID is used to segregate the traffic for different VNs in data plane, or say on "wire". NVE implementation may use a domain-wide global VN ID or egress NVE assigned local VN ID in the data plane. If use of local VN ID, when a VM moves from one NVE to another, a sender NVE not only has to obtain the new NVE address the VM moves Yong, et al Expires May 11, 2014 [Page 13] Internet-Draft Network Virtualization Edge (NVE) November 2013 to, i.e. the outer addresses, also has to obtain the new VN ID the new NVE allocating for the VN. In other words, the ingress NVE has to modify both overlay header and outer header when a VM moves from one NVE to another. If a domain-wide VN ID is used on NVEs, an ingress NVE only need to modify the outer header when a VN moves. Although local allocated VN ID adds implementation complexity for VM mobility, it has advantage in associating VN ID with other context at egress NVE to facilitate egress NVE packet processing. Thus, both methods should be allowed. 5.2. NVE Service Type Choice for a Tenant Network Come soon. 5.3. Gateway vs. Distributed Gateway A gateway is used, in general, to interconnect two networks. A gateway can be implemented on a physical device, a bare metal server, or a VM on a server. The gateway in NVO3 means a virtual network overlay interconnecting with other networks. The other network can be a physical network, a virtual network, a virtual network overlay, or Internet, which is often called an external network. Distributed gateway is a gateway function that is implemented on the NVEs so the traffic between two tenant systems in different virtual networks can be routed on the local NVE directly. The BIG benefit to use the distributed gateway is path optimization, which is important for some cloud applications and underlying networks. To interconnect two networks, a gateway may integrate with other network service appliances such as NAT, Firewall and handle policy based forwarding. This means that inter-VN traffic subjects to this additional treatment. Note that a tenant often uses this as a rule in an application networking design. When implementing distributed gateway, that means that NVEs also need support such capability, which, sometimes, become complex and not practical. Should a tenant uses a gateway or distributed gateway for two VN interconnection? Here are the general recommendations. . If a VN overlay interconnects to an external network that is a physical network, virtual network, or Internet, using a gateway is practical. . If a VN overlay in a DC interconnects to another VN overlay in another DC, the inter-VN traffic will pass through DC GWs, i.e. traffic pattern is north-south, it is good to use a gateway. Yong, et al Expires May 11, 2014 [Page 14] Internet-Draft Network Virtualization Edge (NVE) November 2013 . If a VN overlay interconnects to anther VN overlay within a DC, i.e. traffic pattern is easy-west, and there is light network service appliance function and/or policies for inter-VN traffic, using distributed gateway is better; otherwise use a gateway. An L2 VN overlay may interconnect with another VN overlay and also interconnect with WAN or Internet. In this case, the L3 distributed gateway on the NVE not only be used for the VN interconnections within the NVEs but also be a member of an L3 overlay which the gateway connecting to WAN or Internet is the member of. Figure 6 illustrates an example. In this example, there are three virtual networks: VNx, VNi, and VNy. The VNx and VNx are the L2 overlay. The VNi is the L3 overlay. The VNx presents on NVE1 and an L2GW device. The L2GW device is also on a VLANa which some metal servers attach to. Thus the tenant systems belonging to the VNx and the metal servers on VLANa are on one L2 virtual network. (This is a useful configuration during the DC migration.) The VNy has the tenant systems on NVE1 too. NVE1 has L3dGW function. The inter VN traffic on the NVE1 are passed through the L3dGW. The VNi terminates on the L3dGW at NVE1 and a vGW at the DCGW. NVEs get the vGW information from NVA. The tenant systems on NVE1 can reach the WAN via VNi, i.e. the L3dGW on NVE1 interconnects three VNs. The vGW on the DCGW terminates the traffic from VNi and passes to the WAN or vise versa. Note that some policy may be applied at the L3dGW to control the inter-VN traffic route. For instance, the VNx can exchange traffic with the VNy and the WAN; the VNy can exchange the traffic with the VNx only. Other policies and/or network appliances can be placed at the DCGW and associated with the vGW on the DCGW. +----------+ +--------+ +--+. | +------+ |-~-~-~-~-~-~-~-~-| | / \ Metal TSs---+-|L2VNIx+-{ L2 Overlay(VNx) }[L2VNIx]|-|VLANa|-Servers | +--+---+ |-~-~-~-~-~-~-~-~-+--------+ \+--+./ | | | L2GW | +-+---+ |-~-~-~-~-~-~-~-~-+-------+ .'^^\ | |L3dGW+-{ L3 Overlay(VNi) }-[vGW]-+->{ WAN } | +-+---+ |-~-~-~-~-~-~-~-~-+-------+ ._._./ | | | DCGW | | | +----------+ | +--+---+ |-~-~-~-~-~-~-~-~-| +-------+| TSs---+-|L2VNIy+-{ L2 Overlay(VNy) }-+L2VNIy |+--TSs | +--+---+ |-~-~-~-~-~-~-~-~-| +-------+| +----------+ +----------+ NVE1 NVE2 Yong, et al Expires May 11, 2014 [Page 15] Internet-Draft Network Virtualization Edge (NVE) November 2013 Figure 6 Example of Gateways and Distributed GW Usage A gateway and distributed gateway function on NVEs can further work together to provide the inter-VN connection. This is useful when some NVEs in a VN can support distributed gateway function and others can't. A VN can have both distributed gateway on some NVEs and a gateway, which means that the traffic between two tenant systems may be forwarded locally if the attached NVE supports the distributed gateway function or may be hire-pin via a gateway if local NVE does not support distributed gateway function. This may happen in the network migration period. How to configure and manage a gateway and distributed gateway is beyond the scope of this document. 6. Security Considerations Come soon. 7. Acknowledgements Authors like to thank Qin Wu for the review and valuable comments. 8. IANA Considerations The document does not require any IANA action. 9. References 9.1. Normative References [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC2119, March 1997. [RFC4834] Morin, T., "Requirements for Multicast in Layer 3 Provider-Provisioned Virtual Private Networks (PPVPNs)", RFC4834, April 2007 9.2. Informative References [NVO3ARCH] Black, D, Narten, T., et al, "An Architecture for Overlay Networks (NVo3)", draft-narten-nvo3-arch-01, work in progress. [NVO3DPREQ] Bitar, N., et al, "NVO3 Data Plane Requirements", draft- ietf-nvo3-dataplane-requirements-01, work in progress [NVO3FRWK] LASSERRE, M., Motin, T., et al, "Framework for DC Network Virtualization", draft-ietf-nvo3-framework-03, work in progress. Yong, et al Expires May 11, 2014 [Page 16] Internet-Draft Network Virtualization Edge (NVE) November 2013 [NVO3OAM] Ashwood, P, et al, "NVO3 Operation Requirement", draft- ashwood-nvo3-operational-requirement, work in progress [NVGRE] Sridharan, M., et al, "NVGRE: Network Virtualization using Generic Routing Encapsulation", draft-sridharan-virtualization- nvgre-03, work in progress [VXLAN] Mahalingam, M., Dutt, D., etc, "VXLAN: A Framework for Overlaying Virtualized Layer 2 Networks over Layer 3 Networks", draft-mahalingam-dutt-dcops-vxlan-05.txt, work in progress Authors' Addresses Lucy Yong Huawei Technologies, USA Email: lucy.yong@huawei.com Frank Liang Xia Huawei Technologies Email: frank.xialiang@huawei.com Yong, et al Expires May 11, 2014 [Page 17]