Network Working Group Z. Qiang Internet Draft Ericsson Intended status: Informational February 6, 2015 Expires: August 2015 Tenant Traffic Handling in NVO3 draft-zu-nvo3-ts-traffic-handling-00.txt Status of this Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. This document may contain material from IETF Documents or IETF Contributions published or made publicly available before November 10, 2008. The person(s) controlling the copyright in some of this material may not have granted the IETF Trust the right to allow modifications of such material outside the IETF Standards Process. Without obtaining an adequate license from the person(s) controlling the copyright in such materials, this document may not be modified outside the IETF Standards Process, and derivative works of it may not be created outside the IETF Standards Process, except to format it for publication as an RFC or to translate it into languages other than English. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html This Internet-Draft will expire on July 6, 2015. Z. Qiang Expires August 6, 2015 [Page 1] Internet-Draft Data Plane Handling in NVO3 February 2015 Copyright Notice Copyright (c) 2015 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Abstract This draft discusses the considerations on how to handle the tenant traffic in NVO3 architecture and several related issues which need to be considered when designing a NVO3 based virtualized data center network for multiple tenants. Table of Contents 1. Introduction...................................................3 2. Conventions used in this document..............................3 3. Terminology....................................................3 4. Tenant Traffic Forwarding......................................3 5. L2CP...........................................................4 5.1. STP/RSTP/MSTP.............................................4 5.2. LACP......................................................5 6. ARP and Neighbor Discovery.....................................5 7. Routing protocol...............................................7 8. Security Considerations........................................9 9. IANA Considerations...........................................10 10. References...................................................10 10.1. Normative References....................................10 10.2. Informative References..................................10 11. Acknowledgments..............................................11 Z. Qiang Expires August 6, 2015 [Page 2] Internet-Draft Data Plane Handling in NVO3 February 2015 1. Introduction A high-level overview of a possible architecture for building NVO3 overlay networks has been present in [nvo3-arch]. The corresponding control plane requirements has documented in [hypervisor-nve-cp] and [nve-nva-cp-req]. Tenant traffic, including the Layer 2 Control Protocol (L2CP) specified in IEEE802.1 and Layer 3 Control Protocol (L3CP) specified in IETF, needs to be handled carefully in NVO3 network. This document is providing some considerations on how the tenant traffic shall be handled by NVO3. And several related issues due to the required tenant traffic handling procedure are discussed. Section 4 provides some considerations on how to forward the generic TS traffic over NVO3. Section 5 discusses the handling on L2CP messages received from the TS. Section 6 is the ARP and ND message optimization considerations. Section 7 lists all the issues and possible alternatives when dynamic IP routing is supported by a TS. 2. Conventions used in this document The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC-2119 [RFC2119]. In this document, these words will appear with that interpretation only when in ALL CAPS. Lower case uses of these words are not to be interpreted as carrying RFC-2119 significance. 3. Terminology This document uses the same terminology as found in the NVO3 Framework document [framework] and [hypervisor-nve-cp]. 4. Tenant Traffic Forwarding In NVO3, a L2 NVE implements Ethernet LAN emulation, an Ethernet based multipoint service similar to an IETF VPLS [RFC4761] [RFC4762] or EVPN [EVPN] service. It forwards the multicast and unicast L2 traffic between the TSs. From the Tenant Systems aspect, the NVE is just like a L2 bridge as specified in IEEE 802.1Q [IEEE 802.1Q]. A L3 NVE provides Virtualized IP forwarding service, similar to IETF IP VPN, e.g. BGP/MPLS IPVPN [RFC4364]. An L3 NVE provides inter- subnet layer 3 switching/routing for the TS. The NVE is the first hop or next hop router to the attached TS. Z. Qiang Expires August 6, 2015 [Page 3] Internet-Draft Data Plane Handling in NVO3 February 2015 In NVO3, it is very common to provide both L2 and L3 service to a TS. In logic view, the TS is attached to a NVE which provides both L2 and L3 function. In implementation, the L2 NVE function and L3 NVE function may be collocated. The L2 NVE function provides intra-subnet traffic forwarding. The L3 NVE function provides inter-subnet traffic forwarding. In NVO3, to avoid flooding issues, the inner-outer address mapping table is built using the NVA-NVE control signaling [nve-nva-cp-req]. Both L2 and L3 data forwarding are based on the inner-outer address mapping table lookup (and forwarding policies). The data forwarding procedure is similar for both L2 NVE and L3 NVE. Upon receiving a unicast packet from the TS, the NVE performs a lookup in the inner-outer address mapping table using the received destination IP/MAC address. If a mapping is found, the received packet will be encapsulated and forwarded to the destination NVE. If no mapping is found, the received unknown unicast packet should be dropped. As an alternative, the inner-outer address mapping table updating procedure may be triggered using the NVA-NVE control signaling [nve-nva-cp-req]. However, an attacker may generate large amount of unknown unicast packets from a compromised VM, which may result a denial of service (DOS) attacks. Therefore for security reason, the inner-outer address mapping table updating procedure shall not be triggered too often. One easy way to avoid this kind security issue is to implement a frequency limitation function at processing TS traffic with unknown destination addresses. Discussions: As specified in [nvo3-sec-req], frequency limitation shall be supported on the NVA query procedure triggered by any received unknown data packets. 5. L2CP For a L2 NVE, the VAP is an emulation of a physical Ethernet port. It shall have the capability to handle any L2CP. 5.1. STP/RSTP/MSTP The Spanning Tree Protocol (STP) is a L2 protocol that ensures a loop-free topology for any bridged Ethernet local area network. STP is originally standardized as IEEE 802.1D. It is deprecated as of 802.1d-2004 in favor of Rapid Spanning Tree Protocol (RSTP). The Multiple Spanning Tree Protocol (MSTP) defines an extension to RSTP to further develop the usefulness of VLANs. Z. Qiang Expires August 6, 2015 [Page 4] Internet-Draft Data Plane Handling in NVO3 February 2015 In NVO3 network, the L2 forwarding / switching function provided by the NVE is based on the destination MAC address and the inner-outer address mapping table. There won't be any looping of the L2 connections among the TSes by the NVEs if the NVE inner-outer address mapping table is configured correctly. Therefor there is no need to use any L2CP for that purpose among the participated NVEs of a TS. However, STP/RSTP/MSTP may be used by the TS, including multi-homing case. In NVO3 network, the NVE does not need to propagate any STP messages to the remote NVEs. But, the NVE may need to learn the Root Bridge MAC address and Bridge Priority of the root of the Internal Spanning Tree (IST) of the attached layer 2 segment by listening to the BPDUs. Discussions: The NVE does not need to forward the STP message. But it may need to participate. 5.2. LACP and MC-LAG Link Aggregation [IEEE 802.1AXbk-2012] is a mechanism for making multiple point-to-point links between a pair of devices appear to be a single logical link between those devices. MC-LAG [IEEE 802.1aq-2012], or Multi-Chassis Link Aggregation Group, is a type of LAG with constituent ports that terminate on separate chassis, thereby providing node-level redundancy. LACP may be used between the TS and its attached NVE. MC-LAG may be used if the TS is attaching to multiple NVEs. In both cases, a L2 NVE may have to be involved in the Link Aggregation procedure. When MC- LAG is used, Inter-Chassis Communication Protocol (ICCP) needs to be enabled. Discussions: The NVE may need to support the Link Aggregation procedure. 6. ARP and Neighbor Discovery For an L2 service, it is not a must for NVE to support any special processing of ARP [RFC0826] and IPv6 Neighbor Discovery (ND) [RFC4861] in NVO3 architecture. The NVE may forward the ARP or ND messages using the mcast capability. However, as a performance optimization, an NVE does not need to propagate the ARP or ND messages. To avoid ARP/ND flooding, it can intercept ARP or ND requests received from its attached TSs and respond based on the information configured in the inner-outer address mapping table. Z. Qiang Expires August 6, 2015 [Page 5] Internet-Draft Data Plane Handling in NVO3 February 2015 Discussions: To avoid ARP/ND flooding, the NVE may need to response to the received messages based on the inner-outer address mapping table. Upon receiving ARP or ND request from a TS, the NVE sends the ARP or ND response with the requested MAC address back. The NVE may perform ARP or ND proxy when responding the ARP or ND request. If the NVE does not have the interested MAC information in the receiving ARP or ND request, it may query the NVA using the NVA-NVE control signaling [nve-nva-cp-req]. However, an attacker may generate large amount of ARP / ND request packets from a compromised VM, which may result a denial of service (DOS) attacks. Therefore for security reason, the inner-outer address mapping table updating procedure shall not be triggered too often. One easy way to avoid this kind security issue is to implement frequency limitation function at processing TS ARP/ND request messages. Discussions: As specified in [nvo3-sec-req], the NVE shall have a frequency limitation at sending NVA query message triggered by the received ARP/ND request messages with unknown MAC addresses. In Multi-Homing NVE scenarios, a TS may be reachable via more than one NVEs. In this case, if ARP / ND proxy is supported at the participated NVEs of the same network segment where a TS is attached, all participated NVEs may be aware of the same location of the traffic's destination. Therefore, all participated NVEs may offer its own MAC address for the same destination IP address in the ARP / ND reply message, which could be a racing condition. One NVE may need to be selected by the NVA at each network segment to avoid racing issue. Only the selected NVE can response to the ARP / ND request at the attached network segment. Discussions: The NVA may need a property way to select one NVE per network segment of a TS for ARP / ND proxy of given destination IP addresses to avoid the racing issue. At VM mobility, a VM may be moved from one layer-2 segment to another layer-2 segment, assuming IP address preservation is supported. To optimize the ARP or ND updating procedure, both the source NVE and the target NVE can have the same MAC address configured at the VAP where the TS attached. Discussions: The NVA may need a property way to configure the participated NVEs with same MAC address on the VAP of the same VN at each network segment. However, at Multi-Homing NVE scenarios, the NVA may need a property way to configure the participated NVEs on the VAP Z. Qiang Expires August 6, 2015 [Page 6] Internet-Draft Data Plane Handling in NVO3 February 2015 of the same VN at each network segment to avoid duplicated MAC address issue. 7. Routing protocol IP routing protocol may be used by the TS for dynamic IP forwarding. A routing protocol specifies how routers communicate with each other, disseminating information that enables them to select routes between any two nodes on a computer network. In NVO3, there are different developments to support layer 3 services: centralized GW function, distributed GW function, or the combination of both. If the layer 3 service is provided by a NVO3 Centralized Gateway function, the TS routing function and the NVO3 Centralized Gateway functions appears as router adjacencies to each other. A routing protocol may be used between the routers for overlay data plane. Any TS routing messages (e.g. routing updates message from a vR function installed in a VM of the TS) will be handled by the NVO3 Centralized Gateway function. Once there is a routing rules installation or updating, the NVO3 Centralized Gateway function may update its routing distribution polices and forward data packets accordingly. The user data packet will be forwarded by the attached NVE to the NVO3 Centralized Gateway function. Then the NVO3 Centralized Gateway function will make the layer 3 routing decision that either discarding the packet or tunneling it to the destination NVE where the destination VM attached. In this case, the NVE functions, both source and destination, only need to support layer 2 functions. If the layer 3 service is provided by the Distributed GW function embedded in the L3 NVE, this can be an issue for dynamic routing updates. In tenant view, the Distributed GW function appears as next hop router to the TS routing functions, e.g. vR functions installed in a VM of the TS. The Distributed GW function embedded in the L3 NVE may need to support one or more routing protocols (e.g. BGP/OSPF/RIP) to learn any TS routing rules installation or updating. This allows a L3 NVE and the attached TS router to learn the IP routes updates from each other. However, as the TS packet forwarding in the L3 NVE is based on the inner-outer address mapping table configured by NVA using the NVA-NVE control protocol, any TS routing updates may trigger the inner-outer address mapping table updates accordingly, not only in the attached L3 NVE, but also in the remote participated L3 NVEs. With the NVO3 architecture specified in [nvo3-arch], it is an issue on how this dynamic updates can be done. Z. Qiang Expires August 6, 2015 [Page 7] Internet-Draft Data Plane Handling in NVO3 February 2015 Figure 1 is an example of the interactions between the distributed GW function and the vR in TS. The TS1 is attached with NVE1 and NVE2. And TS2 is attached with NVE3. Both TS1 and TS2 may have v-Routing function enabled in the VM. The distributed GW function is supported in NVE1, NVE2 and NVE3. In this example, the vR in TS1 and the dGW in NVE1/NVE2 are routing peers. The vR in TS2 and the dGW in NVE3 are routing peer. Two forwarding paths are available between TS1 and TS2. The TS packets between TS1 and TS2 are forwarded using NVE1-NVE3 tunnel or NVE2-NVE3 tunnel based on the inner-outer address mapping table which is configured by the NVA. +-----+ | NVA | +-----+ +--------------------------------------+ | +-----------------+ | | | | | +---+---+ +---+---+ +-+---+-+ | NVE1 | | NVE2 | | NVE3 | | dGW | | dGW | | dGW | +---+---+ +---+---+ +---+---+ | | | +-----+ +-------+ | | | | +-+--+-+ +--+--+ | TS1 | | TS2 | | vR | | vR | +------+ +-----+ Figure 1 example of TS dynamic routing The issue is that the TS may want to change the routing policies at any time. For instance, initially the NVE3 may be configured with a routing policy that any traffic from TS2 to TS1 shall use the NVE3- NVE1 tunnel. At some point, TS1 may want to change that police. It would like to use the route with NVE2 for the traffic with TS2. Or it may have a new route available, e.g. a new subnet installed in TS1, and it would like to use the route with NVE2 for the new installed subnet. At any of above routing updates, both NVE 1 and NVE 2 may be informed by TS1 using a routing protocol, e.g. OSPF. At receiving the routing update messages, both NVE1 and NVE2 shall process it and may update its inner-outer address mapping table accordingly. However Z. Qiang Expires August 6, 2015 [Page 8] Internet-Draft Data Plane Handling in NVO3 February 2015 this inner-outer address mapping table update is only good for the traffic forwarding from TS1 to TS2. The problem is how to update the NVE3 for the traffic forwarding from TS2 to TS1, which the NVE3 shall update its inner-outer address mapping table accordingly as well. Discussion: Alternatives are: - Alt A: Limiting the usage of distributed GW function in NVO3. Using the centralized GW function for the TS with dynamic IP routing is enabled. The Distributed GW function is only used for a TS where TS dynamic IP routing is not enabled. With this limitation, there is no NVO3 control plane impact. - Alt B: Using NVE-NVE interaction messages to flood the peer L3 NVEs. For instance, the L3NVE may inform the peer NVEs with the received routing updates information. However, in this case, should the peer NVE update its inner-outer address mapping table without NVA's involvements? This may be challenging the NVA's centralized control role. And it may also cause some security violation concerns. - Alt C: Using the NVA-NVE signaling to update the peer L3 NVEs. In this case, the L3 NVE shall not forward any routing updates information to any peer NVEs to avoid flooding. Instead, it shall always inform the NVA about any routing changes. Then the NVA will use the NVA-NVE signaling for the inner-outer address mapping table updating at the peer NVE. - Alt D: Collocated NVA and GW function. With this alternative, the TS routing policies (i.e. RIB) is managed by the collocated GW function. It is assumed that the NVA is synced with the collocated GW function. The Distributed GW function embedded in the NVE is installed with the TS IP forwarding policies (i.e. FIB or inner- outer address mapping table). The TS routing messages will be terminated at the collocated GW function which is the next hop router of the TS routing function. If there is any TS routing installing and updating, the collocated GW function may update the routing policies (i.e. RIB), and the NVA will notify the distributed GW functions with the updated inner-outer address mapping table using the NVA-NVE control signaling. 8. Security Considerations This is a discussion paper which provides inputs for the NVO3 requirement documents and in itself does not introduce any new security concerns. Z. Qiang Expires August 6, 2015 [Page 9] Internet-Draft Data Plane Handling in NVO3 February 2015 9. IANA Considerations No actions are required from IANA for this informational document. 10. References 10.1. Normative References [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC2234] Crocker, D. and Overell, P.(Editors), "Augmented BNF for Syntax Specifications: ABNF", RFC 2234, Internet Mail Consortium and Demon Internet Ltd., November 1997. 10.2. Informative References [overlay-problem-statement] Narten, T., Gray, E., Black, D., Fang, L., Kreeger, L., and M. Napierala, "Problem Statement: Overlays for Network Virtualization", draft-ietf-nvo3- overlay-problem-statement-04 (work in progress), July 31, 2013. [hypervisor-nve-cp] Li, Y., Yong L., Kreeger, L., Narten, T., and D. Black, "Hypervisor to NVE Control Plane Requirements", draft-ietf-nvo3-hpvr2nve-cp-req-00(work in progress), July 1, 2014. [nvo3-framework] Lasserre, M., Balus, F., Morin, T., Bitar, N., and Y. Rekhter, "Framework for DC Network Virtualization", draft-ietf-nvo3-framework-09 (work in progress), July 4, 2014. [nve-nva-cp-req] Kreeger, L., D. Dutt, T. Narten, D. Black, "Network Virtualization NVE to NVA Control Protocol Requirements", draft-ietf-nvo3-nve-nva-cp-req-02 (work in progress), April 24, 2014 [nvo3-arch] D. Black, J. Hudson, L. Kreeger, M. Lasserre, T. Narten, "An Architecture for Overlay Networks (NVO3)", draft-ietf- nvo3-arch-01(work in progress), February 14, 2014 [nvo3-sec-req] S.Hartman, D.Zhang, M.Wasserman, Z.Qiang, "Security Requirements of NVO3", draft-ietf-nvo3-security- requirements-04 (work in progress), January 12, 2015 [IEEE 802.1Q] "Virtual Bridged Local Area Networks", 2005 Z. Qiang Expires August 6, 2015 [Page 10] Internet-Draft Data Plane Handling in NVO3 February 2015 [IEEE 802.1AXbk-2012] "IEEE Standard for Local and metropolitan area networks--Link Aggregation Amendment 1: Protocol addressing" [IEEE 802.1aq-2012] "IEEE Standard for Local and metropolitan area networks--Media Access Control (MAC) Bridges and Virtual Bridged Local Area Networks--Amendment 20: Shortest Path Bridging" [EVPN] Sajassi, A. et al, "BGP MPLS Based Ethernet VPN", draft- ietf-l2vpn-evpn (work in progress) [RFC4761] Kompella, K. et al, "Virtual Private LAN Service (VPLS) Using BGP for auto-discovery and Signaling", RFC4761, January 2007 [RFC4762] Lasserre, M. et al, "Virtual Private LAN Service (VPLS) Using Label Distribution Protocol (LDP) Signaling", RFC4762, January 2007 [RFC0826] Plummer, D., "Ethernet Address Resolution Protocol: Or converting network protocol addresses to 48.bit Ethernet address for transmission on Ethernet hardware", STD 37, RFC 826, November 1982. [RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private Networks (VPNs)", RFC 4364, February 2006. [RFC4861] Narten, T., Nordmark, E., Simpson, W., and H. Soliman, "Neighbor Discovery for IP version 6 (IPv6)", RFC 4861, September 2007. [RFC7275] L. Martini, S. Salam, A. Sajassi, "Inter-Chassis Communication Protocol for Layer 2 Virtual Private Network (L2VPN) Provider Edge (PE) Redundancy", RFC7275, June 2014 11. Acknowledgments Many people have contributed to the development of this document and many more will probably do so before we are done with it. While we cannot thank all contributors, some have played an especially prominent role. The following have provided essential input: Suresh Krishnan. Z. Qiang Expires August 6, 2015 [Page 11] Internet-Draft Data Plane Handling in NVO3 February 2015 Authors' Addresses Zu Qiang Ericsson 8400, boul. Decarie Ville Mont-Royal, QC, Canada Email: Zu.Qiang@Ericsson.com Z. Qiang Expires August 6, 2015 [Page 12]