Internet DRAFT - draft-gu-nvo3-overlay-cp-arch

draft-gu-nvo3-overlay-cp-arch






Network Working Group                                              Y. Gu
Internet-Draft                                                    W. Hao
Intended status: Standards Track                                  Huawei
Expires: January 10, 2013                                   July 9, 2012


Analysis of external assistance to NVE and consideration of architecture
                    draft-gu-nvo3-overlay-cp-arch-00

Abstract

   Draft [overlay-cp] has introduced some control plan requirements and
   characteristics.  From NVE's perspective, this draft describes what
   assistance is needed to make NVE satisfy the requirements and
   characteristics introduce in [overlay-cp].  Not all of these
   assistance is necessarily achieved by an external controller.  Some
   of the assistance requirements can be regarded as a complementarity
   requirements to [overlay-cp] . while others are requirements to an
   assistance Database.  This draft also provide considerations on how
   the network virtualization architecture should be like and how these
   assistance can be fulfilled.  The target is to help the working group
   to figure out the architecture of overlay control plane, instead of
   providing solutions.

Status of this Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at http://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on January 10, 2013.

Copyright Notice

   Copyright (c) 2012 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents



Gu & Hao                Expires January 10, 2013                [Page 1]

Internet-Draft   NVO3 overlay control plane architecture       July 2012


   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.


Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
   2.  Terminologies and concepts . . . . . . . . . . . . . . . . . .  3
   3.  The fundamental requirements and characteristics . . . . . . .  5
     3.1.  Assistance to NVE  . . . . . . . . . . . . . . . . . . . .  6
       3.1.1.  Assistance from TES  . . . . . . . . . . . . . . . . .  6
     3.2.  Access Control List  . . . . . . . . . . . . . . . . . . .  7
     3.3.  QoS  . . . . . . . . . . . . . . . . . . . . . . . . . . .  7
     3.4.  DHCP Snooping  . . . . . . . . . . . . . . . . . . . . . .  7
     3.5.  NVE to VNI Registration  . . . . . . . . . . . . . . . . .  7
     3.6.  VNI to Multicast Addr Mapping  . . . . . . . . . . . . . .  8
     3.7.  Synchronization  . . . . . . . . . . . . . . . . . . . . .  8
   4.  Implementation Options and Architecture considerations . . . .  8
     4.1.  Exclusively using External Controller  . . . . . . . . . .  9
     4.2.  Hybrid of External Controller and Centralized Database . . 10
       4.2.1.  Brief introduction of VDP profile database and
               work flow  . . . . . . . . . . . . . . . . . . . . . . 10
       4.2.2.  Example Architecture and Work Flow . . . . . . . . . . 12
   5.  Summary  . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
   6.  Security Considerations  . . . . . . . . . . . . . . . . . . . 13
   7.  References . . . . . . . . . . . . . . . . . . . . . . . . . . 14
     7.1.  Normative Reference  . . . . . . . . . . . . . . . . . . . 14
     7.2.  Informative Reference  . . . . . . . . . . . . . . . . . . 14
   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 14

















Gu & Hao                Expires January 10, 2013                [Page 2]

Internet-Draft   NVO3 overlay control plane architecture       July 2012


1.  Introduction

   Draft [overlay-cp] has introduced some control plan requirments and
   characteristics.  From NVE's perspective, this draft describes what
   assistance is needed to make NVE statisfy the requirements and
   characteristics introduce in [overlay-cp].  Not all of these
   assistance is necessarily acheived by an external controller.  Some
   of the assistance requirements can be regarded as a complementarity
   requirements to [overlay-cp] . while others are requirements to an
   assistance Database.  This draft also provide considerations on how
   the network virtualization architecture should be and how these
   assistance can be fulfilled.  The target is to help the working group
   to figure out the architecture of overlay control plane, instead of
   providing solutions.


2.  Terminologies and concepts

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in [RFC2119].

   The document uses terms defined in [framework]and [overlay-cp].

   VN: Virtual Network.  This is a virtual L2 or L3 domain that belongs
   a tenant.

   VNI: Virtual Network Instance.  This is one instance of a virtual
   overlay network.  Two Virtual Networks are isolated from one another
   and may use overlapping addresses.

   Virtual Network Context or VN Context: Field that is part of the
   overlay encapsulation header which allows the encapsulated frame to
   be delivered to the appropriate virtual network endpoint by the
   egress NVE.  The egress NVE uses this field to determine the
   appropriate virtual network context in which to process the packet.
   This field MAY be an explicit, unique (to the administrative domain)
   virtual network identifier (VNID) or MAY express the necessary
   context information in other ways (e.g. a locally significant
   identifier).

   VNID: Virtual Network Identifier.  In the case where the VN context
   has global significance, this is the ID value that is carried in each
   data packet in the overlay encapsulation that identifies the Virtual
   Network the packet belongs to.

   NVE: Network Virtualization Edge.  It is a network entity that sits
   on the edge of the NVO3 network.  It implements network



Gu & Hao                Expires January 10, 2013                [Page 3]

Internet-Draft   NVO3 overlay control plane architecture       July 2012


   virtualization functions that allow for L2 and/or L3 tenant
   separation and for hiding tenant addressing information (MAC and IP
   addresses).  An NVE could be implemented as part of a virtual switch
   within a hypervisor, a physical switch or router, a Network Service
   Appliance or even be embedded within an End Station.

   Underlay or Underlying Network: This is the network that provides the
   connectivity between NVEs.  The Underlying Network can be completely
   unaware of the overlay packets.  Addresses within the Underlying
   Network are also referred to as "outer addresses" because they exist
   in the outer encapsulation.  The Underlying Network can use a
   completely different protocol (and address family) from that of the
   overlay.

   Data Center (DC): A physical complex housing physical servers,
   network switches and routers, Network Service Appliances and
   networked storage.  The purpose of a Data Center is to provide
   application and/or compute and/or storage services.  One such service
   is virtualized data center services, also known as Infrastructure as
   a Service.

   VM: Virtual Machine.  Several Virtual Machines can share the
   resources of a single physical computer server using the services of
   a Hypervisor (see below definition).

   Hypervisor: Server virtualization software running on a physical
   compute server that hosts Virtual Machines.  The hypervisor provides
   shared compute/memory/storage and network connectivity to the VMs
   that it hosts.  Hypervisors often embed a Virtual Switch (see below).

   Virtual Switch: A function within a Hypervisor (typically implemented
   in software) that provides similar services to a physical Ethernet
   switch.  It switches Ethernet frames between VMs' virtual NICs within
   the same physical server, or between a VM and a physical NIC card
   connecting the server to a physical Ethernet switch.  It also
   enforces network isolation between VMs that should not communicate
   with each other.

   Tenant: A customer who consumes virtualized data center services
   offered by a cloud service provider.  A single tenant may consume one
   or more Virtual Data Centers hosted by the same cloud service
   provider.

   Tenant End System: It defines an end system of a particular tenant,
   which can be for instance a virtual machine (VM), a non-virtualized
   server, or a physical appliance.

   Virtual Access Points (VAPs): Tenant End Systems are connected to the



Gu & Hao                Expires January 10, 2013                [Page 4]

Internet-Draft   NVO3 overlay control plane architecture       July 2012


   Tenant Instance through Virtual Access Points (VAPs).  The VAPs can
   be in reality physical ports on a ToR or virtual ports identified
   through logical interface identifiers (VLANs, internal VSwitch
   Interface ID leading to a VM).

   VN Name: A globally unique name for a VN.  The VN Name is not carried
   in data packets originating from End Stations, but must be mapped
   into an appropriate VN-ID for a particular encapsulating technology.
   Using VN Names rather than VN-IDs to identify VNs in configuration
   files and control protocols increases the portability of a VDC and
   its associated VNs when moving among different administrative domains
   (e.g. switching to a different cloud service provider).

   VSI: Virtual Station Interface.  Typically, a VSI is a virtual NIC
   connected directly with a VM.  [Qbg]


3.  The fundamental requirements and characteristics

   In this section, we make a summary of the fundamental requirements
   and characteristics made in [overlay-cp].

   Summary of requirements:

   o  Inner to Outer address mapping

   o  Underlying Network Multi-Destination Delivery Address(es)

   o  VN Connect/Disconnect Notification

   o  VN Name to VN-ID Mapping

   Summary of characteristics:

   o  As few local caching state as better

   o  Fast acquisition of needed state

   o  Fast detection/update of stale cached state information

   o  Minimize processing overhead

   o  Highly scalable

   o  Minimize the complexity of the implementation

   o  Extensible




Gu & Hao                Expires January 10, 2013                [Page 5]

Internet-Draft   NVO3 overlay control plane architecture       July 2012


   o  Simple protocol configuration

   o  Do not rely on IP Multicast

   o  Flexible mapping sources

3.1.  Assistance to NVE

   In this section, we describe the assistance to NVE as an addition to
   the requirements enumerated in the above section.  Meanwhile the
   additional requirements must satisfy the required characteristic.  We
   call it assistance, instead of control plane requirements, since the
   assistance can be achieved by a controller, or a database, which is
   not traditionally in concept of control plane.

   In following section, more than one options to enable these
   assistance are introduced.  No matter what kind of control plane
   components are finally adopted by the working, the assistance
   requirements must be satisfied.

3.1.1.  Assistance from TES

   In draft [tes-nve-mechanism], some requirements and possible
   mechanisms to enable the requirements are described.  These
   requirements are the assistance that TES can provides, maybe together
   with external entities, e.g. controllers or profile Database.  A
   summary is enumerated here.

   REQUIREMENT-1:  The TNP (TES to NVE notification mechanism and
       protocol) MUST support TES to notify NVE about the VM's status,
       including but not limited to Start up, Shut down, Emigration and
       Immigration.

   REQUIREMENT-2:  The TNP MUST support TES to notify NVE about the VM's
       VN Clue, which can be one identifier or a combination of several
       indentifier.

   REQUIREMENT-3:  The TNP MUST support TES to notify NVE about the VM's
       inner address.  The inner address MUST include one or both of MAC
       address of VM's virtual NIC and VM's IP address.  And it SHOULD
       be extensible to carry new address type.

   REQUIREMENT-4:  The TNP MUST support NVE to notify TES about the VM's
       local tag.  The local Tag type supported by TNP MUST include IEEE
       802.1Q tag.  And it SHOULD be extensible to carry other type of
       local tag.





Gu & Hao                Expires January 10, 2013                [Page 6]

Internet-Draft   NVO3 overlay control plane architecture       July 2012


   REQUIREMENT-5:  The TNP SHOULD support NVE to notify TES about the
       VM's traffic PCP value.

   The following sections are the assistance the NVE needs but can be
   provided by entities other than TES, e.g. by an external controller
   or a database.  These assistance requirements are complementarity to
   those introduced in . [overlay-cp]

3.2.  Access Control List

   While VAP identify the a new membership, be a VM or a physical
   server, NVE needs to get the Access Control List to the member.  The
   ACL maybe associate with a specific member or associate with a
   specific VNI.  If the ACL is associate with a specific VNI, NVE only
   needs to get the ACL at the first time the NVE is associate with the
   VNI.

   If the ACL changes, e.g. rules change or deleting, the assistance
   subject must be able to notify NVE to update the ACL.

   While the member migrates to a new NVE, the NVE must be able to get
   the ACL as soon as possible.

3.3.  QoS

   Similar to ACL, NVE needs to get the QoS policies while a new member
   is associated with the NVE.  In order to achieve QoS policies, not
   only the NVE but also the network devices on traffic path other than
   NVE need to be aware of the QoS policies.  But in the NVO3 working
   group, we only focus on NVE.

   While the member migrates to a new NVE, the NVE must be able to get
   the QoS policies as soon as possible.

3.4.  DHCP Snooping

   While DHCP Snooping function is enabled on NVE, a DHCP snooping table
   item is created by the access NVE.  While VM migrates to a new NVE,
   the VM may not resend a DHCP request since the migration is
   transparent to the VM and the IP address must be the same.  In this
   case, the new NVE must be able to get the DHCP Snooping information
   created by the original NVE by some way.  And the original NVE must
   be able to delete the DHCP Snooping information timely.

3.5.  NVE to VNI Registration

   While the first membership to a specific VNI is created on NVE, NVE
   need to register the association to an external entity.  The reason



Gu & Hao                Expires January 10, 2013                [Page 7]

Internet-Draft   NVO3 overlay control plane architecture       July 2012


   for this is to enable an a global view of which NVEs belongs to a
   specific VNI.  Every NVE must be aware of NVE to VNI mapping for
   multicast in a single VNI or to update the QoS/ACL policies.  For
   example, all NVEs responsible to at least one member belong to a
   particular VNI have to be notified of updated ACL or QoS policies
   related to this VNI.

3.6.  VNI to Multicast Addr Mapping

   NVE can get the inner to outer address mapping through control plane
   assistance or through data plane learning.  In the case of latter,
   NVE must be able to learn the VNI to Multicast address mapping in
   order to forward unknown unicast and broadcast traffic.

3.7.  Synchronization

   This assistance a general requirement.  For whatever information NVE
   get from external entity, while the origin of the information is
   changing, all relevant NVE who have local copy of the information
   must be able to synchronize with the origin.  Some examples of the
   information are ACL, QoS, Inner to Outer address mapping, VN Name to
   VNID mapping, and NVEs to VNI global view.


4.  Implementation Options and Architecture considerations

   The combination of requirements in Section 3 and Section 4 are the
   assistance that NVE need in order to fulfill the overlay forwarding
   in a way satisfying the characteristic in Section 3.  Not all of the
   assistance is necessarily regarded as requirements to an external
   controller.  In fact, there are more than one way to enable these
   requirements.  In this section, we introduce 2 kinds of assistance
   subject to enable the above requirements.  These should not be
   regarded as solution proposals, but considerations on overlay control
   plan components.

   In this draft, we only consider the situation where external NVE is
   embedded on network devices and VMs access to NVE via hypervisor.
   But for other cases, the mechanism introduced here can also be used,
   with necessary prune.

   Two assistance subjects are introduced, including external controller
   and centralized database.  It's not feasible to use only database,
   e.g. it's hard for database to synchronize mapping and QoS/ACL
   polices among all VNI-relevant NVEs.  But a centralized database can
   offload much work from controller.





Gu & Hao                Expires January 10, 2013                [Page 8]

Internet-Draft   NVO3 overlay control plane architecture       July 2012


4.1.  Exclusively using External Controller

   Only an external controller is used to assist NVE for virtualization
   network forwarding.  The controller might have a database on it or
   directly attached.
                         +------Control Protocol-----+
                         |                           |
                         |                           |
           +------------+---------+           +------------+
           | +----------+-------+ |           | Controller |-----+
           | |  Overlay Module  | |           +-------+----+     |
           | +---------+--------+ |                   | Database |
           |           |VN context|                   +----------+
           |           |          |
           |  +--------+-------+  |
           |  |     VNI        |  |
      NVE1 |  +-+------------+-+  |
           |    |   VAPs     |    |
           +----+------------+----+
                |            |
         -------+------------+-----
                |            |
                |            |
               Tenant End Systems

                  Fig1. Architecture with only controller

   The working flow is as follows.
   TES/VM         NVE                       Controller
     |--start up-->|
      or immigrate |<-get mappings and policies-->|
                     (VNID, inner to outer, etc)
               locally create          locally record
               caches                 NVE-VNI mapping

     |-data frame->|
                   |--encapsulation---->

     |-emigrate--->|
                   |--notify VM emigration-------->|
              locally update           locally update
              caches                  NVE-VNI mapping

                   |<-synch mappings and policies--|
              locally update
              caches

                Fig2. Work flow with controller assists NVE



Gu & Hao                Expires January 10, 2013                [Page 9]

Internet-Draft   NVO3 overlay control plane architecture       July 2012


4.2.  Hybrid of External Controller and Centralized Database

4.2.1.  Brief introduction of VDP profile database and work flow

   Take Profile Database introduced in IEEE 802.1Qbg as an example of
   the Centralized Database.  In IEEE 802.1Qbg, a database is mentioned
   on how to assist the VDP protocol.  It's not standardized in IEEE
   802.1Qbg, but is a fundamental knowledge while VDP is defined.
   Please refer to to find out the brief protocol introduction of VDP.
   The following figure shows what is profile database and how it works.
   [tes-nve-mechanism]
   +-------------------+
   | +----+  +-------+ |Step4 +---------+
   | | VM |--|Hyper- | |------| Bridge  |--------+
   | +----+  |Visor  | | VDP  +---------+        |
   |         +-------+ |        |       Database |
   +-------------------+        |       protocol |
          | Step3         Step5 |          +--------+
          |                     |      +---|Network |
   +-----------+  API    +---------+   |   |Admin   |
   |VM Manager |---------| Profile |---+   +--------+
   +-----------+  Step2  | Database| Step1
                         +---------+

                        Fig3. VDP Profile Database

   A profile database is a centralized database, which is used to store
   profile of VSI type and VM.  A VSI type is a set of policies or
   resource definition that can be shared by all VMs that choose to use
   this VSI type.  VSI type can be regarded as an instance of Virtual
   Network.  The profile is quite flexible, and it can be organized in a
   way shown in the following figure and include one or more of the
   following information.  There can be other kind of profile
   organization format.  The profile is very easy to extend to include
   more information.
















Gu & Hao                Expires January 10, 2013               [Page 10]

Internet-Draft   NVO3 overlay control plane architecture       July 2012


   +-------------------------------------------------------+
   |VSI type|Profile type   |        description           |
   +--------+---------------+------------------------------+
   |VN1     |Priority       | The priority of traffic      |
   |        |QoS            | QoS policies for the VSI type|
   |        |ACL            | ACL rules for the VSI type   |
   |        |Bandwidth      | Bandwidth of the traffic     |
   |        |Multicast Addr | The multicast addr for all   |
   |        |               | VMs belong to the VN         |
   |        |VNID           | A global unique ID for this  |
   |        |               | VN                           |
   +--------+---------------+------------------------------+
   |VN2     |Priority       | The priority of traffic      |
   |        |QoS            | QoS policies for the VSI type|
   |        |ACL            | ACL rules for the VSI type   |
   |        |Bandwidth      | Bandwidth of the traffic     |
   |        |Multicast Addr | The multicast addr for all   |
   |        |               | VMs belong to the VSI type   |
   |        |VNID           | A global unique ID for this  |
   |        |               | VN                           |
   +-------------------------------------------------------+

                    Fig4. Profile organization example

   A mapping between VSI type and VM is also managed on the database.
   +----------------------------------------------------------+
   |VSI type|VM list| Profile type|     description           |
   +--------+-------+-------------+---------------------------+
   |VN1     |VM1    |MAC Addr     | The MAC Addr of VM's vNIC.|
   |        |       |VID          | The VID to which the VM is|
   |        |       |             | associated.               |
   |        |       |Inner Addr   | The inner addr of the VM, |
   |        |       |             | which can be IPv4/v6 addr.|
   |        |       |Outer Addr   | The outer addr of the VM, |
   |        |       |             | which can be IPv4/v6 addr.|
   |        |-------+-------------+---------------------------+
   |        |VM2    |MAC Addr     | The MAC Addr of VM's vNIC.|
   |        |       |VID          | The VID to which the VM is|
   |        |       |             | associated.               |
   |        |       |Inner Addr   | The inner addr of the VM, |
   |        |       |             | which can be IPv4/v6 addr.|
   |        |       |Outer Addr   | The outer addr of the VM, |
   |        |       |             | which can be IPv4/v6 addr.|
   +--------+-------+-------------+---------------------------+

                       Fig5. VSI type to VM mapping

   The work flow of VDP with profile database is as follows.



Gu & Hao                Expires January 10, 2013               [Page 11]

Internet-Draft   NVO3 overlay control plane architecture       July 2012


   o  Step1: Network Administrator creates VSI type database.

   o  Step2: VM Manager query available VSI type and obtain a VSI type
      instance.

   o  Step3: VM Manager creat a VM on physical server and push VSI type
      information to Hypervisor

   o  Step4: While VM is in start up/shut down/emigrate/immigrate
      status, VDP messages are exchanged between hypervisor and bridge.

   o  Step5: Bridge retrieve VSI type information from profile database.

4.2.2.  Example Architecture and Work Flow

                         +------Control Protocol-----+
                         |                           |
                         |                           |
           +------------+---------+           +------------+
           | +----------+-------+ |           | Controller |
           | |  Overlay Module  | |           +------------+
           | +---------+--------+ |
           |           |VN context|
           |           |          |            +-----------+
           |  +--------+-------+  |  Database  | Profile   |
           |  |     VNI        |  |------------| Database  |
      NVE1 |  +-+------------+-+  |  API       +-----------+
           |    |   VAPs     |    |
           +----+------------+----+
                |            |
         -------+------------+-----
                |            |
                |            |
               Tenant End Systems

                        Fig6. Example architecture















Gu & Hao                Expires January 10, 2013               [Page 12]

Internet-Draft   NVO3 overlay control plane architecture       July 2012


   TES/VM         NVE                       database
   |--start up-->|
    or immigrate |<-get mappings and policies->|
                   (VNID, inner to outer, etc)
             locally create
             caches                              Controller
                 |--register NVE-VNI mapping-------->|
                                             locally update
                                            NVE-VNI mapping
   |-data frame->|
                 |--encapsulation---->

   |-emigrate--->|
                 |--notify VM emigration------------>|
            locally update                   locally update
            caches                          NVE-VNI mapping

                                               |-syn->|
                                      while mappings and/or
                                        policies is updated

                 |<-synch mappings and policies------|

                 |<-get mappings and policies->|
                   (VNID, inner to outer, etc)
            locally update
            caches

                          Fig7. Example work flow


5.  Summary

   Compared the mechanism in Sec 4.1 and 4.2, we can get the following
   results.  From architecture view, exclusive controller has simpler
   architecture with few interaction requirements, and simpler work
   flow.

   From performance view and reusing of existed protocols, hybird
   mechanism is able to offload the query of static information to
   database, which can optimize the performance of controller and make
   the system more extensible.


6.  Security Considerations

   TBA




Gu & Hao                Expires January 10, 2013               [Page 13]

Internet-Draft   NVO3 overlay control plane architecture       July 2012


7.  References

7.1.  Normative Reference

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", March 1997.

   [Qbg]      "IEEE P802.1Qbg Edge Virtual Bridging".

7.2.  Informative Reference

   [framework]
              Marc Lasserre, Marc., Balus, Florin., Morin, Thomas.,
              Bitar, Nabil., and Yakov. Rekhter,
              "draft-lasserre-nvo3-framework-02", June 2012.

   [overlay-cp]
              Kreeger, L., Dutt, D., Narten, T., Black, D., and M.
              Sridharan, "draft-kreeger-nvo3-overlay-cp-00", Jan 2012.

   [tes-nve-mechanism]
              Gu, Y., "The mechanism and protocol between TES and NVE to
              facilitate NVO3", July 2012.


Authors' Addresses

   Gu Yingjie
   Huawei
   No. 101 Software Avenue
   Nanjing, Jiangsu Province  210001
   P.R.China

   Phone: +86-25-56625392
   Email: guyingjie@huawei.com


   Weiguo Hao
   Huawei












Gu & Hao                Expires January 10, 2013               [Page 14]