Internet DRAFT - draft-gu-nvo3-tes-nve-mechanism

draft-gu-nvo3-tes-nve-mechanism






Network Working Group                                              Y. Gu
Internet-Draft                                                     Y. Li
Intended status: Standards Track                                  Huawei
Expires: April 22, 2013                                     Oct 19, 2012


            The mechanism and signalling between TES and NVE
                   draft-gu-nvo3-tes-nve-mechanism-01

Abstract

   his draft introduces the interaction required between TES to NVE when
   NVE is located in an external box to TES .  The signaling between TES
   and NVE has to be designed carefully to reflect all the interaction
   requirements.  This document describes the relevant considerations
   for such design and also provides a basic analysis of the potential
   reusable protocols.  Currently this draft focuses on the general
   interaction procedures with relevant parameters and the signaling
   design consideration.  It may be extended to show more detailed
   signalling design recommendation and/or solution recommendation in
   the future with the progress of NVO3's work.

Status of this Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at http://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on April 22, 2013.

Copyright Notice

   Copyright (c) 2012 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents



Gu & Li                  Expires April 22, 2013                 [Page 1]

Internet-Draft          NVO3 TES to NVE mechanism               Oct 2012


   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.


Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
   2.  Terminologies and concepts . . . . . . . . . . . . . . . . . .  6
   3.  TES to NVE Interaction . . . . . . . . . . . . . . . . . . . .  9
     3.1.  Interaction Intentions . . . . . . . . . . . . . . . . . .  9
     3.2.  VM Lifetime Events . . . . . . . . . . . . . . . . . . . .  9
       3.2.1.  VM Creation  . . . . . . . . . . . . . . . . . . . . .  9
       3.2.2.  VM Pre-associate with NVE  . . . . . . . . . . . . . . 10
       3.2.3.  VM Associate with NVE  . . . . . . . . . . . . . . . . 10
       3.2.4.  VM Suspension  . . . . . . . . . . . . . . . . . . . . 10
       3.2.5.  VM Resume  . . . . . . . . . . . . . . . . . . . . . . 11
       3.2.6.  VM Migration . . . . . . . . . . . . . . . . . . . . . 11
       3.2.7.  VM Termination . . . . . . . . . . . . . . . . . . . . 11
       3.2.8.  VM Full Lifecycle Sketch . . . . . . . . . . . . . . . 11
     3.3.  Events,Interaction and Parameters  . . . . . . . . . . . . 13
       3.3.1.  VM Pre-association . . . . . . . . . . . . . . . . . . 13
       3.3.2.  VM Association . . . . . . . . . . . . . . . . . . . . 14
       3.3.3.  VM Suspension  . . . . . . . . . . . . . . . . . . . . 15
       3.3.4.  VM Resume  . . . . . . . . . . . . . . . . . . . . . . 15
       3.3.5.  VM Emigration  . . . . . . . . . . . . . . . . . . . . 16
       3.3.6.  VM Immigration . . . . . . . . . . . . . . . . . . . . 16
       3.3.7.  VM Termination . . . . . . . . . . . . . . . . . . . . 17
       3.3.8.  Keep-alive . . . . . . . . . . . . . . . . . . . . . . 17
       3.3.9.  NVE Local Changes  . . . . . . . . . . . . . . . . . . 18
     3.4.  Signalling Design Considerations . . . . . . . . . . . . . 18
       3.4.1.  General Requirements . . . . . . . . . . . . . . . . . 18
       3.4.2.  Consideration  . . . . . . . . . . . . . . . . . . . . 19
       3.4.3.  Signalling States Machine  . . . . . . . . . . . . . . 19
   4.  Security Considerations  . . . . . . . . . . . . . . . . . . . 20
   5.  Appendix 1: Mechanism Analysis . . . . . . . . . . . . . . . . 20
     5.1.  IEEE 802.1Qbg  . . . . . . . . . . . . . . . . . . . . . . 20
       5.1.1.  Brief Introduction . . . . . . . . . . . . . . . . . . 21
     5.2.  BGP  . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
     5.3.  External Controller  . . . . . . . . . . . . . . . . . . . 23
   6.  References . . . . . . . . . . . . . . . . . . . . . . . . . . 23
     6.1.  Normative Reference  . . . . . . . . . . . . . . . . . . . 23
     6.2.  Informative Reference  . . . . . . . . . . . . . . . . . . 23
   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 24





Gu & Li                  Expires April 22, 2013                 [Page 2]

Internet-Draft          NVO3 TES to NVE mechanism               Oct 2012


1.  Introduction

   Tenant End System (TES) is the physical host where tenant deploys
   their applications.  Tenants' applications can be deployed on a
   physical server directly or on a virtual machine resided on a
   physical server.  Tenant's virtual network, or say virtual data
   center, is an overlay network which is built on the underlying
   network, but logically independent of the underlying network.
   Network Virtualization Edge (NVE) is implemented with virtualization
   functions to encapsulate or decapsulate a tenant's packet that allow
   for L2 and/or L3 tenant separation and for hiding tenant addressing
   information (MAC and IP addresses).  A Tenant End System attaches to
   a Network Virtualization Edge (NVE) node, either directly or via a
   switched network (typically Ethernet).  TES and NVE can be on the
   same physical server or on the separate devices.  Fig1 to Fig3 show
   different NVE location cases.  While TES and NVE are on the same
   physical server, the interaction between TES and NVE is via some
   proprietary internal interface which does not require a standard
   signaling protocol.  Therefore such scenario is not the target of
   this document.For all the other scenarios, as long as the signaling
   between TES and NVE is visibile to network developer, it is in the
   scope of this draft.  We tried to examine the different locations of
   NVE to make sure the signaling interaction between NVE and TES cover
   as possible scenarios as possible.

   o  (NVE Location 1) NVE and TES are co-located in a physical server.
      VM connects to NVE on Hypervisor.  In this case, there should be
      some mechanism to assist Hypervisor know of VM changes, including
      adding, deleting and migration.  Both VM and Hypervisor, as well
      as network service appliance, are controlled by VM Manager.  VM
      Manager is aware of any VM identity and event, hence it can easily
      notify NVE about the information through some internal interface.
      A publicaly available standard protocol is not necessary in this
      case.  Refer to Fig1.

















Gu & Li                  Expires April 22, 2013                 [Page 3]

Internet-Draft          NVO3 TES to NVE mechanism               Oct 2012


           +-------------+------------+
           |  +--------------------+  |
           |  |  +--------------+  |  |
           |  |  |Overlay Module|  |  |
           |  |  +----+---------+  |  |
           |  |        | VN context|  |
           |  |  +-----+-------+   |  |
           |  |  |      VNI    |   |  |
           |  |  +-+---------+-+   |  |
           |  |    |  VAPs   |     |  |
           |  +----+---------+-----+  |
           |       |         |        |
           |    +--+---------+---+    |
           |    |       VM       |    |
           |    +----------------+    |
           |                          |
           +--------------------------+
               Tenant End Systems

                                   Figure 1

   o  (NVE Location 2) TES connects to NVE on an external network entity
      next to it(Figure 2).  VM is controlled by VM
      Manager, while NVE is controlled by some other management entity
      like network management system.  Hence proprietary protocol
      between TES and NVE may not fit all the scanarios.  A standard
      protocol to signal between TES and NVE is mandatory in this case.
      Refer to Fig2.























Gu & Li                  Expires April 22, 2013                 [Page 4]

Internet-Draft          NVO3 TES to NVE mechanism               Oct 2012


                        +------- L3 Network --------+
                        |                           |
                        |       Tunnel Overlay      |
           +------------+---------+       +---------+------------+
           | +----------+-------+ |       | +---------+--------+ |
           | |  Overlay Module  | |       | |  Overlay Module  | |
           | +---------+--------+ |       | +---------+--------+ |
           |           |VN context|       | VN context|          |
           |           |          |       |           |          |
           |  +--------+-------+  |       |  +--------+-------+  |
           |  |     VNI        |  |       |  |       VNI      |  |
      NVE1 |  +-+------------+-+  |       |  +-+-----------+--+  | NVE2
           |    |   VAPs     |    |       |    |    VAPs   |     |
           +----+------------+----+       +----+-----------+-----+
                |            |                 |           |
         -------+------------+-----------------+-----------+-------
                |            |     Tenant      |           |
                |            |   Service IF    |           |
           +----+------------+--------+    +---+-----------+-------+
           |   +----------------+     |    |  +---------------+    |
           |   |   Hypervisor   |     |    |  |  Hypervisor   |    |
           |   +--------+-------+     |    |  +-------+-------+    |
           |            |             |    |          |            |
           |    +-------+------+      |    |   +------+------+     |
           |    |       VM     |      |    |   |     VM      |     |
           |    +--------------+      |    |   +-------------+     |
           |                          |    |                       |
           +--------------------------+    +-----------------------+

               Tenant End Systems            Tenant End Systems

       Figure 2: NVE Location3: VM connects to NVE on external network
                                    entity

   o  (NVE Location 3) TES and NVE are indirectly connected.  Refer to
      Fig3.















Gu & Li                  Expires April 22, 2013                 [Page 5]

Internet-Draft          NVO3 TES to NVE mechanism               Oct 2012


                      +------- L3 Network ------+
                      |                         |
                      |       Tunnel Overlay    |
         +------------+--------+       +--------+------------+
         | +----------+------+ |       | +------+----------+ |
         | | Overlay Module  | |       | | Overlay Module  | |
         | +--------+--------+ |       | +--------+--------+ |
         |          |VN Context|       |          |VN Context|
         |          |          |       |          |          |
         |  +-------+-------+  |       |   +------+-------+  |
         |  |    VNI        |  |       |   |     VNI      |  |
    NVE1 |  +-+-----------+-+  |       |   +-+----------+-+  | NVE2
         |    |   VAPs    |    |       |    |   VAPs    |    |
         +----+-----------+----+       +----+-----------+----+   /\
              |           |                 |           |        |
            ...................           ...................    |
       -----: switched network:           : switched network:    |signalling
            ...................           ...................    |
              |           |     Tenant      |           |        |
              |           |   Service IF    |           |        \/
            Tenant End Systems            Tenant End Systems

          Figure 3: Reference model when TES and NVE are indirectly
                                  connected

   In the mail list discussion, more than one mechanisms to be used
   between TESand NVE were discussed, including VDP (VSI Discovery and
   Configuration Protocol ), BGP and others..  This draft is not going
   to make assertion about which protocol is better.  We believe that
   each candidate protocol can, with some revision or updating, be used
   to exchange necessary events and information between TES and NVE.
   The final decision on which one to be used does not only depend on
   functionalities, but also some other aspects, e.g. lightweight to be
   implemented on server, widely deployment in the industry, efficiency
   and performance etc.

   This draft first presents the recommended procedures of the TES and
   NVE signalling, key parameters of each step, and issues need to be
   addressed.  Then a set of signaling design considerations are
   provided, which can be used as design requirements for the future
   signalling definition.  In the appendix, we give a brief analysis on
   two existing protocols and also show how they can be revised to adapt
   to TES and NVE signaling.


2.  Terminologies and concepts

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",



Gu & Li                  Expires April 22, 2013                 [Page 6]

Internet-Draft          NVO3 TES to NVE mechanism               Oct 2012


   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in [RFC2119].

   The document uses terms defined in [framework].

   VN: Virtual Network.  This is a virtual L2 or L3 domain that belongs
   a tenant.

   VNI: Virtual Network Instance.  This is one instance of a virtual
   overlay network.  Two Virtual Networks are isolated from one another
   and may use overlapping addresses.

   Virtual Network Context or VN Context: Field that is part of the
   overlay encapsulation header which allows the encapsulated frame to
   be delivered to the appropriate virtual network endpoint by the
   egress NVE.  The egress NVE uses this field to determine the
   appropriate virtual network context in which to process the packet.
   This field MAY be an explicit, unique (to the administrative domain)
   virtual network identifier (VNID) or MAY express the necessary
   context information in other ways (e.g. a locally significant
   identifier).

   VNID: Virtual Network Identifier.  In the case where the VN context
   has global significance, this is the ID value that is carried in each
   data packet in the overlay encapsulation that identifies the Virtual
   Network the packet belongs to.

   NVE: Network Virtualization Edge.  It is a network entity that sits
   on the edge of the NVO3 network.  It implements network
   virtualization functions that allow for L2 and/or L3 tenant
   separation and for hiding tenant addressing information (MAC and IP
   addresses).  An NVE could be implemented as part of a virtual switch
   within a hypervisor, a physical switch or router, a Network Service
   Appliance or even be embedded within an End Station.

   Underlay or Underlying Network: This is the network that provides the
   connectivity between NVEs.  The Underlying Network can be completely
   unaware of the overlay packets.  Addresses within the Underlying
   Network are also referred to as "outer addresses" because they exist
   in the outer encapsulation.  The Underlying Network can use a
   completely different protocol (and address family) from that of the
   overlay.

   Data Center (DC): A physical complex housing physical servers,
   network switches and routers, Network Service Appliances and
   networked storage.  The purpose of a Data Center is to provide
   application and/or compute and/or storage services.  One such service
   is virtualized data center services, also known as Infrastructure as



Gu & Li                  Expires April 22, 2013                 [Page 7]

Internet-Draft          NVO3 TES to NVE mechanism               Oct 2012


   a Service.

   VM: Virtual Machine.  Several Virtual Machines can share the
   resources of a single physical computer server using the services of
   a Hypervisor (see below definition).

   Hypervisor: Server virtualization software running on a physical
   compute server that hosts Virtual Machines.  The hypervisor provides
   shared compute/memory/storage and network connectivity to the VMs
   that it hosts.  Hypervisors often embed a Virtual Switch (see below).

   Virtual Switch: A function within a Hypervisor (typically implemented
   in software) that provides similar services to a physical Ethernet
   switch.  It switches Ethernet frames between VMs' virtual NICs within
   the same physical server, or between a VM and a physical NIC card
   connecting the server to a physical Ethernet switch.  It also
   enforces network isolation between VMs that should not communicate
   with each other.

   Tenant: A customer who consumes virtualized data center services
   offered by a cloud service provider.  A single tenant may consume one
   or more Virtual Data Centers hosted by the same cloud service
   provider.

   Tenant End System: It defines an end system of a particular tenant,
   which can be for instance a virtual machine (VM), a non-virtualized
   server, or a physical appliance.

   Virtual Access Points (VAPs): Tenant End Systems are connected to the
   Tenant Instance through Virtual Access Points (VAPs).  The VAPs can
   be in reality physical ports on a ToR or virtual ports identified
   through logical interface identifiers (VLANs, internal VSwitch
   Interface ID leading to a VM).

   VN Name: A globally unique name for a VN.  The VN Name is not carried
   in data packets originating from End Stations, but must be mapped
   into an appropriate VN-ID for a particular encapsulating technology.
   Using VN Names rather than VN-IDs to identify VNs in configuration
   files and control protocols increases the portability of a VDC and
   its associated VNs when moving among different administrative domains
   (e.g. switching to a different cloud service provider).

   VSI: Virtual Station Interface.  Typically, a VSI is a virtual NIC
   connected directly with a VM.  [Qbg]







Gu & Li                  Expires April 22, 2013                 [Page 8]

Internet-Draft          NVO3 TES to NVE mechanism               Oct 2012


3.  TES to NVE Interaction

3.1.  Interaction Intentions

   While TES is a non-virtualized physical server, a single physical
   interface on NVE is exclusively attached to a single tenant and the
   attachement doesn't change very frequently.  In this case, NVE can be
   pre-configured with tenant's network properties and policies to
   execute appropriate packet proccessing.  And when a physical server
   moves, which means a server change its attach point to the network,
   the new NVE, to which the server is going to attach with in the new
   location, can also be preconfigured.  In this case, there is no need
   to proceed signalling between TES and NVE.

   While TES is a virualized server with multiple VMs, the interaction
   between TES and NVE becomes necessary.  A physical interface on NVE
   can be attached to multiple VMs, which could belong to the same or
   different tenants, and VMs can be moved to new locations without
   physical shutdown, which means NVE not able to know VMs' attachemnt
   and/or detachment by checking the physical port.  As described in
   [framework], NVE need to establish Virtual Network Instance for each
   tenant virtual network attached to it through physical interface, NVE
   must be able to know which tenants are attached to it and the
   corresponding VMs belongs to each tenants.  So that NVE must be able
   to 1) identify and distinguish VMs attched to NVE through the same
   physical interface; 2) identify which tenant the VM belongs to; 3)
   get the network policies that is associated with the tenant.  That's
   why a interaction signalling between TES and NVE is needed.  Of
   course the signalling between TES and NVE are not limited to the
   above intentions.  While looking into the detail proccessing of VM
   events, we will find more signalling functionalities and proccessing
   on TES and NVE.

3.2.  VM Lifetime Events

   Not every VM has to pass through all the listed VM lifetime events.
   Any VM can have at least two or a combination of the following
   events.

3.2.1.  VM Creation

   VM Manager indicates the hypervisor to schedule resources on server
   for a particular VM, including CPU, Memory, Storage and Network
   resources.  After the VM is created on the server, the VM has
   necessary resource and is ready to be launched.  The creation of VM
   doesn't necessarily mean the VM is running.  The VM can created but
   not launched for some while as long as the manager would like.  The
   VM can be created and launched at once.  Launching a VM just like



Gu & Li                  Expires April 22, 2013                 [Page 9]

Internet-Draft          NVO3 TES to NVE mechanism               Oct 2012


   startup a physical computer.

   Though VM creation is a very important events for VM, but the
   attached NVE needn't be aware of this event.

3.2.2.  VM Pre-associate with NVE

   VM Manager can decide when to luanch a VM and connect the VM to the
   network.  Before VM connects to network, operator need to provision
   VM's network properties and policies to the NVE that the VM is
   attached to.  The examples of network properties are VM MAC address,
   tenant virtual network identifier.  The exmaples of policies are ACL
   and QoS.  But these properties and policies are not immediately
   activated on NVE unless the VM Manager indicate the VM to connect to
   network.  This is called Pre-association.  Pre-association is
   optional event.

3.2.3.  VM Associate with NVE

   This event means the VM is going to connect to the network.  NVE has
   to get VM's network properties and policies, assign resources and
   install these properties and policies.  If there is Pre-association
   before Association, NVE can reduce the time for Association.  While
   VM is associated, it can use network resources as a physical server
   does.

   Association can happen with or without pre-association.  If there is
   Pre-association before Association, NVE has already the net work
   properties and policies restored, or even installed.  If the network
   properties and policies in Association message is the same as the
   pre-association, NVE can activate the installed network properties
   and policies.  If they are different, the old reserved resources
   should be released and the new network properties and policies are
   installed and activated.

3.2.4.  VM Suspension

   Creating and terminating VM may take a considerable amount of time.
   Instead of performing these operations, operators can suspend a
   virtual machine for the required time and quickly resume it later.
   Suspending a VM is similar to putting a real computer into the sleep
   mode.  When suspending a VM, VM's current state (including the state
   of all applications and processes running in the VM) is stored.  When
   the suspended virtual machine is resumed, it continues operating at
   the same point the virtual machine was at the time of its suspending.






Gu & Li                  Expires April 22, 2013                [Page 10]

Internet-Draft          NVO3 TES to NVE mechanism               Oct 2012


3.2.5.  VM Resume

   To activate the suspended VM.  The suspended applications will start
   again at the state the VM was suspended.  It's not always predictable
   on when a suspended VM will be resumed.

3.2.6.  VM Migration

   Two kinds VM migration, i.e. hot migration (or live migraiton) and
   offline migration.  The proccessing of offline migration is similar
   to terminating the VM on one server and creating it on another
   server.  The running applications on the VM will be broken and then
   be restarted again on the new location.  For live migration, VM is
   lively migrated from one location to another, and the running
   applications should not be visibly disrupted.  There is no
   termination or creation during live migration, so it's highly
   important to let NVE be aware of the migration so that corresponding
   network properties and policies can be correctly obtained, installed
   and activated on new location, and removed from the old location.
   Otherwise, there might be security risk and will influence or even
   interrupted running applications.

   There are two sub-type for VM migration: VM emigration and VM
   immigration.

   o  VM Emigrating: VM is emigrating from this server.  Hence, all the
      relevant resources on the server and attached NVE are disabled,
      but not removed right now, and is ready to be removed once VM is
      successfully migrated.  If VM is failed to immigrate on the new
      location, VM has to be resumed on old location with the states and
      policies disabled by old NVE.

   o  VM Immigrating: VM is immigrating to this server.  The srever and
      attached NVE has prepared the necessary resources and is ready to
      enable the VM's properties and policies once VM is successfully
      migrated.

3.2.7.  VM Termination

   All applications and processing on VM is terminated.  All VM's
   resources on server, including CPU, Memory, Storage and network
   resources, are released.  There is no such a VM any more.

3.2.8.  VM Full Lifecycle Sketch

   Not every VM has to pass through all the lifetime events emulated in
   above.  A simplest VM life has only VM Creation, VM Associating with
   NVE and VM Termination.  A most complex VM life has all the events



Gu & Li                  Expires April 22, 2013                [Page 11]

Internet-Draft          NVO3 TES to NVE mechanism               Oct 2012


   listed in above.  In this section, we show a sketch for a VM's full
   lifecycle with all listed events.  This is helpful for the signalling
   designation in the future.
 /~~~~~~~~~~~~\             /~~~~~\
 |VM Terminate|--Aged out-->|NULL |
 \~~~~~~~~~~~~/             \~~~~~/
      ^                        |
 VM Terminate                  v
      |                 /~~~~~~~~~~~\
      +-----------------|VM Creation|<---------.
      |                 \~~~~~~~~~~~/          |
      |                       |              Fail
      |                       v                |
      |              /~~~~~~~~~~~~~~~~\        |
      +--------------|VM Pre-Associate|--------.
      |              |with NVE        |<-------.
      |              \~~~~~~~~~~~~~~~~/        |
      |                       |              Fail
      |                       v                |
      +----------------/~~~~~~~~~~~~~\<--------|-----------------.
      |   .----------->|VM Associate |---------.                 |
      |   |            |with NVE     |<--------.                 |
      |   |            \~~~~~~~~~~~~~/         |  Successful Immigraiton
      |VM Resume         | or | or  |          |     to this server
      |   |              |    .---. .---.      |                 |
      |   |              v        |     |      |           /~~~~~~~~~~~~~~\
      +---|-----/~~~~~~~~~~~~~\   |     .------|---------->|VM Immigrating|
      |   .-----|VM Suspension|   |            |           \~~~~~~~~~~~~~~/
      |         \~~~~~~~~~~~~~/   |            |                  |
      |                           |  Failed Immigration           |
      |                           |   to other server             |
      |                           v            |                  |
      |                    /~~~~~~~~~~~~~\     |     Failed Immigration
      +--------------------|VM Emigrating|-----.       to this server
      |                    \~~~~~~~~~~~~~/                        |
      |                           |                               |
      |          Successful Immigration to other server           |
      |                           |                               |
      +---------------------------.                               |
      |                                                           |
      +-----------------------------------------------------------.

                    Figure 4: VM Full Lifecycle Sketch








Gu & Li                  Expires April 22, 2013                [Page 12]

Internet-Draft          NVO3 TES to NVE mechanism               Oct 2012


3.3.  Events,Interaction and Parameters

   In this section, we will present description of interaction,
   parameters and special concerns for each VM events are provided.  The
   interaction has strong relationship with VM lifetime events, but is
   not one-to-one mapping, for example, there is no interaction for VM
   Creation.  For VM events, the interaction is initiated by hypervisor
   on behalf of a VM and sent to VNI on attached NVE.  But this is not
   always the case, since NVE may also initiate interaction if there is
   some changes happen on NVE and those changes must be learned by
   particular VMs.

3.3.1.  VM Pre-association

   o  Interaction: This event will trigger Hypervisor to compose a pre-
      association message, and then Hypervisor sends the message to NVE.
      While receives the pre-association message, NVE needs to authorize
      the VM and/or Hypervisor, obtain VM's network properties and
      policies, and install the properties and policies on NVE.

   o  Parameters: The signalling from TES to NVE should at least include
      the following mandatory parameters.

      *  Operation, i.e.  Pre-association.

      *  VMID, a global unique ID in Data Center for a VM.  A VM can
         have more than one MAC addresses and belongs to more than one
         VNID, so a VMID is necessary for NVE to accosicate the VNIDs
         and MACs with the particular VM.

      *  VNID(s), a global unique ID in Data Center for a tenant's
         virtual network.

      *  MAC addresses, a VM may have more than one MAC addresses.  A VM
         may also belongs to more than one virtual network.  So the MAC
         address(s) and VNID should be presented in a way that NVE can
         identify which MAC addresses belongs to which VNID.

      *  Policies, including ACL, QoS, Priority and etc.  In the case
         there are more than one VNID associated with the VM, Policies
         should be explicitely indicated to belong to which VNID.

   o  Response: After NVE processes pre-association message, it repond
      to TES with processing result.  The response can be SUCCESS or
      FAIL with such indicated reasons as FAILED AUTHORIZTION, CONFLICT
      POLICIES(e.g. the provisioned policies are conflict with other
      existed policies on NVE), NON-SUFFICIENT RESOURCES(e.g. the NVE
      has not enough resources to install the provisioned policies).



Gu & Li                  Expires April 22, 2013                [Page 13]

Internet-Draft          NVO3 TES to NVE mechanism               Oct 2012


3.3.2.  VM Association

   o  Interaction: This event will trigger Hypervisor to compose an
      Association message, and then Hypervisor sends the message to NVE.
      Association can happen with or without a Pre-association message.

      *  If there is a Pre-association message before Association, NVE
         needs to compare the information provided by Pre-association
         and Association.  If they are same, NVE can activate the pre-
         installed resources.  If they are different, NVE needs to do
         some additional work depending on what information has been
         changed from pte-association to association.  For example, if
         policy or VNID is changed, NVE needs to update its memory.

      *  If there is no Pre-association message before Association, NVE
         needs to do authorization, obtain VM's network properties and
         policies, and install and activate the properties and policies
         on NVE.

      *  If there is another successful Association message before this
         Association, NVE needs to compare the information provided by
         previous provisioned Association and this Association.  If all
         is the same, NVE do nothing except for update the VM's timer.
         If there is different in comparision, NVE needs to do some
         additional work, depends on what information is changed.  For
         example, if policies or VNID is changed, NVE needs to update
         its memory.

   o  Parameters: The signalling from TES to NVE should at least include
      the following mandatory parameters.

      *  Operation, i.e.  Association.

      *  VMID

      *  VNID(s)

      *  MAC addresses

      *  Policies

   o  Response: After NVE processes Association message, it repond to
      TES with processing result.  The response can be SUCCESS or FAIL
      with such indicated reasons as FAILED AUTHORIZTION, CONFLICT
      POLICIES(e.g. the provisioned policies are conflict with other
      existed policies on NVE), NON-SUFFICIENT RESOURCES(e.g. the NVE
      has not enough resources to install the provisioned policies).




Gu & Li                  Expires April 22, 2013                [Page 14]

Internet-Draft          NVO3 TES to NVE mechanism               Oct 2012


3.3.3.  VM Suspension

   o  Interaction: This event will trigger Hypervisor to compose an
      Suspension message or an Association message with Suspension
      indication, and then Hypervisor sends the message to NVE.
      Suspension must happen after Successful Association.  On receiving
      a Suspension message, NVE inactivate, but not remove, the VM's
      resources and prepare for the next Resume message.  In the state
      of suspension, NVE acts similar as it in Pre-association state.
      The FDB can be aged out during VM suspension.

   o  Parameters: The signalling from TES to NVE should at least include
      the following mandatory parameters.

      *  Operation, i.e.  Suspension or an Association message with
         Suspension indication

      *  VMID

   o  Response: After NVE processes Suspension message, it repond to TES
      with processing result.  The response can be SUCCESS or FAIL .  If
      it's FAIL, it may be because the NVE is too busy to process the
      message.

3.3.4.  VM Resume

   o  Interaction: This event will trigger Hypervisor to compose an
      Resume message or an Association message with Resume indication,
      and then Hypervisor sends the message to NVE.  Resume is supposed
      to happen after a successful Suspension message, otherwise, it
      will be responded with a SUCCESS message and NVE will do nothing
      to the message..  On receiving a Resume message, NVE activates the
      VM's resources and prepare.

   o  Parameters: The signalling from TES to NVE should at least include
      the following mandatory parameters.

      *  Operation, i.e.  Resume or an Association message with Resume
         indication

      *  VMID

   o  Response: After NVE processes Resume message, it repond to TES
      with processing result.  The response can be SUCCESS or FAIL.  If
      it's FAIL, it may be because the NVE is too busy to process the
      message.





Gu & Li                  Expires April 22, 2013                [Page 15]

Internet-Draft          NVO3 TES to NVE mechanism               Oct 2012


3.3.5.  VM Emigration

   o  Interaction: This event will trigger Hypervisor to compose an
      Emigration message or an Association message with Emigration
      indication, and then Hypervisor sends the message to NVE.
      Emigration can happen after Pre-association, Association,
      Suspension or Resume.

   o  On receiving VM Emigration message or indication, NVE inactivate
      VM's resources.  But NVE doesn't immediately reomve VM's resources
      and states, because an emigration maybe fail if the immigration on
      the remote server or NVE is failed.  In that case, the emigrating
      VM may need to continue its work on the current server.  NVE will
      wait for a next Termination message to remove the VM's resources
      or states on NVE.

   o  Parameters: The signalling from TES to NVE should at least include
      the following mandatory parameters.

      *  Operation, i.e.  Association.

      *  VMID

   o  Response: After NVE processes VM Emigration, it repond to TES with
      processing result.  The response can be SUCCESS or FAIL.  If it's
      FAIL, it may be because the NVE is too busy to process the
      message.

3.3.6.  VM Immigration

   o  Interaction: This event will trigger Hypervisor to compose an
      Immigration message, or an Pre-association/Association message
      with Immigration indication, call them immigration(Pre-asso) and
      Immigration(Asso).  NVE's reaction to VM Immigration is silimar to
      its reaction to Pre-association or Association.  If the result of
      Immigration processing is FAIL, the VM will not migrate to the new
      location and continue its work on old server.  VM Manger may have
      to find another new location for the VM to migrate to.

   o  To distinguish Immigration from Pre-association and Association is
      meaningful, [statemigration-framework]shows the problem of VM's
      flow-coupled state migration in case of VM live migration.  The
      Immigration message can be a indication or trigger for the flow-
      coupled state migration on middleboxes.

   o  Parameters: The signalling from TES to NVE should at least include
      the following mandatory parameters.




Gu & Li                  Expires April 22, 2013                [Page 16]

Internet-Draft          NVO3 TES to NVE mechanism               Oct 2012


      *  Operation, i.e.  Immigration or an (Pre-)Association message
         with Immigration indication.

      *  VMID

      *  VNID(s)

      *  MAC addresses

      *  Policies

   o  Response: After NVE processes Immigration message, it repond to
      TES with processing result.  The response can be SUCCESS or FAIL
      with such indicated reasons as FAILED AUTHORIZTION, CONFLICT
      POLICIES(e.g. the provisioned policies are conflict with other
      existed policies on NVE), NON-SUFFICIENT RESOURCES(e.g. the NVE
      has not enough resources to install the provisioned policies).

3.3.7.  VM Termination

   o  Interaction: This event will trigger Hypervisor to compose an
      Termination message.  NVE' will release VM's resources on NVE and
      remove all state about this VM.

   o  Parameters: The signalling from TES to NVE should at least include
      the following mandatory parameters.

      *  Operation, i.e.  Termination

      *  VMID

   o  Response: After NVE processes Termination message, it repond to
      TES with processing result.  The response can be SUCCESS or FAIL.
      If it's FAIL, it maybe because NVE is too busy to process the
      Termination message, however the VM can be terminated on the
      server anyway.

3.3.8.  Keep-alive

   This is not a VM lifetime events.  Since the resources on NVE is
   precious, if a associated, pre-associated or suspended VM keeps idle
   for a pre-defined time, NVE will remove the VM's resources, so that
   NVE can serve other active VMs.  In order to keep VM's resource on
   NVE, Hypervisor has to create keep-alive message, or an Pre-
   association/Association message with Keep-alive indication, NVE will
   update VM's timer upon the Keep-alive message.

   Parameters: The signalling from TES to NVE should at least include



Gu & Li                  Expires April 22, 2013                [Page 17]

Internet-Draft          NVO3 TES to NVE mechanism               Oct 2012


   the following mandatory parameters.

   o  Operation, i.e.  Keep-alive or an (Pre-)Association message with
      Keep-alive indication.

   o  VMID

3.3.9.  NVE Local Changes

   While VM associate with a VNID on NVE, NVE will generate local
   significant indicators for the VM and VNIDs, e.g.  VID.  If the
   indicators are sent to Hypervisor in previous response, and the
   indicators change later on, NVE need to create an Associate or a
   dedicated message with the changed indicators and send to Hypervisor,
   and Hypervisor will respond with processing result.

   Note: Although we use the VM Lifetime events names as the names of
   messages in this section, it does mean that there should be a
   dedicated message for each event in the future signalling.  Some of
   the events can be carried in one signalled message with different
   operation type.  For example, an Association message with Immigration
   indication or an Association message with Suspension indication.

3.4.  Signalling Design Considerations

3.4.1.  General Requirements

3.4.1.1.  Basic Requirements

   REQUIREMENT-1:  The TNS (TES to NVE Signalling) MUST support TES to
       notify NVE about the VM's events, including but not limited to
       Pre-Association, Association, Emigration, Immigration and
       Termination.

   REQUIREMENT-2:  The TNS MUST support TES to notify NVE about the VM's
       VNID, which can be one identifier or a combination of several
       indentifier.

   REQUIREMENT-3:  The TNS MUST support TES to notify NVE about the VM's
       address.  The address MUST include one or both of MAC address of
       VM's virtual NIC and VM's IP address.  And it SHOULD be
       extensible to carry new address type.

   REQUIREMENT-4:  The TNS MUST support NVE to notify TES about the VM's
       local tag.  The local Tag type supported by TNP MUST include IEEE
       802.1Q tag.  And it SHOULD be extensible to carry other type of
       local tag.




Gu & Li                  Expires April 22, 2013                [Page 18]

Internet-Draft          NVO3 TES to NVE mechanism               Oct 2012


3.4.1.2.  Extension Requirements

   REQUIREMENT-5:  The TNS SHOULD support NVE to notify TES about the
       VM's traffic PCP value.

   In typical DC, where physical server connects to adjacent bridge, the
   data frame from server can be tagged with PCP or untaggged.  If a
   data frame is untagged, it can be tagged with PCP on adjacent bridge.
   While in virtualized DC, the adjacent bridge is Hypervisor.  There
   are two options to deal with PCP tag, 1) data frame is tagged with
   PCP by VM, 2)data frame is tagged with PCP by Hypervisor and 3) data
   frame is tagged with PCP by NVE.

   In cloud service, the VM can be anybody and it may want a higher
   priority than it should have.  The VM can tag it's data frame with
   higher PCP value and get better service.  Based on the assumption
   that PCP provided by VM is not reliable, it's more reasonable to let
   the network to define the PCP value based on VM's priority, and
   enable bridges to tag the PCP value, as 2) or 3).

   This problem is similar to local VID, which can be tagged either by
   Hypervisor or by NVE.  The benefit to tag PCP by Hypervisor is to
   reduce the load on NVE.

3.4.2.  Consideration

   To be added.

3.4.3.  Signalling States Machine

   The interaction should be stateful.  Both Hypervisor and NVE need to
   record the state of their signalling state.  The main states are Pre-
   association, Association, Suspension, and Termination.  The following
   diagram shows a the state machine of TES to NVE signalling.  Only
   reasonable situations are listed in the diagram.  In the future, more
   situation will be added to the state machine.















Gu & Li                  Expires April 22, 2013                [Page 19]

Internet-Draft          NVO3 TES to NVE mechanism               Oct 2012


           |------------------->/```\----------------------|
           |                    \~~~/                      |
           |                      |Pre-Asso                |
           |                      |or                      |
           |                      |Immigration(Pre-Asso)   |
     /~~~~~~~~~~~\  Aged out      v                        |
     |Termination|<----|  /~~~~~~~~~~~~~~~~\              Asso
     \~~~~~~~~~~~/<-\  ---|Pre-Association |               or
          ^          \    \~~~~~~~~~~~~~~~~/        Immigration(Asso)
          |           \           |                        |
      Aged out    Aged out        |Asso                    |
          or          or          |or                      |
     Termination   Termination    |Immigration(Asso)       |
          |              \----|   v                        |
    /~~~~~~~~~~~\Suspension/~~~~~~~~~~~~~\                 |
    |Suspension |<---------| Association |<----------------|
    \~~~~~~~~~~~/--------->\~~~~~~~~~~~~~/
                  Resume      /        ^
                             /          \
    /~~~\                   |            |
    \~~~/  States           |-Emigration-|
                                 or
                            Immigration(Asso)
    ------ Message

               Figure 5: TES to NVE signalling State Machine


4.  Security Considerations

   There are some considerations on security in [overlay-cp].  Most of
   the considerations are about mechanism between NVE and external
   controller, and the attack on underlying networks, which can not be
   resolved only by the mechanism between TES and NVE.  One security
   issue related to the mechanism between TES and NVE is about the
   authentication of VM who announces to associate with a particular VN.
   There is a hypervisor between VMs and NVEs, and both VMs and
   hypervisor are not always reliable.  For example, a poisoned
   hypervisor may modify the VN Name, or identification for similar
   intention, in order to associate with a VN that it doesn't belong to.


5.  Appendix 1: Mechanism Analysis

5.1.  IEEE 802.1Qbg






Gu & Li                  Expires April 22, 2013                [Page 20]

Internet-Draft          NVO3 TES to NVE mechanism               Oct 2012


5.1.1.  Brief Introduction

   VDP has four basic TLV types.

   o  Pre-Associate: Pre-Associate is used to pre-associate a VSI
      instance with a bridge port.  The bridge validates the request and
      returns a failure Status in case of errors.  Successful pre-
      association does not imply that the indicated VSI Type will be
      applied to any traffic flowing through the VSI.  The pre-associate
      enables faster response to an associate, by allowing the bridge to
      obtain the VSI Type prior to an association.

   o  Pre-Associate with resource reservation: Pre-Associate with
      Resource Reservation involves the same steps as Pre-Associate, but
      on successful pre-association also reserves resources in the
      Bridge to prepare for a subsequent Associate request.

   o  Associate: The Associate TLV Type creates and activates an
      association between a VSI instance and a bridge port.  The Bridge
      allocates any required bridge resources for the referenced VSI.
      The Bridge activates the configuration for the VSI Type ID.  This
      association is then applied to the traffic flow to/from the VSI
      instance.

   o  Deassociate: The de-associate TLV Type is used to remove an
      association between a VSI instance and a bridge port.  Pre-
      Associated and Associated VSIs can be de-associated.  De-associate
      releases any resources that were reserved as a result of prior
      Associate or Pre-Associate operations for that VSI instance.

|1        |2       |3       |4       |7       |8     |9      |25         |26          |25+M
|---------+--------+--------+--------+--------+------+-------+-----------+------------|
|TLV type|TLV info | Status |VSI Type|VSI Type|VSIID |VSIID  |Filter Info|Filter Infor|
|(7bits) |strlength|(1octet)|  ID    |version |format|(16oct)|   format  | (M octets) |
|        | (9bits) |        |(3oct)  |(1oct)  |(1oct)|       |  (1 octet)|            |
|--------+---------+--------+--------+--------+------+-------+-----------+------------|
|                           |<-------VSI type&instance------>|<-------Filter----------|
|                           |<--------------------VSI attibutes---------------------->|
|<----TLV header--><--------------TLV information string = 23+Moctets---------------->|

                       Figure 6: VDP TLV definitions

   Some important flag values in VDP request:

   o  M-bit (Bit 5): Indicates that the user of the VSI (e.g., the VM)
      is migrating (M-bit = 1) or provides no guidance on the migration
      of the user of the VSI (M-bit = 0).  The M-bit is used as an
      indicator relative to the VSI that the user is migrating to.



Gu & Li                  Expires April 22, 2013                [Page 21]

Internet-Draft          NVO3 TES to NVE mechanism               Oct 2012


   o  S-bit (Bit 6): Indicates that the VSI user (e.g., the VM) is
      suspended (S-bit = 1) or provides no guidance as to whether the
      user of the VSI is suspended (S-bit = 0).  A keep-alive Associate
      request with S-bit = 1 can be sent when the VSI user is suspended.
      The S-bit is used as an indicator relative to the VSI that the
      user is migrating from.

   The filter information field supports the following format:

   o  VID
   +---------+------+-------+--------+
   | #of     | PS   | PCP   | VID    |
   |entries  |(1bit)|(3bits)|(12bits)|
   |(2octets)|      |       |        |
   +---------+------+-------+--------+
             |<--Repeated per entry->|

                                   Figure 7

   o  MAC/VID
   +---------+--------------+------+-------+--------+
   | #of     |  MAC address | PS   | PCP   | VID    |
   |entries  |  (6 octets)  |(1bit)|(3bits)|(12bits)|
   |(2octets)|              |      |       |        |
   +---------+--------------+------+-------+--------+
             |<--------Repeated per entry---------->|

                                   Figure 8

   o  GroupID/VID
   +---------+--------------+------+-------+--------+
   | #of     |  GroupID     | PS   | PCP   | VID    |
   |entries  |  (4 octets)  |(1bit)|(3bits)|(12bits)|
   |(2octets)|              |      |       |        |
   +---------+--------------+------+-------+--------+
             |<--------Repeated per entry---------->|

                                   Figure 9

   o  GroupID/MAC/VID
   +---------+-----------+-------------+------+-------+--------+
   | #of     | GroupID   | MAC address | PS   | PCP   | VID    |
   |entries  |(4 octets) | (6 octets)  |(1bit)|(3bits)|(12bits)|
   |(2octets)|           |             |      |       |        |
   +---------+-----------+-------------+------+-------+--------+
             |<--------------Repeated per entry--------------->|

                                  Figure 10



Gu & Li                  Expires April 22, 2013                [Page 22]

Internet-Draft          NVO3 TES to NVE mechanism               Oct 2012


   In each format, the null VID can be used in the VDP Request.  In this
   case, the Bridge is expected to supply the corresponding local VID
   value in the VDP Response.

   The VSIID in VDP request that identify a VM can be one of the
   following format: IPV4 address, IPV6 address, MAC address, UUID or
   locally defined.

   +--------------------------------------------------+----------------+
   | VDP features                                     | Requirements   |
   |                                                  | Matching       |
   +--------------------------------------------------+----------------+
   | Pre-Associate/ Pre-Associate with resource       | Requirement-1  |
   | reservation/ Associate/ Deassociate              |                |
   | M-bit/S-bit                                      | Requirement-1  |
   | VSI type&instance in VDP request                 | Requirement-2  |
   | Filter Infor                                     | Requirement-3  |
   | VID infor in VDP response                        | Requirement-4  |
   | PCP in VDP response                              | Requirement-5  |
   +--------------------------------------------------+----------------+

                               VDP TLV types

5.2.  BGP

   gives a brief analysis on how BGP can be reused for TES and NVE
   signalling. Please refer to it for more information. [server2nve]

5.3.  External Controller


6.  References

6.1.  Normative Reference

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", March 1997.

   [Qbg]      "IEEE P802.1Qbg Edge Virtual Bridging".

6.2.  Informative Reference

   [framework]
              Marc Lasserre, Marc., Balus, Florin., Morin, Thomas.,
              Bitar, Nabil., and Yakov. Rekhter,
              "draft-ietf-nvo3-framework-00", September 2012.

   [overlay-cp]



Gu & Li                  Expires April 22, 2013                [Page 23]

Internet-Draft          NVO3 TES to NVE mechanism               Oct 2012


              Kreeger, L., Dutt, D., Narten, T., Black, D., and M.
              Sridharan, "draft-kreeger-nvo3-overlay-cp-00", Jan 2012.

   [server2nve]
              Kompella, K.,
              "draft-dunbar-nvo3-overlay-mobility-issues-00", July 2012.

   [statemigration-framework]
              Gu, Y., Shore, M., and S. Sivakumar, "A Framework and
              Problem Statement for Flow-associated Middlebox State
              Migration", October 2012.


Authors' Addresses

   Gu Yingjie
   Huawei
   No. 101 Software Avenue
   Nanjing, Jiangsu Province  210001
   P.R.China

   Phone: +86-25-56625392
   Email: guyingjie@huawei.com


   Yizhou Li
   Huawei
   No. 101 Software Avenue
   Nanjing, Jiangsu Province  210001
   P.R.China

   Phone:
   Email: liyizhou@huawei.com


















Gu & Li                  Expires April 22, 2013                [Page 24]