Internet DRAFT - draft-khasnabish-vmmi-problems

draft-khasnabish-vmmi-problems






Network Working Group                                  Bhumip Khasnabish
Internet-Draft                                              ZTE USA,Inc.
Intended status: Informational                                   Bin Liu
Expires: July 3, 2013                                    ZTE Corporation
                                                              Baohua Lei
                                                               Feng Wang
                                                           China Telecom
                                                            Dec 30, 2012


  Mobility and Interconnection of Virtual Machines and Virtual Network
                                Elements
                 draft-khasnabish-vmmi-problems-03.txt

Abstract

   In this draft, we discuss the challenges and requirements related to
   the migration, mobility, and interconnection of Virtual Machines
   (VMs)and Virtual Network Elements (VNEs).  VM migration scheme across
   IP subnets is needed to implement virtual computing resources sharing
   across multiple network administrative domains.  Many technologies
   are involved in the VM migration across DCs.  These technologies are
   classified and discussed according to their different locations in
   the inter-DC and intra-DC network.  For the seamless online migration
   in various scenarios, many problems need to be resolved in the
   control plane.  The VM migration process should be adapted to these
   aspects.  We also describe the limitations of various types of
   virtual local area(VLAN) networking technologies and virtual private
   networking (VPN) technologies that are traditionally expected to
   support such migration, mobility, and interconnections.

Status of this Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at http://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on July 3, 2013.




Bhumip Khasnabish, et al.  Expires July 3, 2013                 [Page 1]

Internet-Draft  Mobility and Interconnection of VM & VNE        Dec 2012


Copyright Notice

   Copyright (c) 2012 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.


Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  4
     1.1.  Conventions Used in this Document  . . . . . . . . . . . .  4
   2.  Terminology and Concepts . . . . . . . . . . . . . . . . . . .  4
   3.  Control & Mobility Related Problem Specifications  . . . . . .  6
     3.1.  Summarization of Mobility in Virtualized Environments  . .  7
     3.2.  VM Migration Mobility Problems across IP Subnets/WAN . . .  7
       3.2.1.  IP Tunnel Problems . . . . . . . . . . . . . . . . . .  9
       3.2.2.  IP Allocation Strategy Problems  . . . . . . . . . . . 10
       3.2.3.  Routing Synchronization Strategy Problems  . . . . . . 12
       3.2.4.  The Migration Protocol State Machine of VM Online
               Migration across Subnets . . . . . . . . . . . . . . . 12
       3.2.5.  Resource Gateway Problems  . . . . . . . . . . . . . . 13
       3.2.6.  Optimized Location of Default Gateway  . . . . . . . . 13
       3.2.7.  Other Problems . . . . . . . . . . . . . . . . . . . . 13
     3.3.  VM Mobility Problems Implemented on the VR Device  . . . . 13
     3.4.  Security and Authentication of VMMI  . . . . . . . . . . . 13
     3.5.  The Virtual Network Model  . . . . . . . . . . . . . . . . 14
     3.6.  The Processing Flow  . . . . . . . . . . . . . . . . . . . 14
     3.7.  The NVE/OBP Location Problems  . . . . . . . . . . . . . . 15
       3.7.1.  NVE/OBP on the Server  . . . . . . . . . . . . . . . . 16
       3.7.2.  NVE/OBP on the ToR . . . . . . . . . . . . . . . . . . 17
       3.7.3.  Hybrid Scenario  . . . . . . . . . . . . . . . . . . . 19
   4.  Technology Problems involved in the VM Migration . . . . . . . 19
     4.1.  TRILL  . . . . . . . . . . . . . . . . . . . . . . . . . . 20
     4.2.  SPB  . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
     4.3.  Data Center Interconnection Fabric Related Problems  . . . 20
     4.4.  Types and Applications of VPN Interconnections between
           DCs which Provide DCI  . . . . . . . . . . . . . . . . . . 21
       4.4.1.  Types of VPNs  . . . . . . . . . . . . . . . . . . . . 21
       4.4.2.  Applications of L2VPN in DCs . . . . . . . . . . . . . 21



Bhumip Khasnabish, et al.  Expires July 3, 2013                 [Page 2]

Internet-Draft  Mobility and Interconnection of VM & VNE        Dec 2012


       4.4.3.  Applications of L3VPN in DCs . . . . . . . . . . . . . 22
     4.5.  The Actual Number of Available Isolated Domains  . . . . . 22
     4.6.  Problems of the Number of Management
           Devices/Management Domains of DC in the NVO3 Network . . . 22
     4.7.  Limitation of TCAM capacity  . . . . . . . . . . . . . . . 23
     4.8.  SDN  . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
     4.9.  LISP . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
   5.  Others Problems Related with IETF and IEEE . . . . . . . . . . 23
     5.1.  Review of VXLAN, NVGRE, and NVO3 . . . . . . . . . . . . . 23
     5.2.  The East-West Traffic Problem  . . . . . . . . . . . . . . 25
     5.3.  The MAC, IP, and ARP Explosion Problems  . . . . . . . . . 26
     5.4.  Suppressing Flooding within a VLAN . . . . . . . . . . . . 27
     5.5.  Packet Encapsulation Problems  . . . . . . . . . . . . . . 27
   6.  Acknowledgement  . . . . . . . . . . . . . . . . . . . . . . . 27
   7.  References . . . . . . . . . . . . . . . . . . . . . . . . . . 27
   8.  Security Considerations  . . . . . . . . . . . . . . . . . . . 28
   9.  IANA Consideration . . . . . . . . . . . . . . . . . . . . . . 28
   10. Normative References . . . . . . . . . . . . . . . . . . . . . 28
   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 28
































Bhumip Khasnabish, et al.  Expires July 3, 2013                 [Page 3]

Internet-Draft  Mobility and Interconnection of VM & VNE        Dec 2012


1.  Introduction

   There are many challenges related to the VM migration and their
   interconnections among two or more data centers (DCs).  The
   technologies that can be used for VM migration and DC interconnection
   should support the required level of performance, security,
   scalability, along with simplicity and cost-effective management,
   operations and maintenance.

   In this draft, the issues and requirements for moving the virtual
   machines are summarized with reference to the necessary conditions
   for migration, business needs, state classification, security, and
   efficiency.

   In this draft, the requirements for VMMI technologies that are useful
   on large-scale Layer-2 network and on segmented IP network/WAN are
   discussed.  VM migration scheme across IP subnets/WAN is therefore
   needed to implement virtual computing resources sharing across
   multiple network administrative domains.  This will make a wider
   range of VM migration possible, and allow for migration of VMs to
   different types of DCs.  It can be adapted to different types of
   physical networks, different topological networks, and various
   protocols.  For the seamless online migration in these scenarios, a
   very intelligent seamless VM migration orchestration is needed in the
   control plane.  We summarize the requirements of virtual networks for
   VM migration, virtual networking, and operations in DCI/overlay
   modes.

1.1.  Conventions Used in this Document

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in [RFC2119].


2.  Terminology and Concepts

   o  ARP: Address Resolution Protocol

   o  DC: Data Center

   o  DC GW: Data Center Gateway

   o  DCI: Data Center Interconnection

   o  DCS: Data Center Switch





Bhumip Khasnabish, et al.  Expires July 3, 2013                 [Page 4]

Internet-Draft  Mobility and Interconnection of VM & VNE        Dec 2012


   o  FDB: Forwarding DataBase

   o  HPC: High-Performance Computing

   o  IDC: Internet Data Center

   o  IP: Internet Protocol

   o  IP VPN: Layer 3 VPN, defined in L3VPN working group

   o  LISP: Locator ID Separation Protocol

   o  NVO3: Network Virtualization Overlays (Over Layer-3)

   o  OBP: Overlay network boundary point

   o  OTV: Overlay Transport Virtualization

   o  PBB: Provider Backbone Bridge

   o  PM: Physical Machine

   o  QoS: Quality of Service

   o  STP: Spanning Tree Protocol

   o  TNI: Tenant Network Identifier

   o  ToR: Top of the Rack

   o  TRILL: Transparent Interconnection of Lots of Links

   o  VLAN: Virtual Local Area Networking

   o  VM: Virtual Machine

   o  VMMI: Virtual Machine Mobility and Interconnection

   o  VN: Virtual Network

   o  VNI: Virtual Network Identifier

   o  VNE: Virtual Network Entity.(a virtualized laye-3/network entity
      with associated virtualized port and virtualized processing
      capabilities)

   o  VPN: Virtual Private Network




Bhumip Khasnabish, et al.  Expires July 3, 2013                 [Page 5]

Internet-Draft  Mobility and Interconnection of VM & VNE        Dec 2012


   o  VPLS: Virtual Private LAN Service

   o  VR: Virtual Router(a logical device which realizes simulation
      capabilities of the physical router on the software using hardware
      layer's resources and capabilities)

   o  VRRP: Virtual Router Redundancy Protocol

   o  VSE: Virtual Switching Entity (a virtualized laye-2/switch entity
      with associated virtualized port and virtualized processing
      capabilities)

   o  VSw: Virtual Switch

   o  WAN: Wide Area Network

   o  communication agent: NVO3 communication agent(an entity which
      forwards traffic between NVO3 and non-NVO3 environments)


3.  Control & Mobility Related Problem Specifications

   Overall, the requirements of VM migration bring in the following
   challenges in the forefront of data center operations and management:
   (A)How can the existing technologies be compatible with each other,
   including various multi-tenant network technologies, network
   interface technologies between VMs and ToRs, Access layer network
   technologies, networking technologies within the DC and across DCs?
   So that these technologies can work seamlessly with overlay network
   technologies through virtualization technologies.  It may accomplish
   the accommodation of a large number of individual tenant networks in
   the DC and the communication between the isolated domains.  It makes
   the VMs can be seamlessly online migrated in the management domain.

   (B)When the VMs are migrated from one DC to another within one
   administrative domain, (i) how to ensure that the necessary
   conditions of migration are satisfied, (ii) how to ensure that a
   successful migration occurs without service disruption, and (iii) how
   to ensure successful rollback when any unforeseen problem occurs in
   the migration process.

   (C)When the VMs are migrated from one administrative domain to
   another, how to solve the problems of seamless communications between
   the domains.  There are several different solutions(such as VXLAN,
   NVGRE,etc) to the current Layer-2 (L2) based DC interconnection
   technology, and each can solve different problems in different
   scenarios.  If the unification of packet encapsulation mapping rules
   in different solutions can be achieved, it is bound to promote



Bhumip Khasnabish, et al.  Expires July 3, 2013                 [Page 6]

Internet-Draft  Mobility and Interconnection of VM & VNE        Dec 2012


   seamless migration of VMs among DCs along with the desired
   integration in cloud computing and networking.

   (D) How to utilize IP based technologies to accomplish the migration
   of VMs over layer-3 (L3) network?  For example, VPN technology can be
   used to carry L2 and L3 traffic across the IP/MPLS core network.

   We discuss the above in more details in the following sections.  A
   related draft [DCN Ops Req] discusses data center network and
   operations requirements.

3.1.  Summarization of Mobility in Virtualized Environments

   Mobility refers to the movement of a VM from one server to another
   server within one DC or across DCs, while maintaining the VM's
   original IP and MAC address throughout the process.  When a VM is
   migrated to a new location, it should maintain the existing client
   sessions, and the state of the VM sessions should be copied to the
   new location.  VM mobility does not change the VLAN/subnet connection
   to the VM, and it requires that the serving VLAN is extended to the
   new location of the VM.

   In order to support VM mobility, it is required to allow VMs to be
   migrated easily and repeatedly -- that is as often as needed by the
   applications and services -- among a large (more than two) number of
   DCs.  Seamless migration of VMs in mixed IPv4 and IPv6 VPN
   environments should be supported by using appropriate DC GWs.

   Some widely used VM migration tools require that management programs
   on the source server and destination server are directly connected
   via an L2 network.  The objective is to facilitate the implementation
   of smooth VM migration.

   The participating source server and destination server in the VM
   migration process may be located in different DCs.  It may be
   required to extend the Layer-2 network beyond what is covered by the
   L2 network of the source DC.  This may create islands of the same
   VLAN in different (geographically dispersed) DCs.

   Besides, the optimal forwarding in a VLAN that supports VM mobility
   may involve traffic management over multiple DCs.  The support of
   seamless mobility of VM across DCs may not necessarily always achieve
   optimal intra-VLAN forwarding and routing.

3.2.  VM Migration Mobility Problems across IP Subnets/WAN

   There are many existing implementable solutions for migrating VM
   within a traditional LAN(non-large-scale Layer-2 network).  These



Bhumip Khasnabish, et al.  Expires July 3, 2013                 [Page 7]

Internet-Draft  Mobility and Interconnection of VM & VNE        Dec 2012


   solutions include Xen, KVM, and VMWare, which all implement VM image
   file sharing based on NFS, and only CPU and memory status are
   migrated.  These are layer-2 VM migration technologies.  The
   advantage of the implementation is that VM's IP addresses don't need
   to be changed after the VM migration.  With the development and
   popularization of the DCs and virtualization technologies, the number
   of servers and network environment in a single LAN will limit the
   scalability of the virtual computing environment.

   In addition, when re-configuring the VLAN in the traditional DC
   network, STP (MSTP) will lead to the VLAN isolation.  It is a very
   serious problem in the DC network, especially in the storage network,
   because the storage network is very demanding for uninterrupted
   service.

   With the evolution of new technologies, such as various
   virtualization technologies, the large-scale Layer-2 technology, DCI
   technology and overlay network technology, the network environment in
   the VM migration will become more complex.  In order to realize
   virtual computing resource sharing and online VM migration across
   multiple management domains using these technologies as the
   foundation, VM migration scheme based on Nvo3 network technology is
   needed to adapt to the complex network environment.  This will make a
   wider range of VM migration possible, and can allow for migration of
   VMs to different types of DCs.  It can be adapted to different types
   of physical networks, different topological networks, and various
   protocols.

   For example, in the process of VM migration in DC, there are
   scenarios that VM in the traditional three-tier topological network
   is migrated through WAN to Fat-Tree topological network, or to a
   variety of other topological networks.  For the seamless online
   migration in these scenarios, a very intelligent seamless VM online
   migration is needed to be implemented in the control plane.

   If VM migration is only implemented in the L2 domain, people are
   concerned about the expansion of the number of VLANs or isolated
   domains, such as the 16,000,000 isolated domains in PBB.

   Now the limitless and seamless VM online migration across overlay
   based IP subnets means that the following issues need to be
   addressed, in order to achieve our goal: to create a true virtual
   network environment that is separated from the physical network.

   Isolation domain mapping rules.

   Migration across IP subnets.




Bhumip Khasnabish, et al.  Expires July 3, 2013                 [Page 8]

Internet-Draft  Mobility and Interconnection of VM & VNE        Dec 2012


   VM migration in the overlay network needs to be adapted to the
   heterogeneous network topology.

   How the source network environment(i.e. the network between the VM or
   its server and the connected ToR) adapts its configuration to the
   destination network environment?

   The network redirection technology, IP-in-IP technology and dynamic
   IP tunnel configuration will be used to allow online VM migration
   across subnets.

   A module which allocates IP addresses to VMs is needed, which can
   manage each IP allocation in the virtual network.  The IP allocation
   should not conflict with each other, and make the path cost of
   routing forwarding as small as possible.  It is necessary to know the
   DC network topology, its routing protocols, and real-time results of
   the path cost to realize minimum path cost.  We know that the network
   topologies of different DCs are not necessarily the same.  For
   example, the network topologies and routing protocols of traditional
   DC and Fat-Tree network DC are different.  The addition of related
   protocol processing in the control plane is needed for seamless VM
   migration between them.  Otherwise, online VM migration cannot be
   implemented across DCs or across IP subnets.  The scheme of IPinIP
   tunneling resolves the contradiction between unchanged IP addresses
   during the VM migration and changed IP addresses when VMs are
   migrated across IP subnets.  Therefore, the VM's mobility problem can
   be solved only after the above mentioned problems have been solved.

   Service providers can implement VM migration by upgrading its
   software to support new protocols, and the hardware devices don't
   need to be upgraded.

   These problems are described as below:


3.2.1.  IP Tunnel Problems

   During the VM migration, it is required to establish the IP-in-IP
   tunnels .  The purpose is to make the user/application have no
   perception of the migration process, and their IP addresses on the
   related layer should be the same.  The scheme of IPinIP tunneling
   resolves the contradiction between unchanged VM IP addresses during
   the VM migration and changed server IP addresses when VMs migrate
   across IP subnets.  OBP is involved in setting up IP tunnels.
   According to nvo3 control plane protocols, there are two positions
   for OBP (NVE / VTEP) in the DC: on the Server and on ToR.  Placing
   OBP on the server can minimize its correlation with network elements
   in the specific network topology.  It will face more problems if OBP



Bhumip Khasnabish, et al.  Expires July 3, 2013                 [Page 9]

Internet-Draft  Mobility and Interconnection of VM & VNE        Dec 2012


   is placed on ToR.  NVE is preferred to be placed on the Server
   (unless there are other stronger reasons).  It will create a virtual
   network for VM communications.  The traffic between VMs will not
   directly be exposed on the wire and switches.

   However, OBP on the Server can reduce the weak coupling with DC
   topology to a certain extent, but they cannot be completely
   unrelated.

   The disadvantage of network connection solutions for online VM
   migration across different subnets is that the network configuration
   of VM needs to be changed after the migration, and the migration
   process is opaque.  So the transparent migration of VM needs to be
   implemented, and network connection redirection technology needs to
   be considered.

   Since users cannot utilize the VMs due to changes of the network
   access point during the online migration of VMs across subnets, the
   scheme of network connection redirection system based on Proxy Mobile
   IP (PM IP) can be used.  VM which is migrated to the external subnet
   is regarded as a mobile node and the IP address isn't changed.  All
   the packets to/from the VM is transmitted through the bi-directional
   tunnel between the external network and the home network, in order to
   implement online transparent migration across subnets.  After the
   necessary data has been migrated, the tunnel directly connected to
   the location of the VM's new server is re-established at the
   preferably switching speed.

   The source VM and the destination VM need to be activated
   simultaneously and must be dynamically configured with IP tunnel.  In
   order to make the VM migration process completely transparent
   (including transparent to the VMs' applications and the outside
   users), the migration environment of the VMs should be regarded as a
   mobile network environment, and the migrated VM is regarded as a
   mobile node.  The mobile agent function of the host should be taken
   full advantage to communicate with the external network.

3.2.2.  IP Allocation Strategy Problems

   In the encapsulation of packets described as above, the IP address of
   the VM is a critical entity.  Its allocation is based on DHCP in the
   small network.  With the expansion of the network scale, IP address
   conflict is more likely to occur.  When the VM is migrated to another
   network, its IP address may possibly conflict with IP address of the
   VM or physical host in the destination network.  For example, the
   duplicate IP addresses will causes confusion and migration failure.

   Therefore, a management module allocating IP addresses to VMs is



Bhumip Khasnabish, et al.  Expires July 3, 2013                [Page 10]

Internet-Draft  Mobility and Interconnection of VM & VNE        Dec 2012


   needed, which can manage each IP allocation in the virtual network.
   The IP allocation should not conflict with each other, and should
   make the path cost of routing and forwarding as small as possible.

   When allocating IP, it should not conflict with the currently
   assigned IP network segments of the VM clusters.  In addition, it
   should not conflict with the IP network segments where the physical
   hosts are located.  It also should not conflict with the destination
   IP network segments after the migration.  So the synchronization of
   IP address allocation information needs to be done.  Of course, the
   synchronization in the whole network is not necessary as long as
   there are ways to ensure no conflict exists.  Moreover, the
   allocation method needs to consider the introduction of network
   overhead as small as possible, as well as insufficient IP issues in
   the destination network segments.

   When allocating IP addresses to the hosts based on DHCP protocol, the
   IP addresses in the IP address pool are allocated from small to
   large.  The insufficient number of addresses in the pool may lead to
   conflict with assigned VMs' IP addresses, which hinders VM migration.
   The IP address allocation from small to large makes assigned VMs' IP
   addresses may affect routing protocol to choose the better path.

   Especially for the specific architectures like Fat-Tree, specific
   network topology and the protocol architecture of specific routing
   strategy (such as OSPF) should be utilized.  The VM migration process
   must be adapted to these aspects, and cannot be copied for purely
   Layer-2 migration approach.  So VM migration is inherently related to
   network topologies and network routing protocols.

   In the Fat-tree topology, IP addressing and IP allocation methods of
   network servers and switches are related to the routing protocols.
   Two routing methods can be chosen: OSPF protocol (OSPF domain cannot
   be too large), and the fixed routing configuration.

   As VM in the destination DC is needed to assign an IP, in order to
   prevent IP conflict, routing protocols used in the destination DC
   need to be known.  For example, OSPF routing protocol (in this case,
   the new added network node is assigned IP address by using DHCP), or
   the fixed configuration IP routing protocol is used in the Fat-tree
   topology.  If the former, the number and distribution of reserved IP
   addresses in the IP address pool are different from the latter.
   Therefore a scheme is required to know the adopted network topology
   and address allocation strategy, IP usage for each segment, the
   remaining number of IP addresses, etc.  This information cannot be
   acquired purely by the existing DHCP protocol.

   Different routing strategies have different routing management



Bhumip Khasnabish, et al.  Expires July 3, 2013                [Page 11]

Internet-Draft  Mobility and Interconnection of VM & VNE        Dec 2012


   mechanism for VM migration across the DCs for the following reasons:
   (a) It involves the uniqueness problem of the IP address assignment
   and IP tunnel establishment, and (b) it involves the global unified
   management issues.  These problems will be discussed later.

   In the addressing method of the fixed routing protocol, the IP
   address assigned to the device located within DC actually contains
   the location information.  The type of the corresponding device can
   easily be determined through IP, and the location of the device in
   the topology can also be visually judged.

   So these addresses must be avoided in the automated allocation of IP
   addresses to the VMs.

   The function of DHCP protocol needs to be greatly enhanced, or the
   protocols and tools of IP address allocation need to be re-designed.

   Moreover, negotiation is needed before migration.  Because it may be
   required to migrate the VMs back after the migration process has been
   finished, in order to make the migration process smoothly, the IP
   addresses of source /destination communications agents should be
   considered for reservation.  The reserved IP address relates to
   network topology and IP address allocation strategy in the source/
   destination network environment.

   For the control plane protocols in the nvo3 network, reasonable
   allocation of IP address is used according to the adopted network
   topology and routing protocols in the source and destination DC, in
   order to achieve the seamless VM migration and the optimal path as
   much as possible.

   In addition, the above mentioned problems are also involved in
   PortLand network topology, similar to Fat Tree network topology.

   Future server-centric network topologies, such as Dcell/Bcube network
   topology, also need to achieve compatibility in the control plane.

3.2.3.  Routing Synchronization Strategy Problems

   In order to ensure the normal data forwarding after the VM migration,
   the routing synchronization between the source network and
   destination network is needed.

3.2.4.  The Migration Protocol State Machine of VM Online Migration
        across Subnets

   As for the routing strategy discussed earlier, compared to the
   migration in the same IP subnet, the IP allocation strategy and



Bhumip Khasnabish, et al.  Expires July 3, 2013                [Page 12]

Internet-Draft  Mobility and Interconnection of VM & VNE        Dec 2012


   routing synchronization strategy will be changed.  So the state and
   handling of routing updates must be included in the state machine of
   VM migration across subnets at the preparation phase before the VM
   migration.

   Therefore, if the VM is allowed to span cross subnets, the network
   redirection technology should be used.  For IP-in-IP technology, the
   advantage of it is good compatibility with network equipments, as
   long as upgrading their software.

3.2.5.  Resource Gateway Problems

   A resource gateway is needed to record IP address resources that have
   been used, and IP network segments which the used IP addresses belong
   to.

3.2.6.  Optimized Location of Default Gateway

   The VM's default gateway should be in a close topological proximity
   to the ToR that is connected to the server presently hosting that VM.

3.2.7.  Other Problems

   Migration across domains has proposed new requirements for network
   protocols, for example, the ARP response packet mechanism is no
   longer applicable in the WAN.  In addition, some packets will be lost
   during the migration, which does not apply to parallel computing.
   There are also problems such as computing resources sharing across
   multiple administrative domains, etc.

3.3.  VM Mobility Problems Implemented on the VR Device

   Each virtual router (VR) has logically independent routing table and
   forwarding table.  It supports the overlay of private IP addresses
   and public IP addresses.  Multiple VRs can be logically formed on one
   physical router.  Each VR individually runs its own instances of the
   routing protocols, and has its dedicated I/O ports, the cache, the
   address space, routing and forwarding tables and network management
   software.  But it isn't in a way that supports multi-tenancy.
   Therefore it does not support the live migration of VM.

3.4.  Security and Authentication of VMMI

   During the VM migration process, it is required to give proper
   considerations to the security related matters; this includes solving
   traffic roundabout issues, ensuring that the firewall functionalities
   are appropriately enacted, and so on.




Bhumip Khasnabish, et al.  Expires July 3, 2013                [Page 13]

Internet-Draft  Mobility and Interconnection of VM & VNE        Dec 2012


   Therefore, in addition to authorization and authentication,
   appropriate policies and measures to check/enforce the security level
   must be in place while migrating VMs from one DC to another,
   especially from a private DC to a public DC in the Cloud [NIST 800-
   145, Cloud/DataCenter SDO Survey].
   For example, when a VM is migrated to the destination DC network, the
   corresponding switch port connected to the VM and its host server
   should utilize the port strategy of the source switch.  The end time
   of the VM migration and the issue time of the strategy must be
   synchronized.  If the former is earlier than the latter, the services
   may not get a timely response, and if the former is later than the
   latter, it may not have exact level of network security for a time
   period.
   What may be helpful in such environment is the creation and
   maintenance of a reasonable interactive state machine.

3.5.  The Virtual Network Model

   Based on the above problems, two requirements will be added on the
   virtual network model: Firstly, the routing information is adjusted
   automatically according to the physical location of VM after the VM
   is migrated to a new subnet; Secondly, a logical entity, namely
   "virtual network communications agent", is added, which is
   responsible for data routing, storage and forwarding in the inter-
   subnets communications.  The agent can be dynamically created and
   revoked, and can be running on each server.

   The communication nodes in the overlay layer are called overlay
   entities, which are composed by all VMs and communication agents.
   Each VN on the overlay layer can be customized as required.  The VN
   is composed by the specified VMs and communication agents.  VMs and
   communication agents may come from different physical networks.  They
   are connected through private tunnels established by the
   communication agents.  The positions of communication agents in the
   physical network can be divided into two categories: on the server or
   on the network device (such as on the ToR).

3.6.  The Processing Flow

   During the process, VM migration messages will trigger the topology
   updates of the VMs' clusters in the source virtual network and
   destination virtual network.  It is therefore required to acquire the
   network topology, the routing protocols, and the IP address
   assignment rules for each other on both ends, so the VM can be
   assigned an unique IP address.  The routing information of the
   communications agents is updated.  The communications agent captures
   the corresponding VM's packets, encapsulates them into the data
   section of the packets, and adds the necessary control information



Bhumip Khasnabish, et al.  Expires July 3, 2013                [Page 14]

Internet-Draft  Mobility and Interconnection of VM & VNE        Dec 2012


   (such as self-defined forwarding rules).  After the encapsulation,
   these packets are transferred to the destination network through the
   tunnels between the communications agents.  The communications agent
   in the destination network de-capsulates the packets and processes
   the information, and then delivers the packets to the destination
   network.  The data transfer process across subnets is now completed.

   The modules which need to be modified are as follows:

   According to the above processing flow, the modules can be divided by
   function as follows: Routing management, MAC capture, Tunnel packet
   encapsulation, Tunnel forwarding, Tunnel packet de-capsulation, and
   Forwarding in the destination network.

3.7.  The NVE/OBP Location Problems

   VMs communicate with each other through the interconnected network
   either within the same domain, or between different domains.
   According to various NVE / OBP position, the processing is different.

   As it is transparent to network topology and L2/L3 protocol, NVE /
   OBP on the server should be the default configuration mode.

   Assume that a set of VMs and the network that interconnects them are
   allowed to communicate with each other, MAC source and destination
   addresses in the Ethernet header of the packets exchanged among these
   VMs are preserved.  This is L2-based VM communication within a LAN.
   Any VM should have its own IP.  If a VM belongs to more than one
   domain, this VM will have multiple IP addresses and multiple logical
   interfaces, which is similar to the model of L3 switches.

   Different VM clusters are distinguished by VLAN mechanism in the same
   L2 physical domain.  In the case of VM communications across IP
   subnets, the packets are encapsulated in NVE, and directly delivered
   to the peer NVE, and then transferred to the destination VM.

   Once migration across L3 network occurs, some scenarios will cause
   the MAC source address to be modified.

   It is also possible that a VM may belong to different VM cluster
   networks at the same time, and the two VM clusters are distinguished
   by the VLANs(VN ID).

   In the above case, from the perspective of the overlay network, VLAN
   is mapped to VNI.  Different VNI domains are isolated from each
   other.  If you want to communicate between different VNI domains, the
   packets should be routed according to the outer layer addresses, and
   then the outer headers are stripped.  The packets are looked up



Bhumip Khasnabish, et al.  Expires July 3, 2013                [Page 15]

Internet-Draft  Mobility and Interconnection of VM & VNE        Dec 2012


   according to the inner layer addresses.

   As NVEs may belong to different domains, if a NVE communicates with
   the other NVE in the same domain, the VLAN-ID of packets exchanged
   should be the same.  In order to simplify the process, the VLAN-IDs
   are allowed to be removed.  But once a NVE communicates with the
   other NVE in the different domain, the VLAN-ID of packets exchanged
   may be different.

   Note that 'Two VMs' in the following scenarios refers to 'Two VMs
   which are communicating with each other'.


3.7.1.  NVE/OBP on the Server

   Scenario classification:

   Note that 'belonging to the same VN ID' refers to 'within the same
   subnet'.

   (1)Two VMs are on the same server and belong to the same VN ID.
   In this case, the packets are forwarded with the Layer-2 mechanism.

   VM migration processing scenario: this scenario is not in the scope
   of VM migration.

   (2)Two VMs are on the same server, but belong to different VN IDs.
   In this case, the packets are forwarded with the Layer-3 mechanism.

   VM migration processing scenario: this scenario is not in the scope
   of VM migration.

   (3)Two VMs are on different servers, but belong to the same VN ID.
   In this case, the packets are encapsulated on the local NVE, and
   forwarded according to the outer layer addresses with the L2
   mechanism.  After they are delivered to the peer NVE, the outer layer
   addresses are stripped, and then the packets are forwarded according
   to the inner layer addresses.

   VM migration processing scenario: it can be processed in the way as
   across IP subnets, and IP-in-IP tunnel is needed.

   (4)Two VMs are on different servers, and belong to different VN IDs.
   In this case, the packets are encapsulated on the local NVE, and
   forwarded according to the outer layer addresses with the L3
   mechanism.  After they are delivered to the peer NVE, the outer layer
   addresses are stripped, and then the packets are forwarded according
   to the inner layer addresses.



Bhumip Khasnabish, et al.  Expires July 3, 2013                [Page 16]

Internet-Draft  Mobility and Interconnection of VM & VNE        Dec 2012


   VM migration processing scenario: it is similar with the third
   scenario.

   In the above cases, the processing of the outer layer information of
   the packets, including L2 information( such as destination MAC,
   source MAC, VLAN ID) and L3 information, follows the existing
   mechanism, because the underlying network is transparent to the
   overlay layer.

   In the case of NVE on the server, the routing protocols (unicast and
   multicast) may not be changed on the underlying network, but the
   routing design of the overlay layer is subjected to considerable
   restriction.

3.7.2.  NVE/OBP on the ToR

   In case of NVE on the ToR, NVE needs to handle the VIDs of various
   packets.  Once the VM is migrated, the rules of source network also
   need to be migrated, causing physical network configuration changes.
   Therefore, it is required to develop a set of rules to deal with it.

   In this case, the source of the VIDs used by the VM please refer to
   Section 4.5 (The Actual Number of Available Isolated Domains).
   Various rules and usage range of the VIDs are required to be set.
   The VLAN-ID used by a given VM refers to the VLAN-ID carried by the
   traffic that is originated by that VM and within the same L2 physical
   domain.

   When a VM is communicating with the VM in a different VLAN, there are
   two ways to implement before their communication messages enter into
   the processing module in the overlay layer.  Firstly, the
   implementation is on Layer-2.  One possible solution is that the port
   on which the server hosting the VM is connected to ToR belongs to two
   or more VLANs.  Secondly, the implementation is on Layer-3.  The VM
   has multiple IP addresses and logical interfaces.  The number of
   addresses and interfaces are related to the number of communication
   parties.  These methods are similar to the approaches of conventional
   Layer-2 forwarding and Layer-3 routing.  After the processing, the
   packets enter into the processing module in the overlay layer, and
   then the header information is processed.

   Scenario classification:

   (1)Two VMs are on the servers connected to the ports on the same ToR
   and these ports belong to the same VLAN.
   In this case, the packets are forwarded according to the underlying
   network address with the L2 mechanism.




Bhumip Khasnabish, et al.  Expires July 3, 2013                [Page 17]

Internet-Draft  Mobility and Interconnection of VM & VNE        Dec 2012


   VM migration processing scenario: VM is migrated in the way as within
   the same VLAN.

   (2)Two VMs are on the servers connected to the ports on the same ToR
   and these ports belong to different VLANs.
   In this case, the packets are forwarded according to the underlying
   network address with the L3 mechanism.

   VM migration processing scenario: it can be processed in the way as
   across IP subnets, and IP-in-IP tunnel is needed.

   (3)Two VMs are on the servers connected to the ports on different
   ToRs and these ports belong to the same VLAN.
   In this case, the packets are encapsulated on the ToRs, then enter
   into the processing in the underlying network.  If the ToRs are
   directly L2 connected, the packets are forwarded according to the
   outer layer addresses with the L2 mechanism; if the ToRs are not
   directly L2 connected, the packets are routed according to the outer
   layer addresses with the L3 mechanism.  After they are delivered to
   the peer NVE, the outer layer addresses are stripped, and then the
   packets are forwarded according to the inner layer addresses.

   VM migration processing scenario: it is similar with the second
   scenario.

   (4)Two VMs are on the servers connected to the ports on different
   ToRs and these ports belong to different VLANs.
   In this case, the packets are encapsulated on the ToRs, and processed
   in the underlying network.  Since the ToRs are not directly L2
   connected, the packets are routed according to the outer layer
   addresses with the L3 mechanism.  After they are delivered to the
   peer NVE, the outer layer addresses are stripped, and then the
   packets are forwarded according to the inner layer addresses.

   VM migration processing scenario: it is similar with the second
   scenario.


   When a VM is communicating with the VM in a different VLAN, there are
   two implementations before their communication messages enter into
   the processing module in the overlay layer.  Firstly, the
   implementation is on Layer-2.  One possible solution is that the port
   belongs to two or more VLANs, on which the server hosting the VM is
   connected to ToR.  Secondly, the implementation is on Layer-3.  The
   VM has multiple IP addresses and logical interfaces.  The number of
   addresses and interfaces are related to the number of communication
   parties.  These methods are similar to the approaches of conventional
   Layer-2 forwarding and Layer-3 routing.  After the processing, the



Bhumip Khasnabish, et al.  Expires July 3, 2013                [Page 18]

Internet-Draft  Mobility and Interconnection of VM & VNE        Dec 2012


   packets enter into the processing module in the overlay layer, and
   then the header information is processed.

3.7.3.  Hybrid Scenario

   The hybrid scenario should be considered in the communication model
   of NVO3 network.  This may include the situation when NVE is on the
   Server in some part of the network, and NVE is on ToR in other part
   of the network.  The normal communication between two parts of the
   network should be covered under the hybrid scenario discussion.


4.  Technology Problems involved in the VM Migration

   As mentioned above, when VMs are migrated from one server to another,
   the source and destination server may be within the same DC, or in
   the DCs at different geographic locations.  The remote data centers
   can be interconnected through different DCI technologies.  Overall,
   the technologies involved in the VM migration across DCs include
   those within the DC.  The key technologies involved in the VM
   migration are broadly categorized according to different locations of
   the elements in DCs as follows:

   1)a) VM/Network-aware VM migration technologies, which makes the
   network automatically perceive the network behavior of VM/Host (such
   as QoS / VLAN, etc.).  The network behavior is mainly perceived by
   the switch which the server hosting VM is directly connected to.
   Such technologies include IEEE EVB / VDP, PortExtender (802.1Qbh),
   and so on.

   b) Service-aware VM migration technologies, e.g., a distributed
   cluster of VMs may be used for Unified Communications Services (UCS)
   , and when any UCS related VMs need to be migrated, a set of VMs from
   the clusters of the same category can be used.

   2) The network interconnection driven technologies within the DC,
   including IETF TRILL, IEEE 802.1Q/SPBV, and PBB / SPBM.

   3) The internal exit device interconnection based technologies across
   the DCs, including IETF VPLS, PBB-EVPN.

   4) The ToR / Host interconnection technologies within the DC as well
   as across the DCs, including the NVO3 which IETF is currently
   standardizing.

   The above-mentioned classification is according to the locations of
   various elements in DC.  The technologies in the second class or
   third class can logically become one of the foundations of the



Bhumip Khasnabish, et al.  Expires July 3, 2013                [Page 19]

Internet-Draft  Mobility and Interconnection of VM & VNE        Dec 2012


   technologies in the fourth class.

   The above-mentioned VM migration technologies may be considered to be
   within the scope of NVO3 WG.

4.1.  TRILL

   The large-scale Layer-2 and multi-path interconnection in the DC
   network can be implemented through TRILL.  The transparent TRILL
   network can be used to ensure that the MAC and IP remain unchanged in
   the VM migration.  However, in order to guarantee service continuity
   during the VM migration, the control plane of NOV3 exchanges messages
   with the control plane of TRILL, so that the position state can be
   quickly updated after the VM migration.  Business traffic can be
   quickly pointed to the new network address of the VM after the
   migration and tenants may not have any perception of the VM
   migration.

4.2.  SPB

   As with TRILL, the large-scale Layer-2 and multi-path interconnection
   in the DC network can be implemented through SPB.  The transparent
   SPB network can be used to ensure that the MAC and IP remain
   unchanged in the VM migration.  However, in order to guarantee
   service continuity during the VM migration, the control plane of NOV3
   exchanges messages with the control plane of SPB, so that the
   position state can be quickly updated after the VM migration.
   Business traffic can be quickly pointed to the new network address of
   the VM after the migration and tenants may not have any perception of
   the VM migration.

4.3.  Data Center Interconnection Fabric Related Problems

   One of the most important factors that directly impact the VMMI is
   connectivity among the relevant data centers.  There are many
   features that determine this required connectivity.  These features
   of connectivity include bandwidth, security, quality of service, load
   balancing capability, etc.  These are frequently utilized to make
   decision on whether a VM can join a host in real-time or it needs to
   join VRF in certain unit of VM.

   The requirements related to the above are as follows:
   o The negative impact of ARP, MAC and IP entry explosion on the
   individual network which contains a large number of tenants should be
   minimized by DC and DC-interconnection technologies.

   o The link capacity of both intra-DC and inter-DC network should be
   effectively utilized.  Efficient utilization of the link capacity



Bhumip Khasnabish, et al.  Expires July 3, 2013                [Page 20]

Internet-Draft  Mobility and Interconnection of VM & VNE        Dec 2012


   requires traffic forwarding on the shortest path between two VMs both
   within the DC and across DCs.  Therefore, Traffic should be forwarded
   on the shortest path between two VMs within the DC or across DCs.
   o Support of east-west traffic between tenants'applications located
   in different DCs.

   Many mature VPN technologies can be utilized to provide connectivity
   between DCs.  The extension of VLAN and virtual domain between DCs
   may also be utilized for this purpose.


4.4.  Types and Applications of VPN Interconnections between DCs which
      Provide DCI

4.4.1.  Types of VPNs

   Related technologies of layer3 VPN: BGP / MPLS IP Virtual Private
   Networks (VPNs), RFC 4364, etc.

   Related technologies of layer2 VPN: PBB + L2VPN, TRILL + L2VPN, VLAN
   + L2VPN, NVGRE [draft-sridharan-virtualization-nvgre-00], PBB VPLS,
   E-VPN, PBB-EVPN, VPLS, VPWS, etc.

4.4.2.  Applications of L2VPN in DCs

   It is a very common practice to use L2 interconnection technologies
   for DC interconnection across geographical regions.  Note that VPN
   technology is also used to carry L2 and L3 traffic across the IP/MPLS
   core network.  This technology can be used in the same DC to support
   scalability or interconnection across L3 domains.  VPLS is commonly
   used for IP/MPLS connection over WAN and it supports transparent LAN
   services.  IP VPN, including BGP / MPLS IP VPN and IPSec VPN, has
   been used in a common IP/MPLS core network to provide virtual IP
   routing instances.
   The implementation of PBB plus L2-VPN can take advantage of some of
   the existing technologies.  It is flexible to use VPN network in the
   cloud computing environment and can support a sufficient number of
   VPN connections/sessions (networking resources), which is much larger
   than the 4K VLAN mode of L2VPN.  Therefore, the resulting effect is
   similar to that of VXLAN.
   Note that PBB can not only support access to more than 16M virtual
   LAN instances, it can also separate the tenants and provide different
   domains through isolated MAC address spaces.
   The use of PBB encapsulation has one major advantage.  Note that
   since VM's MAC address will not be processed by ToRs and Core SWs,
   MAC table size of ToRs and Core SWs may be reduced by two orders of
   magnitude; the specific number is related with the number of VMs in
   each server and VMs' virtual interfaces.



Bhumip Khasnabish, et al.  Expires July 3, 2013                [Page 21]

Internet-Draft  Mobility and Interconnection of VM & VNE        Dec 2012


   One solution to solve problems in DC is to deploy other technologies
   in the existing DC network.  A service provider can separate its
   domains of VLAN into different VLAN islands, in this way each island
   can support up to 4K VLANs.  Domains of VLAN can be interconnected
   via VPLS, at the same time, DC GWs can be used as VPLS PEs.
   If retaining the existing VLAN-based solutions only in VSw, while the
   number of tenants in some VLAN islands is more than 4K, the service
   provider needs to deploy VPLS deeper in the DC network.  This is
   equivalent to supporting L2VPN from the ToRs, and using the existing
   VPLS solutions to enable MPLS for the ToR and core DC elements.

4.4.3.  Applications of L3VPN in DCs

   IP VPN technology can also be used for DC network virtualization.
   For example, multi-tenant L3 virtualization can be achieved by
   assigning a different IP VPN instance to each tenant who needs L3
   virtualization in a DC network.
   There are many advantages of using IP VPN as an L3 virtualization
   solution within DC compared to using existing virtual routing
   technology.  Some of the advantages are as mentioned below:
   (1) It supports many VRF-to-VRF tunneling options containing
   different operational models: BGP/MPLS IP VPN, IP or L3 VPN GRE, etc.
   (2) The connections of IP VPN instances used in Cloud services below
   the WAN can be IP VPN that is directly involved in the WAN.

4.5.  The Actual Number of Available Isolated Domains

   The isolation of VMs is achieved through VNI.  One way to acquire the
   value of the VNI is through tag mapping in the underlying network.
   VNI may be derived from the 12-bit VLAN ID, or from other related 24-
   bit information (e.g. 24-bit PBB I-SID tag mapping, or other
   information which can be mapped to VNI), It relates to the technical
   solutions adopted by the Provider Edge Bridge and the degree of
   chip's support.  If VNI is from 12-bit VLAN ID, the actual number of
   available VN IDs is 4096 in the application; if it is from 24-bit
   VLAN ID, the actual number of available VN IDs is 16M in the
   application.  This is the compatibility issue between the isolated
   domains in the overlay layer and the isolated domains in the
   underlying network.

4.6.  Problems of the Number of Management Devices/Management Domains of
      DC in the NVO3 Network

   The settings of management domains should be done in concert with the
   control plane of NVO3 network.  It relates to the management domain
   of VM.  The range of VM's management domain can be considered from
   two aspects.
   Firstly, it relates to whether only one management center (e.g.



Bhumip Khasnabish, et al.  Expires July 3, 2013                [Page 22]

Internet-Draft  Mobility and Interconnection of VM & VNE        Dec 2012


   vCenter) is required.  With VCenter as an example, a single vCenter
   under the 32-bit OS can control 200 ESX Hosts and 2,000 VMs.  If the
   number of the ESX Hosts and VMs in the DC exceeds the quantity, there
   should be multiple vCenters in place.
   Secondly, it relates to the institutional settings.  If the company
   needs operations across the country or across the WAN, a single
   vCenter can manage through the connections across the WAN, However,
   multiple vCenters may be needed for hierarchical settings of global
   organizational structure or the required security settings.

4.7.  Limitation of TCAM capacity

   Regardless of the locations of communication agents in the overlay
   network, the number of 16M isolated domains is clearly beyond the
   maximum TCAM capacity which hardware currently can support, so it is
   necessary to consider the hierarchical virtual network.

4.8.  SDN

   If SDN technology is used for VM migration in Nvo3 network, it also
   has to face the problem of limited TCAM capacity which hardware can
   support, and the problem is more prominent.

4.9.  LISP

   If LISP is used an option in Nvo3 control plane, it will face the
   problem of notify the ITR of fast switching to new Locator IP after
   the migration.  For example, a VM has just migrated from site A to
   site B. The ITR do not know, and still use the old Locator IP to send
   the packets.  The applications are certainly interrupted and won't be
   restored until the ITR acquires the new Locator IP.  However, the
   length of the interruption time has a great influence on the time-
   sensitive operations.


5.  Others Problems Related with IETF and IEEE

5.1.  Review of VXLAN, NVGRE, and NVO3

   In order to solve the problem of insufficient number of VLANs in DC,
   the technologies like VXLAN and NVGRE have adopted two major
   strategies; one of the strategies is the encapsulation and the other
   is tunneling.

   Both VXLAN and NVGRE use encapsulation and tunneling to create a
   number of VLAN subnets, which can be extended to the Layer-2 and
   Layer-3 networks.  This solves the problem of limitation of the
   number of VLAN as defined by IEEE802.1Q, and helps achieve shared



Bhumip Khasnabish, et al.  Expires July 3, 2013                [Page 23]

Internet-Draft  Mobility and Interconnection of VM & VNE        Dec 2012


   load-balancing in multi-tenant environment in both public and private
   networks.

   The VXLAN technology is introduced in 2011, and it is designed to
   address the number restrictions of 802.1Q VLAN.  The technologies
   like MAC in MAC, MAC in GRE also extend the number of VLANs.
   However, VXLAN attempts to address the issues related to inadequate
   utilization of link resources, monitoring of packets after re-
   encapsulation of header more effectively.
   The frame format of VXLAN is the same as that of OTV and LISP,
   although these three solutions solve different problems of DC
   Interconnection and VM migration.  Also, in VXLAN, the packet is
   encapsulated in MAC in UDP, and addressing is extended to 24-bit,
   which is the effective solution to the restrictions of VLAN number.
   UDP encapsulation enables the logical virtual network extension to
   different subnets.  It also supports the migration of VMs across
   subnets.  The change of the frame's structure increases the field for
   extending the VLAN.

   Note that VXLAN solves different problem compared to OTV.  OTV solves
   the problem of DC interconnection, which builds an IP tunnel between
   different data centers through MAC in IP.  VXVLAN mainly solves the
   problem of limitation of VLAN resources in DCs due to the increase in
   the number of tenants.  The key is the expansion of the VNI field to
   increase the number of VLANs.  Both technologies can be applied to VM
   migration, since the two packet formats are almost the same and
   completely compatible.

   NVGRE specifies the 24-bit Tenant Network Identifier (TNI) and
   resolves some issues related to supporting multiple tenants in DC
   network.  It uses GRE to create an independent virtual Layer-2
   network, and limits physical Layer-2 network to expand across subnet
   borders.  Terminals supporting NVGRE insert the TNI indicators in the
   GRE headers to separate the TNIs.

   NVGRE and VXLAN solve the same problem.  The two technologies were
   proposed almost at the same time.  However, there are some
   differences between them:

   VXLAN not only increases VXLAN header(VNI), but also increases the
   outer UDP encapsulation on the packet, which facilitates live
   migration of VMs across subnets.  In addition, differentiated
   services can be supported to the tenants in the same subnet because
   of the use of UDP.  Both proposals are built on the assumption that
   load-balancing is the necessary condition to achieve efficient
   operation.  VXLAN randomly assigns port number to achieve load-
   balancing, while NVGRE uses the retained 8-bit in the key GRE field.
   However, there may be opportunity to improve the capability of the



Bhumip Khasnabish, et al.  Expires July 3, 2013                [Page 24]

Internet-Draft  Mobility and Interconnection of VM & VNE        Dec 2012


   control plane for both mechanisms in future.

5.2.  The East-West Traffic Problem

   Let us discuss the background of East-West traffic problem first.
   There are a variety of applications in the DC, such as distributed
   computing, distributed storage, and distributed search.  These
   applications and services need frequent exchanges of transactions
   between the business servers across the DCs.  According to the
   traditional three-tier network model, the data stream first flows
   north-south and then finally flows east-west.  In order to improve
   the forwarding efficiency of the data stream, it is necessary to
   update the existing network model and network forwarding technology.
   Among others, the Layer-2 multi-path technology being studied is one
   of the directions to solve this problem.

   Distributed computing is the basis of transformation of the existing
   IT services.  This allows scalable and efficient use of sometimes
   underutilized computing and storage resources scattered across the
   data centers.

   In typical data centers, the average server utilization is often low
   in the existing network.  The concept of virtualization and
   distributed computing can perfectly solve the problem of capacity
   limitation of a single server in demanding environments in certain
   DCs via on-demand utilization of resources and without impacting the
   performance.  This revolutionary technology of distributed computing
   and services using resources in the DCs also produces several
   horizontal flows of traffic.  The application of distributed
   computing technology on the servers produces a large number of
   interactive traffic streams between servers.  In addition, the types
   of DC would influence the traffic model both within and across data
   centers.

   The first type of DC is telecom operators who usually not only
   operate DCs, but also supply bandwidth for the Level-2 ISP providers.
   The second type is the traditional ISP companies with strong power.
   The third type is some IT enterprises which invest in the
   construction of DCs.  The fourth type is high-performance computing
   (HPC) centers that are built by universities, research institutes and
   organizations.  Note that in these types of DCs, the south-north
   traffic flow is significantly smaller compared to the horizontal
   flow, and this brings greatest challenges to the network design and
   installation.  In addition to the normal flow of traffic due to the
   distributed computing, storage, communications, and management, hot
   backup and live VM migration produce a sudden lateral flow of traffic
   and associated challenges.




Bhumip Khasnabish, et al.  Expires July 3, 2013                [Page 25]

Internet-Draft  Mobility and Interconnection of VM & VNE        Dec 2012


   There are two potential solutions to the distributed horizontal flow
   of traffic, as described below.
   A. The first one is to solve the problem of east-west traffic within
   the server clusters by exploiting representative technologies such as
   vswitch, Dcell, B-cube, and DCTCP .
   B. The second solution is the network-based solution.  The tree
   structure of the traditional DC network is not inherently efficient
   for horizontal flow of traffic.  The problems can be solved in two
   ways: (i)in the direction of radical changes: radical deformations in
   changing the tree structure to multi-path, and (ii) in the direction
   of mild improvement: changing L2 big trees to L2 small trees and
   meeting the requirements by expanding the interconnection capacity of
   the upper node, clustering/stacking system, and link trunking.

   The requirements related to the above are as follows: Stacking
   technology across the data center requires specialized interfaces,
   and the length of feasible transmission distance is limited.

   The problems related to the above statement include the following:
   (a) although TRILL resolves the multi-path problem of Layer-2
   protocol, it negatively impacts the multi-path properties of Layer-3
   protocol.  This is because only one active default router supports
   Virtual Router Redundancy Protocol (VRRP), and this means that the
   multi-path characteristics cannot be fully utilized in Layer-3
   protocol. (b)TRILL does not define how to deal with the problem of
   overlapping namespace.


5.3.  The MAC, IP, and ARP Explosion Problems

   Network devices within data centers encounter many problems for
   supporting conventional communication framework because they need to
   accommodate a huge number of IP, MAC addresses and ARP.

   Each blade server in a network device usually supports at least 16-40
   VMs, and each VM has its own MAC address and IP address.  The
   entities like Disk, memory, FDB table, MAC table, etc, cause an
   increase in convergence time.  In order to accommodate this large
   number of the servers, different options for the network topology,
   for example, fat tree topology or a conventional network topology may
   be considered.

   The number of ARP packets grows not only with the number of virtual
   L2 domains or ELANs, which is instantiated on server, but also with
   the number of VMs in that domain.  Therefore, scenarios like overload
   of ARP entries on the servers/hypervisors, exhaustion of ARP entries
   on the routers/PEs, and processing overload of L3 service appliances,
   must be efficiently resolved.  Otherwise, these problems will easily



Bhumip Khasnabish, et al.  Expires July 3, 2013                [Page 26]

Internet-Draft  Mobility and Interconnection of VM & VNE        Dec 2012


   propagate throughout the layer-2 switching network.

   Consequently, what are needed to resolve these problems include (a)
   automated management of MAC/IP/ARP in DCs, and (b) network deployment
   that will reduce the explosion in MAC number requirements in DCs.

5.4.  Suppressing Flooding within a VLAN

   Efficient operations of DCs in nvo3 network require that flooding of
   broadcast, multicast and unknown unicast frames within a VLAN (that
   may be caused by the improper configuration) should be reduced.

5.5.  Packet Encapsulation Problems

   In order to achieve seamless migration of VMs across DCs that support
   different VLAN expansion mechanisms, unification of packet
   encapsulation methods is required.


6.  Acknowledgement

   The following experts have provided valuable comments on the earlier
   version of this draft: Thomas Narten, Christopher LILJENSTOLPE,
   Steven Blake, Ashish Dalela, Melinda Shore, David Black, Joel M.
   Halpern, Vishwas Manral, Lizhong Jin, Juergen Schoenwaelder, Donald
   Eastlake, and Truman Boyes.  We express our sincere thanks to them,
   and expect that they will continue to provide suggestions in future.


7.  References

   [PBB-VPLS] Balus, F. et al.  "Extensions to VPLS PE model for
   Provider
   Backbone Bridging", draft-ietf-l2vpn-pbb-vpls-pe-model-
   04.txt (work in progress), October 2011.

   [DCN Ops Req] A. Dalela.  "Datacenter Network and Operations
   Requirements",
   draft-dalela-dc-requirements-00.txt, December 30, 2011

   [VPN Applicability] Nabil Bitar.  "Cloud Networking: Framework and
   VPN Applicability",
   draft-bitar-datacenter-vpn-applicability-01.txt, October 2011

   [VXLAN] M.Mahalingam.  "VXLAN: A Framework for Overlaying Virtualized
   Layer 2 Networks over Layer 3 Networks",
   draft-mahalingam-dutt-dcops-vxlan-01.txt, February 24, 2012




Bhumip Khasnabish, et al.  Expires July 3, 2013                [Page 27]

Internet-Draft  Mobility and Interconnection of VM & VNE        Dec 2012


   [NIST 800-145] NIST Special Publication 800-145, Peter Mell and
   Timothy Grance, The NIST definition of cloud computing,
   http://csrc.nist.gov/publications/nistpubs/800-145/SP800-145.pdf,
   September 2011

   [Cloud/DataCenter SDO Survey] B. Khasnabish and C. JunSheng.  "Cloud/
   DataCenter SDO Activities Survey and Analysis",
   draft-khasnabish-cloud-sdo-survey-02.txt, December 28, 2011

   [NVGRE] M. Sridharan.  "NVGRE: Network Virtualization using Generic
   Routing Encapsulation",
   draft-sridharan-virtualization-nvgre-00.txt, September 2011

   [NVO3] Thomas Narten. " NVO3: Network Virtualization", l2vpn-9.pdf,
   November 2011


8.  Security Considerations

   To be added later, on as-needed basis.


9.  IANA Consideration

   The extensions that are discussed in this draft are related to DC
   operations environment.


10.  Normative References

   [RFC4364]  Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private
              Networks (VPNs)", RFC 4364, February 2006.


Authors' Addresses

   Bhumip Khasnabish
   ZTE USA,Inc.
   55 Madison Avenue, Suite 160  Morristown, NJ 07960
   USA

   Phone: +001-781-752-8003
   Email: vumip1@gmail.com, bhumip.khasnabish@zteusa.com








Bhumip Khasnabish, et al.  Expires July 3, 2013                [Page 28]

Internet-Draft  Mobility and Interconnection of VM & VNE        Dec 2012


   Bin Liu
   ZTE Corporation
   15F, ZTE Plaza, No.19 East Huayuan Road,Haidian District
   Beijing  100191
   P.R.China

   Phone: +86-10-59932098
   Email: richard.bohan.liu@gmail.com,liu.bin21@zte.com.cn


   Baohua Lei
   China Telecom
   118, St. Xizhimennei, Office 709, Xicheng District
   Beijing
   P.R.China

   Phone: +86-10-58552124
   Email: leibh@ctbri.com.cn


   Feng Wang
   China Telecom
   118, St. Xizhimennei, Office 709, Xicheng District
   Beijing
   P.R.China

   Phone: +86-10-58552866
   Email: wangfeng@ctbri.com.cn























Bhumip Khasnabish, et al.  Expires July 3, 2013                [Page 29]