Network Working Group                                  Bhumip Khasnabish
Internet-Draft                                              ZTE USA,Inc.
Intended status: Informational                                   Bin Liu
Expires: December 31, 2012                               ZTE Corporation
                                                              Baohua Lei
                                                               Feng Wang
                                                           China Telecom
                                                           June 29, 2012


  Requirements for Mobility and Interconnection of Virtual Machine and
                        Virtual Network Elements
                 draft-khasnabish-vmmi-problems-01.txt

Abstract

   In this draft, we discuss the challenges and requirements related to
   migration, mobility, and interconnection of Virtual Machines (VMs)and
   Virtual Network Elements (VNEs).  We also describe the limitations of
   various types of virtual local area networking (VLAN) and virtual
   private networking (VPN) techniques that are traditionally expected
   to support such migration, mobility, and interconnections.

Status of this Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at http://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on December 31, 2012.

Copyright Notice

   Copyright (c) 2012 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of


Bhumip Khasnabish, et al.  Expires December 31, 2012            [Page 1]

Internet-Draft  Mobility and Interconnection of VM & VNE       June 2012


   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.


Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  4
     1.1.  Conventions used in this document  . . . . . . . . . . . .  4
   2.  Terminology and Concepts . . . . . . . . . . . . . . . . . . .  4
   3.  Network Related Prloblem specification . . . . . . . . . . . .  6
     3.1.  The Evolution Problems of The Logical Network Topology
           in VMMI Environments . . . . . . . . . . . . . . . . . . .  8
     3.2.  Cloud Service Virtualization Requirements  . . . . . . . .  9
       3.2.1.  Requirement of logical element . . . . . . . . . . . .  9
       3.2.2.  Requirements for Resource Allocation Gateway (RA
               GW) Function . . . . . . . . . . . . . . . . . . . . . 10
       3.2.3.  Performance Requirements . . . . . . . . . . . . . . . 11
       3.2.4.  Fault Tolerance Capability Requirements  . . . . . . . 11
       3.2.5.  Network Model  . . . . . . . . . . . . . . . . . . . . 12
       3.2.6.  Types and Applications of VPNs Interconnection
               between DCs which provide Cloud Services . . . . . . . 12
         3.2.6.1.  Types of VPNs Layer3 VPN . . . . . . . . . . . . . 12
         3.2.6.2.  Applications of L2VPN in DCs . . . . . . . . . . . 13
         3.2.6.3.  Applications of L3VPN in DCs . . . . . . . . . . . 13
       3.2.7.  VN Requirements  . . . . . . . . . . . . . . . . . . . 14
       3.2.8.  Packet Encapsulation Problems  . . . . . . . . . . . . 14
       3.2.9.  Network Bandwidth Efficiency Problem of Resource
               Use  . . . . . . . . . . . . . . . . . . . . . . . . . 15
       3.2.10. VM Migration Problem in mixed IPv4 and IPv6
               Environment  . . . . . . . . . . . . . . . . . . . . . 15
         3.2.10.1. Real-time Perception of Availability of Global
                   Network and Storage Resources  . . . . . . . . . . 15
         3.2.10.2. The real-time perception of global available
                   network resource and requested network
                   resource for matching with storage resources . . . 16
         3.2.10.3. The real-time perception of global requested
                   network resource for matching with storage
                   resources  . . . . . . . . . . . . . . . . . . . . 16
       3.2.11. Selection of Migration . . . . . . . . . . . . . . . . 16
         3.2.11.1. Requirements with Different Network
                   Environments and Protocol  . . . . . . . . . . . . 16
         3.2.11.2. Requirements for Live Migration of Virtual
                   Machines . . . . . . . . . . . . . . . . . . . . . 17
       3.2.12. Access and Migration of VMs without users'


Bhumip Khasnabish, et al.  Expires December 31, 2012            [Page 2]

Internet-Draft  Mobility and Interconnection of VM & VNE       June 2012


               Perception . . . . . . . . . . . . . . . . . . . . . . 17
         3.2.12.1. VM Migration Problems and Strategies in the
                   WAN with having Traffic Roundabout as a
                   Prerequisite . . . . . . . . . . . . . . . . . . . 18
         3.2.12.2. VM Migration Problems and Strategies in the
                   WAN without having Traffic Roundabout as a
                   Target . . . . . . . . . . . . . . . . . . . . . . 19
       3.2.13. Review of VXLAN, NVGRE, and NVO3 . . . . . . . . . . . 20
       3.2.14. The East-West Traffic Problem  . . . . . . . . . . . . 21
       3.2.15. Data Center Interconnection Fabric Related Problems  . 23
       3.2.16. MAC, IP, and ARP Explosion Problems  . . . . . . . . . 24
       3.2.17. Suppressing Flooding within VLAN . . . . . . . . . . . 24
       3.2.18. Convergence and Multipath Support  . . . . . . . . . . 24
       3.2.19. Routing Control - Multicast Processing . . . . . . . . 25
       3.2.20. Problems and Requirement related to DMTF . . . . . . . 25
   4.  Control & Mobility Related Problem Specification . . . . . . . 26
     4.1.  General Requirements and Problems of State Migration . . . 26
       4.1.1.  Foundation of Migration Scheduling . . . . . . . . . . 26
       4.1.2.  Authentication for Migration . . . . . . . . . . . . . 26
       4.1.3.  Consultation for Assessing Migratability . . . . . . . 26
       4.1.4.  Standardization of Migration State . . . . . . . . . . 26
     4.2.  Mobility in Virtualized Environments . . . . . . . . . . . 28
     4.3.  VM Mobility Requirements . . . . . . . . . . . . . . . . . 29
       4.3.1.  Summarization of Mobility  . . . . . . . . . . . . . . 29
       4.3.2.  Problem Statement  . . . . . . . . . . . . . . . . . . 29
   5.  Network Management Related Problem Specification . . . . . . . 29
     5.1.  Data Center Maintenance  . . . . . . . . . . . . . . . . . 29
     5.2.  Load Balancing after VM Migration and Integration  . . . . 31
     5.3.  Security and Authentication of VMMI  . . . . . . . . . . . 32
     5.4.  Efficiency of Data Migration and Fault Processing  . . . . 32
     5.5.  Robustness Problems  . . . . . . . . . . . . . . . . . . . 33
       5.5.1.  Robustness of VM Migration . . . . . . . . . . . . . . 33
       5.5.2.  Robustness of VNE  . . . . . . . . . . . . . . . . . . 33
   6.  Acknowledgement  . . . . . . . . . . . . . . . . . . . . . . . 34
   7.  References . . . . . . . . . . . . . . . . . . . . . . . . . . 34
   8.  Security Considerations  . . . . . . . . . . . . . . . . . . . 35
   9.  IANA Consideration . . . . . . . . . . . . . . . . . . . . . . 35
   10. Normative References . . . . . . . . . . . . . . . . . . . . . 35
   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 35


Bhumip Khasnabish, et al.  Expires December 31, 2012            [Page 3]

Internet-Draft  Mobility and Interconnection of VM & VNE       June 2012


1.  Introduction

   There are many challenges related to the VM migration and their
   interconnections among two or more data centers (DCs).  The
   techniques that can be used for VM migration and data center
   interconnection should support the required level of performance,
   security, scalability, along with simplicity and cost-effective
   management, operations and maintenance.

   In this draft, the issues and requirements for moving the virtual
   machines are summarized with reference to the necessary conditions
   for migration, business needs, state classification, security, and
   efficiency.  We then list the requirements for VM migration in the
   current IPV4 and IPV6 mixed environment.

   On the choice of the migration solution, the requirements for
   techniques that are useful on large-scale Layer-2 network and on
   segmented IP network are discussed.  We summarize the requirements of
   virtual networks for VM migration, visual networking, and operations
   in DCI modes.

   In the following sections of this draft, we first describe the
   general challenges at high level, and then analyze the requirements
   for VM migration.  We then discuss the commonly-used solutions and
   their limitations along with the desired features of a potential
   reference solution.  A more detailed solution survey will be
   presented in a companion draft.

1.1.  Conventions used in this document

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in [RFC2119].


2.  Terminology and Concepts

   o  ACL: Access Control List

   o  ARP: Address Resolution Protocol

   o  DC: Data Center

   o  DCB/DCBR: Data Center Border Routers

   o  DC GW: Data Center Gateway


Bhumip Khasnabish, et al.  Expires December 31, 2012            [Page 4]

Internet-Draft  Mobility and Interconnection of VM & VNE       June 2012


   o  DCI: Data Center Interconnection

   o  DCS: Data Center Switch

   o  FDB: Forwarding DataBase

   o  HPC: High-Performance Computing

   o  IDC: Internet Data Center

   o  IGMP: Internet Group Management Protocol

   o  IOMMU: Input/Output Memory Management Unit

   o  IP: Internet Protocol

   o  IP VPN: Layer 3 VPN, defined in L3VPN working group

   o  ISATAP: Intra-Site Automatic Tunnel Addressing Protocol

   o  LISP: Locator ID Separation Protocol

   o  MatrixDCN: Matrix-based fabric for Data Center Network

   o  NHRP: Next Hop Resolution Protocol

   o  NVO3: Network Virtualization Overlays (Over Layer-3)

   o  OTV: Overlay Transport Virtualization

   o  PaaS: Platform as a Service

   o  PIM: Protocol Independent Multicast

   o  PBB: Provider Backbone Bridge

   o  PM: Physical Machine

   o  QoS: Quality of Service

   o  RA GW: Resource Allocation GateWay

   o  STP: Spanning Tree Protocol

   o  TNI: Tenant Network Identifier

   o  ToR: Top of the Rack


Bhumip Khasnabish, et al.  Expires December 31, 2012            [Page 5]

Internet-Draft  Mobility and Interconnection of VM & VNE       June 2012


   o  TRILL: Transparent Interconnection of Lots of Links

   o  VLAN: Virtual Local Area Networking

   o  VM: Virtual Machine

   o  VMMI: Virtual Machine Mobility and Interconnection

   o  VN: Virtual Network

   o  VNI: Virtual Network Identifier

   o  VNE: Virtual Network Entity.(a virtualized laye-3/network entity
      with associated virtualized port and virtualized processing
      capabilities)

   o  VPN: Virtual Private Network

   o  VPLS: Virtual Private LAN Service

   o  VRRP: Virtual Router Redundancy Protocol

   o  VSE: Virtual Switching Entity (a virtualized laye-2/switch entity
      with associated virtualized port and virtualized processing
      capabilities)

   o  VSw: Virtual Switch

   o  WAN: Wide Area Network


3.  Network Related Prloblem specification

   In this section, we describe the background of the virtual machine
   and VNE migration between the data centers.

   Why VM and VNE need to be migrated?  First of all, in case of
   overload and during any natural disasters, business-critical data
   center applications need to be migrated to other data centers as
   quickly as possible.  As a pre-condition of data center migration
   and/or integration, some of the applications can be migrated without
   interruption from one data center to another.  As for the
   considerations of address resources, cooling and physical space in
   the primary data center, some of the virtual machines can be migrated
   to the backup data center(s) even under normal operating conditions.

   Secondly, through seamless management of VM migration, it may be
   possible to save operations, maintenance, and upgrade costs.  For


Bhumip Khasnabish, et al.  Expires December 31, 2012            [Page 6]

Internet-Draft  Mobility and Interconnection of VM & VNE       June 2012


   example, the volume of previous server may be relatively large, and
   the volume of the present server may be relatively small.  The
   migration of VMs would allow the users to simultaneously use a single
   server or to replace a set of smaller previous servers.  Thus VM
   migration will save the user a substantial amount of physical rack
   space.  In addition, the server of virtual machine has a unified
   "virtual hardware", unlike the previous server which may have a
   number of different hardware resources.  After migration, the server
   can be managed through a unified interface.  We note that using some
   of the virtual machine software such as high availability tools
   provided by VMware -- when the server shuts down due to various
   failure -- it is possible to automatically switch to another virtual
   server in the network without causing any disruption in operation.
   In short, migration of VMs under many desirable scenarios has the
   advantage of lowering operations costs, simplifying maintenance,
   improving system load balancing, enhancing system error tolerance,
   and optimizing system-wide power and space management.

   In general, a data center architecture consists of the following
   components:

   o  Gateways (Data Center Gateway, Resource Allocation Gateway)

   o  Core Router / Switch

   o  Aggregation layer switch

   o  Access layer ToR switch

   o  Visual switch

   o  Interconnection network between DCs

   o  Servers

   o  Firewall system, etc.


   Overall, the requirement of VM migration brings in the following
   challenges in the forefront of data center operations and management:
   (A)How to accommodate a large number of tenants in each isolated
   network in data center;

   (B)From one DC to another within one administrative domain, (i) how
   to ensure that the necessary conditions of migration are satisfied,
   (ii) how to ensure that a successful migration occurs without service
   disruption, and (c) how to ensure successful rollback when any
   unforeseen problem occur in the migration process.


Bhumip Khasnabish, et al.  Expires December 31, 2012            [Page 7]

Internet-Draft  Mobility and Interconnection of VM & VNE       June 2012


   (C)From one administrative domain to another, how to solve the
   problem of seamless communication between the domains.  There are
   several different solutions to the current Layer-2 (L2) based DC
   interconnect technology, and each can solve different problems in
   different scenarios.  In L2 network, VXLAN
   [draft-mahalingam-dutt-dcops-vxlan-01] is used to resolve the VLAN
   number limitation problem.  And, NVGRE
   [draft-sridharan-virtualization-nvgre-00] attempts to solve similar
   problems, but artificially causes interoperability problems between
   domains.  If the unification of packet encapsulation in different
   solutions can be achieved, it is bound to promote seamless migration
   of VMs among DCs along with the desired integration in cloud
   computing and networking.

   (D) How to utilize IP based technology to resolve migration of VMs
   over layer-3 (L3) network?  For example, VPN technology can be used
   to carry L2 and L3 traffic across the IP/MPLS core network.

   (E)How to resolve the problems related to mobility and portability of
   VMs among DCs is also an important aspect to consider.

   We discuss the above in more details in the following sections.  A
   related draft [DCN Ops Req] discusses data center network and
   operations requirements.

3.1.  The Evolution Problems of The Logical Network Topology in VMMI
      Environments

   The question is whether there is any relation between VM migration
   and the topology of the network within a data center.  In simple
   implementations, seamless VM migration should be realized over
   Layer-2 network.  Since a large number of VMs and their applications
   are running in the same Layer-2 domain, it (VM migration) may be very
   stressful from bandwidth utilization viewpont of the data center
   switching network.

   In order to improve the bandwidth utilization, it is required to
   upgrade the load balancing capability of the network which has
   numerous ECMP between different points.

   Although multi-root tree (such as Fat Tree, MatrixDCN, and other
   network topology) and protocols support ECMP, we can achieve it by
   configuring the appropriate routing, or through TRILL or SPB.
   However, implementing TRILL or SPB requires elimination/upgrading of
   the existing equipment.  If we can encode their positions in the
   topology by IP or MAC address along with using Fat Tree, MatrixDCN
   network topology, we can realize seamless and transparent VM
   migration within the data center, on the premise that the large


Bhumip Khasnabish, et al.  Expires December 31, 2012            [Page 8]

Internet-Draft  Mobility and Interconnection of VM & VNE       June 2012


   layer-2 network is composed of the existing low-end switching
   equipments.

   Note that although Ethernet and IP protocols are meant to support
   arbitrary topology, these Layer-2 and Layer-3 network protocols are
   not flexible enough for use in Data Center environments.  The lack of
   flexibility may result in lack of scalability, management
   difficulties, inflexible communications, and poor fault tolerance.
   These ultimately result in lack of support for flexible VMs migration
   in the increasingly larger and complex Layer-2 networks.  However, if
   we can solve these problems, we will be able to achieve the purpose
   of flexible migration for VMs in the scalable, fault tolerant layer-2
   data center networks.

   Some solutions are moving forward in the direction to solve the
   problems above, there have been several new topological models and
   routing architectures.  These include Fat Tree fabric and MatrixDCN
   fabric.

   MatrixDCN is a new style network fabric for data center networks.
   These fabrics can support super-large scale network including more
   than 100,0000 servers without performance degradation.  Furthermore,
   through ECMP techology, MatrixDCN can eliminate the bandwidth
   bottleneck problems in the canonical tree-structure data center
   networks.  MatrixDCN fabric is described in [Matrix DCN, I-D.sun-
   matrix-dcn].

3.2.  Cloud Service Virtualization Requirements

   The following sub-sections present the requirements of logical and
   physical elements for Cloud/DC service virtualization and their
   operations.

3.2.1.  Requirement of logical element

   o  Resource Allocation Gateway (RA GW)
      Network service providers provide virtualized basic network
      resources for tenants between data centers.  Within the data
      center, the facilities include virtualized computing and
      virtualized storage resources.  The RA gateway's role is to
      provide access to the virtualized resources.  These resources are
      divided into the following three categories: networking resources,
      computing resources, and storage resources.  The RA gateway
      compares the demanded networking, computing and storage resources
      with the available resources, finds out the corresponding
      relations, and achieves globally reasonable matching of resources
      scheduling.  DC GW's function, described below, is a subset of RA
      GW functions.


Bhumip Khasnabish, et al.  Expires December 31, 2012            [Page 9]

Internet-Draft  Mobility and Interconnection of VM & VNE       June 2012


   o  Data Center Gateway (DC GW)
      The DC gateway provides access to the data center for different
      outside users including the Internet access and VPN connection
      users.  In the existing DC network model, the DC GW may be a
      router with virtual routing capabilities, or may be a PE device of
      IPVPN/L2VPN connection.  Core Nodes which perform the roles of DC
      GWs, may also provide Internet connectivity, inter-DC connectivity
      and VPN support.

   o  Core Router / Switch
      These are high-end core nodes / switch with routing capabilities
      located in the core layer, connecting aggregation layer switches.

   o  Aggregation Layer Switch
      This switch aggregates traffic from the ToR switches and forwards
      the downstream traffic.  The switch can be a normal aggregation
      switch, or multiple switches virtualized into a single stack
      switch.

   o  Access Layer ToR Switch
      Access layer ToR switches are usually dual-homed to the parent
      node switch.

   o  Virtual Switch
      This is a virtual software switch which runs on a server.

      The requirements related to the above demand that L2/L3 tunnel is
      terminated to one of the entities mentioned above.

3.2.2.  Requirements for Resource Allocation Gateway (RA GW) Function

   The emerging DC and network providers offer virtualized computing,
   storage and networking resources and related services.  Tenants are
   identified by the overlapping addresses, and share a pool of storage
   and networking resources.  Therefore, a virtual platform is needed,
   with the capabilities of control and management for virtual machines,
   virtual services, virtual storage and virtual networks.  What tenants
   see is a subset of the above four entities.  The virtualized platform
   is built on the framework of the physical network, physical servers,
   physical switches and routers, and physical storage devices.  Through
   the virtual platform, the tenants are offered globally scheduled
   resources for sharing throughout the entire system.

   The RA GW collects information related to system-wide availability of
   computing, storage, and networking resources.  The RA GW then
   allocates appropriate quantities of computing, storage and networking
   resources to the tenants according to certain policies, and the
   demands for resources.


Bhumip Khasnabish, et al.  Expires December 31, 2012           [Page 10]

Internet-Draft  Mobility and Interconnection of VM & VNE       June 2012


   Note that in order to prevent any single point of failure the RA GW
   needs to have backup support.  The global resource availability
   information and scheduling information (between resource allocation
   gateway and backup resource allocation gateway) also needs real-time
   backup.

   It is possible to provide automatic matching and scheduling of the
   virtualized resources, which are dynamically adjusted according to
   the operating conditions.  It can optimize utilization of the
   computing resources, networking resources such as IDC interconnection
   resources, IDC internal routing and switching resources, and storage
   resources.  It should consider the optimization of the network path
   routing for matching with network resources.  Routing selection can
   be based on the degree of matching between the required bandwidth and
   the bandwidth that can be provided, the shortest path, service level,
   user and usage level.  These factors need to be considered in the
   decision-making process.

3.2.3.  Performance Requirements

   Any preferred solution should be able to easily support a large
   number of tenants sharing the data center resources.  It is also
   required to support a large (more than 4K) number of VLANs.  For
   example, there are a number of VPN applications -- VPLS or IP VPN --
   which serve more than 10K tenants, each requiring multiple VLANs.  In
   this scenario the availability of 4K VLANs is not sufficient for the
   tenants.
   The solution should guarantee high quality of service, and must
   ensure a large number of network connections are not interrupted even
   during overloads or minor failure conditions.  The connectivity
   should meet carrier-class reliability and availability requirements.

3.2.4.  Fault Tolerance Capability Requirements

   In the event of any fault or error, it is required to quickly recover
   from an error condition.  Error recovery includes network fault
   recovery, computing power recovery, VM migration recovery, and
   storage recovery.  Among them, the network fault recovery capability
   and computing power recovery are the fundamental requirements for VM
   migration recovery and storage recovery.

   Network fault recovery: Once an error or fault condition is
   identified in virtual network connectivity, alarms should be
   triggered, and recovery by using backup virtual network should be
   automatically activated.

   Computing capability recovery: Once the computing capability fails,
   an efficient detection mechanism is needed to find the problem and


Bhumip Khasnabish, et al.  Expires December 31, 2012           [Page 11]

Internet-Draft  Mobility and Interconnection of VM & VNE       June 2012


   services can be scheduled to backup virtual machines that are being
   used for the services.

   VM migration recovery: In the event of VM migration failure, it is
   required to automatically restore to the original state of the
   virtual machines so that users' services are not adversely impacted.

   Storage recovery: In the event of storage failures, it is required to
   automatically find a backup virtual storage resource so that it can
   be enabled or activated immediately.  The response and recovery times
   should be very short in order to minimize service delay and
   disruptions.

   After the VM migration, it is required to consider the impact on the
   switching network, such as whether the new network environment will
   have the problem of insufficient bandwidth.  Although at the
   consultation phase before the migration, there will be an initial
   judgment, but it cannot guarantee that no problem will occur at all
   after the migration.  In addition, if the destination DC needs to
   activate the standby servers and additional network resources, it may
   be worthwhile to consider allocating and activating additional server
   and network resources.  And, in some cases, some routing policies --
   on network segments and server clusters -- may need to be adjusted as
   well after migration.

3.2.5.  Network Model

   Traditionally, the DCs have their own private networks for the
   interconnection among themselves.
   Alternatively, the data centers can use independent WAN service
   provider's interconnection facilities for primary and/or secondary
   connections.

3.2.6.  Types and Applications of VPNs Interconnection between DCs which
        provide Cloud Services

3.2.6.1.  Types of VPNs Layer3 VPN

   Layer3 VPN
   BGP / MPLS IP Virtual Private Networks (VPNs) (BGP / MPLS IP Virtual
   Private Networks (VPNs))
   RFC 4364
   Layer2 VPN
   PBB + L2VPN
   TRILL + L2VPN
   VLAN + L2VPN
   NVGRE [draft-sridharan-virtualization-nvgre-00]
   PBB VPLS


Bhumip Khasnabish, et al.  Expires December 31, 2012           [Page 12]

Internet-Draft  Mobility and Interconnection of VM & VNE       June 2012


   E-VPN
   PBB-EVPN
   VPLS
   VPWS

3.2.6.2.  Applications of L2VPN in DCs

   It is a very common practice to use L2 interconnection technologies
   for DC interconnection across geographical regions.  Note that VPN
   technology is also used to carry L2 and L3 traffic across the IP/MPLS
   core network.  This technology can be used in the same DC to support
   scalability or interconnection across L3 domains.  VPLS is commonly
   used for IP/MPLS connection over WAN and it supports transparent LAN
   services.  IP VPN, including BGP / MPLS IP VPN and IPSec VPN, has
   been used in a common IP/MPLS core network to provide virtual IP
   routing instances.
   The implementation of PBB plus L2-VPN can take advantage of some of
   the existing technologies.  It is flexible to use VPN network in the
   cloud computing environment and can support a sufficient number of
   VPN connections/sessions (networking resources), which is much larger
   than the 4K VLAN mode of L2VPN.  Therefore, it can achieve the effect
   which is similar to that of VXLAN.
   Note that PBB can not only support access to more than 16M virtual
   LAN instances, it can also separate the customers and provide
   different domains by isolated MAC address spaces.
   The use of PBB encapsulation has one major advantage.  Note that
   since VM's MAC address will not be processed by ToRs and Core SWs,
   MAC table size of ToRs and Core SWs may be reduced by two orders of
   magnitude; the specific number is related with the number of virtual
   machines in each server and VM virtual interfaces.
   One solution to solve problems in DC is to deploy other technologies
   in the existing DC network.  A service provider can separate its
   domains of VLAN into different VLAN islands, in this way each island
   can support up to 4K VLANs.  Domains of VLAN can be interconnected
   via VPLS, at the same time, DC GWs can be used as VPLS PEs.
   If retaining the existing VLAN-based solutions only in VSw, while the
   number of tenants in some VLAN islands is more than 4K, the service
   provider needs to deploy VPLS deeper in the DC network.  This is
   equivalent to supporting L2VPN from the ToRs, and using the existing
   VPLS solutions to enable MPLS for the ToR and core DC elements.

3.2.6.3.  Applications of L3VPN in DCs

   IP VPN technology can also be used for data center network
   virtualization.  For example, multi-tenant L3 virtualization can be
   achieved by assigning a different IP VPN instance to each tenant who
   needs L3 virtualization in a DC network.
   There are many advantages of using IP VPN as a Layer-3 virtualization


Bhumip Khasnabish, et al.  Expires December 31, 2012           [Page 13]

Internet-Draft  Mobility and Interconnection of VM & VNE       June 2012


   solution within DC compared to using existing virtual routing DC
   technology.  Some of the advantages are as mentioned below:
   (1) It supports many VRF-to-VRF tunneling options containing
   different operational models: BGP/MPLS IP VPN, IP or L3 VPN GRE, etc.
   (2) The connections of IP VPN instances used in Cloud services below
   the WAN can be IP VPN that is directly involved in the WAN.

3.2.7.  VN Requirements

   The Virtual Networks (VNs) consists of virtual IDC network, and
   virtual DC internal switching network.  These VNs are built on the
   basis of the physical networks.  VM migration is not affected by the
   physical network.  As long as it is within the scope of the VN, it is
   free to migrate if it satisfies the necessary conditions.  In
   addition, network architecture and forwarding/switching capacity
   should match between the source network and destination network,
   without causing any concern for the physical network.
   The physical characteristics of the network, such as VLAN, IP subnet,
   L2 protocol entities, QoS supporting entities, etc. are abstracted as
   the logical elements of VN.  Because the VMs operate in VN
   environment, each VM has the associated logical elements, such as the
   CPU process, I/O, memory, Disk, etc., and VN also has a corresponding
   set of logical elements.
   In general, the VNs are isolated from each other.  The VMs within
   each VN communicate using their own internal address, and send and
   receive Ethernet packets.  VNs do not have ties to their specific
   implementation; the implementation can use Internet, L2VPN, L3VPN,
   GRE, etc.  From the VN layer, IP can be used to make that
   distinction.  Traffic traverse through firewall into the VN, and ACL
   and other security policies are also needed in the access layer.

3.2.8.  Packet Encapsulation Problems

   In order to implement virtual network (VN), a method similar to the
   overlay address is required.  Overlay address can be reflected by
   VXLAN or the I-SID of PBB+L2VPN.  The overlay address works as an
   identifier corresponding to every instance of VN.  The implementation
   model requires that the edge switch or router acts as the DC GW for
   the encapsulation and de-encapsulation of the tunnel packets.
   Various VNs within the DC rely on overlay address in order to
   distinguish and separate one from the other.  Each VN also contain 4K
   VLANs for its internal use.  The data packets travel to DC
   interconnection network through DC GW, and are encapsulated for
   subsequent transmission.

   The main issue related to the above is the support of encapsulation.

   In L2 network, VXLAN supports the VLAN expansion requirements.  In


Bhumip Khasnabish, et al.  Expires December 31, 2012           [Page 14]

Internet-Draft  Mobility and Interconnection of VM & VNE       June 2012


   NVGRE, a similar problem is also resolved in a different way.
   Therefore, in order to achieve seamless migration of VMs across DCs
   that support different VLAN expansion mechanisms, unification of
   packet encapsulation methods is required.

3.2.9.  Network Bandwidth Efficiency Problem of Resource Use

   A single data center site can't have infinite capacity, due to
   limitations of space, power, cooling, electrical and cable plant, so
   a large data center is usally composed of some geographically
   separated sites to ensure scalability and reliability.  Some
   technologies have been proposed such as OTV and VPLSoverGRE to
   support layer-2 connection between different sites of one data
   center.  Inter-site bandwidth is limited compared with intra-site
   bandwidth, so it is apt to be the bottleneck of commnications in data
   center.  This proposal improves the utility of inter-site bandwidth
   through ip compression technology.  It describes the position,
   processing procedure of ip compression model and relationship with
   OTV or VPLSoverGRE.  This problem and solution is described by
   draft-sun-ip-compression-dcn-00.

3.2.10.  VM Migration Problem in mixed IPv4 and IPv6 Environment

   With the proliferation of IPv6 technology, the existing IPv4 networks
   will have attachment to IPv6 hosts.  This is driving the development
   of a series of tunnel technologies, e.g., 6to4 tunnel technology,
   ISATAP tunnel technology, and so on.  ISATAP tunnel is a point to
   point automatic tunnel technology, and 6to4 tunnel is multipoint
   automatic tunnel technology which is mainly used for attaching
   multiple IPv6 islands over an IPv4 network to connect to the IPv6
   network.  ISATAP and 6to4 tunnel technology works through the IPv4
   address embedded in destination address of IPv6 packets, which are
   automatically obtained at the end of the tunnel.

   The following issues are pertinent to the migration of VMs across
   data centers in mixed (IPv4 and IPv6) network environment.

3.2.10.1.  Real-time Perception of Availability of Global Network and
           Storage Resources

   In the current system, that status of availability of network
   resources and storage resources may not be reported in hard real-
   time.  This may cause a mismatch between the reported and actually
   available virtual machines/storage system resources in the data
   centers.  However, from the global the scale, the compute and storage
   resources in the distributed data center system may need to be used
   more efficiently.  Without real-time up-to-date information about
   system resources availability, the network resources cannot be used


Bhumip Khasnabish, et al.  Expires December 31, 2012           [Page 15]

Internet-Draft  Mobility and Interconnection of VM & VNE       June 2012


   more efficiently.  Therefore, a management model needs to be
   established.  This model needs to keep track of system-wide network
   resources and storage resources, and dispatch them on as needed
   basis.  The management model can be integrated into the framework of
   virtual machine migration as being currently discussed in DMTF [DMTF
   VSMP].

   The real challenges here are how to learn about the availability of
   system-wide networking, compute, and storage resources.  A set of
   uniform methods, mechanisms and protocols would be very useful to
   resolve these issues.


3.2.10.2.  The real-time perception of global available network resource
           and requested network resource for matching with storage
           resources

   In mixed IPv4 and IPv6 networks, a multi-tunneling VPN gateway
   solution may be useful to resolve the problem of establishing
   communication between heterogeneous networks.  This will be helpful
   for supporting seamless communication across heterogeneous data
   centers about the availability of system-wide resources.

3.2.10.3.  The real-time perception of global requested network resource
           for matching with storage resources

   The access to data center virtual machine / storage resources can be
   accurately performed when we have a set of standardized APIs,
   resources (memory, storage, processing, communications, etc.) format,
   and communication protocols.  The availability of virtual machine /
   storage system resources in the global scope needs to be registered,
   and their status need to be reported to the resource management
   system in the cloud system.  Eventually, the resource management
   system in the cloud system is kept well-informed of system-wide
   network resources.


3.2.11.  Selection of Migration

3.2.11.1.  Requirements with Different Network Environments and Protocol

   Currently in large-scale DCs, Layer-2 interconnection techniques are
   mainly used for migration of virtual machines, but there also exists
   Layer-3 interconnection techniques for VM migration.  These two
   technologies are suitable for different implementation environments
   and scenarios.

   The former is often used for frequent data migration with strict


Bhumip Khasnabish, et al.  Expires December 31, 2012           [Page 16]

Internet-Draft  Mobility and Interconnection of VM & VNE       June 2012


   requirements on data security, such as data migration and backup in
   the bank, etc., whereas the latter is commonly used for data
   migration for personal or mobile users, or bulk data transfer between
   different service providers.

   Because of users' demands for establishment of a unified management
   platform, it will become more and more important to build the
   distributed PaaS across different cloud/DC service providers.  No
   user is willing to maintain too many independent platforms.  At the
   same time, sharing of resources across multiple data centers is
   becoming a major trend.  As a result, it will become very cumbersome
   for data center managers to build a large number of VPN connections
   for all data centers.  What may be needed is a portal operator, who
   can manage of all the internal VPN connections between the clouds/DCs
   and can unify the scheduling of data/VM migration in order to achieve
   optimum utilization of resources.

3.2.11.2.  Requirements for Live Migration of Virtual Machines

   The scenarios for live migration of VMs across DCs include the
   following: (a) Migration across IPv4 networks and across IPv6
   networks, (b) Migration from IPv4 to IPv6 networks and vice versa,
   (c) Migration based on mobile IP.

   Live migration of VMs may be more suitable for mobile applications
   for small scale and home users.  The complexity of the network can be
   fully shielded from the users, as long as both source and destination
   have either IPv4 or IPV6 addresses.  This migration paradigm can be
   more secure and applicable in Layer-3 networking environment.

3.2.12.  Access and Migration of VMs without users' Perception

   For VM migration without users' perception, it is required to achieve
   migration of VMs from one DC to another without causing any
   significant disruption of services.  In essence, the users should not
   be able to perceive that the VM migration has occurred.  To achieve
   this, none (or insignificant amount) of critical data packets can be
   lost during the process of VM migration.  The following two
   conditions are helpful to achieve this:

   i.  First, consider how to avoid traffic roundabout while having
   traffic roundabout problem as a prerequisite.

   ii.  Second, consider how to portray the state of no migration in
   user's perception and no traffic roundabout with having no traffic
   roundabout problem as a target.

   The following are the relevant problems and possible solutions in


Bhumip Khasnabish, et al.  Expires December 31, 2012           [Page 17]

Internet-Draft  Mobility and Interconnection of VM & VNE       June 2012


   these two areas.

3.2.12.1.  VM Migration Problems and Strategies in the WAN with having
           Traffic Roundabout as a Prerequisite


                            _____________
                           /             \
                   user c +     MAN C     +
                           \_____________/
                                 |
                                 |
                                 |
                                \|/
                           =--_==--==--=--=
                          /                \
                         = backbone network =
                         =                  =
                          \___=--__--==-__=
                              /         \
                       |     /     __\   \    |
                      \|/   /        /    \  \|/
             __________    /  /|\          \    _______
            /          ___|___ |         ___|___       \
   user a  + MAN A    | VM-A  |         |gateway|MAN B + user b
            \_________|gateway|         |       |______/
                      |_______|         |_______|
                          |                  |
                         _|__              __|__
                       ||VM-A||           ||VM-A||
                       ||____|| migration ||____||
                       |      |__________\|      |
                       |Server|          /|Server|
                       |______|           |______|


                   Figure 1: Roundabout Traffic Scenario

3.2.12.1.1.  VM Migration Requirements

   For migration in Layer-2 (L2) network, it is required to keep VM MAC
   / IP address the same as they are in the source domain.  This will
   help live VM migration and seamless inter-DC communications among the
   service providers.


Bhumip Khasnabish, et al.  Expires December 31, 2012           [Page 18]

Internet-Draft  Mobility and Interconnection of VM & VNE       June 2012


3.2.12.1.2.  A Scenario

   Let us consider the scenario where a VM needs to be migrated from the
   IDC in metro A to the IDC in metro B. There is almost no traffic
   roundabout for users within the metro (such as for user a).
   For access to IDC services by WAN, such as from user in metro C, the
   client traffic must first access VM-A gateway after VM migration, and
   then be sent to the migrated VM through the Layer-2 tunnel.

3.2.12.1.3.  A Possible Solution

   Through mechanisms such as DNS service, businesses can access
   services from a location/DC which is as close as possible and the
   roundabout routes can be minimized after migration.  However, the
   shortcoming of this approach is that, for access across the metro
   network, there are still traffic roundabout issues.  This approach is
   a solution to evade the problem, and does not completely solve the
   problem.  Moreover, additional processing is involved in the control
   of DNS service, which increases the complexity of the solution.

3.2.12.2.  VM Migration Problems and Strategies in the WAN without
           having Traffic Roundabout as a Target

   In this process of VM migration, in order to achieve real-time
   migration without users' perception, the entire state of the
   management programs (including firewall) needs to migrate as VMs
   migrate.  The state migration of the firewalls is the key to ensure
   that the packets in the original firewalls' data flow are neither
   lost nor mis-routed during the VM migration.
   Before a VM migrates to a new DC environment, firewalls have recorded
   the existing VMs connections' session tables.  In the event of VM
   migration, the firewalls in the new DC location will be used for the
   access to the VM.  If the firewalls in the new location don't have
   the session tables of the original firewalls' data flows it will
   cause loss or mis-routing of packets.  The original sessions will be
   disconnected and the users' data flows will fail to access the VM.
   To solve this problem, the original firewall's session tables in use
   need to be migrated and synchronized with the session tables of the
   firewall in the new VM location.  The session table should contain at
   least the following information:
   Source IP address, Destination IP address, Source Port address,
   Destination Port address, Protocol type, VLAN ID, time of expiration,
   public guard information for Firewall defense.
   Since the firewall's session table needs to migrate when a VM
   migrates, the deployment of the source and destination firewall
   should be known in advance.  There are at least two kinds of firewall
   deployment.
   The first kind is the centralized deployment.  In this case, the


Bhumip Khasnabish, et al.  Expires December 31, 2012           [Page 19]

Internet-Draft  Mobility and Interconnection of VM & VNE       June 2012


   firewalls are placed on the connection point of the DC and WAN.  Each
   DC has firewalls either on or adjacent to the core switches.  The
   second one is the distributed deployment.  In this case, the
   firewalls are distributed on the aggregation switches or access
   switches.  The former's advantages are convenient management and
   deployment.  The disadvantage is the firewalls can easily become a
   bottleneck because of centralized/aggregated processing.  The
   latter's advantage is distributed processing of huge VM data flows in
   large L2 network.
   After knowing the deployment of the firewall, it is necessary to
   determine how to migrate the firewall session table from the source
   location to the destination location.  Since the location and number
   of the centralized and distributed firewall deployment differ, the
   mechanisms that are utilized to migrate the session tables in these
   two deployments are not exactly the same.  These are new challenges
   to be addressed for VM migration.

3.2.13.  Review of VXLAN, NVGRE, and NVO3

   In order to solve the problem of insufficient number of VLANs in DC,
   the techniques like VXLAN and NVGRE have adopted two major
   strategies; one is the encapsulation and the other is tunneling.

   Both VXLAN and NVGRE use encapsulation and tunneling to create a
   number of VLAN subnets, which can be extended to the Layer-2 and
   Layer-3 networks.  This solves the problem of limitation of the
   number of VLAN as defined by IEEE802.1Q, and helps achieve shared
   load-balancing in multi-tenant environment in both public and private
   networks.

   The VXLAN technology is introduced in 2011, and it is designed to
   address the number restrictions of 802.1Q VLAN.  The technologies
   like MAC in MAC, MAC in GRE also extend the number of VLANs.
   However, VXLAN attempts to address the issues related to inadequate
   utilization of link resources, monitoring of packets after re-
   encapsulation of header more effectively.
   The frame format of VXLAN is the same as that of OTV and LISP,
   although these three solutions solve different problems of IDC
   Interconnection and the VM migration.  Also, in VXLAN, the packet is
   encapsulated in MAC in UDP, and addressing is extended to 24-bit,
   which is the effective solution to the restrictions of VLAN numbers.
   UDP encapsulation enables the logical virtual network extension to
   different subnets.  It also supports the migration of VMs across
   subnets.  The change of the frame's structure increases the field for
   extending the VLAN.

   Note that VXLAN solves different problem compared to OTV.  OTV solves
   the problem of IDC interconnection, which builds an IP tunnel between


Bhumip Khasnabish, et al.  Expires December 31, 2012           [Page 20]

Internet-Draft  Mobility and Interconnection of VM & VNE       June 2012


   different data centers through MAC in IP.  VXVLAN mainly solves the
   problem of limitation of VLAN resources in DCs due to the increase in
   the number of tenants.  The key is the expansion of the VNI field to
   increase the number of VLANs.  Both techniques can be applied to VM
   migration, since the two packet formats are almost the same and
   completely compatible.

   NVGRE specifies the 24-bit Tenant Network Identifier (TNI) and
   resolves some issues related to supporting multiple tenants in DC
   network.  It uses GRE to create an independent virtual Layer-2
   network, and limit physical Layer-2 network to expand across subnet
   borders.  Terminals supporting NVGRE insert the TNI indicators in the
   GRE headers to separate the TNIs.

   NVGRE and VXLAN solve the same problem.  The two technologies were
   proposed almost at the same time.  Hoverer, there are some
   differences between them:

   VXLAN not only increases VXLAN header(VNI), but also increases the
   outer UDP encapsulation on the package, which facilitates live
   migration of VMs across subnets.  In addition, differentiated
   services can be supported to the tenants in the same subnet because
   of the use of UDP.  Both proposals are built on the assumption that
   load-balancing is the necessary condition to achieve efficient
   operation.  VXLAN randomly assigns port number to achieve load-
   balancing, while NVGRE uses the retained 8-bit in the key GRE field.
   However, there may be opportunity to improve the capability of the
   control plane for both mechanisms in future.

3.2.14.  The East-West Traffic Problem

   Let us discuss the background of East-West traffic problem first.
   There are a variety of applications in the DC, such as distributed
   computing, distributed storage, and distributed search.  These
   applications and services need frequent exchanges of transactions
   between the business servers across the DCs.  According to the
   traditional three-tier network model, the data streams first flows
   north-south and then finally flows east-west.  In order to improve
   the forwarding efficiency of the data stream, it is necessary to
   update the existing network model and network forwarding technology.
   Among others, the Layer-2 multi-path technology being studied is one
   of the directions to solve this problem.

   Distributed computing is the basis of transformation of the existing
   IT services.  This allows scalable and efficient use of sometime
   underutilized computing and storage resources scattered across the
   data centers.


Bhumip Khasnabish, et al.  Expires December 31, 2012           [Page 21]

Internet-Draft  Mobility and Interconnection of VM & VNE       June 2012


   In typical data centers, the average server utilization is often low
   in the existing network.  The concept of virtualization and
   distributed computing can perfectly solve the problem of capacity
   limitation of a single server in demanding environments in certain
   DCs via on-demand utilization of resources and without impacting the
   performance.  This revolutionary technology of distributed computing
   and services using resources in the DCs also produces several
   horizontal flows of traffic.  The application of distributed
   computing technology on the servers produces a large number of
   interactive traffic streams between servers.  In addition, the types
   of IDC would influence the traffic model both within and across data
   centers.

   The first type of IDC is telecom operators who usually not only
   operate DCs, but also supply bandwidth for the Level-2 ISP providers.
   The second type is the traditional ISP companies with strong power.
   The third type is some IT enterprises which invest in the
   construction of DCs.  The fourth type is high-performance computing
   (HPC) centers that are built by universities and research institutes
   and organizations.  Note that in these types of DCs, the south-north
   traffic flow is significantly smaller compared to the horizontal
   flow, and this causes greatest challenges to the network design and
   installation.  In addition to the normal flow of traffic due to the
   distributed computing, storage, communications, and management, hot
   backup and VM migration requirements produce a sudden lateral flow of
   traffic, and associated challenges.

   There are three potential solutions to the distributed horizontal
   flow of traffic, as described below.
   A. The first one is to solve the problem of east-west traffic within
   the server clusters by exploiting representative technologies such as
   vswitch, Dcell, B-cube, and DCTCP .
   B. The second solution is through the server network and Ethernet
   network, by exploiting technologies such as IEEE 802.1qbg, VEPA and
   UCS.
   C. The third solution is the network-based solution.  The tree
   structure of the traditional DC network is not inherently efficient
   for horizontal flow of traffic.  The problems can be solved in two
   ways: (i)The direction of radical changes: radical deformations in
   changing the tree structure to multi-path, and (ii) The direction of
   mild improvement: change L2 big trees to L2 small trees and meet the
   requirements by expanding the interconnection capacity of the upper
   node, clustering/stacking system, and links trunking.

   The requirements related to the above are as follows: Stacking
   technology across the data center requires specialized interfaces,
   and the length of feasible transmission distance is limited.


Bhumip Khasnabish, et al.  Expires December 31, 2012           [Page 22]

Internet-Draft  Mobility and Interconnection of VM & VNE       June 2012


   The problems related to the above statement include the following:
   (a) although TRILL resolves the multi-path problem of Layer-2
   protocol, it negatively impacts the multi-path properties of Layer-3
   protocol.  This is because only one active default router supports
   Virtual Router Redundancy Protocol (VRRP), and this means that the
   multi-path characteristics cannot be fully utilized on Layer-3
   protocol.

   In addition, TRILL does not define how to deal with the problem of
   overlapping namespace, nor it provides any solution to the
   requirement of supporting more than 4K VLANs.


3.2.15.  Data Center Interconnection Fabric Related Problems

   One of the most important factors that directly impact the VMMI is
   connectivity among the relevant data centers.  There are many
   features that determine this required connectivity.  These features
   of connectivity include bandwidth, security, quality of service, load
   balancing capability, etc.  These are frequently utilized to make
   decision on whether a VM can join a host in real-time or it needs to
   join VRF in certain unit of VM.

   This connectivity fabric should be open and transparent, which can be
   achieved by developing simple extensions to some of the existing
   technologies.  The program should have strong openness and
   compatibility; it must be easy to deploy any required extensions as
   well.

   The requirements related to the above are as follows:
   o The negative impact of ARP, MAC and IP entry explosion on the
   individual network which contains a large number of tenants should be
   minimized by DC and DC-interconnect technologies.

   o The link capacity of both intra-DC and inter-DC network should be
   effectively utilized.  Efficient utilization of the link capacity
   requires traffic forwarding on the shortest path between two VMs both
   within the DC and across DCs.  Therefore, Traffic should be forwarded
   on the shortest path between two VMs within the DC or across DCs.
   o Support of east-west traffic between customers' applications
   located in different DCs.

   o Management of VMs across DC

   o Mobility of VMs and their migration across DCs

   Many mature VPN technologies can be utilized to provide connectivity
   between DCs.  The extension of VLAN and virtual domain between DCs


Bhumip Khasnabish, et al.  Expires December 31, 2012           [Page 23]

Internet-Draft  Mobility and Interconnection of VM & VNE       June 2012


   may also be utilized for this purpose.


3.2.16.  MAC, IP, and ARP Explosion Problems

   Network devices within data centers encounter many problems for
   supporting conventional communication framework because they need to
   accommodate a huge number of IP, MAC addresses and ARP.

   Each blade server in a network device usually supports at least 16-40
   VMs, and each VM has its own MAC adress and IP address.  The entities
   like Disk, memory, FDB table, MAC table, etc. cause an increase in
   convergence time.  In order to accommodate this large number of the
   servers, different options for the network topology, for example, fat
   tree topology or a conventional network topology may be considered.

   The number of ARP packets grows with not only the number of virtual
   L2 domains or ELANs which is instantiated on server but also with the
   number of VMs in that domain.  Therefore, scenarios like overload of
   ARP entries on server/hypervisor, exhaustion of ARP entries on the
   routers/PEs, and processing overload of L3 service appliances, must
   be efficiently resolved.  These problems will easily propagate
   throughout the layer 2 switching network.

   Consequently, what are needed to resolve these problems include (a)
   automated management of MAC/IP/ARP in IDC, and (b) network deployment
   that will reduce the explosion in MAC number requirements in DCs.

3.2.17.  Suppressing Flooding within VLAN

   Efficient operations of Data Centers require that flooding of
   broadcast, multicast and unknown unicast frames within VLAN (that may
   be caused by the improper configuration) be reduced.

3.2.18.  Convergence and Multipath Support

   Although STP is used to solve the broadcast storm problem in the
   loop, it may cause network oscillation resulting in inefficient
   utilization of resources.

   Possible solutions to this problem include switch virtualization, use
   of TRILL and SPB, etc.

   Consequently, standardization of switch virtualization and the
   support of complex network topology in TRILL/SPB would be very
   helpful.


Bhumip Khasnabish, et al.  Expires December 31, 2012           [Page 24]

Internet-Draft  Mobility and Interconnection of VM & VNE       June 2012


3.2.19.  Routing Control - Multicast Processing

   In order to achieve efficient operation of Data centers, the
   overheads and delays due to processing of (a) different types of
   packets such as unicast, multicast and broadcast, (b) ARP packets,
   and (c) load-balancing/-sharing mechanisms must be minimized.

   Note that STP bridging is often used to perform IGMP and/or PIM
   snooping to optimize multicast data delivery.  However, since this
   snooping mechanism is performed by local STP topology, all traffic
   goes through the root bridge for each bridge.  This type of
   traversing may lead to sub-optimal multicast traffic transmission.
   There also exist additional overheads because each customer multicast
   group is associated with the forwarding tree network throughout the
   Ethernet switching network.

   Consequently, development and standardization of efficient Layer-2
   multicast mechanism to support intra- and inter-DC VM mobility would
   be very useful.

3.2.20.  Problems and Requirement related to DMTF

   o  Computing Resources
      It is required to standardize the format for virtualizing
      computing resources.  Best practices for utilizing a standardized
      format for mobility and interconnection management of virtualized
      computing resources would be also very useful.

   o  Storage Resources
      It is required to standardize the format for virtualizing storage
      resources.  Best practices for utilizing a standardized format for
      mobility and interconnection management of virtualized storage
      resources would be also very useful.

   o  Memory Resources
      It is required to standardize the format for virtualizing memory
      resources.  Best practices for utilizing a standardized format for
      mobility and interconnection management of virtualized memory
      resources would be also very useful.

   o  Switching Resources
      It is required to standardize the format for virtualizing
      switching resources.  Best practices for utilizing a standardized
      format for mobility and interconnection management of virtualized
      switching resources would be also very useful.

   o  Networking Resources
      It is required to standardize the format for virtualizing


Bhumip Khasnabish, et al.  Expires December 31, 2012           [Page 25]

Internet-Draft  Mobility and Interconnection of VM & VNE       June 2012


      networking resources.  Best practices for utilizing a standardized
      format for mobility and interconnection management of virtualized
      networking resources would be also very useful.


4.  Control & Mobility Related Problem Specification

4.1.  General Requirements and Problems of State Migration

4.1.1.  Foundation of Migration Scheduling

   A series of inspections need to be done before initiating the VM
   migration process.  The hypervisor should be able to confirm which
   data centers need to be interconnected for migrating VM data in the
   network.  The hypervisor should also be able to confirm which subnets
   and servers in the current network are most suitable to accommodate
   the migrated VMs.

4.1.2.  Authentication for Migration

   For VM migration, authentication is required for all of the following
   entities: network resources, processor, memory and storage resources,
   load balancer, firewall, etc.

4.1.3.  Consultation for Assessing Migratability

   After successful authentication, it is required to check that the
   inter-DC networking resources can support the migration of VMs.  The
   required resources include network bandwidth resources, storage
   resources, resource pool scheduling or management resources, and so
   on.

4.1.4.  Standardization of Migration State

   As an example of standardization of the VM state migration process,
   the following related entities should be aware of the state of each
   other.  The flow of activities may be as follows: Global detection ->
   authentication processing -> capability negotiation->session
   establishment -> initialization instance -> establish the beginning
   stage -> begin migration -> migration & migration exception handling
   -> finish migration -> End stage -> deletion of instances - > Global
   detection


   +------------------------+
   |                       \|/
   |              +------------------+


Bhumip Khasnabish, et al.  Expires December 31, 2012           [Page 26]

Internet-Draft  Mobility and Interconnection of VM & VNE       June 2012


   |              | Global detection |
   |              +------------------+
   |                        |
   |                       \|/
   |              +------------------+
   |              |  authentication  |
   |              |  processing      |
   |              +------------------+
   |                        |
   |                       \|/
   |              +------------------+
   |              |    capability    |
   |              |    negotiation   |
   |              +------------------+
   |                        |
   |                       \|/
   |              +------------------+
   |              |     session      |
   |              |   establishment  |
   |              +------------------+
   |                        |
   |                       \|/
   |              +------------------+
   |              | initialization   |   establish the beginning stage
   |              | instance         |                 |
   |              +------------------+                 |
   |                       \|                          |
   |        +---------------|                          |
   |        |              /|                          |
   |        |               |                          |
   |        |              \|/                         |
   |        |     +------------------+                 |
   |        |     |  begin migration |                 |
   |        |     +------------------+                 |
   |        |               |                          |
   |        |               |                          |
   |        |              \|/                         |
   | +------------+                                    |
   | | exception  |/ Y  migration                      |
   | | processing |---  exception?                     |
   | +------------+\                                   |
   |                        |                          |
   |                        |N                         |
   |                        |                          |
   |                       \|/                         |
   |              +------------------+                 |
   |              | finish migration |                 |
   |              +------------------+                 |


Bhumip Khasnabish, et al.  Expires December 31, 2012           [Page 27]

Internet-Draft  Mobility and Interconnection of VM & VNE       June 2012


   |                        |                          |
   |                       \|/                         |
   |              +------------------+                \|/
   |              |   destruction    |             end stage
   |              |   of instances   |
   |              +------------------+
   |                        |
   +------------------------+


      Figure 2: A Flow Chart for State Migration between Data Centers

4.2.  Mobility in Virtualized Environments

   In order to support VM mobility, it is required to allow VMs to
   migrate easily and repeatedly -- that is as often as needed by the
   applications and services -- among a large (more than two) number of
   DCs.  Seamless migration of VMs in mixed IPv4 and IPv6 VPN
   environments should be supported by using appropriate DC GWs.

   VMs in the resource pool should support mobility.  These mobile VMs
   can move either within a DC or from one DC to another remote DC.  The
   mobility can be triggered by factor like natural disaster, imbalance
   of load, cost (of space, electricity, etc.) reduction campaign, and
   so on.  When a VM is migrated to a new location, it should maintain
   the existing client sessions.  VM's MAC and IP address should be
   preserved and the state of the VM sessions should be copied to the
   new location.

   Some widely used virtual machine migration tools require that
   management programs on the source server and destination server are
   directly connected via an L2 network.  The objective is to facilitate
   the implementation of smooth VM migration.

   One example of such tool is VMware's VMotion virtual machine
   migration tool.
   (1) Firstly, a VMotion ELAN may need to provide protection and load-
   balancing across multiple DC network.
   (2) Secondly, in the current VMotion procedure, the new location of
   the VM must be part of the tenant ELAN domain.  When a new VM is
   activated, a Gratuitous ARP is sent, and the MAC FIB entries in the
   "tenant ELAN" are updated to direct the traffic for that VM to the
   new VM location.
   (3) Thirdly, if the path needs IP forwarding, the accessibility
   information of VM must be updated to the shortest path information to
   the VM.


Bhumip Khasnabish, et al.  Expires December 31, 2012           [Page 28]

Internet-Draft  Mobility and Interconnection of VM & VNE       June 2012


4.3.  VM Mobility Requirements

4.3.1.  Summarization of Mobility

   Mobility refers to the movement of a VM from one server to another
   server within one DC or to a different DC, while maintaining the VM's
   original IP and MAC address throughout the process.  VM mobility does
   not change the VLAN/subnet connection to the VM, and it requires that
   the serving VLAN be extended to the new location of VM.

   In summary, the seamless mobility solution in DC is based on IP
   routing, BGP / MPLS MAC-VPN, BGP / MPLS IP VPNs and NHRP.

4.3.2.  Problem Statement

   The following are the major issues related to supporting seamless
   mobility of VM.

   The first problem is that the participating source server and
   destination server in the VM migration process may be located in
   different data centers.  It may be required to extend the Layer-2
   network beyond what is covered by the L2 network of the source DC.
   This may create islands of the same VLAN in different (geographically
   dispersed) data centers.

   The second problem is that the optimal forwarding in a VLAN that
   support VM mobility may involve traffic management over multiple data
   centers.

   The third problem is that the support of seamless mobility of VM
   across DCs may not necessarily always achieve optimal intra-VLAN
   forwarding.

   The forth problem is that the support of seamless mobility of VM
   across DCs may not necessarily always result in optimal routing.


5.  Network Management Related Problem Specification

5.1.  Data Center Maintenance

   We note that the servers and the applications/services in the data
   center should maintain uninterrupted service during the migration
   process.

   In order to provide uninterrupted service during the migration
   process, the following are some prerequisites:


Bhumip Khasnabish, et al.  Expires December 31, 2012           [Page 29]

Internet-Draft  Mobility and Interconnection of VM & VNE       June 2012


   o It is required to ensure the networking and communication services
   remain uninterrupted between the source node and destination node
   during the migration.

   o A stateful migration may be preferred.  It may be desirable to not
   to respond to users' requests until a successful migration occurs.
   The service management program in the source server records the
   current state of VM and saves users' requests for any service/
   operation to the VM in the source node.

   o It is required to copy the state data of source VM to the target VM
   in another DC, and then the new VM in the target node (DC) can be
   activated for accepting the service requests.

   o The service management program in the source server needs to store
   (in cache) both operation request and the current state of the source
   VM, and send those over the network to the service management program
   in the target server.  As soon as the target server and VM become
   ready, the service management program in the target server publishes
   the received operation request to the target VM.  The target VM takes
   the received final state information of the source VM as the initial
   operational parameters.

   However, in real-life operations, system malfunction may occur in any
   one of the above four steps/scenarios.  For example, it may be
   difficult to ensure uninterrupted communication/networking between
   source node and destination node during the entire migration process.
   Maintaining sustainable network QoS may be complex, and VM migration
   may take excessively long time due to lack of timely availability of
   the required nodal/DC resources.

   Now, if the VM migration time is excessively long, the users may need
   to be allowed to continuously use the source VM, and the changes of
   data during the migration must also be recorded.  At the same time it
   is required to take measures to ensure that the amount of change in
   the database and application is as small as possible.  This will help
   achieve faster recovery, and at the same time the interruption due to
   VM migration will be almost imperceptible to the users.

   It may be useful if IETF proposes a standard definition of the
   uninterrupted service for the VM migration scenario.  This definition
   along with the parameters can be the basis for checking the maturity
   of various VM migration solutions.  The definition should take into
   account the time that the users/services can tolerate without giving
   any perception of interruption in the operation.  Total time is the
   addition to the time required for execution of the four steps/
   processes that are mentioned at the beginning of the section.  It may
   be expected that the most mature solution in each of the steps/


Bhumip Khasnabish, et al.  Expires December 31, 2012           [Page 30]

Internet-Draft  Mobility and Interconnection of VM & VNE       June 2012


   process will offer fastest and best solution to the VM migration
   process.

   The next problem related to this topic is the Physical Device
   Compatibility Problem.
   When migrating a VM from one Physical Machine (PM) to another, if the
   VM is depending on some special driver, hardware, which are NOT
   available in the target PM, the migration process will fail.  For
   example, if a VM is using IOMMU technology which is used to access
   real hardware directly (not emulated by hypervisor, for high
   performance) from VM, and this device is not available in the target
   PM, VM migration process will fail.  Therefore a basic requirement
   related to VM migration is checking for strict compatibility between
   source and target PM before initiating the migration process.

   Another problem related to this topic is migration of VMs between
   Heterogeneous Hypervisors.  We note that some virtual network
   functions are implemented in hypervisor, such as vSwitch in VMware.

   Additional requirements related to the above are as follows: stateful
   and stateless VMMI processing need to be be treated separately.
   Stateless VMMI processing refers to the fact that the protocol state
   for the transaction does not need to be preserved in memory.  This
   lack of state means that if the follow-up processing is needed before
   processing the information, it must be retransmitted.  This means
   that it could lead to significant increase in the amount of data that
   need to be transferred as the number of connections increases.  For
   stateless VM migration, there is no need transfer previous state
   information and hence lightweight processing and fast response can be
   achieved.


5.2.  Load Balancing after VM Migration and Integration

   In the migration of virtual machines between data centers, users are
   provided with the nearest calculation principle of "follow the sun",
   or multi-site load balancing requirements.  In addition, for reducing
   energy consumption, cooling costs and other similar considerations,
   the virtual machines can be integrated into less dynamic data
   centers, which is the future trend of the so-called "Green" data
   centers.

   The challenge related to this topic is how to solve the problem of
   load-balancing.  For example, before the migration of VM, loading of
   the source VM server and network traffic distribution may be load-
   balanced locally, and the loading of the destination VM server and
   network traffic distribution may be load-balanced locally.  However,
   after the migration of VM from the source server to destination


Bhumip Khasnabish, et al.  Expires December 31, 2012           [Page 31]

Internet-Draft  Mobility and Interconnection of VM & VNE       June 2012


   server, both loading condition and traffic distribution may not be
   balanced even for some extended time period.

   Therefore, it may be useful to define and enforce a set of policy in
   order to allocate VM and other networking and computing resources
   uniformly across data centers.  Of course the software, hardware and
   networking environments of the source and destination servers should
   be also as similar as possible.

5.3.  Security and Authentication of VMMI

   During the VMMI / VM migration process, it is required to give proper
   considerations to the security related matters; this includes solving
   traffic roundabout issues, ensuring that the firewall functionalities
   are appropriately enacted, and so on.

   Therefore, in addition to authorization and authentication,
   appropriate policies and measures to check/enforce the security level
   must be in place while migrating VMs from one DC to another,
   especially from a private DC to a public DC in the Cloud [NIST 800-
   145, Cloud/DataCenter SDO Survey].
   For example, when a VM is migrated to the destination DC network, the
   corresponding switch port of the VM and its host server should
   utilize the port strategy of the source switch.  The end time of the
   VM migration and the issue time of the strategy must be synchronized.
   If the former is earlier than the latter, the services may not get a
   timely response, and if the former is later than the latter, it may
   not have exact level of network security for a time period.
   What may be helpful in such environment is the creation and
   maintenance of a reasonable interactive state machine.

5.4.  Efficiency of Data Migration and Fault Processing

   It may be useful to streamline data before commencing VM migration.
   Incremental migration may help improve VM migration efficiency.  For
   example, plan to transfer only differentiated data during VM
   migration process between two DCs.  However, this strategy may have
   the risk of propagating faults between DCs.
   In addition, if VM migration occurs between heterogeneous database
   systems, such as transfer of data from ORACLE database in Linux
   system to SQL Server database in Windows system, it is necessary to
   define the security and policy when fault occurs.  The processing of
   VM migration may be slower when database migration operation fails,
   and there may be a need to roll back to previous stable states for
   all of the databases involved in VM migration.  Similar issues are
   being discussed in DMTF [DMTF VSMP] as well.


Bhumip Khasnabish, et al.  Expires December 31, 2012           [Page 32]

Internet-Draft  Mobility and Interconnection of VM & VNE       June 2012


5.5.  Robustness Problems

5.5.1.  Robustness of VM Migration

   During normal operations, VMs may encounter a series of challenges,
   e.g., CPU overloaded, memory and storage stress, disk space
   limitation, excessive program response time, database write-up
   failure, file system failure, etc.

   If any of the above issues cannot be resolved in a timely fashion, it
   will lead to the collapse of the VM migration process.  As a part of
   the recovery process, the VM management process should take a
   snapshot of all data in the VM and copy them into a blank VM (VM
   template) in the current or a distant server with an objective to
   prevent any service disruption.  The snapshot can be stateful or
   stateless, depending on (a) the status, nature, and function of the
   owner to which various data belongs to in the VM, and (b) the
   strategy of replication.  For example, for the data in the database,
   a stateful snapshot needs to be taken, because the database itself
   has the ability to record the running state of the database.
   We note that any incremental migration of VM state is not sufficient
   to guarantee service continuity.  Another alternative solution may be
   warranted.

   During VM migration process if the speed of writing is faster than
   the data transfer (from source VM location to destination VM
   location) rate, the VM state transfer has to be paused to adjust the
   time for bulk data transfer.  During this adjustment period, the
   service downtime will occur.  It is required to develop methods and
   mechanisms to overcome such service discontinuity.

5.5.2.  Robustness of VNE

   During normal operations, VNEs may encounter a series of challenges,
   e.g., CPU overloaded, memory stress, space limitation of MAC table
   and forwarding table, lack of routing convergence, excessive program
   response time, file system failure, etc.

   If any of the above issues cannot be resolved in a timely fashion, it
   will lead to the collapse of VNE migration.  As a part of the
   recovery process, the VNE management process should take a snapshot
   of all data in the VNE and copy them into an idle/unassigned VNE in
   the current or a distant node with an objective to prevent any
   service disruption.

   The snapshot can be stateful or stateless, depending on (a) the
   status, nature, and function of the owner to which various data
   belongs to in the VNE, and (b) the strategy of replication.


Bhumip Khasnabish, et al.  Expires December 31, 2012           [Page 33]

Internet-Draft  Mobility and Interconnection of VM & VNE       June 2012


   For example, for stateful snapshot of a VNE both protocol state and
   the status of forwarding table need to be captured and transferred to
   the new (migrated) location of the VNE.


6.  Acknowledgement

   The following experts have provided valuable comments on the earlier
   version of this draft: Thomas Narten, Christopher LILJENSTOLPE,
   Steven Blake, Ashish Dalela, Melinda Shore, David Black, Joel M.
   Halpern, Vishwas Manral, Lizhong Jin, Juergen Schoenwaelder, Donald
   Eastlake, and Truman Boyes.  We express our sincere thanks to them,
   and expect that they will continue to provide suggestions in future.


7.  References

   [PBB-VPLS] Balus, F. et al.  "Extensions to VPLS PE model for
   Provider
   Backbone Bridging", draft-ietf-l2vpn-pbb-vpls-pe-model-
   04.txt (work in progress), October 2011.

   [VM-Mobility] Raggarwa, R. et al.  "Data Center Mobility based on
   BGP/MPLS, IP Routing and NHRP", draft-raggarwa-data-center-
   mobility-01.txt (work in progress), September 2011.

   [DCN Ops Req] A. Dalela.  "Datacenter Network and Operations
   Requirements",
   draft-dalela-dc-requirements-00.txt, December 30, 2011

   [DMTF VSMP] DMTF.  "Virtual System Migration Profile",
   DSP1081, Version: 1.0.0c, May 2010

   [VPN Applicability] Nabil Bitar.  "Cloud Networking: Framework and
   VPN Applicability",
   draft-bitar-datacenter-vpn-applicability-01.txt, October 2011

   [VXLAN] M.Mahalingam.  "VXLAN: A Framework for Overlaying Virtualized
   Layer 2 Networks over Layer 3 Networks",
   draft-mahalingam-dutt-dcops-vxlan-01.txt, February 24, 2012

   [NIST 800-145] NIST Special Publication 800-145, Peter Mell and
   Timothy Grance, The NIST definition of cloud computing,
   http://csrc.nist.gov/publications/nistpubs/800-145/SP800-145.pdf,
   September 2011

   [Cloud/DataCenter SDO Survey] B. Khasnabish and C. JunSheng.  "Cloud/
   DataCenter SDO Activities Survey and Analysis",


Bhumip Khasnabish, et al.  Expires December 31, 2012           [Page 34]

Internet-Draft  Mobility and Interconnection of VM & VNE       June 2012


   draft-khasnabish-cloud-sdo-survey-02.txt, December 28, 2011

   [NVGRE] M. Sridharan.  "NVGRE: Network Virtualization using Generic
   Routing Encapsulation",
   draft-sridharan-virtualization-nvgre-00.txt, September 2011

   [NVO3] Thomas Narten. " NVO3: Network Virtualization", l2vpn-9.pdf,
   November 2011

   [Network State Migration] Yingjie Gu,
   "draft-gu-opsawg-policies-migration-01",
   draft-gu-opsawg-policies-migration-01.txt,October 2011

   [Matrix DCN] Sun et al , "Matrix Fabric based Data Center Network",
   draft-sun-matrix-dcn-00.txt,Work in progress, 2012.


8.  Security Considerations

   To be added later, on as-needed basis.


9.  IANA Consideration

   The extensions that are discussed in this draft are related to DC
   operations environment.


10.  Normative References

   [RFC4364]  Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private
              Networks (VPNs)", RFC 4364, February 2006.


Authors' Addresses

   Bhumip Khasnabish
   ZTE USA,Inc.
   55 Madison Avenue, Suite 160  Morristown, NJ 07960
   USA

   Phone: +001-781-752-8003
   Email: vumip1@gmail.com, bhumip.khasnabish@zteusa.com


Bhumip Khasnabish, et al.  Expires December 31, 2012           [Page 35]

Internet-Draft  Mobility and Interconnection of VM & VNE       June 2012


   Bin Liu
   ZTE Corporation
   15F, ZTE Plaza, No.19 East Huayuan Road,Haidian District
   Beijing  100191
   P.R.China

   Phone: +86-10-59932098
   Email: richard.bohan.liu@gmail.com,liu.bin21@zte.com.cn


   Baohua Lei
   China Telecom
   118, St. Xizhimennei, Office 709, Xicheng District
   Beijing
   P.R.China

   Phone: +86-10-58552124
   Email: leibh@ctbri.com.cn


   Feng Wang
   China Telecom
   118, St. Xizhimennei, Office 709, Xicheng District
   Beijing
   P.R.China

   Phone: +86-10-58552866
   Email: wangfeng@ctbri.com.cn


Bhumip Khasnabish, et al.  Expires December 31, 2012           [Page 36]