Internet DRAFT - draft-gu-statemigration-arch

draft-gu-statemigration-arch






Network Working Group                                              Y. Gu
Internet-Draft                                                    Huawei
Intended status: Standards Track                                M. Shore
Expires: August 23, 2013                            No Mountain Software
                                                            S. Sivakumar
                                                           Cisco Systems
                                                                D. Zhang
                                                     Huawei Technologies
                                                       February 19, 2013


             An Architecture for Middlebox State Migration
                    draft-gu-statemigration-arch-01

Abstract

   This draft use several motivation use cases to indicate the
   importance of the state migration work.  An architecture and
   components of a solution is given conceiving the use cases, together
   with the interfaces and data models that are required in the
   architecture.

Status of this Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at http://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on August 23, 2013.

Copyright Notice

   Copyright (c) 2013 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents



Gu, et al.               Expires August 23, 2013                [Page 1]

Internet-Draft        State Migration Architecture         February 2013


   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.


Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
   2.  Terminology and concepts . . . . . . . . . . . . . . . . . . .  3
   3.  VM migration in a virtual data center network  . . . . . . . .  5
     3.1.  Scenario 1 . . . . . . . . . . . . . . . . . . . . . . . .  5
       3.1.1.  Description  . . . . . . . . . . . . . . . . . . . . .  5
     3.2.  Scenario 2 . . . . . . . . . . . . . . . . . . . . . . . .  6
       3.2.1.  Description  . . . . . . . . . . . . . . . . . . . . .  6
   4.  Architecture and Components  . . . . . . . . . . . . . . . . .  8
   5.  Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . .  9
   6.  Message Flow . . . . . . . . . . . . . . . . . . . . . . . . . 10
   7.  Security Considerations  . . . . . . . . . . . . . . . . . . . 11
   8.  References . . . . . . . . . . . . . . . . . . . . . . . . . . 12
     8.1.  Normative Reference  . . . . . . . . . . . . . . . . . . . 12
     8.2.  Informative Reference  . . . . . . . . . . . . . . . . . . 12
   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 12



























Gu, et al.               Expires August 23, 2013                [Page 2]

Internet-Draft        State Migration Architecture         February 2013


1.  Introduction

   It has been introduced in [framework] that an end-to-end network flow
   typically traverses one or more stateful "middlebox," such as
   firewalls, NATs, and TCP or traffic optimizers.  In order to process
   the packets in the flow correctly, the middleboxes need to establish
   and maintain associated state for the flow.  In some cases (for
   example, VM migration or load balancing), the path of a flow though
   the network may change, and packets of that flow may be processed by
   new middleboxes.  The new middlebox may not be able to process the
   packets properly if they do not have the associated state information
   as instantiated in the original middlebox.

   This draft first introduces several network scenarios demonstrating
   state migration problems and then describes interfaces, components,
   and messages to address the issues raised in those scenarios.


2.  Terminology and concepts

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in [RFC2119].

   The document uses terms defined in [framework].

   VN: A virtual network, or "VN," is a network that consists of
   virtualized links, based on virtualized interfaces.  It may exist in
   network layer 2 (L2) or network layer 3 (L3).

   VNI: Virtual Network Instance.  This is one instance of a virtual
   overlay network.  We assume that two VNs are isolated from one
   another and may use overlapping addresses.

   NVE: Network Virtualization Edge.  It is a network entity that sits
   on the edge of an NVO3 network.  It implements network virtualization
   functions that allow for L2 and/or L3 tenant separation and for
   hiding tenant addressing information (MAC and IP addresses).  An NVE
   could be implemented as part of a virtual switch within a hypervisor,
   a physical switch or router, a Network Service Appliance or even be
   embedded within an End Station.

   Virtual Network Context or VN Context: Field that is part of the
   overlay encapsulation header which allows the encapsulated frame to
   be delivered to the appropriate virtual network endpoint by the
   egress NVE.  The egress NVE uses this field to determine the
   appropriate virtual network context in which to process the packet.
   This field MAY be an explicit, unique (to the administrative domain)



Gu, et al.               Expires August 23, 2013                [Page 3]

Internet-Draft        State Migration Architecture         February 2013


   virtual network identifier (VNID) or MAY express the necessary
   context information in other ways (e.g. a locally significant
   identifier).

   VNID: Virtual Network Identifier.  In the case where the VN context
   has global significance, this is the ID value that is carried in each
   data packet in the overlay encapsulation that identifies the Virtual
   Network the packet belongs to.

   Underlay or Underlying Network: This is the network that provides the
   connectivity between NVEs.  The Underlying Network can be completely
   unaware of the overlay packets.  Addresses within the Underlying
   Network are also referred to as "outer addresses" because they exist
   in the outer encapsulation.  The Underlying Network can use a
   completely different protocol (and address family) from that of the
   overlay.

   Data Center (DC): A physical complex housing physical servers,
   network switches and routers, Network Service Appliances and
   networked storage.  The purpose of a Data Center is to provide
   application and/or compute and/or storage services.  One such service
   is virtualized data center services, also known as Infrastructure as
   a Service.

   VM: Virtual Machine.  Several Virtual Machines can share the
   resources of a single physical computer server using the services of
   a Hypervisor (see below definition).

   Hypervisor: Server virtualization software running on a physical
   compute server that hosts Virtual Machines.  The hypervisor provides
   shared compute/memory/storage and network connectivity to the VMs
   that it hosts.  Hypervisors often embed a Virtual Switch (see below).

   Virtual Switch: A function within a Hypervisor (typically implemented
   in software) that provides similar services to a physical Ethernet
   switch.  It switches Ethernet frames between VMs' virtual NICs within
   the same physical server, or between a VM and a physical NIC card
   connecting the server to a physical Ethernet switch.  It also
   enforces network isolation between VMs that should not communicate
   with each other.

   Tenant: A customer who consumes virtualized data center services
   offered by a cloud service provider.  A single tenant may consume one
   or more Virtual Data Centers hosted by the same cloud service
   provider.

   Tenant End System: It defines an end system of a particular tenant,
   which can be, for example, a virtual machine (VM), a non-virtualized



Gu, et al.               Expires August 23, 2013                [Page 4]

Internet-Draft        State Migration Architecture         February 2013


   server, or a physical appliance.

   Virtual Access Points (VAPs): Tenant End Systems are connected to the
   Tenant Instance through Virtual Access Points (VAPs).  The VAPs can
   be in reality physical ports on a ToR or virtual ports identified
   through logical interface identifiers (VLANs, internal VSwitch
   Interface ID leading to a VM).

   VN Name: A globally unique name for a VN.  The VN Name is not carried
   in data packets originating from End Stations, but must be mapped
   into an appropriate VN-ID for a particular encapsulating technology.
   Using VN Names rather than VN-IDs to identify VNs in configuration
   files and control protocols increases the portability of a VDC and
   its associated VNs when moving among different administrative domains
   (e.g. switching to a different cloud service provider).

   VSI: Virtual Station Interface.  Typically, a VSI is a virtual NIC
   connected directly with a VM.  [bridging]


3.  VM migration in a virtual data center network

   In this section we look at middlebox flow-associated state migration
   during VM migration, using two different but related network
   scenarios for demonstration purposes.

3.1.  Scenario 1

3.1.1.  Description

   As illustrated in Figure 1, there is a virtual data center network
   supporting multiple tenants.  Three servers, Server1, Server2 and
   Server3 are connected to the network with three virtual network edge
   devices, VNE1, NVE2, and NVE3 respectively.  The packets transported
   between NVE1 and NVE3 are processed by a stateful firewall FW1 while
   the packets transported between NVE2 and NVE3 are processed by a
   stateful firewall FW2.  The security policies deployed on FW1 and FW2
   are identical.  However, they are deployed far away from each other,
   and there is no state synchronization between them at run time.  This
   condition is typical when NVE1 and NVE2 are located in different data
   centers which have firewalls deployed at the core layer or the
   convergence layer.

   The virtual machines, VM1 and VM3, belongs to the virtual network
   instance VNI1, and are located within Server 1 and Server 3
   respectively.  VM1 and VM3 communicate using TCP.

   Pre-conditions:



Gu, et al.               Expires August 23, 2013                [Page 5]

Internet-Draft        State Migration Architecture         February 2013


   VM1 moves from Server1 to Server2.

   Post-conditions:

   TCP flows between servers VM1 and VM3 must not be disrupted.

   Requirements:

   The state information associated with VM1 on the hypervisor running
   on Server1 MUST be forwarded to the hypervisor on Server2.

   NVE2 MUST distribute the mapping information regarding flows
   associated with VM1 to other NVEs.

   The state associated with the TCP sessions between VM1 and VM3 MUST
   be conveyed to FW2 so that it can process traffic between VM1 and VM3
   using the same rules instantiated on FW1.  Otherwise, FW2 will treat
   packets based on local policy, which may include discarding the
   packets.

3.2.  Scenario 2

3.2.1.  Description

   The network topology in this case is identical to scenario 1.

   Pre-conditions:

   The path between NVE1 and NVE3, which traverses FW1, is broken.

   Post-conditions:

   The communication between VM1 and VM3 are transported through NVE2.

   The communication between VM1 and VM3 using TCP transport must not be
   disrupted.

   Requirements:

   Flow-coupled state associated with TCP sessions between VM1 and VM3
   MUST be conveyed to FW2



                     +-------------------------+
                     |   +-------+             |
                     |   |  VM3  |  Server 3   |
                     |   +---+---+             |



Gu, et al.               Expires August 23, 2013                [Page 6]

Internet-Draft        State Migration Architecture         February 2013


                     |       |                 |
                     | +-----+---------------+ |
                     | |    Hypervisor       | |
                     | +---------------------+ |
                     +-------------+-----------+
                                   |
                          +--------+---------+
                    +-----+       NVE3       +-----------+
                    |     +------------------+           |
                    |                                    |
                    |                                    |
                    |                                    |
                    |                                    |
                    |                                    |
                    |        -----------------           |
                    |/  -----                 ----- \\   |
                ////|                                 \\\|\
             ///    |                                    | \\\
          ///       |                                    |    \\\
         |   +------|-------+                   +--------+------+|
       ||    |   FW1        |                   |               | ||
       |     |              |                   |        FW2    |  |
      |      +-----+--------+                   +----+----------+   |
      |            |                                 |              |
       |           |                                 |             |
       ||          |         Network                 |            ||
         |         |                                 |           |
          \\\      |                                 |        ///
             \\\   |                                 |     ///
                \\\|\                                |/////
                   | \\\-----                 -----//|
                   |         -----------------       |
                   |                                 |
   +---------------+--+                           +--+---------------+
   |       NVE1       +---------------------------+      NVE2        |
   +--------+---------+                           +--------+---------+
            |                                              |
+-----------+-------------+                    +-----------+-------------+
| +---------------------+ |                    | +---------------------+ |
| |    Hypervisor       | |                    | |    Hypervisor       | |
| +----+----------------+ |                    | +----+----------------+ |
|      |       Server 1   |                    |      |        Server 2  |
|  +---+---+              |   VM Migration     |  +---+---+              |
|  |  VM1  |--------------+--------------------+->|  VM1  |              |
|  +-------+              |                    |  +-------+              |
+-------------------------+                    +-------------------------+





Gu, et al.               Expires August 23, 2013                [Page 7]

Internet-Draft        State Migration Architecture         February 2013


   Figure 1.  A simple example of a virtual data center


4.  Architecture and Components

   This section defines the architecture of the proposed solution which
   can fulfill the state migration requirements described in the
   scenarios in Section 3.

   State Host: The entity where the flow-coupled state information is
   generated and restored.  A state Host can be a physical Firewall, a
   Virtual Firewall, a NAT device, a IPS/IDS device, etc.

   State Orchestration Agent (SOA): an application runs on a State Host
   and collaborates with the State Orchestration Master to perform state
   migration.

   State Orchestration Master (SOM): A centralized entity which
   coordinates the state migration between State Hosts.  In order to
   accomplish its work, a SOM

   o  needs to receive notification of a VM migration,

   o  identify the destination and source State Hosts, and then

   o  trigger the state migration under the assistance of the associated
      SOAs.

   The origin of these notification is not designated in this draft.

   Topology Discovery Entity (TDE): The entity which is used to collect
   the topology information of network, including the location of State
   Hosts and VMs.  In this solution, it is the job of a TDE to monitor
   the change of the network topology caused by e.g., VM Migration and
   then find out the destination State Host and source State Host
   according to its knowledge of network topology.  How a TDE creates
   the topology information of a network and how it notices any change
   of the topology is out of the scope of this draft.  There are many
   ways to achieve it in current network deployment.












Gu, et al.               Expires August 23, 2013                [Page 8]

Internet-Draft        State Migration Architecture         February 2013


                          -----------
                          |   SOM   |
                          -----------               ========
                  ************/\************        | TDE  |
                 * __________/  \__________ *       =====^==
                * / Could be indirect link \ *           |
                 ****************************          ~~~~~~~~~~
                /                            \         |Topology|
        ---------------              --------------    ~~~~~~~~~~
        | -------     |              | -------    |
        | | SOA |     |              |? SOA |    |
        | -------     |              | -------    |
        |  State Host |              |  State Host|
        ---------------              --------------
               |                             |
         *****************************************
        *                                         *
       *                                           *
      *     Could be any kind of network topology   *
       *                                           *
        *                                         *
         *****************************************
               |                             |
               |                             |
           --------                      --------
           | VM1  |                      | VM1' |
           --------                      --------

                  Figure 1: State Migration Archittecture

   Note that the connections between a State Host and a VM can vary.
   The use cases introduced above can fit into this architecture.  This
   architecture maps well to a NVO3 Data Center network or a traditional
   Data Center network.  In a virtual firewall case, the connection
   betwen a VM and a State Host is usually within a Hypervisor, and the
   TDE can be the Hypervisor itself, which obviously has the view of the
   topology of virtualized network attached to it.


5.  Interfaces

   Based on the architecture, there are several interfaces to define.
   The interfaces and the functionality of each interface are defined
   below.

   o  IF(M,T):





Gu, et al.               Expires August 23, 2013                [Page 9]

Internet-Draft        State Migration Architecture         February 2013


      *  F1: State Migration Notification.  If a TDE detects any network
         topology changes caused by a VM Migration, it notifies the SOM
         about the changes.  The messages for state migration
         notification should includes the identities of the destination
         and source State Hosts, and the identity of the migrated VM.
         The identity of a State Host can be an IP Address, a system
         name or anything that can uniquely reprsent the State Host
         within the network.  Similarly, the identity of a VM can be
         anything that State Host can uniquely identify the VM wthin the
         network (e.g., an IP address or an VNID defined in NVO3
         documents).

   o  IF(M,A) :

      *  F1: SOA authentication and authorization. when a SOA starts
         running on State Host, the SOA first sends an authentication
         and authorization request to a SOM.  In the request, SMA should
         inform the SOM of its identity, its certification, the protocol
         versions it suports and network addresses being used.  If the
         authentication succeeds and policy permits it, the SOA is
         authorized to participate in state migration.

      *  F2: State Host Registration.  Once a SOA is authenticated by
         the SOM, the SOA collects the profile of the State Host on
         which it is running and registers the State Host on the SOM.
         The profile of the State Host includes the identity of the
         State Host, the representation of state on this specific State
         Host, functions allowed on this State Host, etc.

      *  F3: State Migration Indication.  While state migration is
         required, SOM will send a request to the corresponding SOA to
         upload or download the state which needs to be migrated.

        -------               -----------               =======
        | SOA |<---IF(M,A)--->|   SOM   |<---IF(M,T)--->| TDE |
        -------               -----------               =======

                           Figure 2: Interfaces


6.  Message Flow

   Using scenario 1 as an example, the following message flow shows a
   possible message flow.







Gu, et al.               Expires August 23, 2013               [Page 10]

Internet-Draft        State Migration Architecture         February 2013


            SOA-Src              SOM        TDM        SOA-Dst

             |<---------A&A------>|<--Authorization&----->|
                                     Authentication (A&A)
             |--State Host Reg--->|<--State Host Reg------|
   ==================================================================
   VM is
   Migrated~~                     |<-Notify---|
             |<--Upload Request---|
             |---Upload State---->|
                                  |--Get ready to-------->|
                                      Receive State

                                  |<-----Ready and -------|
                                      Download Request

                                  |---Download State----->|

                          Figure 3: Message Flow


7.  Security Considerations

   Any network technology which changes state in network devices may be
   subverted and abused by bad actors.

   Because we are modifying state in particular types of network devices
   - middleboxes such as NATs, firewalls, and load balancers - we are
   concerned with specific types of attacks.

   These attacks include:

   o  Using the middlebox signaling mechanism to open firewall pinholes
      in contravention of local policy

   o  Using the middlebox signaling mechanism to close firewall
      pinholes, creating a denial of service for the network flows
      associated with those pinholes

   o  Using the topology discovery mechanism to map network topology and
      locate devices of interest

   o  Using the signaling protocol to consume middlebox resources,
      creating a denial of service attack against the entire middlebox

   o  Using the signaling mechanism to install or alter NAT table
      mappings to route traffic to an attacker




Gu, et al.               Expires August 23, 2013               [Page 11]

Internet-Draft        State Migration Architecture         February 2013


   o  Eavesdropping on signaling or discovery mechanism traffic to watch
      for changes in network topology (state migrates when a VM
      migrates)

   The primary mechanisms against attacks on a middlebox state migration
   technology are origin authentication and integrity protection.  Every
   message MUST be authenticated as having been originated by the
   ostensible sender - it must be possible to detect forgeries.  Every
   message MUST be authenticated as having been unaltered in transit.
   It may also be desirable to encrypt the traffic to provide
   protections against eavesdropping.

   Additionally, there may be operational protections against using a
   state migration mechanism as an attack vector against data center
   networks.  For example, implementing policies against permitting
   state migration signaling traffic from outside a range of permitted
   addresses or transiting any but a limited list of network filtering
   devices.


8.  References

8.1.  Normative Reference

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", March 1997.

   [bridging]
              "IEEE P802.1Qbg Edge Virtual Bridging".

8.2.  Informative Reference

   [framework]
              Gu, Y., Shore, M., and S. Sivakumar, "A Framework and
              Problem Statement for Flow-associated Middlebox State
              Migration", October 2012.


Authors' Addresses

   Yingjie Gu
   Huawei

   Phone: +86-25-56624760
   Fax:   +86-25-56624702
   Email: guyingjie@huawei.com





Gu, et al.               Expires August 23, 2013               [Page 12]

Internet-Draft        State Migration Architecture         February 2013


   Melinda Shore
   No Mountain Software
   PO Box 16271
   Two Rivers, AK  99716
   US

   Phone: +1 907 322 9522
   Email: melinda.shore@nomountain.net


   Senthil Sivakumar
   Cisco Systems
   7100-8 Kit Creek Road
   Research Triangle Park, NC
   US

   Email: ssenthil@cisco.com


   Dacheng Zhang
   Huawei Technologies

   Email: zhangdacheng@huawei.com




























Gu, et al.               Expires August 23, 2013               [Page 13]