Network Working Group N. Zong Internet-Draft L. Dunbar Intended status: Informational Huawei Technologies Expires: August 16, 2014 M. Shore No Mountain Software D. Lopez Telefonica February 12, 2014 Virtualized Network Function (VNF) Pool Problem Statement draft-zong-vnfpool-problem-statement-03 Abstract Network functions are traditionally implemented on specialized hardware and less on commodity server, but there is a clear trend to implement a number of network functions, such as firewall or load balancer, as software on virtualized computing platforms. These virtualized functions are called Virtualized Network Functions (VNFs). We call a group of such VNFs a VNF set, which can be used to build network services. The use of VNF set to build network services introduces additional challenges on reliability, such as additional points of failure and the need to coordinate various VNFs in a VNF set. This document discusses the problems related to the reliability of VNF set. A VNF pooling architecture is also briefly introduced, as well as related IETF protocols. Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on August 16, 2014. Zong, et al. Expires August 16, 2014 [Page 1] Internet-Draft VNF Pool Problem Statement February 2014 Copyright Notice Copyright (c) 2014 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents 1. Background . . . . . . . . . . . . . . . . . . . . . . . . . 2 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 3. Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 4 3.1. From Specialized Hardware to Virtualized Network Function 4 3.2. The Concept of VNF Set . . . . . . . . . . . . . . . . . 4 3.3. Problems . . . . . . . . . . . . . . . . . . . . . . . . 5 4. VNF Pooling Architecture . . . . . . . . . . . . . . . . . . 7 5. Related Works . . . . . . . . . . . . . . . . . . . . . . . . 9 6. Security Considerations . . . . . . . . . . . . . . . . . . . 10 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 10 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 10 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 10 9.1. Normative References . . . . . . . . . . . . . . . . . . 10 9.2. Informative References . . . . . . . . . . . . . . . . . 10 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 11 1. Background Network functions such as firewall, load balancer, WAN optimization are conventionally deployed as specialized hardware servers in both network operators' networks and data center networks, as the building blocks of the network services. A Virtualized Network Function (VNF) may also provide such network function through an implementation as software instance(s) running on commodity server(s) via a virtualization layer (i.e., hypervisor). VNFs could potentially offer benefits such as elastic service offering, reduced operational and equipment costs [NFV-WP]. There is thus a trend to move network functions from specialized hardware servers to commodity servers, based on virtualized computing Zong, et al. Expires August 16, 2014 [Page 2] Internet-Draft VNF Pool Problem Statement February 2014 platforms, in order to build network services by using a group of VNFs. For example, in Service Function Chaining (SFC), a network service can be built using a group of sequentially connected VNFs deployed at different points in the network [SFC]. We call a group of VNFs a VNF set, which can be used not only as a SFC, but also merely as multiple VNFs. The use of VNF set to build network services introduces additional challenges on reliability, such as additional points of failure and the need to coordinate various VNFs in a VNF set. For a single VNF, it typically would not have built-in reliability mechanisms on its host (i.e., a commodity server). Instead, there are more factors of risk such as software failure, server overload, and instance migration that may lead to VNF instance failure. Existing pooling and other redundancy mechanisms may be applied to address some reliability requirements of a single VNF. However, the complexity of coordinating a growing number of VNFs including stateful and stateless functions, and extending the redundancy within a VNF set (i.e., multiple pools for multiple VNFs) requires further solution development. For example, when a live VNF pool member goes out of service, how do adjacent entities learn which pool member will replace it? How do VNFs learn the states of adjacent VNFs before the failure of the adjacent VNFs happens? How is the service states of an instance held and accessed for efficient synchronization with backup instances and other members of its pool? This document discusses the problems related to the reliability of VNF set. A VNF pooling architecture is also briefly introduced, as well as related IETF protocols. 2. Terminology Reliability: capability of a functional entity to consistently provide its function under various dynamic and even unexpected conditions such as fault, overload, etc. Virtualized Network Function (VNF): a VNF provides the same functional behavior and interfaces as the equivalent network function, but is deployed as software instance(s) building on top of a virtualization layer [NFV-TERM]. VNF Pool: a group of VNF instances providing the same network function. VNF Pool Element: a VNF instance inside a VNF pool. Zong, et al. Expires August 16, 2014 [Page 3] Internet-Draft VNF Pool Problem Statement February 2014 VNF Pool User: an entity that requests network function(s) provided by the VNF pool(s). VNF Pool Manager: an entity that manages VNF pool elements, and interacts with the VNF pool user to provide the network function. VNF Set: a group of VNFs that are distributed in multiple VNF pools. 3. Problems 3.1. From Specialized Hardware to Virtualized Network Function Network functions are traditionally implemented on specialized hardware. There is a trend to implement a number of network functions as software instances on commodity servers, via virtualized computing platforms. These virtualized functions are called Virtualized Network Functions (VNFs). For example, in Figure 1, virtual firewalls (vFWs) can be deployed as modular software instances on commodity servers, which could be located in Data Center (DC) networks, network operators' networks, or in the end user premises. Compared with traditional FW deployed as "standalone box" combining specialized hardware and software, vFW has potential advantages such as agility, scalability [NFV-WP]. FW vFW vFW vFW +-------------+ +-----------+ +-----------+ +-----------+ | Specialized | |FW Software| |FW Software| |FW Software| ... | Hardware |----\ +-----------+ +-----------+ +-----------+ | + |----/ +------------------------------------------+ | Software | | Virtualization Platform | +-------------+ +------------------------------------------+ +-----------+ +-----------+ | Commodity | | Commodity | | Server | | Server | ... ... +-----------+ +-----------+ Figure 1: Example of vFW. 3.2. The Concept of VNF Set We call a group of VNFs a VNF set. A VNF set can include a single type of VNF or multiple types of VNF. The following examples are all valid VNF sets. 1. n vFW instances: {vFW#1,vFW#2,...,vFW#n}. 2. m vFW instances and k virtual load balancer (vLB) instances: {vFW#1,...,vFW#m,vLB#1,...,vLB#k}. Zong, et al. Expires August 16, 2014 [Page 4] Internet-Draft VNF Pool Problem Statement February 2014 To be more generic, we denote VNF-A#x the xth instance of VNF type A, VNF-B#y the yth instance of VNF type B, and so on. A VNF set can be used as a Service Function Chaining (SFC) [SFC], where the instances of various functions are sequentially connected to build a network service. A simple example is shown in Figure 2. Network Service +----------+ +----------+ +----------+ | VNF-A#x | data conn | VNF-B#y | data conn | VNF-C#z | | |-----------| |-----------| | +----------+ +----------+ +----------+ Figure 2: A VNF set used as a SFC. Alternatively, a VNF set can be also used merely as multiple VNFs, where these VNFs can provide network service in a parallel way. An example is shown in Figure 3. +----------+ +----------+ +----------+ | VNF-A#x | | VNF-B#y | | VNF-C#z | +----------+ +----------+ +----------+ \ | / data conn \ |data /data conn \ |conn / \ | / +---------------+ | Client | +---------------+ Figure 3: A VNF set used as multiple VNFs. A more detailed use case study of VNF set is documented in several separated drafts [VNFPOOL-UC1] [VNFPOOL-UC2]. 3.3. Problems The use of VNF set to build network services introduces additional challenges on reliability, as listed below. 1. More potential causes of VNF instance failure. A VNF typically would not have built-in reliability mechanisms on its host (i.e., a commodity server). Instead, there are more factors of risk that may lead to VNF instance failure or transition conditions. 1) Hardware failure or status change such as server over- utilization. Zong, et al. Expires August 16, 2014 [Page 5] Internet-Draft VNF Pool Problem Statement February 2014 2) Software failure at various levels including hypervisor, Virtual Machine (VM), VNF instance. 3) Instance scaling in/out/up/down, or migration caused by instance performance downgrade, server consolidation or other service requirement changes. 2. Backup advertisement. Although the existing pooling and other redundancy mechanisms may be applied to address some reliability requirements of a single VNF, multiple pools for multiple VNFs may require extended redundancy mechanisms. For example, before a live VNF instance fails, one or more backup instances in the same VNF pool need to be selected and advertised to the adjacent entities such as another VNF pool. Who is responsible and how to select and advertise such backup instance(s) within the VNF set? 3. State notification. A VNF may need to learn the states of adjacent VNFs before the failure of the adjacent VNFs happens. Some critical states include the performance downgrade due to resource contention between instances, instance migration, scaling in/out/up/down, and so on. Who is responsible and how to notify such critical states within the VNF set? 4. Service state synchronization. Service states related to the specific function performed by a VNF, e.g., NAT translation table, TCP connection states, should be synchronized between a live VNF instance and its backup instance(s) for stateful failover. Who is responsible and how to collect, hold, as well as access such service states to achieve efficient synchronization? A VNF should provide negotiated level of state sharing with the necessary performance to fulfill the service requirements - e.g., state synchronization method, format of state data, location and mechanism to access state data. 5. Complication of VNFs placement. There are multiple policies influencing the appropriate placement of VNFs. For example, it should be avoided that a live VNF instance and its backup instance(s) are placed in a single physical server, or locations with shared risks in the network. On the other hand, it would be desirable to place the live and backup instances in topologically closed locations. A VNF set may need to collect information from the underlying network - e.g., interface with Application Layer Traffic Optimization (ALTO) [ALTO], Interface to Routing System (I2RS) [I2RS]. 6. Reliable transport. The transport network should provide alternative paths for accessing a VNF instance, as well as for the aforementioned control traffic, to prevent single point of failure Zong, et al. Expires August 16, 2014 [Page 6] Internet-Draft VNF Pool Problem Statement February 2014 in the network. Transport redundancy mechanisms like Multipath TCP (MPTCP) [MPTCP], Stream Control Transmission Protocol (SCTP) [RFC3286] need to be identified and analyzed for reuse. Ideally, the reliability of a VNF set means that the network service provided by such a VNF set will continue throughout an interruption, and the outage of one or more VNFs will not be visible to the users of the VNF set. This work initially focuses on several mechanisms supporting the reliable VNF set, which are mainly the redundancy within VNF set and the stateful failover. Additional mechanisms for reliable VNF set might be included after future gap analysis between identified requirements and existing IETF technologies. Detailed analysis of VNF reliability can be found in [NFV-REL]. Also note that this work currently does not intend to resolve the service availability issue, although the reliability of VNF set will benefit service availability. 4. VNF Pooling Architecture There are a number of existing technologies for providing reliable and highly available functions, such as Reliable Server Pool (RSerPool) [RFC5351], Virtual Router Redundancy Protocol (VRRP) [RFC5798], amongst many others. Both technologies provide the service with an abstract object (e.g., pool handle in RSerPool, virtual router ID in VRRP) representing a group of functional instances where the dynamic mapping of such abstract object to the actual serving instance, or the selection of serving instance, is managed internally in the group to cover the failover procedure. The advantage is to provide reliable and highly available functions in a transparent manner for both end-hosts and other service components. Based on this idea, we describe a preliminary VNF pooling architecture to address the aforementioned problems for reliable VNF set. A simple pooling diagram is depicted below. Zong, et al. Expires August 16, 2014 [Page 7] Internet-Draft VNF Pool Problem Statement February 2014 +-----------------+ | Pool User | +-----------------+ ^ ^ | | +-----------+ +-----------+ | | v v +--------------+ +--------------+ | Pool Manager |<-------------->| Pool Manager | +--------------+ +--------------+ ^ ^ | | v v +------------------------------+ +------------------------------+ |+----------+ +----------+ | | +----------+ +----------+| || VNF-A#1 | | VNF-A#n | | | | VNF-B#1 | | VNF-B#m || || | ... | |<+---+>| | ... | || |+----------+ +----------+ | | +----------+ +----------+| | VNF-A Pool | | VNF-B Pool | +------------------------------+ +------------------------------+ Figure 4: VNF Pooling Architecture. In VNF pooling architecture, a VNF set is a group of VNFs distributed in multiple VNF pools. Each VNF pool contains a group of VNF instances (also called VNF pool elements) providing the same network function. Each VNF pool also has a VNF pool manager that manages the pool elements, and interacts with the VNF pool user to provide the network function. A VNF pool user can be either an application end- host or a service component (e.g., orchestrator in DC service) requesting the network functions. The VNF pooling architecture will address the problems for reliable VNF set in the following perspectives. 1. Each VNF pool manager communicates with the VNF pool elements under its responsibility to transmit messages for backup instance selection, service state synchronization. 2. Different VNF pool managers from different VNF pools communicate with each other to transmit messages for backup instance advertisement, and instance state notification within the VNF set. 3. Different VNF pool elements from different VNF pools may also communicate with each other to transmit messages for backup instance advertisement, and instance state notification. Zong, et al. Expires August 16, 2014 [Page 8] Internet-Draft VNF Pool Problem Statement February 2014 4. A VNF pool manager may communicate with the VNF pool user to obtain the policy, in order to make the decision on appropriate placement of the VNF instances. If needed, a VNF pool manager may also interface with ALTO, and/or I2RS to collect information from the underlying network. It could also be possible that the VNF pool user receives information from the VNF pool manager and update policy based on the VNF pool state. The detailed solution will be documented in a separated draft [VNFPOOL-ARCH]. 5. Related Works 1. Reliable Server Pool (RSerPool). RSerPool supports high availability and scalability of the applications through the use of pools of servers [RFC5351]. The main functions of RSerPool involve server pool management, as well as receiving requests from a client to bind to a desired server. The applicability of (and gaps between) RSerPool to (and) reliable VNF set is described in a separated draft [VNFPOOL-RSP]. 2. Virtual Router Redundancy Protocol (VRRP). VRRP specifies an election protocol that dynamically assigns responsibility of a virtual router to one of the VRRP routers called master on a LAN [RFC5798]. The election process provides dynamic failover in the forwarding responsibility should the Master become unavailable. The advantage of VRRP is a higher availability default path without requiring configuration of dynamic routing or router discovery protocols on every end-host. 3. Service Function Chaining (SFC). A service chain defines an ordered set of service functions that must be applied to packets [SFC]. A VNF set can be used as a SFC, where a group of VNFs are sequentially connected to implement a network service. SFC and the reliable VNF set are independent but complementary with each other in the following aspects. 1) SFC targets on steering packets among VNFs. Reliable VNF set focuses on redundancy for VNFs, e.g., selecting standby instances, handling instance transition/failure cases, without caring about how to construct the data path. 2) A VNF pool manager could interact with SFC control entity to either advertise the status of redundant VNF instance, or receive the redundancy requirement from SFC control entity, and so on. Zong, et al. Expires August 16, 2014 [Page 9] Internet-Draft VNF Pool Problem Statement February 2014 3) A VNF set is not only used in the case of "chained VNFs", but also applicable to other cases where the VNFs are not necessarily sequentially connected. 6. Security Considerations Any technology which allows the insertion, deletion, reordering, or manipulation of network functions has the potential to be subverted by an attacker, with serious consequences. Distributed VNFs introduce an additional attack vector, in which bad actors join several VNFs of a service. Replay attacks have the potential to create denials of service, reordering, adding, or removing VNFs. VNF reliability technologies must provide cryptographic protections against spoofing and insertion attacks as well as replay attacks, in the form of client authentication, origin authentication on VNF reliability management (control plane) traffic, and replay protections. There may be circumstances under which an attacker masquerading as a VNF manager can introduce data leakage or similar attacks, and consequently server authentication would be required, as well. 7. IANA Considerations This document has no actions for IANA. 8. Acknowledgements The authors would like to thank Chidung Lac from Orange, Daniel King from Lancaster University, Lingli Deng, Zhen Cao from China Mobile, Richard Yang from Yale University, Hidetoshi Yokota from KDDI, Mukhtiar Shaikh from Brocade, Susan Hares, for their valuable comments. 9. References 9.1. Normative References TBD. 9.2. Informative References [NFV-WP] NFV Whitepaper: "Network Function Virtualization", issue 1, 2012, http://portal.etsi.org/NFV/NFV_White_Paper.pdf. [SFC] "Service Function Chaining (SFC)", . Zong, et al. Expires August 16, 2014 [Page 10] Internet-Draft VNF Pool Problem Statement February 2014 [NFV-TERM] ETSI GS NFV 003: "Terminology for Main Conceptional Entities in NFV", Version 0.0.4, 2013. [VNFPOOL-UC1] L. Xia, Q. Wu and D. King, "Use Cases and Requirements for Virtual Service Node Pool Management", draft-xia-vsnpool- management-use-case-01, August 2013. [VNFPOOL-UC2] S. Hares and K. Subramaniam, "Use Cases for Resource Pools with Virtual Network Functions (VNFs)", draft-hares-vnf-pool- use-case-00, January 2014. [ALTO] "Application-Layer Traffic Optimization (alto)", . [I2RS] "Interface to the Routing System (i2rs)", . [MPTCP] "Multipath TCP (mptcp)", . [RFC3286] L. Ong and J. Yoakum, "An Introduction to the Stream Control Transmission Protocol (SCTP)", RFC3286, May 2002. [NFV-REL] ETSI GS NFV REL 001: "Network Function Virtualization; Resiliency Requirements", Version 0.0.1, 2013. [RFC5351] P. Lei, L. Ong, M. Tuexen and T. Dreibholz, "An Overview of Reliable Server Pooling Protocols", RFC5351, September 2008. [RFC5798] S. Nadas, "Virtual Router Redundancy Protocol (VRRP) Version 3 for IPv4 and IPv6", RFC5798, March 2010. [VNFPOOL-ARCH] TBD. [VNFPOOL-RSP] T. Dreibholz, M. Tuexen, M. Shore and N. Zong, "The Applicability of Reliable Server Pooling (RSerPool) for Virtual Network Function Resource Pooling (VNFPOOL)", draft-dreibholz- vnfpool-rserpool-applic-00, October 2013. Authors' Addresses Ning Zong Huawei Technologies Email: zongning@huawei.com Zong, et al. Expires August 16, 2014 [Page 11] Internet-Draft VNF Pool Problem Statement February 2014 Linda Dunbar Huawei Technologies Email: linda.dunbar@huawei.com Melinda Shore No Mountain Software Email: melinda.shore@nomountain.net Diego Lopez Telefonica Email: diego@tid.es Zong, et al. Expires August 16, 2014 [Page 12]