Network Working Group N. Zong Internet-Draft L. Dunbar Intended status: Informational Huawei Technologies Expires: November 6, 2014 M. Shore No Mountain Software D. Lopez Telefonica G. Karagiannis University of Twente May 5, 2014 Virtualized Network Function (VNF) Pool Problem Statement draft-zong-vnfpool-problem-statement-05 Abstract Network functions are traditionally implemented on specialized hardware rather than on general purpose servers, but there is a clear trend to implement a number of network functions, such as firewall or load balancer, as software on virtualized computing platforms. These virtualized functions are called Virtualized Network Functions (VNFs), which can be used to build network services. The use of VNFs to build network services introduces additional challenges on reliability, such as additional points of failure and the need to coordinate various VNFs. This document introduces a general idea of VNF Pool to support reliable function provision by the VNFs. We then highlight the reliability challenges and issues when using the VNFs to build services. Related IETF works are also briefly described. Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on November 6, 2014. Zong, et al. Expires November 6, 2014 [Page 1] Internet-Draft VNF Pool Problem Statement May 2014 Copyright Notice Copyright (c) 2014 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 3. Background . . . . . . . . . . . . . . . . . . . . . . . . . 4 3.1. From Specialized Hardware to Virtualized Network Function 4 3.2. The Concept of VNF Set . . . . . . . . . . . . . . . . . 5 4. VNF Pool . . . . . . . . . . . . . . . . . . . . . . . . . . 6 5. Challenges and Open Issues . . . . . . . . . . . . . . . . . 7 5.1. Risk factors for unreliable instance . . . . . . . . . . 7 5.2. Redundancy model inside VNF . . . . . . . . . . . . . . . 8 5.3. State synchronization inside VNF . . . . . . . . . . . . 8 5.4. Interaction between VNF and Service Control Entity . . . 8 5.5. Reliable transport . . . . . . . . . . . . . . . . . . . 9 5.6. Scope Considerations . . . . . . . . . . . . . . . . . . 9 6. Related Works . . . . . . . . . . . . . . . . . . . . . . . . 9 6.1. Reliable Server Pool (RSerPool) . . . . . . . . . . . . . 9 6.2. Virtual Router Redundancy Protocol (VRRP) . . . . . . . . 10 6.3. Service Function Chaining (SFC) . . . . . . . . . . . . . 10 7. Security Considerations . . . . . . . . . . . . . . . . . . . 10 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 11 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 11 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 11 10.1. Normative References . . . . . . . . . . . . . . . . . . 11 10.2. Informative References . . . . . . . . . . . . . . . . . 11 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 12 1. Introduction Network functions such as firewall, load balancer, WAN optimizer are conventionally deployed as specialized hardware servers in both network operators' networks and data center networks, as the building blocks of the network services. Zong, et al. Expires November 6, 2014 [Page 2] Internet-Draft VNF Pool Problem Statement May 2014 A Virtualized Network Function (VNF) provides such network function through its implementation as software instances running on general purpose servers via a virtualization layer (i.e., hypervisor). VNFs potentially offer benefits such as elastic service offering, reduced operational and equipment costs [NFV-WP]. There is a trend to move network functions from specialized hardware servers to general purpose servers based on virtualized computing platforms, in order to build network services by using VNFs. For example, in Service Function Chaining (SFC), a network service can be built using a set of sequentially connected VNF instances deployed at different points in the network [SFC]. We call a general set of VNF instances a VNF set. A VNF set can include a single or multiple VNFs (e.g., virtual firewall, virtual load balancer, etc.), and each VNF may have a number of instances providing the same function. A VNF set can be used not only as part of a SFC, but also merely as a set of VNFs without specific topological constraint. In order to provide reliable function, a VNF uses a VNF Pool to contain a number of VNF instances providing the same function. In this sense, a VNF set can be grouped into multiple VNF Pools, where each pool corresponds to a specific VNF, thus different pools provide different functions. A VNF also has a Pool Manager that manages the VNF Pool and interacts with the Service Control Entity. A Service Control Entity is an entity that combines and orchestrates a set of network functions, i.e., VNFs, to build network services. The major benefit of using VNF Pool is that the reliability mechanisms such as redundancy management are achieved by the VNF Pool inside the VNF and thus transparent to the Service Control Entity. A VNF Pool-enabled VNF still acts as a normal VNF when orchestrated by the Service Control Entity. Nevertheless, the use of VNFs can pose additional challenges on the reliability of the provided services. For a VNF instance, it typically would not have built-in reliability mechanisms on its host (i.e., a general purpose server). Instead, there are more factors of risk such as software failure, hardware failure, and instance migration that may make VNF instance unreliable. We are specifically concerned with the reliability of an individual VNF based on the VNF Pool managed inside the VNF. For example, how to manage the redundancy model, e.g., active/standby for a VNF instance in a VNF Pool? How are the service states of a VNF instance held and accessed for efficient synchronization with backup instances in a VNF Pool? We also consider the information exchanged between the VNF and Service Control Entity. For example, how can a VNF Pool be addressed by the Service Control Entity? When a live VNF instance goes out of service, how does the Service Control Entity learn which VNF instance will replace it, and learn the characteristic of the new instance? Zong, et al. Expires November 6, 2014 [Page 3] Internet-Draft VNF Pool Problem Statement May 2014 Note that we do not address the reliability related control or routing between adjacent VNFs in the whole VNF set, as such coordination could be done by the Service Control Entity. Also note that we only focus on reliability mechanisms based on VNF Pool. Other management aspects of VNF such as scaling, and load balancing are out of scope, although they are probably complementary to reliability in order to provide network services. This document introduces a general idea of VNF Pool to support reliable functions provision by the VNFs. We then highlight the reliability challenges and issues when using the VNFs to build services. Related IETF works are also briefly described. 2. Terminology Reliability: capability of a functional entity to consistently provide its function under various dynamic and even unexpected conditions such as fault, overload, etc. Service Control Entity: an entity of the service provider that decides how to combine and orchestrate the network functions to build network services. Examples of Service Control Entity are orchestrator of DC services, SFC control plane, etc. Virtualized Network Function (VNF): a VNF provides the same functional behavior and interfaces as the equivalent network function, but is deployed as software instance(s) building on top of a virtualization layer [NFV-TERM]. VNF Pool: a number of VNF instances providing the same network function. VNF Pool Element: a VNF instance inside a VNF pool. VNF Pool Manager: an entity that manages a VNF pool, and interacts with the Service Control Entity to provide the network function. VNF Set: a general set of VNF instances that can be grouped into multiple VNF Pools, where each pool corresponds to a specific VNF and different pools provide different functions. 3. Background 3.1. From Specialized Hardware to Virtualized Network Function Network functions are traditionally implemented on specialized hardware. There is a trend to implement a number of network functions as software instances on general purpose servers, via Zong, et al. Expires November 6, 2014 [Page 4] Internet-Draft VNF Pool Problem Statement May 2014 virtualized computing platforms. These virtualized functions are called Virtualized Network Functions (VNFs). For example, in Figure 1, virtual firewall (vFW) can be deployed as software instances on general purpose servers, which could be located in Data Center (DC) networks, network operators' networks, or end user premises. Compared with traditional FW deployed as "standalone box" built by specialized hardware and software, vFW has potential advantages such as agility, scalability [NFV-WP]. FW vFW vFW vFW +-------------+ +-----------+ +-----------+ +-----------+ | Specialized | |FW Software| |FW Software| |FW Software| ... | Hardware |----\ +-----------+ +-----------+ +-----------+ | + |----/ +------------------------------------------+ | Software | | Virtualization Platform | +-------------+ +------------------------------------------+ +-----------------+ +-----------------+ | General Purpose | | General Purpose | | Server | | Server | ... +-----------------+ +-----------------+ Figure 1: Example of vFW. 3.2. The Concept of VNF Set We call a general set of VNF instances a VNF set. A VNF set can include a single or multiple VNFs, and each VNF may have a number of instances providing the same function. The following examples are all valid VNF sets. 1. n vFW instances: {vFW#1,vFW#2,...,vFW#n}. 2. m vFW instances and k virtual load balancer (vLB) instances: {vFW#1,...,vFW#m,vLB#1,...,vLB#k}. To be more generic, we denote VNF-A#x the xth instance of a VNF of type A (e.g., vFW), VNF-B#y the yth instance of a VNF of type B (e.g., vLB), and so on. A VNF set can be used as part of a Service Function Chaining (SFC) [SFC], where the instances of various functions are sequentially connected to build a network service. A simple example is shown in Figure 2. Zong, et al. Expires November 6, 2014 [Page 5] Internet-Draft VNF Pool Problem Statement May 2014 Network Service +----------+ +----------+ +----------+ | VNF-A#x | data conn | VNF-B#y | data conn | VNF-C#z | | |-----------| |-----------| | +----------+ +----------+ +----------+ Figure 2: A VNF set used as part of a SFC. Alternatively, a VNF set can be also used merely as a set of VNFs, where the instances provide network functions in a parallel way. An example is shown in Figure 3. +----------+ +----------+ +----------+ | VNF-A#x | | VNF-B#y | | VNF-C#z | +----------+ +----------+ +----------+ \ | / data conn \ |data /data conn \ |conn / \ | / +---------------+ | Client | +---------------+ Figure 3: A VNF set used as multiple VNFs. Some more detailed use cases of VNFs are documented in other drafts [VNFPOOL-UC1] [VNFPOOL-UC2] [VNFPOOL-UC3]. 4. VNF Pool There are a number of existing technologies for providing reliable functions, such as Reliable Server Pool (RSerPool) [RFC5351], Virtual Router Redundancy Protocol (VRRP) [RFC5798], amongst many others. Both technologies provide the service with an abstract object (e.g., pool handle in RSerPool, virtual router ID in VRRP) representing a group of identical functional instances. The dynamic mapping of such abstract object to the actual serving instance is managed internally in the group to cover the failover procedure. The advantage is to provide reliable functions in a transparent manner for both end-hosts and service control entities. We adopt the similar idea of VNF Pool to provide reliable network functions, as shown in figure 4. Zong, et al. Expires November 6, 2014 [Page 6] Internet-Draft VNF Pool Problem Statement May 2014 +------------------------+ | Service Control Entity | +------------------------+ ^ ^ | | +-----------+ +------------+ | | v v + - - - - - - - - - - - - - - - + + - - - - - - - - - - - - - - - + | VNF-A +--------------+ | | VNF-B +--------------+ | | | Pool Manager | | | | Pool Manager | | | +--------------+ | | +--------------+ | | + - - - - - - - - - - - - - + | | + - - - - - - - - - - - - - + | | |+---------+ +---------+| | | |+---------+ +---------+| | | || VNF-A#1 | ... | VNF-A#n || | | || VNF-B#1 | ... | VNF-B#m || | | |+---------+ +---------+| | | |+---------+ +---------+| | | | VNF-A Pool | | | | VNF-B Pool | | | + - - - - - - - - - - - - - + | | + - - - - - - - - - - - - - + | + - - - - - - - - - - - - - - - + + - - - - - - - - - - - - - - - + Figure 4: VNF Pool Architecture. In VNF Pool architecture, each VNF has a VNF Pool containing a number of VNF instances (or VNF Pool Elements) providing the same function. In this sense, a VNF set can be grouped into multiple VNF Pools, where each pool corresponds to a specific VNF, thus different pools provide different functions. Each VNF also has a Pool Manager that manages the VNF instances in the VNF Pool. Pool Manager interacts with the Service Control Entity to provide the network function. The main benefit of using VNF Pool is that the pooling mechanisms such as redundancy management are achieved by the VNF Pool inside the VNF and thus transparent to the Service Control Entity. The Service Control Entity simply interacts with the Pool Manager in each VNF to request and orchestrate the network functions with desired reliability level. In another word, a VNF Pool-enabled VNF still acts as a normal VNF when orchestrated by the Service Control Entity. 5. Challenges and Open Issues 5.1. Risk factors for unreliable instance For a VNF instance, it typically would not have built-in reliability mechanisms on its host (i.e., a general purpose server). Instead, there are more factors of risk that may make VNF instance unreliable. 1. Instance failure due to hardware failure or status change such as server overload. Zong, et al. Expires November 6, 2014 [Page 7] Internet-Draft VNF Pool Problem Statement May 2014 2. Instance failure due to software failure at various levels including hypervisor, Virtual Machine (VM), VNF. 3. Instance migration caused by instance performance downgrade caused by load (e.g., CPU, memory, disk I/O), server consolidation or other service requirement changes. This is distinct from a hard failure, although it may give the appearance of one. 5.2. Redundancy model inside VNF Before a live VNF instance fails, one or more backup instances in the same VNF Pool need to be selected. How to select such backup instances? Moreover, there are policies influencing the appropriate selection of backup instance. For example, it should be avoided that a live VNF instance and its backup instances are placed in a single physical server, or locations with shared risks in the network. On the other hand, it would be desirable to place the live and backup instances in geologically closed locations. Information from the underlying network may need to be collected via - e.g., the interface with Application Layer Traffic Optimization (ALTO) [ALTO], or Interface to Routing System (I2RS) [I2RS]. 5.3. State synchronization inside VNF Service states related to the specific function performed by a VNF instance, e.g., NAT translation table, TCP connection states, should be synchronized between a live VNF instance and its backup instances for stateful failover. Who is responsible for and how to collect, hold, and access such service states to achieve efficient synchronization? A VNF instance should provide negotiated level of state sharing with the necessary performance to fulfill the service requirements - e.g., state synchronization method, format of state data, location and mechanism to access state data. 5.4. Interaction between VNF and Service Control Entity Some information needs to be exchanged between a VNF and the Service Control Entity when the Service Control Entity orchestrates a VNF Pool-enable VNF. For example, how can a VNF Pool be addressed by the Service Control Entity? A Pool Manager can advertise the locator (e.g., IP address) of the active instance - subject to dynamic due to failover. It is also possible to use a virtual address for the whole VNF Pool (similar to RSerPool or VRRP), and map between virtual and actual addresses. Moreover, when a live VNF instance goes out of service, how does the Service Control Entity learn which VNF instance will replace it, and learn the characteristic of the new instance? Zong, et al. Expires November 6, 2014 [Page 8] Internet-Draft VNF Pool Problem Statement May 2014 5.5. Reliable transport The transport mechanism used to carry the pool control messages, e.g., redundancy management, should provide reliable message delivery. Transport redundancy mechanisms such as Multipath TCP (MPTCP) [MPTCP] and the Stream Control Transmission Protocol (SCTP) [RFC3286] will need to be evaluated for applicability. Latency requirements for pool control message delivery must also be evaluated. 5.6. Scope Considerations Ideally, the reliability goal is that the network service provided by the VNFs will continue throughout an interruption within the VNFs , and VNF instances failure or migration will not be visible to the external entities. Our work of VNF Pool initially focuses on several reliability mechanisms that are mainly associated with a redundancy model of a VNF. Additional reliability mechanisms including state synchronization may be included after future gap analysis between identified requirements [NFV-REL] and existing IETF technologies. We currently assume that a VNF Pool contains the instances of same functional type, e.g., FW, LB, etc. In a subsequent step, after identifying the use cases and requirements, we may consider more types of VNF Pool, such as those composed by the instances of different VNF Components (VNFCs) [NFV-SWA]. We are specifically concerned with the reliability of an individual VNF based on the VNF Pool managed inside the VNF. We do not address the reliability related control or routing between adjacent VNFs in the whole VNF set, as such coordination could be done by the Service Control Entity. We do not work on other management aspects of VNF such as scaling, or load balancing, even though these aspects may be complementary to reliability in order to provide network services. We do not intend to resolve the service availability that usually involves more factors including the interruptions in various OSI layers, and even user perception on service performance. 6. Related Works 6.1. Reliable Server Pool (RSerPool) RSerPool supports high availability and scalability of the applications through the use of pools of servers [RFC5351]. The main functions of RSerPool involve server pool management, as well as receiving requests from a client to bind to a desired server. The Zong, et al. Expires November 6, 2014 [Page 9] Internet-Draft VNF Pool Problem Statement May 2014 applicability and gaps of RSerPool to our work of VNF Pool are described in another draft [VNFPOOL-RSP]. 6.2. Virtual Router Redundancy Protocol (VRRP) VRRP specifies an election protocol that dynamically assigns responsibility of a virtual router to one of the VRRP routers called master on a LAN [RFC5798]. The election process provides dynamic failover in the forwarding responsibility should the Master become unavailable. The advantage of VRRP is a higher availability default path without requiring configuration of dynamic routing or router discovery protocols on every end-host. 6.3. Service Function Chaining (SFC) A service chain defines an ordered set of service functions that must be applied to packets [SFC]. Although the VNFs can be used as part of a SFC, SFC and our work of VNF Pool have different focus. As mentioned in the section of scope consideration, we mostly consider the reliability of an individual VNF based on the VNF Pool inside the VNF. We do not address the reliability related control or routing between adjacent VNFs in the forwarding graph. Moreover, according to VNF Pool architecture and principles, the VNF Pools will be orthogonal to and invisible to the SFC. A VNF Pool-enabled VNF still acts as a normal VNF when orchestrated by the SFC. Information exchanged between the VNF Pool and SFC could be operational information of the VNF Pool including pool address, pool instance characteristic, and so on. 7. Security Considerations Any technology which allows the insertion, deletion, reordering, or manipulation of network functions has the potential to be subverted by an attacker, with serious consequences. Distributed VNFs introduce an additional attack vector, in which bad actors join several VNFs of a service. Replay attacks have the potential to create denials of service, reordering, adding, or removing VNFs. VNF reliability technologies must provide cryptographic protections against spoofing and insertion attacks as well as replay attacks, in the form of client authentication, origin authentication on VNF reliability management (control plane) traffic, and replay protections. There may be circumstances under which an attacker masquerading as a VNF manager can introduce data leakage or similar attacks, and consequently server authentication would be required, as well. Zong, et al. Expires November 6, 2014 [Page 10] Internet-Draft VNF Pool Problem Statement May 2014 Failing over a VNF or otherwise transferring service state raises issues related to the transfer of security state, including VNF element identity and credentials, session-associated cryptographic state, and so on. Where possible, transfer of security state should be avoided as a matter of good practice, and this will require particular attention as solutions are drafted. 8. IANA Considerations This document has no actions for IANA. 9. Acknowledgements The authors would like to thank Chidung Lac from Orange, Daniel King from Lancaster University, Lingli Deng, Zhen Cao from China Mobile, Richard Yang from Yale University, Hidetoshi Yokota from KDDI, Mukhtiar Shaikh from Brocade, Qiang Zu from Ericsson, Marco Liebsch from NEC, Susan Hares, for their valuable comments. 10. References 10.1. Normative References TBD. 10.2. Informative References [NFV-WP] NFV Whitepaper: "Network Function Virtualization", issue 1, 2012, http://portal.etsi.org/NFV/NFV_White_Paper.pdf. [SFC] "Service Function Chaining (SFC)", . [NFV-TERM] ETSI GS NFV 003: "Terminology for Main Conceptional Entities in NFV", Version 0.0.4, 2013. [VNFPOOL-UC1] L. Xia, Q. Wu, D. King, H. Yokota, and N. Khan, "Requirements and Use Cases for Virtual Network Functions", draft- xia-vnfpool-use-cases-00, February 2014. [VNFPOOL-UC2] D. King, M. Liebsch, P. Willis and J. Ryoo, "Virtualization of Mobile Core Network Use Case", draft-king-vnfpool- mobile-use-case-00, February 2014. [VNFPOOL-UC3] S. Hares and K. Subramaniam, "Use Cases for Resource Pools with Virtual Network Functions (VNFs)", draft-hares-vnf-pool- use-case-00, January 2014. Zong, et al. Expires November 6, 2014 [Page 11] Internet-Draft VNF Pool Problem Statement May 2014 [ALTO] "Application-Layer Traffic Optimization (alto)", . [I2RS] "Interface to the Routing System (i2rs)", . [MPTCP] "Multipath TCP (mptcp)", . [RFC3286] L. Ong and J. Yoakum, "An Introduction to the Stream Control Transmission Protocol (SCTP)", RFC3286, May 2002. [NFV-REL] ETSI GS NFV REL 001: "Network Function Virtualization; Resiliency Requirements", Version 0.0.7, 2014. [NFV-SWA] ETSI GS NFV SWA 001: "Network Function Virtualization; SW Architecture; Virtual Network Functions Architecture", Version 0.1.0, 2014. [RFC5351] P. Lei, L. Ong, M. Tuexen and T. Dreibholz, "An Overview of Reliable Server Pooling Protocols", RFC5351, September 2008. [RFC5798] S. Nadas, "Virtual Router Redundancy Protocol (VRRP) Version 3 for IPv4 and IPv6", RFC5798, March 2010. [VNFPOOL-RSP] T. Dreibholz, M. Tuexen, M. Shore and N. Zong, "The Applicability of Reliable Server Pooling (RSerPool) for Virtual Network Function Resource Pooling (VNFPOOL)", draft-dreibholz- vnfpool-rserpool-applic-00, October 2013. Authors' Addresses Ning Zong Huawei Technologies Email: zongning@huawei.com Linda Dunbar Huawei Technologies Email: linda.dunbar@huawei.com Melinda Shore No Mountain Software Email: melinda.shore@nomountain.net Zong, et al. Expires November 6, 2014 [Page 12] Internet-Draft VNF Pool Problem Statement May 2014 Diego Lopez Telefonica Email: diego@tid.es Georgios Karagiannis University of Twente Email: g.karagiannis@utwente.nl Zong, et al. Expires November 6, 2014 [Page 13]