Network Working Group N. Zong Internet-Draft L. Dunbar Intended status: Informational Huawei Technologies Expires: July 18, 2014 M. Shore No Mountain Software D. Lopez Telefonica January 14, 2014 Virtualized Network Function (VNF) Pool Problem Statement draft-zong-vnfpool-problem-statement-02 Abstract There is a trend to implement network services by using a group of Virtualized Network Functions (VNFs) called a VNF set. A VNF set can offer benefits like flexible service provisioning. A VNF set also introduces additional points of failure, and therefore poses additional challenges on reliability. This document overviews the problems related to the reliability of a VNF set. A VNF pooling architecture is also briefly introduced as candidate solution to address the problems. Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on July 18, 2014. Copyright Notice Copyright (c) 2014 IETF Trust and the persons identified as the document authors. All rights reserved. Zong, et al. Expires July 18, 2014 [Page 1] Internet-Draft VNF Pool Problem Statement January 2014 This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents 1. Background . . . . . . . . . . . . . . . . . . . . . . . . . 2 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 3. Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 4 4. VNF Pooling Architecture . . . . . . . . . . . . . . . . . . 6 5. Related Works . . . . . . . . . . . . . . . . . . . . . . . . 7 6. Security Considerations . . . . . . . . . . . . . . . . . . . 8 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 8 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 8 9.1. Normative References . . . . . . . . . . . . . . . . . . 8 9.2. Informative References . . . . . . . . . . . . . . . . . 9 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 9 1. Background Network functions such as packet filtering at firewall, Deep Packet Inspection (DPI), Load Balancing (LB), WAN Optimization are conventionally deployed as specialized hardware servers in both network operators' network and Data Center (DC) network, as the building blocks of the network services. A Virtualized Network Function (VNF) provides such network function and is typically implemented as software instance running on commodity hardware server via virtualization layer (i.e., hypervisor). VNF could potentially offer benefits such as elastic service offering, reduced operational and equipment costs [NFV-WP]. There is a trend to move network functions away from specialized hardware servers to commodity hardware servers, based on resource virtualization, to implement network services by using a group of VNFs. For example, in Service Function Chaining (SFC), a network service can be implemented by a group of sequentially connected VNFs deployed at different points in the network [SFC]. We call a group of VNFs a VNF set, which can be used not only as a SFC, but also solely as one or more pools of VNFs. Zong, et al. Expires July 18, 2014 [Page 2] Internet-Draft VNF Pool Problem Statement January 2014 A VNF set can introduce additional points of failure beyond those inherent in a single specialized server, and therefore poses additional challenges on reliability. For a single VNF, it typically would not have built-in reliability mechanisms on its host (i.e., a commodity hardware server). Instead, there are more factors of risk such as software failure, server overload, and instance migration that may lead to VNF failures. Currently generalized pooling and other redundancy mechanisms may be applied to address some reliability requirements of a single VNF. However, the complexity of dealing with a growing number of VNFs including stateful and stateless functions, and extending the redundancy across a VNF set (i.e., multiple pools for multiple VNFs) requires further solution development. For example, when a live VNF pool member goes out of service, how do adjacent entities learn which pool member will replace it? How do VNFs learn the state of adjacent VNFs to support reliability mechanisms like load sharing and switch before break? How is the service state of an instance held and accessed for efficient synchronization with backup instances and other members of its pool? In this document, we overview problems related to the reliability of a VNF set. We also briefly introduce a VNF pooling architecture as a candidate solution to address the problems. 2. Terminology Reliability: capability of a functional entity to consistently provide function under various dynamic and even unexpected conditions such as fault, overload, etc. Virtualized Network Function (VNF): a VNF provides the same functional behavior and interfaces as the equivalent network function, but is deployed as software instances building on top of a virtualization platform [NFV-TERM]. VNF Pool: a group of VNF instances providing same network function. VNF Pool Element: a VNF instance inside a VNF pool. VNF Pool User: an entity that requests network function provided by a VNF pool. VNF Pool Manager: an entity that manages pool elements, and interacts with pool user to provide network function. VNF Set: a group of VNFs, with each VNF corresponding to a VNF Pool. Zong, et al. Expires July 18, 2014 [Page 3] Internet-Draft VNF Pool Problem Statement January 2014 3. Problems There is a trend to implement network services by using a group of VNFs, based on virtualized resource. We call a group of n VNFs (i.e., VNF#1, VNF#2, ..., VNF#n) a VNF set. A VNF set can be used not only as a Service Function Chaining (SFC) [SFC], where a group of VNFs are sequentially connected to implement a network service (as shown in Figure 1), but also solely as one or more pools of VNFs. A more detailed use case study of VNF set is documented in separated drafts [VNFPOOL-UC]. Network Service +----------+ +----------+ +----------+ | VNF#1 | data conn | VNF#2 | data conn | VNF#n | | Instance |-----------| Instance |- ... ... -| Instance | +----------+ +----------+ +----------+ ^ | Virtualization +--------------------------------------------------------+ | Virtualization Platform | +--------------------------------------------------------+ Figure 1: A VNF set used as a SFC. A VNF set can introduce additional points of failure beyond those inherent in a single specialized server, and therefore poses additional challenges on reliability. 1. More causes of VNF failure. A VNF typically would not have built-in reliability mechanisms on its host (i.e., a commodity hardware server). Instead, there are more factors of risk that may lead to VNF transition or failure. 1) Hardware failure or status change such as server over- utilization, network congestion. 2) Software failure at various levels including hypervisor, Virtual Machine (VM), VNF instance. 3) Instance scaling in/out/up/down, or migration caused by instance performance downgrade, server consolidation or other service requirement changes. 2. Backup advertisement across VNF set. Although currently generalized pooling and redundancy mechanisms may be applied to address some reliability requirements of a single VNF, multiple pools for multiple VNFs may require extended redundancy mechanisms. For example, before a live VNF instance fails, one or more backup Zong, et al. Expires July 18, 2014 [Page 4] Internet-Draft VNF Pool Problem Statement January 2014 instances in the same VNF pool need to be selected and advertised to the adjacent entities such as another VNF pool. Who is responsible and how to select and advertise such backup instances across the VNF set? 3. State notification across VNF set. A VNF may need to learn the state of adjacent VNFs to support reliability mechanisms like load sharing and switch before break. Some critical states include the performance downgrade due to resource contention between instances, instance migration, scaling in/out/up/down, and so on. Who is responsible and how to notify such critical states across the VNF set? 4. Service state synchronization. The service state should be synchronized between a live VNF instance and its backup instances for stateful failover. Who is responsible and how to collect, hold, as well as access such service state to achieve efficient synchronization? A VNF should provide negotiated level of state sharing with the necessary performance to fulfill the service requirements - e.g., state synchronization method, format of state data, location and mechanism to access state data. 5. Complication of VNFs placement. There are multiple policies influencing the appropriate placement of VNFs. It should be avoided that a live VNF instance and its backup instance are placed in a single physical server, or locations with shared risks (e.g., links, network nodes) in physical network. On the other hand, it would be desirable to place the live and backup instances in topologically closed locations. A VNF set may need to collect information from the underlying network - e.g., interface with Application Layer Traffic Optimization (ALTO) [ALTO], Interface to Routing System (I2RS) [I2RS]. 6. Reliable transport. The transport network should provide alternative path for accessing a VNF instance, as well as for the aforementioned control traffic, to prevent single point of failure in physical network. Transport redundancy mechanisms like Multipath TCP (MPTCP) [MPTCP], Stream Control Transmission Protocol (SCTP) [RFC3286] need to be identified and analyzed for reuse. Ideally, the reliability of a VNF set means that the services provided by such a VNF set will continue throughout an interruption and the outages of one or more VNFs will not be visible to the users of the VNF set. Initially, the WG focuses the work around several mechanisms supporting a reliable VNF set, which are mainly redundancy across a VNF set and stateful failover. Additional mechanisms for reliable VNF set might be included after future gap analysis between Zong, et al. Expires July 18, 2014 [Page 5] Internet-Draft VNF Pool Problem Statement January 2014 identified requirements and existing IETF technologies. Detailed analysis on VNF reliability can be also found in [NFV-REL]. 4. VNF Pooling Architecture There are a number of existing technologies for providing reliable and highly available functions, such as Reliable Server Pool (RSerPool) [RFC5351], Virtual Router Redundancy Protocol (VRRP) [RFC5798]. Both technologies provide service with an abstract object (e.g., pool handle in RSerPool, Virtual Router ID in VRRP) to represent a group of functional instances where the dynamic mapping of abstract object to actual serving instance, or the selection of serving instance, is managed internally to the group to cover failover procedure. The advantage of this idea is to provide reliable and highly available functions in a manner that is transparent to both end-hosts and other service components. Based on this idea, we describe a VNF pooling architecture to address the aforementioned problems for reliable VNF set, which is shown as below. +-----------------+ | Pool User | +-----------------+ ^ ^ | | +-----------+ +-----------+ | | v v +--------------+ +--------------+ | Pool Manager |<-------------->| Pool Manager | +--------------+ +--------------+ ^ ^ | | v v +------------------------------+ +------------------------------+ |+----------+ +----------+ | | +----------+ +----------+| || VNF#1 | | VNF#1 | | | | VNF#2 | | VNF#2 || || Instance | ... | Instance |<+---+>| Instance | ... | Instance || |+----------+ +----------+ | | +----------+ +----------+| | VNF#1 Pool | | VNF#2 Pool | +------------------------------+ +------------------------------+ Figure 2: VNF Pooling Architecture. In the VNF pooling architecture, there are multiple VNF pools. Each VNF pool contains a group of VNF instances (also called VNF pool elements) providing the same network function. Each VNF pool also Zong, et al. Expires July 18, 2014 [Page 6] Internet-Draft VNF Pool Problem Statement January 2014 has a VNF pool manager that manages pool elements, and interacts with pool user to provide network function. A pool user can be either an application end-host or a service component (e.g., orchestrator in DC service) requesting network function. The VNF pooling will address the problems for reliable VNF set in the following perspectives. 1. Each VNF pool manager communicates with its responsible VNF pool elements to transmit messages for backup instance selection, service state synchronization. 2. Different VNF pool managers from different VNF pool communicate with each other to transmit messages for backup instance advertisement across VNF set, instance state notification across VNF set. 3. Different VNF pool elements from different VNF pool may also communicate with each other to transmit messages for backup instance advertisement, instance state notification. 4. VNF pool manager may also communicate with the pool user to obtain the policies influencing the appropriate placement of VNFs. When needed, VNF pool manager may interface with ALTO, I2RS to collect information from the underlying network. The detailed solution will be documented in a separated VNF pooling architecture draft [VNFPOOL-ARCH]. 5. Related Works 1. Reliable Server Pool (RSerPool). RSerPool supports high availability and the scalability of applications through the use of pools of servers [RFC5351]. The main functions of RSerPool involve server pool management, as well as receiving requests from a client to bind to a desired server. The applicability and gaps of RSerPool to reliable VNF set is described in a separated draft [VNFPOOL-RSP]. 2. Virtual Router Redundancy Protocol (VRRP). VRRP specifies an election protocol that dynamically assigns responsibility for a virtual router to one of the VRRP routers called master on a LAN [RFC5798]. The election process provides dynamic failover in the forwarding responsibility should the Master become unavailable. The advantage of VRRP is a higher availability default path without requiring configuration of dynamic routing or router discovery protocols on every end-host. Zong, et al. Expires July 18, 2014 [Page 7] Internet-Draft VNF Pool Problem Statement January 2014 3. Service Function Chaining (SFC). A service chain defines an ordered set of service functions that must be applied to packets [SFC]. A VNF set can be used as a SFC, where a group of VNFs are sequentially connected to implement a network service. The SFC and reliable VNF set are independent but complementary with each other. While SFC aims at defining mechanisms such as packet encapsulation, control plane meta-data to connect the service function instances in a specific order, reliable VNF set focuses on signaling related to reliability mechanisms such as redundancy, state transfer and trust/ security. 6. Security Considerations Any technology which allows the insertion, deletion, reordering, or manipulation of network functions has the potential to be subverted by an attacker, with serious consequences. Distributed VNFs introduce an additional attack vector, in which bad actors join several VNFs of a service. Replay attacks have the potential to create denials of service, reordering, adding, or removing VNFs. VNF reliability technologies must provide cryptographic protections against spoofing and insertion attacks as well as replay attacks, in the form of client authentication, origin authentication on VNF reliability management (control plane) traffic, and replay protections. There may be circumstances under which an attacker masquerading as a VNF manager can introduce data leakage or similar attacks, and consequently server authentication would be required, as well. 7. IANA Considerations This document has no actions for IANA. 8. Acknowledgements The authors would like to thank Daniel King from Lancaster University, UK, Lingli Deng, Zhen Cao from China Mobile, Richard Yang from Yale University, US, Hidetoshi Yokota from KDDI, Mukhtiar Shaikh from Brocade, LAC Chidung from Orange, Susan Hares, for their valuable comments. 9. References 9.1. Normative References TBD. Zong, et al. Expires July 18, 2014 [Page 8] Internet-Draft VNF Pool Problem Statement January 2014 9.2. Informative References [NFV-WP] NFV Whitepaper: "Network Function Virtualization", issue 1, 2012, http://portal.etsi.org/NFV/NFV_White_Paper.pdf. [SFC] "Service Function Chaining (SFC)", . [NFV-TERM] ETSI GS NFV 003: "Terminology for Main Conceptional Entities in NFV", Version 0.0.4, 2013. [VNFPOOL-UC] L. Xia, Q. Wu and D. King, "Use cases and Requirements for Virtual Service Node Pool Management", draft-xia-vsnpool- management-use-case-01, August 2013. [ALTO] "Application-Layer Traffic Optimization (alto)", . [I2RS] "Interface to the Routing System (i2rs)", . [MPTCP] "Multipath TCP (mptcp)", . [RFC3286] L. Ong and J. Yoakum, "An Introduction to the Stream Control Transmission Protocol (SCTP)", RFC3286, May 2002. [NFV-REL] ETSI GS NFV REL 001: "Network Function Virtualization; Resiliency Requirements", Version 0.0.1, 2013. [RFC5351] P. Lei, L. Ong, M. Tuexen and T. Dreibholz, "An Overview of Reliable Server Pooling Protocols", RFC5351, September 2008. [RFC5798] S. Nadas, "Virtual Router Redundancy Protocol (VRRP) Version 3 for IPv4 and IPv6", RFC5798, March 2010. [VNFPOOL-ARCH] TBD. [VNFPOOL-RSP] T. Dreibholz, M. Tuexen, M. Shore and N. Zong, "The Applicability of Reliable Server Pooling (RSerPool) for Virtual Network Function Resource Pooling (VNFPOOL)", draft-dreibholz- vnfpool-rserpool-applic-00, October 2013. Authors' Addresses Zong, et al. Expires July 18, 2014 [Page 9] Internet-Draft VNF Pool Problem Statement January 2014 Ning Zong Huawei Technologies Email: zongning@huawei.com Linda Dunbar Huawei Technologies Email: linda.dunbar@huawei.com Melinda Shore No Mountain Software Email: melinda.shore@nomountain.net Diego Lopez Telefonica Email: diego@tid.es Zong, et al. Expires July 18, 2014 [Page 10]