Internet Working Group Y. Jiang W. Xu Internet Draft Huawei Z. Cao Intended status: Standards Track China Mobile Expires: January 2015 July 4, 2014 Fault Management in Service Function Chaining draft-jxc-sfc-fm-00.txt Status of this Memo This Internet-Draft is submitted to IETF in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html This Internet-Draft will expire on January 4, 2015. Copyright Notice Copyright (c) 2014 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of Jiang and et al Expires January 4, 2015 [Page 1] Internet-Draft SFC Fault Management July 2014 the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Abstract SFC provides a flexible and agile approach to service innovation, but whether the SFC path is constructed as expected, whether the chaining is functioning correctly still needs to be verified. This document discusses fault management requirements in SFC and provides a fault management solution for service function chaining. Table of Contents 1. Introduction .............................................. 2 1.1. Conventions used in this document ...................... 3 1.2. Terminology ............................................ 3 1.3. SFC OAM Requirements ................................... 4 2. Packet Format ............................................. 5 3. Theory of Operation ....................................... 8 3.1. Continuity Check and Connectivity Verification of SFC .. 8 3.1.1. MEP sending an SFC CC-CV packet ..................... 8 3.1.2. MEP terminating an SFC CC-CV packet ................. 9 3.2. SFC Route Tracing ...................................... 9 3.2.1. MEP sending an SFC Trace Request ................... 11 3.2.2. SFE/SFF processing an SFC Trace Route Request ...... 11 3.2.3. Service Function treating an SFC Trace Request ..... 11 3.2.4. MEP receiving an SFC Trace Reply ................... 11 4. Security Considerations .................................. 12 5. IANA Considerations ...................................... 12 6. References ............................................... 12 6.1. Normative References .................................. 12 6.2. Informative References ................................ 12 7. Acknowledgments .......................................... 13 1. Introduction This document discusses Operations, Administration and Maintenance (OAM), specifically, fault management requirements for Service Function Chaining (SFC), and further provides a solution that can be used to detect data plane failures in SFC Paths. A requisite of SFC OAM is that SFC OAM messages must follow the same data path as normal SFC packets would traverse. SFC OAM request and reply messages are used primarily to validate the SFC data plane, and Jiang and et al Expires January 4, 2015 [Page 2] Internet-Draft SFC Fault Management July 2014 may further be used to verify the SFC data plane against the SFC control plane. 1.1. Conventions used in this document The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119]. 1.2. Terminology Maintenance Entity Group (MEG): The set of one or more maintenance entities that maintain and monitor a section or a transport path in an OAM domain. MEP: MEG End Point, an OAM end point capable of initiating (source MEP) and terminating (sink MEP) OAM packets for fault management and performance monitoring. MIP: MEG Intermediate Point, an OAM intermediate point terminates and processes OAM packets that are sent to this particular MIP and may generate OAM packets in reaction to received OAM packets. Service Function (SF): a logical entity which can provide one or more service processing functions for packets/frames such as firewall, DPI (Deep Packet Inspection), LI (Lawful Intercept) and etc. Usually these processing functions are computation intensive. This entity may also provide packet/frame encapsulation/decapsulation capability. Service Forwarding Entity (SFE): a logical entity which forwards packets/frames to one or more SFs in a same service chain. Optionally, it provides mapping, insertion and removal of header(s) in packets/frames. Note service forwarding path may not be the shortest path to its destination. Service Function Forwarder (SFF): A service function forwarder is responsible for delivering traffic received from the SFC network forwarder to one or more connected service functions via information carried in the SFC encapsulation. Service Chaining Header: a header in front of packet, added by an SFE/SFF. SFE/SFF uses service chaining header information to forward service chaining packet. Jiang and et al Expires January 4, 2015 [Page 3] Internet-Draft SFC Fault Management July 2014 Service Chaining Packet: an original packet added with a service chaining header. 1.3. SFC OAM Requirements The following SFC OAM requirements MUST be supported: (R1) SFC OAM MUST allow for continuity check between SFEs/SFFs. (R2) SFC OAM MUST allow for connectivity verification between SFEs/SFFs. (R3) SFC OAM MUST support trace routing in a service function path. (R4) SFC OAM MUST support connectivity verification between SFs in an SFC chain. (R5) SFC OAM MUST support performance measurements in SFs and SFEs/SFFs. (R6) SFC OAM MUST support monitoring of unidirectional and bi- directional SFC path. (R7) SFC OAM MUST support fate sharing of SFC OAM packets and SFC service packets on the same SFC path (congruent path). Since control plane is not a prerequisite for SFC, we cannot resort to control plane hello session. Furthermore, OAM packets need to be transported on the same data path as the SFC packets, so that any data plane failure can be identified. Therefore, there is a need to provide an OAM tool that would enable users to detect failures in the SFC data plane, and a mechanism to isolate and identify faults. This document discusses the fault management problem in SFC. The basic idea is to verify that packets in a particular Service Function Chain actually passing through the SFEs/SFFs and SFs along the respective SFC path. It is proposed that this test be carried out by sending an OAM message (called an "SFC trace request message") across an SFC path. The SFC trace request message carries the SFC identifier whose SFC path is being verified. This SFC OAM request message is forwarded on Jiang and et al Expires January 4, 2015 [Page 4] Internet-Draft SFC Fault Management July 2014 the SFC path just like any SFC data packet belonging to that Service Function Chain. The OAM message is processed by each SFE/SFF along the SFC, and the SFE/SFF will respond with an SFC trace Reply message, carrying information such as the previous SF identifier and its position in the SFC. 2. Packet Format OAM messages are encapsulated in an SFC packet in the following format (they should not be combined with any SFC data traffic in the same SFC packet): +------------+-------------+ | SFC Header | OAM message | +------------+-------------+ Where SFC header is formatted as in Figure 1: 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |Version|O| other parameters | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | other parameters | . | . | . | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 1 SFC header O: The O flag indicates that an SFC OAM message is following the SFC header. An SFC OAM message is depicted in Figure 2. Jiang and et al Expires January 4, 2015 [Page 5] Internet-Draft SFC Fault Management July 2014 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Version | Message Type | Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Originator Handle | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Sequence Number | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | TLVs | . . . . . . | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 2 SFC OAM message o Version: version of SFC OAM message. This field is 8 bits long, and current version is set to 0x01. o Message Type indicate the type of SFC OAM message. The SFC OAM message has the following types: Value Meaning ----- ------- 1 continuity check message 2 trace request message 3 trace reply message o Originator Handle: The Originator Handle is filled in by the packet original sender. o Sequence Number: The Sequence Number is assigned by the sender of the SFC request message and can be used to track the correct reply message. o The Sending Timestamp is the time-of-day (in seconds and microseconds, according to the sender's clock) when the SFC OAM request is sent. The Receiving Timestamp in an SFC OAM reply message is the time-of-day (according to the receiver's clock) that the corresponding request was received. Jiang and et al Expires January 4, 2015 [Page 6] Internet-Draft SFC Fault Management July 2014 o TLVs (Type-Length-Value) have the following format: 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Value | . . . . . . | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 3 SFC OAM TLVs Types of SFC OAM TLV will be defined in the next revision; Length is the length of the Value field in octets; and Value field is variable depending on its Type (it is zero padded to align to a 4-octet boundary). Jiang and et al Expires January 4, 2015 [Page 7] Internet-Draft SFC Fault Management July 2014 3. Theory of Operation In order to describe SFC OAM in an abstract way, we reuse some nomenclatures in MPLS WG. SFC OAM operates in the context of Maintenance Entities (MEs) that define a relationship between two points of a service function path to which maintenance and monitoring operations apply. The two points that define a maintenance entity are called Maintenance Entity Group End Points (MEPs). An abstract reference model for an ME is illustrated in Figure 4 below: +-+ +-+ +-+ +-+ |A|----|B|----|C|----|D| +-+ +-+ +-+ +-+ Figure 4 SFC OAM Reference Model In Figure 4, node A can be a classifier or an entry SFE/SFF, node D can be an exit SFE/SFF, and node B and C can be any SFE/SFF or SF (with the restriction that any two SFs cannot be directly connected in SFC forwarding layer) on the SFC path. In general, MEG End Points (MEPs) are the source and sink points of a MEG for SFC OAM. 3.1. Continuity Check and Connectivity Verification of SFC Proactive Continuity Check (CC) can be used to detect a loss of continuity defect between two MEPs in a MEG. Proactive Connectivity Verification (CV) can be used to detect an unexpected connectivity defect between two MEGs or unexpected connectivity within the MEG with an unexpected MEP. BFD can also be used as a tool of proactive CC & CV in SFC, where BFD Control packets must be sent along the same path as the monitored SFC path. 3.1.1. MEP sending an SFC CC-CV packet A source MEP can proactively sends CC-CV packets periodically to its sink peer MEP. An SFC CC-CV packet is an SFC CC-CV message encapsulated with an SFC Header. The SFC header is set as described in [I-D.niu-sfc-mechanism] and its flag O MUST be set to 1. The SFC OAM message is set as follows: Jiang and et al Expires January 4, 2015 [Page 8] Internet-Draft SFC Fault Management July 2014 - The message type MUST be set to 1. - The Sender's Handle is set by the original sender, and MUST be set with the sender's identifier. - The Sequence Number is set with a random value. 3.1.2. MEP terminating an SFC CC-CV packet A sink MEP detects a loss of continuity defect when it fails to receive proactive CC-V OAM packets from the source MEP for a consecutive time. When CC-V packets are received by a sink MEP, it is parsed. If any mis-connectivity defect is detected, a warning should be raised and fault management system should be notified of the detected defects. 3.2. SFC Route Tracing According to the SFC architecture described in figure 2 of [I-D. jiang-sfc-arch] and figure 2 of [I-D. quinn-sfc-arch], SFC can be categorized into two abstraction layers, that is, service function layer and SFC forwarding layer. In the service function layer, a service function chain actually is a service function graph, where a service function is connected to another service function one by one in sequence. In the SFC forwarding layer, service functions are further attached to SFE/SFF nodes thus form a more detailed forwarding graph. As defects can be located on either service functions or SFE/SFF nodes, it is critical to trace route both service functions and SFE/SFF nodes to detect and isolate any defects for SFC. In order to trace route of a service function chain, different layers of service function chain can be monitored: o Service-function-layer, that is, only SF identifiers can be set as the destination MEP in the trace route request and response messages. The trace routing operation collects all the SFs' identifiers along an SFC path. By comparing this SF list with the pre-configured service function graph, an operator could determine whether there is any fault in the SF connectivity and locate the defect on an SF when there are any of them. Jiang and et al Expires January 4, 2015 [Page 9] Internet-Draft SFC Fault Management July 2014 o SFC-forwarding-layer, that is, both SF identifiers and SFE/SFF can be set as the destination MEP in the trace route request and response messages. The trace routing operation collects all the SFs' identifiers and SFE/SFF identifiers along an SFC path. By comparing this SF and SFE/SFF list with the pre-configured SFC forwarding graph, an operator could determine whether there is any fault in the forwarding layer and locate the defect on an SFE/SFF or an SF. Furthermore, two different mechanisms may be used to trace route a service function chain: o TTL mechanism Similar to the IP trace route, the detection node launches a number of trace request messages in sequence to detect the fault in a specific path, the TTL of request message is set successively to 1, 2, ..., and so on. The trace route request will pass the SFs along the service function graph, and each SF will decrease the TTL value by 1. A trace route reply message will be generated and send back to the launcher when the resulted TTL is equal to zero. In this way, the launcher of trace routing can get the list of SFs that the trace route request message passes by parsing all the trace route reply messages, and isolate the fault location if there is any. o record route mechanism The detection node launches a single trace route request message, and this message is transported over the specific SFC path. When the trace route request message is received by an SF in the SFC path, the SF adds its SF identifier to the end of an SF list carried in the message. Moreover, a trace route reply message should be generated and sent back to the launcher, and the new record route SF list MUST be copied to the trace route reply message. In this way, the launcher of trace routing can get the list of SFs that the trace route request message passes by parsing all the trace route reply messages, and isolate the fault location if there is any. Jiang and et al Expires January 4, 2015 [Page 10] Internet-Draft SFC Fault Management July 2014 3.2.1. MEP sending an SFC Trace Request In general, MEG End Points (MEPs) are the source and sink points of a MEG for SFC OAM. An MEP initiates a trace route request packet to detect and track any fault in a Service Function Chain. An SFC Trace route request packet is an SFC trace route request message encapsulated with an SFC Header. The SFC header is set as described in [I-D.niu-sfc-mechanism] and flag O MUST be set to 1. The SFC OAM message is further set as follows: - The message type MUST be set to 2. - The Sender's Handle MUST be set to the sender's identifier. - The Receiver's Handle can be set to the exit SFE/SFF's identifier. 3.2.2. SFE/SFF processing an SFC Trace Route Request When an SFE/SFF receives a trace route request packet with O flag being set in SFC header, it firstly adds its identifier to the end of the record route list in the trace request. It then performs service forwarding function, and sends the new trace route request packet to the next SF or next SFE/SFF. Furthermore, the SFE/SFF sends a trace reply packet back to the source MEP with a copy of the new record route SF list. 3.2.3. Service Function treating an SFC Trace Request An SF can only be configured as an MIP in an MEG of SFC. When an SF (being an MIP) receives a trace request packet with OAM flag being set in SFC header from an SFE/SFF, it only sends it back to the SFE/SFF transparently. 3.2.4. MEP receiving an SFC Trace Reply An MEP should only process an SFC trace reply packet in response to an SFC trace request that it has sent. Thus, upon receipt of an SFC trace reply packet, an MEP should try to match the trace reply packet with a trace request that it has previously sent, by checking the corresponding path identifier and Sequence Number in the SFC OAM packets. If no match is found, then the MEP MUST drop the trace reply packet silently. Jiang and et al Expires January 4, 2015 [Page 11] Internet-Draft SFC Fault Management July 2014 Since each SFE/SFF in the SFC path will send a trace reply packet when the trace request packet passes it, a source MEP will receive a sequence of trace reply packets from SFEs/SFFs (other than the MEP itself) along the SFC path. Thus, the source MEP can get the full service topology and SFC path if there is no defect in the SFC data plane, and could detect and locate the data plane defects if there are any of them. 4. Security Considerations It will be considered in a future revision. 5. IANA Considerations It will be considered in a future revision. 6. References 6.1. Normative References 6.2. Informative References [sfc-ps] P. Quinn, and T. Nadeau; Service Function Chaining Problem Statement; April 2014; Work in Progress [I-D.jiang-sfc-arch] Y. Jiang, H. Li; An Architecture of Service Function Chaining; February 2014; Work in Progress [I-D.niu-sfc-mechanism] L. Niu, H. Li, Y. Jiang; A Service Function Chaining Header and its Mechanism; March 2014; Work in Progress [I-D.quinn-sfc-arch] P. Quinn, J. Halpern; Service Function Chaining (SFC) Architecture; May 2014; Work in Progress Jiang and et al Expires January 4, 2015 [Page 12] Internet-Draft SFC Fault Management July 2014 7. Acknowledgments TBD Authors' Addresses Yuanlong Jiang Huawei Technologies Co., Ltd. Bantian, Longgang district Shenzhen 518129, China Email: jiangyuanlong@huawei.com Weiping Xu Huawei Technologies Co., Ltd. Bantian, Longgang district Shenzhen 518129, China Email: xuweiping@huawei.com Zhen Cao China Mobile Xuanwumenxi Ave, Xuanwu District Beijing 100053, China Email: caozhen@chinamobile.com Jiang and et al Expires January 4, 2015 [Page 13]