Neil Harrison 
   Internet Draft                                          Peter Willis 
   Document: draft-harrison-mpls-oam-00.txt             British Telecom 
   Expires: August 2001                                                 
                                                         Shahram Davari 
                                                             PMC-Sierra 
                                                                        
                                                         Ben Mack-Crane 
                                                                Tellabs 
                                                                        
                                                           Hiroshi Ohta 
                                                                    NTT 
                                                                        
                                                          February 2001 
    
    
                    OAM Functionality for MPLS Networks 
    
    
Status of this Memo 
    
   This document is an Internet-Draft and is in full conformance 
   with all provisions of Section 10 of RFC2026. 
    
    
   Internet-Drafts are working documents of the Internet Engineering 
   Task Force (IETF), its areas, and its working groups.  Note that      
   other groups may also distribute working documents as Internet-
   Drafts. 
    
   Internet-Drafts are draft documents valid for a maximum of six 
   months and may be updated, replaced, or obsoleted by other documents 
   at any time.  It is inappropriate to use Internet-Drafts as 
   reference material or to cite them other than as "work in progress." 
    
   The list of current Internet-Drafts can be accessed at 
        http://www.ietf.org/ietf/1id-abstracts.txt 
   The list of Internet-Draft Shadow Directories can be accessed at 
        http://www.ietf.org/shadow.html. 
    
    
Copyright Notice 
    
   Copyright(C) The Internet Society (2001). All Rights Reserved. 
    
    
Abstract 
    
   This Internet draft provides requirements and mechanisms for OAM 
   (Operation and Maintenance) for the user-plane in MPLS networks. A 
   connectivity verification "CV" OAM packet is defined, which is 
   transmitted periodically from LSP source to LSP sink. The CV flow 
   could be used to detect defects related to misrouting of LSPs as 
   well as link and nodal failure, and if required to trigger 
   protection switching to the protection path. 
     
   Harrison et.al        Expires August 2001                   Page 1 
                 OAM Functionality for MPLS Networks    February 2001 
    
    
   A forward defect identifier "FDI" and a backward defect identifier 
   "BDI" are defined, which carry the defect type and location to the 
   near end and far end respectively. At every LSP terminating node, 
   the FDI is mapped from server layer to client layer. By doing so FDI 
   could suppress the alarm storm, and let the appropriate layer take 
   control of protection switching. BDI is used by LSP source to start 
   or stop the QoS aggregation, depending on whether the LSP is in 
   available or unavailable state. The criteria for entry and exit to 
   the available and unavailable states are also defined in this 
   document. 
 
 
Table of Contents 
    
   1.    Introduction..................................................3 
   2.    Definitions...................................................4 
   3.    Symbols and Abbreviations.....................................5 
   4.    Requirements for MPLS OAM.....................................5 
   5.    Principles of OAM Function....................................6 
   5.1   Client/Server Recursion-Layering..............................6 
   5.2   OAM Functionality and Layer Independence......................7 
   5.3   Defects.......................................................7 
   5.4   Availability..................................................7 
   5.5   Decoupling of User behavior from Connectivity Assessment......8 
   5.6   Forward and Backward Defect Indicators........................8 
   5.7   Connectivity Verification.....................................9 
   5.8   Customers Should not be Used as Defect Detectors.............10 
   5.9   The Reliability of OAM Functionality Under Fault Conditions..10 
   6.    Mechanisms of MPLS OAM.......................................10 
   6.1   Special MPLS Label Values....................................10 
   6.2   Handling of Errored OAM Packets..............................10 
   6.3   Label Stack Overhead Encoding Rules for OAM Packets..........11 
   6.3.1 For CV OAM Packets...........................................11 
   6.3.2 For P OAM Packets............................................12 
   6.3.3 For FDI and BDI OAM Packets..................................12 
   6.3.4 MPLS OAM Function Types for the OAM Alert Label..............13 
   6.4   MPLS OAM Packets.............................................14 
   6.4.1 Connectivity Verification (CV) Packets.......................15 
   6.4.2 Performance ôPö Packets......................................16 
   6.4.3 Forward defect Indicator ôFDIö packets.......................16 
   6.4.4 Backward Defect Indicator ôBDIö..............................17 
   6.5   Defect Types and their Entry/Exit Criteria...................18 
   6.5.1 Defect Type Codepoints.......................................18 
   6.5.2 dLOCV Entry Criteria.........................................20 
   6.5.3 DTTSI Entry Criteria.........................................21 
   6.5.4 dLoop Entry Criteria.........................................21 
   6.5.5 dLOCV, dTTSI and dLoop exit criteria.........................22 
   6.6   Available and unavailable state processing...................23 
     
   Harrison et. al.      Expires August 2001                   Page 2 
                 OAM Functionality for MPLS Networks    February 2001 
    
   6.6.1 Short Break definition.......................................23 
   6.6.2 Available/Unavailable State Definition.......................24 
   6.6.3 Near-end and Far-end Measurements of Availability............24 
   6.6.4 Near-End State Processing Flow-chart.........................25 
   6.6.5 Far-End State Processing Flow-chart..........................27 
   6.6.6 A pictorial view of near-end and far-end state processing....28 
   7.    Security Considerations......................................29 
   8.    References...................................................29 
   9.    Author's Addresses...........................................29 
    
    
   Conventions used in this document 
    
   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 
   "SHOULD", "SHOULD NOT", "RECOMMENDED",  "MAY", and "OPTIONAL" in 
   this document are to be interpreted as described in RFC-2119 [1]. 
    
    
1. Introduction 
    
   This Internet draft provides requirements and mechanisms for OAM 
   (Operation and Maintenance) for the user-plane in MPLS networks. It 
   is recognized that OAM functionality is important in public networks 
   for ease of network operation, for verifying network performance and 
   to reduce operational costs. OAM functionality is especially 
   important for networks, which are required to deliver (and hence be 
   measurable against) QoS (Quality of Service) and availability 
   performance parameters/objectives. 
    
   A connectivity verification "CV" OAM packet is defined in this 
   document, which is transmitted periodically from LSP source to LSP 
   sink. The CV flow could be used to detect defects related misrouting 
   of LSPs as well as link and nodal failure, and if required to 
   trigger protection switching to the protection path. A forward 
   defect identifier "FDI" and a backward defect identifier "BDI" are 
   defined, which carry the defect type and location to the near end 
   and far end respectively. At every LSP terminating node, the FDI is 
   mapped from server layer to client layer. By doing so FDI could 
   suppress the alarm storm, and let the appropriate layer take control 
   of protection switching. BDI is used by LSP source to start or stop 
   the QoS aggregation, depending on whether the LSP is in available or 
   unavailable state. The criteria for entry and exit to the available 
   and unavailable states are also defined in this document. 
    
   The OAM functionality defined herein is limited to point-point LSP 
   tunnels. OAM functionality for multipoint-point and point-multipoint 
   LSP tunnels is FFS. 
    
     
   Harrison et. al.      Expires August 2001                   Page 3 
                 OAM Functionality for MPLS Networks    February 2001 
    
    
2. Definitions 
    
   This document introduces some new terminology, which is required to 
   discuss the functional network components associated with OAM. 
    
    
       Functional Architecture                    Meaning 
       Term                         
       ------------------                   ------------------ 
        
       Client/server               A term referring to the transparent 
       (relationship between       transport of a client (ie higher) 
       layer networks)             layer link connection by a server 
                                   (ie lower) layer network trail. 
                                    
       Link connection             A partition of a layer N trail that 
                                   exists between two logically 
                                   adjacent switching points within the 
                                   layer N network. 
                                    
       LSP Tunnel                  An LSP Tunnel is an LSP with well-
                                   defined source (ingress point) and 
                                   sink (egress point) 
                                    
       Subnetwork                  A subnetwork is a contiguous 
                                   topological region of a network 
                                   delimited by its set of peripheral 
                                   access points, and is characterized 
                                   by the possible routing across the 
                                   subnetwork between those access 
                                   points.  A network is the largest 
                                   subnetwork and a node is the 
                                   smallest subnetwork (at least in 
                                   practical physical terms, though 
                                   there are smaller sub-networks 
                                   within nodes). 
                                    
       Trail                       A generic transport entity at layer 
                                   N which is composed of a client 
                                   payload (which can be a packet from 
                                   a client at higher layer N-1) with 
                                   specific overhead added at layer N 
                                   to ensure the forwarding integrity 
                                   of the server transport entity at 
                                   layer N. 
                                    
       Trail termination point     A source or sink point of a trail at 
                                   layer N, at which the trail overhead 
                                   is added or removed respectively.  A 
                                   trail termination point must have a 
                                   unique means of identification 
                                   within the layer network. 
                                    
     
   Harrison et. al.      Expires August 2001                   Page 4 
                 OAM Functionality for MPLS Networks    February 2001 
    
    
    
3. Symbols and Abbreviations 
    
   This list is not exhaustive of all the abbreviations used in this 
   draft.  In particular, those in common usage within the MPLS 
   community (like 'MPLS' itself) have been excluded. 
    
    
       Abbreviation               Meaning 
    ---------------    ---------------------------- 
       AIS             Alarm Indication Signal 
        
       BDI             Backward Defect Indication 
                        
       CV Packet       Connectivity Verification Packet 
                        
       FDI             Forward Defect Indication 
                        
       FFS             For Further Study 
                        
       OAM             Operations and Maintenance 
                        
       P Packets       Performance Packets 
                        
       QoS             Quality of Service 
                        
       SLA             Service Level Agreement 
                        
       TTSI            Trail Termination Source Identifier 
                        
    
    
4. Requirements for MPLS OAM 
    
   MPLS layer OAM functionality is not a substitute for physical or 
   server layer OAM (e.g., SDH/SONET) or client layer OAM (e.g., IP). 
   MPLS LSPs create layer networks in their own right, and will have 
   defects that are only relevant to the MPLS LSP layer networks.   
    
   OAM functionality is useful because: 
    
   1)   It allows the Operator to verify whether Quality of Service 
        guarantees given in SLAs (Service Level Agreements) are in fact 
        being met by the connection. 
   2)   It allows the Operator to reduce networkÆs operating costs, by 
        allowing more efficient detection and handling of defects. 
        Long-term statistics show that the costs of operating a public 
        network are higher than the initial installation costs. 
   3)   It gives support for improved accounting/billing procedures. 
   4)   It helps provide security for customer traffic by the detection 
        of traffic mis-connections (which may otherwise be 
        undetectable). 
    
     
   Harrison et. al.      Expires August 2001                   Page 5 
                 OAM Functionality for MPLS Networks    February 2001 
    
   The following functions are required: 
    
   1)   Connectivity Verification of LSPs to confirm that defects do 
        not exist on the target LSPs. 
   2)   Fast and efficient defect detection, notification and 
        localization. 
   3)   Measurement of availability performance. 
    
   The necessity of additional functions are for further study. In 
   particular, the need for in-service measurement of LSP QoS 
   performance (measurement of packet losses, spurious packets, errored 
   packets, delay and delay variation) is for further study. Note that 
   an LSP needs to be in the available state for QoS assessment to be 
   valid. 
    
   Defects include following cases: 
    
   1)   Simple loss of LSP connectivity (due to a server layer failure 
        or a failure within the MPLS layer network); 
   2)   Swapped LSP trails; 
   3)   Unintended LSP mismerging (of 2 or more LSP trails); 
   4)   Unintended replication of LSP packets (of the same LSP trail 
        for example, due to routing loops). 
    
    
5. Principles of OAM Function 
    
   The following principles can, for the most part, be applied to any 
   layer networks, ie not just MPLS. This recommendation defines 
   specific embodiments of these principles, as functional OAM 
   entities, for MPLS layer networks. Although it is recommended that 
   all the OAM functional entities are deployed network-wide, operators 
   are free to choose if they wish to apply all or only some of these 
   OAM functional entities (ie CV flows but not P flows), and whether 
   deployment is network-wide or limited in scope to LSPs of certain 
   types, e.g. apply only to important LSPs such as those supporting 
   VPNs. In cases of limited OAM functional entity deployment or scope, 
   then operators should be aware that there could be deficiencies in 
   their ability to detect/handle certain defect cases. 
    
    
5.1 Client/Server Recursion-Layering 
    
   A very important functional architecture feature of layer networks 
   is client/server recursion (also known as layering). That is, a 
   client layer link connection (ie a partition of a longer client 
   layer trail between two logically adjacent client layer nodes) is 
   created by a server layer trail. This is the basis of client layer 
   topology construction. This recursion principle extends between 
   various client/server layer relationships and ultimately 'to the 
   duct'. Note also that client layer link connections can be multiple 
   in number, ie a single server layer trail entity can support a 
   multiple number of client layer link connections. 
    
     
   Harrison et. al.      Expires August 2001                   Page 6 
                 OAM Functionality for MPLS Networks    February 2001 
    
   The key points to note here are: 
    
   (1)  The client and server layer trails termination points will 
        generally not be congruent.  And since the trail termination 
        points are associated with the addressable access points of a 
        layer network, it follows that the addressing of the two layers 
        will also generally not be congruent. 
   (2)  The 'duct' (or more precisely the environment of physical 
        occupancy and connectivity) is the lowest layer network. The 
        degree of connectivity in this layer effectively defines the 
        degree of independent connectivity in all client layers. This 
        could be put another way, by saying that the availability 
        performance of any client layer network design is determined 
        (and inherited from) the physical infrastructure. This means 
        that if one cannot state which link connections have a common 
        lower server layer trail, then one cannot say anything with 
        certainty about the resilience design of a client layer 
        network. 
    
    
5.2 OAM Functionality and Layer Independence 
     
   The OAM functionality of a layer network must not be dependent on 
   any specific server or client layer technology. This is critical to 
   ensure that layer networks can evolve (or new/old layer networks be 
   added/removed) without impacting other layer networks. 
    
   The control-plane of a given layer network must also have its own 
   OAM. 
    
   [Note - Control-plane OAM is outside the scope of this draft.] 
    
    
5.3 Defects 
    
   All the major defect conditions must be identified with in-service 
   measurable entry and exit criteria, and all consequent actions must 
   be specified.  The entry and exit criteria of various defects should 
   be temporally harmonized as far as possible to simplify trail 
   defect-state processing.  Attention should be paid to relating the 
   defect entry/exit criteria to æshort-breaksÆ, which are generally 
   accepted by many operators as 3-9s periods of gross signal 
   disturbance from which the network may self-recover. If the event 
   lasts for >=10s this is the normally accepted threshold for entering 
   the unavailable state (also see the next item). 
    
    
5.4 Availability 
    
   The most important performance metric of a trail (or a subnetwork 
   partition thereof) is availability.  This means that the entry and 
   exit criteria for the available state must be defined. It is also 
   important to understand how unavailable/available state transitions 
   relate to the stopping/starting of the aggregation of available 
     
   Harrison et. al.      Expires August 2001                   Page 7 
                 OAM Functionality for MPLS Networks    February 2001 
    
   state QoS metrics; noting that from pragmatic considerations this 
   may be effectively applied at an earlier point to preserve the 
   integrity of the available state metrics, e.g. after 3s say, which 
   marks the onset of (at least) a short-break, and which from 
   operational experience is a good practical rule-of-thumb for setting 
   a point beyond which a network is unlikely to self-recover. 
    
    
5.5 Decoupling of User behavior from Connectivity Assessment 
    
   User traffic behavior must not be a factor in connectivity status 
   assessment. In practical terms, this means decoupling user traffic 
   behavior from all defects and (the dependent) available state 
   entry/exit criteria. 
    
    
5.6 Forward and Backward Defect Indicators 
    
   The node in the layer network, which first detects a defect (sourced 
   from within that layer), should apply a well-known 'Forward Defect 
   Indication' (FDI) signal in the downstream direction. In the 
   majority of current transport network technologies such a signal has 
   been termed AIS (Alarm Indication Signal). At the trail termination 
   point where the appropriate FDI signal is generated: 
    
   (1)  There should be a complimentary Backward Defect Indication 
        (BDI) signal (which is removed at the upstream trail 
        termination point) and 
   (2)  There must be a mapping of the FDI signal from the server layer 
        to the appropriate FDI signal of the client layer(s) as part of 
        the server->client adaptation process. 
    
   The primary purpose of the FDI signal is to suppress client layer 
   alarms (which would otherwise create an 'alarm storm' in places 
   which could be geographically and organizationally far removed from 
   the originating defect source location). 
    
   Three secondary purposes of FDI (and in some cases BDI) are: 
    
   (1)  To allow correct processing of available state performance 
        metrics. 
   (2)  To inform applications that the connection is no longer 
        functioning correctly and to take appropriate action, e.g. 
        perhaps invoke a 're-connect' action, or in the case of voice 
        perhaps mute the speech path. 
   (3)  To inform client layer trails (e.g. nested LSPs in the case of 
        MPLS) that a defect has occurred in a lower server layer trail, 
        and hence to provide some indication that protection-switching 
        in the affected client layer trails could be postponed to give 
        the server layer trail an opportunity to effect protection 
        switching. 
    
   FDI/BDI signals should also provide information on the defect 
   location and type. Such information is very useful to the lead 
     
   Harrison et. al.      Expires August 2001                   Page 8 
                 OAM Functionality for MPLS Networks    February 2001 
    
   operator in a co-operating domain scenario, and can also 
   differentiate failures, which are internal or external to public and 
   private domains. 
    
   Note that, if being used, the BDI signal must be generated (in the 
   backward direction) in response to detecting a defect at a trail 
   sink termination point (in the forward direction) and not from some 
   intermediate point, such as where the defect might be actually 
   located. The reasons for this are that: 
    
   (1)  In the case of bi-directional trails and unidirectional 
        defects, each trail direction might not be congruently routed. 
   (2)  In the case of unidirectional trails the BDI signal may be 
        provided out-of-band, e.g. perhaps via a control-plane or 
        management-plane mechanism. [Note: The exact means for 
        providing the BDI functionality in this is FFS] 
    
   The above requirements mean that the FDI/BDI architecture is valid 
   for all routing cases. 
    
    
5.7 Connectivity Verification 
    
   An essential characteristic of the trails in a layer network is that 
   their trail termination points must have a unique identifier (at 
   least within that layer network). However, on link connections 
   between nodes within the layer network, relative identifiers are 
   commonly used for traffic forwarding. These relative identifiers 
   only have to be unique per interface, e.g. the VPI/VCI of ATM, the 
   DLCI of FR, the ælabelÆ of MPLS. 
    
   When relative identifiers are used for traffic forwarding there is a 
   possibility of trail misconnectivity due to defects.  These cover a 
   variety of connectivity failure modes, including: 
    
   1)   Simple loss of continuity (due to a server layer failure or a 
        failure within the layer network considered); 
   2)   Swapped connections; 
   3)   Unintended mismerging (of 2 or more trails); 
   4)   Unintended replication (of the same trail due, for example, to 
        routing loops). 
    
   Although some of these defects may be rare in practice, unless 
   detected/corrected their consequences can be very severe for an 
   operator; ranging from simple availability/QoS SLA violations 
   through to more serious security, censorship and mis-billing 
   implications. 
    
   It is therefore required that a unique trail source identifier be 
   periodically transmitted from the trail source to the trail sink to 
   detect these types of defect. 
    
    
     
   Harrison et. al.      Expires August 2001                   Page 9 
                 OAM Functionality for MPLS Networks    February 2001 
    
5.8 Customers Should not be Used as Defect Detectors 
    
   The OAM tools provided should ensure (as far as reasonably 
   practicable) that customers should not have to act as failure 
   detectors for the operator. 
    
    
5.9 The Reliability of OAM Functionality Under Fault Conditions 
    
   Under fault conditions a layer network cannot, by definition, be 
   expected to behave in a predictable manner. Therefore care should be 
   exercised when specifying and using OAM functions that require a 
   layer network to function in a reliable and predictable manner for 
   fault diagnosis. 
    
    
6. Mechanisms of MPLS OAM 
    
    
6.1 Special MPLS Label Values 
    
   The label structure defined in [1] indicates a single label field of 
   20 bits.  Label field values 0-3 have already been reserved for 
   special functions. A special label, the 'OAM Alert Label', is 
   defined as follows: 
    
                        Table 1: OAM Alert Label 
    
        Label value                           
         (Decimal)                        Meaning 
        ------------              ----------------------- 
             4           OAM Alert Label.  This indicates that the 
                         first octet following the OAM Alert Label 
    [Note: this value is in the OAM payload (ie octet 5) is an OAM 
    yet to be officially Function Type field whose value defines 
    assigned by IANA]    the type of defect handling OAM function 
                         (ie CV, P, FDI or BDI), which follows in 
                         the payload area. 
    
    
   All OAM packets must have a minimum payload length of 40 octets to 
   facilitate ease of processing.  This is achieved by padding with all 
   0s when necessary. All padding bits are reserved for future operator 
   defined usage.   
    
    
6.2 Handling of Errored OAM Packets 
    
   Each OAM packet uses a BIP16 (in the last two octets of the OAM 
   payload area) to detect errors.  The BIP16 is computed over all the 
   fields of the OAM payload, including the initial octet, which 
     
   Harrison et. al.      Expires August 2001                  Page 10 
                 OAM Functionality for MPLS Networks    February 2001 
    
   specifies the Function Type and the BIP16 bit positions (which are 
   all pre-set to zero for initial calculation purposes). 
    
   BIP16 processing must be performed on all OAM packets prior to being 
   able to reliably pass their payload for further processing.  Any OAM 
   packets that show a BIP16 violation upon reception processing should 
   be discarded. 
    
   In the case of the CV packet flow, persistent BIP16 violations will 
   cause a Loss of Connectivity Verification; this defect is defined 
   later, but for now we can note that it would occur after nominally 
   3s.  This behavior is consistent with the nature of the defect.   
   However, it is recommended that at a local equipment level some 
   notification is given to the Network Management System to indicate 
   that BIP16 discards are occurring. 
    
   In the case of the other OAM packet types, ie the FDI, BDI and P 
   packets (these are defined later), it is again recommended that at a 
   local equipment level some indication is given to the Network 
   Management System that BIP16 discards are occurring.  The threshold 
   to be used for recording/reporting such BIP16 discard activity for 
   these OAM packets should be programmable, and is outside the scope 
   of this Recommendation. 
    
    
6.3 Label Stack Overhead Encoding Rules for OAM Packets 
    
    
6.3.1   For CV OAM Packets 
    
   CV OAM packets are differentiated from normal user-plane traffic by 
   an increase of one in the label stack depth at a given LSP level at 
   which they are inserted. Therefore, they maintain this label stack 
   difference of one (from normal user-plane traffic) as they traverse 
   any lower layer server LSPs. 
    
   The OAM Alert Labeled header is added before (ie below) the normal 
   user-plane forwarding labeled header at the LSP trail source point.  
   The S bit is set only in the OAM Alert Label. 
    
   The CV OAM packet can be used on both E-LSPs and L-LSPs. However, 
   the coding of the EXP field is different in the two cases. 
   In the case of L-LSPs, the coding of the EXP field should be set to 
   all 0s in both the OAM Alert Labeled header and the preceding normal 
   user-plane forwarding header.  This is to ensure the CV OAM packets 
   have a Per Hop Behavior (PHB), which ensures the lowest drop 
   probability [2]. 
    
   In the case of E-LSPs, the coding of the EXP field should be set to 
   all 0s in the OAM Alert Labeled header and to whatever is the 
   'minimum loss-probability PHB' in the preceding normal user-plane 
   forwarding header for that E-LSP.  This is again to ensure the CV 
   OAM packets have a PHB, which ensures the lowest drop probability 
   [2]. 
     
   Harrison et. al.      Expires August 2001                  Page 11 
                 OAM Functionality for MPLS Networks    February 2001 
    
    
   The TTL field should be set to 1 in the OAM Alert Labeled header.  
   The reasons for this are: 
    
   ¸    CV OAM packets should never travel beyond the LSP trail 
        termination sink point at the LSP level they were originally 
        generated (noting that they are not examined by intermediate 
        label-swapping LSRs, and are only observed at LSP sink points), 
        and 
   ¸    The TTL of the immediately prior normal user-plane forwarding 
        header is used to mitigate against damage from looping packets. 
    
    
6.3.2   For P OAM Packets 
    
   The label stack overhead encoding rules of performance P OAM packets 
   are FFS. 
    
    
6.3.3   For FDI and BDI OAM Packets 
    
   FDI and BDI OAM packets are invoked, on a nominal 1 per second 
   basis, when defects are detected. The FDI packet traces forward and 
   upward through any nested LSP stack. The BDI packet is sent 
   backwards towards its peer-level LSP trail termination sink point in 
   the reverse direction (assuming a bi-directional in-band LSP exists) 
   for each LSP at and above the level of the defect. 
    
   The OAM Alert labeled header is inserted before (ie below) a normal 
   user-plane forwarding labeled header, and a label stack of 2 is only 
   ever required for either the FDI or BDI packet at their origin.  
   Note that in the case of FDI, it is assumed that the server->client 
   LSP adaptation mappings that were in existence prior to the failure 
   are recursively used to ensure correct FDI forwarding.  It is 
   therefore important that the LSP sink point remembers any server-
   >client LSP labels mappings that were in existence prior to the 
   failure.  Although the exact means for achieving this are outside 
   the scope of this Recommendation, some examples of how these server-
   > client layer label mappings could be configured are as follows: 
    
   ¸    Manually, via the NMS say; 
   ¸    Automatically on LSP set-up via extensions to LDP/RSVP 
        signaling; 
   ¸    By an automatic 'learning process', i.e. if, during the 
        establishment of the client LSPs, the signaling is tunneled 
        trough the server layer, then the server trail terminating node 
        could keep the information about the established LSPs in memory 
        as they occur. 
    
   When server->client layer LSP relationships are changed (e.g. 
   existing client layer LSP removed, or new client LSP added say), 
   then it is important that the server->client label mappings are also 
   updated to reflect the new relationships. 
    
     
   Harrison et. al.      Expires August 2001                  Page 12 
                 OAM Functionality for MPLS Networks    February 2001 
    
   The S bit is set only in the OAM Alert Labeled header. The FDI OAM 
   packet is recursively mapped upwards, through a client/server 
   adaptation process at LSP trail termination sink points, into any 
   further affected higher client layer LSPs.  When this arrives at the 
   top LSP it needs to be mapped into an equivalent FDI for whatever 
   client layer is then being carried.  In the case of IP (or indeed 
   any other client layer), this is outside the scope of this document. 
    
   Note that higher level LSPs will also see failures (as a result of 
   corruption of their own CV flow) but they will also see an incoming 
   FDI OAM packet flow from the lowest level LSP where the failure 
   originates.  This dynamic behavior allows for correct identification 
   of the true source of the defect and is explained in more detail 
   later.  But for now it is sufficient to note that the incoming FDI 
   is needed to: 
    
   ¸    Suppress unnecessary alarms in the affected higher layer LSPs. 
   ¸    Give an indication to affected higher-level LSPs that they may 
        need to hold-off protection switching as the defect is at a 
        lower level LSP. 
   ¸    To allow the appropriate BDI coding at the affected higher 
        layer. 
    
   It is assumed that when a BDI OAM packet is returned in-band it 
   follows a bi-directional LSP and, like the CV and P OAM packets, 
   that it should never travel beyond the LSP trail termination sink 
   point (of the return LSP). 
    
   The coding of the EXP field associated with the OAM Alert Labeled 
   header and the preceding normal user-plane forwarding labeled header 
   at the LSP level at which the FDI or BDI is inserted is the same as 
   that previously described for the CV OAM packet. 
    
   The TTL field should be set to 1 in the OAM Alert Labeled packet 
   header. The reasons for this are: 
    
   ¸    The FDI OAM packet is recursively regenerated at each LSP trail 
        termination sink point into all affected client layer LSPs (if 
        any); so the TTL field is recursively regenerated with a value 
        of 1; 
   ¸    The BDI OAM packet should never travel beyond the LSP trail 
        termination sink point of the return LSP at the LSP level that 
        it was originally generated; 
   ¸    The TTL of the immediately prior normal user-plane forwarding 
        header is used to mitigate against damage from looping packets. 
    
    
6.3.4   MPLS OAM Function Types for the OAM Alert Label 
    
   The first octet of the OAM packet payload specifies the OAM Function 
   Type as follows: 
    
                        Table 2: OAM Function Types 
    
     
   Harrison et. al.      Expires August 2001                  Page 13 
                 OAM Functionality for MPLS Networks    February 2001 
    
       OAM Function Type    First octet of OAM packet payload 
       codepoint (Hex)      Function Type Purpose 
       -----------------    ---------------------------------- 
              00            Reserved 
                             
              01            CV (Connectivity Verification).  Used 
                            to detect/diagnose all types of LSP 
                            connectivity defect (sourced either 
                            from below or within the MPLS 
                            network).  This will be the main in-
                            service OAM defect detection tool. 
                             
              02            P (Performance).  Used to measure 
                            user-plane loss of packets and their 
                            aggregate octets. 
                             
              03            FDI (Forward Defect Indicator).  This 
                            is generated by an MPLS node detecting 
                            any defect (defined later) and 
                            inserted into affected client layers.  
                            Its primary purpose is to suppress 
                            alarms being raised within affected 
                            higher level client LSPs and (in turn) 
                            their client layers.  It includes 
                            fields to indicate the nature of the 
                            defect and its location. 
                             
              04            BDI (Backward Defect Indicator).  This 
                            is generated at a return LSP trail 
                            termination source point in response 
                            to a defect being detected at a LSP 
                            trail termination sink point in the 
                            other direction.  The defect type and 
                            location codepoints of the 
                            complimentary FDI are mapped into 
                            similar fields of the BDI.  The BDI 
                            may be realized either in the user-
                            plane if bi-directional LSPs are being 
                            used (the case considered in this 
                            document) or out-of-band (e.g. via 
                            management-plane function) in the case 
                            of uni-directional LSPs.  The latter 
                            scenario is outside the scope of this 
                            document. 
                             
    
   All other OAM Function Type codepoints are reserved for possible 
   future standardization. 
    
    
6.4 MPLS OAM Packets 
    
    
     
   Harrison et. al.      Expires August 2001                  Page 14 
                 OAM Functionality for MPLS Networks    February 2001 
    
6.4.1   Connectivity Verification (CV) Packets 
    
    0                   1                   2                   3 
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
   | Func Type (1) |                 (must be 0)                   | 
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
   |                                                               | 
   +                                                               + 
   |                       Ingress Router ID                       | 
   +                                                               + 
   |                                                               | 
   +                                                               + 
   |                                                               | 
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
   |                           LSP ID                              | 
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
   |                                                               | 
   \\                   Reserved (0) 14 bytes                     \\ 
   |                                                               | 
   +                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
   |                               |         BIP 16                | 
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
    
                    Figure 1: CV Payload Structure 
    
   The intention is that the CV OAM packet is transmitted from the LSP 
   trail termination source point at a nominal rate of 1 CV per second.  
   It is important that the rate of CV OAM packet generation is 
   constant so that simple and deterministic defect processing can be 
   carried out at the LSP trail termination sink point. 
    
   CV OAM packets within a given LSP are not synchronous to any other 
   CV OAM packets in any other LSP (this includes all nested LSPs, and 
   CV OAM packets from the remote end of an LSP at level N but in the 
   other direction when bi-directional LSPs at level N are being used). 
    
   The structure of the LSP Trail Termination Source Identifier (TTSI) 
   is defined by using a 16 octet Router ID IPv6 address plus a 4 octet 
   LSP Tunnel ID [3].  Note that the first 2 octets of the LSP Tunnel 
   ID are currently padded with all 0s to allow for any future increase 
   in the Tunnel ID field. 
    
   For nodes that do not support IPv6 addressing, an IPv4 address can 
   be used for the Router ID using the format described in RFC1884 [4].   
     
   Harrison et. al.      Expires August 2001                  Page 15 
                 OAM Functionality for MPLS Networks    February 2001 
    
   That is: 
    
    0                   1                   2                   3 
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
   |                                                               | 
   +                                                               + 
   |                            (0)                                | 
   +                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
   |                               |               (FF)            | 
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
   |                       IPv4 Address                            | 
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
    
                   Figure 2: IPV6 Compatible IPV4 Address 
    
   On LSP establishment the LSP trail termination sink point should be 
   configured with the expected TTSI (Ingress router ID + LSP ID).  
   Ideally this should be done automatically via LSP signaling at LSP 
   set-up time (e.g. via a CR-LDP or RSVP control-plane mechanism), but 
   it could also be configured manually.  The mechanism for achieving 
   this configuration is outside the scope of this Recommendation. 
    
    
6.4.2   Performance ôPö Packets 
    
   The structure of the P OAM packet is FFS. 
    
    
6.4.3   Forward defect Indicator ôFDIö packets 
    
    
    0                   1                   2                   3 
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
   | Func Type (3) | (must be 0)   |        Defect Type            | 
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
   |                      Defect Location                          | 
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
   |                                                               | 
   \\                 Reserved (0) 30 bytes                       \\ 
   |                                                               | 
   +                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
   |                               |         BIP 16                | 
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
    
                        Figure 3: FDI Payload Structure 
    
   The FDI is sent downstream from the first node detecting the defect.  
   In the case of MPLS server layer failures (i.e. in a lower layer 
   technology such as SDH) this would be the first MPLS node downstream 
   of the server layer failure (as a consequence of the appropriate 
   client/server adaptation of the server FDI signal). In the case of 
   MPLS layer failures (i.e. failures within the MPLS fabric) this 
     
   Harrison et. al.      Expires August 2001                  Page 16 
                 OAM Functionality for MPLS Networks    February 2001 
    
   would be the first LSP trail termination sink point at the same LSP 
   level as the failure. 
    
   The primary function of the FDI is to stop downstream client layer 
   alarm storms and hence correctly focus the attention of Operational 
   personnel.  However, FDI can also have an important role in: 
    
   ¸    Facilitating correctly targeted nested LSP protection schemes, 
        i.e. one would want a lower level (server) LSP to protection 
        switch before a higher level (client) LSP if the fault was 
        sourced from within the lower level LSP, and 
    
   ¸    Identifying availability/short-break events and hence suspend 
        up-state QoS metric aggregation. 
    
   The format of the Defect Location field and its handing at inter 
   domain NNI boundaries is FFS. 
    
   The Defect Type field is set at 2 octets here. This is currently 
   considered sufficient, but it should be confirmed once all the 
   Defects Types have been identified and fully specified. A candidate 
   set of Defect Types and their codepoints are given later. 
    
   The handling of the Defect Type field at inter domain NNI boundaries 
   is FFS. However, 2 octets have been reserved for this function. 
    
   When a FDI is to be passed from a server layer LSP to its client 
   layer LSP(s) (ie at the client/server adaptation function following 
   the server layer LSP trail termination sink point), the Defect 
   Location and Defect Type field should be copied from the server 
   layer LSP FDI into the client layer LSP(s) FDI. 
    
   The mapping of MPLS layer sourced FDI from the highest-level LSP 
   into its client layer (e.g. IP) is outside the scope of this 
   document. 
    
    
6.4.4   Backward Defect Indicator ôBDIö 
    
    0                   1                   2                   3 
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
   | Func Type (4) |   (must be 0) |        Defect Type            | 
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
   |                      Defect Location                          | 
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
   |                                                               | 
   \\                  Reserved(0) 30 bytes                       \\ 
   |                                                               | 
   +                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
   |                               |         BIP 16                | 
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 

                        Figure 4: BDI Payload Structure 
     
   Harrison et. al.      Expires August 2001                  Page 17 
                 OAM Functionality for MPLS Networks    February 2001 
    
    
   For the case of bi-directional LSPs, the BDI is sent from the LSP 
   trail source point of the return LSP as a mirror of the appropriate 
   (see Note) FDI at the LSP trail sink point of the other direction.   
    
   The Defect Location and Defect Type fields are a direct mapping of 
   those sets in the appropriate (see Note) FDI and have identical 
   formats as described previously for the FDI OAM packet. 
    
   Note - The word 'appropriate' here signifies that any incoming FDI 
   (i.e. from a lower layer) takes precedence over any FDI that would 
   have been generated at the layer being considered due to detecting 
   defects at this layer (where these defects are only consequential as 
   a result of a lower layer defect). 
    
   The BDI does not propagate beyond its return LSP trail termination 
   sink point, and it is discarded at that point after any processing 
   based its observation is carried out, e.g. for single-ended short-
   break and/or availability measurements. 
    
    
6.5 Defect Types and their Entry/Exit Criteria 
    
6.5.1   Defect Type Codepoints 
    
   The following coding structure is proposed for the various defect 
   types so far identified: 
    
                        Table 3: Defect Types 
    
              DT code in FDI/BDI               
              OAM packets (Hex)     
              Note: first octet     
              indicates layer and   
    Defect    second octet          
     Type     indicates defect                Meaning 
    -------   --------------------    ------------------------ 
    dServer          01 01         Any server layer defect 
                                   arising below the MPLS layer 
                                   network.  It is not suggested 
                                   that these are individually 
                                   identified and defined for 
                                   each type of server layer, 
                                   since this function is only 
                                   appropriate to the server 
                                   layer itself.  Hence, we only 
                                   need an indication that it is 
                                   the server layer and not the 
                                   MPLS layer. 
                                       
    dLOCV            02 01         Simple Loss of Connectivity 
                                   Verification due to missing 
                                   CV OAM packets with expected 
                                   TTSI.  Note that if the cause 
     
   Harrison et. al.      Expires August 2001                  Page 18 
                 OAM Functionality for MPLS Networks    February 2001 
    
                                   of dLOCV is the server layer 
                                   (ie there is also an incoming 
                                   FDI signal from the server 
                                   layer) then the DT codepoint 
                                   01 01_H is used.  The dLOCV 
                                   codepoint 02 01_H is only 
                                   used for MPLS layer simple 
                                   connectivity failures only. 
                                    
    dTTSI            02 02         Trail Termination Source 
                                   Identifier Mismatch due to an 
                                   unexpected TTSI observed in 
                                   the incoming CV OAM packets.  
                                   This detects swapped 
                                   connections and unintended 
                                   mismerging failures, which 
                                   can be differentiated by 
                                   noting whether an expected 
                                   TTSI is also missing or 
                                   present respectively.  Note 
                                   that in the case of the 
                                   former (ie swapped 
                                   connections), the dTTSI 
                                   defect condition takes 
                                   priority over the dLOCV 
                                   defect condition, which is 
                                   also present. 
                                    
    dLoop            02 03         This detects an unintended 
                                   replication Looping defect 
                                   from observation of an 
                                   increased rate of expected CV 
                                   OAM packets above the nominal 
                                   1/sec. (Note this defect is 
                                   added for completeness, but 
                                   it is expected to be rare) 
                                    
    dUnknown         02 FF         Unknown defect detected in 
                                   the MPLS layer.  This is 
                                   expected to be used for MPLS 
                                   nodal failures, which are 
                                   detected within the node 
                                   (probably by proprietary 
                                   means) and affect user-plane 
                                   traffic. 
                                       
    None             00 00         Reserved 
                                    
    None             FF FF         Reserved 
    
    
   There are 3 MPLS layer user-plane defects, ie dLOCV, dTTSI and 
   dLoop, which we now define in more detail. 
    
     
   Harrison et. al.      Expires August 2001                  Page 19 
                 OAM Functionality for MPLS Networks    February 2001 
    
    
6.5.2   dLOCV Entry Criteria 
    
   Entry to the dLOCV condition, and hence entry to the LSP Trail Sink 
   Near-End Defect State, occurs when there are no expected CV OAM 
   packets observed in any period of 3 consecutive seconds. 
    
   In terms of consequent actions: 
    
   ¸    If there is an incoming FDI signal from a server layer below 
        the MPLS network, then this is mapped to the DT codepoint 01 
        01_H in the FDI OAM packets sent forwards and the BDI OAM 
        packets sent backwards.  The local DL codepoint is also 
        inserted in these FDI and BDI OAM packets. There are no alarms 
        associated with the MPLS layer itself but only the server 
        layer, which sourced the FDI signal. 
    
   Else: 
    
   ¸    If there is an incoming FDI signal from a lower level LSP 
        within the MPLS network, then that FDI signal's DL/DT 
        codepoints are mapped into the FDI sent to any further client 
        layers (i.e. suppresses generation of FDI DL/DT codepoints from 
        this point) and the BDI OAM packet sent backwards.  There are 
        no alarms generated regarding this LSP (the alarm will be 
        associated with the lowest layer LSP within which the defect 
        originated). 
    
   Else: 
    
   ¸    If there is no FDI signal incoming from the server layer or a 
        lower level LSP AND there are no CV OAM packets observed with 
        an unexpected TTSI which give rise to the dTTSI defect, then 
        the DT codepoint 02 01_H is inserted in the FDI OAM packets 
        sent downstream and the BDI OAM packets sent upstream.  The 
        local DL codepoint is also inserted in these FDI and BDI OAM 
        packets. A local alarm is raised relevant to this defect 
        condition. 
    
   Note: 
    
        (i)    Since OAM packet flows are not synchronized in LSPs at 
               different hierarchical levels (ie when LSPs are nested), 
               there is a possibility that a client layer LSP detects a 
               defect before its server layer LSP. This error could be 
               up to 1s due to CV packet arrival time differences plus 
               some additional uncertainty due to network delay 
               effects. This could result in an error of judgment as to 
               the type of defect that is present and hence which 
               consequent actions are appropriate; especially whether 
               the raising of a local alarm is appropriate and the 
               correct setting of the DL and DT codepoints in FDI/BDI 
               OAM packets.  To mitigate this effect, it is recommended 
               that the raising of an alarm is deferred for at least 2 
     
   Harrison et. al.      Expires August 2001                  Page 20 
                 OAM Functionality for MPLS Networks    February 2001 
    
               seconds after a defect state is detected (the exact 
               value is FFS). This will also allow the network to 
               settle into a stable state as regards defect detection 
               behavior. 
         
        (ii)   The starting/stopping of aggregation of any LSP user-
               plane packet/octet loss metrics (e.g. if using the P OAM 
               packet say) is dependent on whether the LSP is in the 
               available or unavailable state. 
    
    
6.5.3   DTTSI Entry Criteria 
    
   Entry to the dTTSI condition, and hence entry to the LSP Trail Sink 
   Near-End Defect State, occurs when there are >= 2 CV OAM packets 
   observed in any period of 3 consecutive seconds each with an 
   unexpected TTSI. Any expected CV OAM packets or any incoming FDI 
   signals (from either the server layer or a lower level LSP) are 
   ignored, and it should be noted that the dTTSI defect overrides the 
   dLOCV defect if both are present (as would be the case, for example, 
   with swapped LSPs). The DT codepoint 02 02_H is inserted in the FDI 
   OAM packets sent forwards and the BDI OAM packets sent backwards.  
   The local DL codepoint is also inserted in these FDI and BDI OAM 
   packets. A local alarm is raised relevant to this defect condition 
   and the unexpected TTSI captured locally (this may also be 
   optionally sent to the NMS as an exception report say). The 
   downstream traffic must also be suppressed. 
    
   Note: 
    
   (i)  Since OAM packet flows are not synchronized in LSPs at 
        different hierarchical levels (ie when LSPs are nested), there 
        is a possibility that a client layer LSP detects a defect 
        before its server layer LSP.  This error could be up to 1s due 
        to CV packet arrival time differences plus some additional 
        uncertainty due to network delay effects. This could result in 
        an error of judgment as to the type of defect that is present 
        and hence which consequent actions are appropriate; especially 
        whether the raising of a local alarm is appropriate and the 
        correct setting of the DL and DT codepoints in FDI/BDI OAM 
        packets.  To mitigate this effect, it is recommended that the 
        raising of an alarm is deferred for at least 2 seconds after a 
        defect state is detected (the exact value is FFS).  This will 
        also allow the network to settle into a stable state as regards 
        defect detection behavior. 
    
   (ii) The starting/stopping of aggregation of any LSP user-plane 
        packet/octet loss metrics (e.g. if using the P OAM packet say) 
        is dependent on whether the LSP is in the available or 
        unavailable state. 
    
    
6.5.4   dLoop Entry Criteria 
    
     
   Harrison et. al.      Expires August 2001                  Page 21 
                 OAM Functionality for MPLS Networks    February 2001 
    
   Entry to the dLoop condition, and hence entry to the LSP Trail Sink 
   Near-End Defect State, occurs when there are >= 5 CV OAM packets 
   observed in any period of 3 consecutive seconds each with an 
   expected TTSI.  The DT codepoint 02 03_H is inserted in the FDI OAM 
   packets sent forwards and the BDI OAM packets sent backwards.  The 
   local DL codepoint is also inserted in these FDI and BDI OAM 
   packets. A local alarm is raised relevant to this defect condition. 
    
   Note: 
    
   (i)  Since OAM packet flows are not synchronized in LSPs at 
        different hierarchical levels (ie when LSPs are nested), there 
        is a possibility that a client layer LSP detects a defect 
        before its server layer LSP. This error could be up to 1s due 
        to CV packet arrival time differences plus some additional 
        uncertainty due to network delay effects. This could result in 
        an error of judgment as to the type of defect that is present 
        and hence which consequent actions are appropriate; especially 
        whether the raising of a local alarm is appropriate and the 
        correct setting of the DL and DT codepoints in FDI/BDI OAM 
        packets.  To mitigate this effect, it is recommended that the 
        raising of an alarm is deferred for at least 2 seconds after a 
        defect state is detected (the exact value is FFS). This will 
        also allow the network to settle into a stable state as regards 
        defect detection behavior. 
    
   (ii) The starting/stopping of aggregation of any LSP user-plane 
        packet/octet loss metrics (e.g. if using the P OAM packet say) 
        is dependent on whether the LSP is in the available or 
        unavailable state. 
    
    
6.5.5   dLOCV, dTTSI and dLoop exit criteria 
    
   Exit of the dLOCV, dTTSI or dLoop condition, and hence exit of the 
   LSP Trail Sink Near-End Defect State, occurs when there are: 
    
   ¸    >= 2 but <= 4 CV OAM packets observed each with an expected 
        TTSI, AND 
   ¸    No CV OAM packets observed with an unexpected TTSI in any 
        period of 3 consecutive seconds. 
    
   Note that the numbers of CV OAM packets observed each with an 
   expected TTSI are a suggested number. It must be further studied if 
   these numbers are appropriate. 
    
   All the consequent actions invoked when entering the LSP Trail Sink 
   Near-End Defect State (i.e. sending of FDI and BDI OAM packets, the 
   raising of local alarms and the suppression of traffic in the dTTSI 
   case only) are stopped when we exit the LSP Trail Sink Near-End 
   Defect State. 
    
   Note û The starting/stopping of aggregation of any LSP user-plane 
   packet/octet loss metrics (e.g. if using the P OAM packet say) is 
     
   Harrison et. al.      Expires August 2001                  Page 22 
                 OAM Functionality for MPLS Networks    February 2001 
    
   dependent on whether the LSP is in the available or unavailable 
   state. 
    
    
6.6 Available and unavailable state processing 
    
   The main purpose of defining harmonized defect entry/exit criteria 
   as noted above is in order to significantly simplify: 
    
   ¸    Near-end/far-end LSP Trail Sink Defect State processing; 
   ¸    Near-end/far-end LSP Available State processing (which will 
        shortly be discussed); 
   ¸    The decision point at which any LSP user-plane traffic QoS 
        metrics (if being collected) are stopped/started with respect 
        to aggregation into long-term registers. 
    
   In all sections where the evaluation of events is described, the 
   measurement technique is based on a sliding-window with a 1 second 
   granularity of advance.  Note that the datum for the commencement of 
   the sliding window is an arbitrary point in time decided by the each 
   node independently and is not synchronized to OAM packet arrival 
   events on any LSPs.  This is deemed acceptable to allow simpler 
   nodal processing. 
    
   It should be noted that this Recommendation uses the traditional 
   functional dependency relationship between QoS and availability.  
   That is: 
    
   ¸    QoS is a unidirectional metric, ie if QoS metrics are being 
        measured then each direction is measured independently. 
   ¸    Availability is a bi-directional metric in the case of bi-
        directional LSPs, in the sense that if any direction enters the 
        unavailable state (defined later) then both directions are 
        deemed to be unavailable.  In the case of unidirectional LSPs, 
        then availability can only have unidirectional significance. 
   ¸    QoS measurements must be suspended (as regards aggregation into 
        long-term available state registers) if an LSP enters the 
        unavailable state; noting that this means the QoS measurements 
        of both directions from the definition of the availability 
        metric above in the case of bi-directional LSPs. 
    
   However, it should also be noted that (for both pragmatic reasons 
   and to preserve their statistical significance) QoS metric 
   aggregation is actually suspended after detecting a short-break 
   event. 
    
    
6.6.1   Short Break definition 
    
   We first define a short-break event.  This is defined as a period 
   where the entry and exit to any of the previously defined defect 
   conditions both occur within 9s, ie the LSP Trail Sink Near-End 
   Defect State lasts for <= 9s.  The start of the short-break occurs 
   at the beginning of the defect entry criteria and the end of the 
     
   Harrison et. al.      Expires August 2001                  Page 23 
                 OAM Functionality for MPLS Networks    February 2001 
    
   short-break occurs at the beginning of the defect exit criteria.  
   Clearly this has a minimum period of 3s.  Short-breaks are only 
   defined to exist when the LSP is in the Available State. 
    
   Note û Short-breaks are more common than many people realize (in one 
   operator's network a study of SES (Severely Errored Second) events 
   showed that about 50% of these would have been classified as short-
   breaks).  They can cause severe disruption to some applications and 
   are therefore an important performance metric (perhaps second in 
   importance after availability).  Since they exist at the physical 
   layers they will exist (by inheritance) in client layers, such as 
   MPLS and IP.  An important property of the short-break, which we 
   will exploit, is that it yields a pragmatic harmonized threshold for 
   defect evaluation (across all defect types as noted previously) and 
   the stopping/starting of QoS metric aggregation into long-term up-
   state performance registers. 
    
    
6.6.2   Available/Unavailable State Definition 
    
   If the LSP Trail Sink Near-End Defect State exceeds 10 consecutive 
   seconds in duration then the LSP enters the Unavailable State. The 
   start point of the Unavailable State is deemed to be at the 
   beginning of these 10 consecutive seconds. We therefore no longer 
   have a short-break (and the event should not be registered as such). 
    
   A LSP re-enters the Available State after first exiting the LSP 
   Trail Sink Near-End Defect State and there has been an aggregate 
   period of 10 consecutive seconds in which there have been: 
    
   ¸    >=9 and <= 11 CV OAM packets each with an expected TTSI, AND 
   ¸    No CV OAM packets with an unexpected TTSI. 
    
   Note that the numbers of CV OAM packets observed each with an 
   expected TTSI are suggested numbers.  It must be further studied if 
   these numbers are appropriate. 
    
   The start point of the Available State is deemed to be at the 
   beginning of these 10 consecutive seconds. 
    

6.6.3   Near-end and Far-end Measurements of Availability 
    
   All of the above discussion is strictly only relevant to the near-
   end processing when the LSP trail termination sink point is in the 
   LSP Trail Sink Near-End Defect State as discussed previously.  We 
   can also measure the far-end availability behavior (useful when only 
   a single end is accessible for measurement) by using the BDI signal 
   (when bi-directional LSPs are being used) since this is a reflected 
   upstream mirror of the duration over which FDI is sent downstream. 
    
   We therefore define the LSP Trail Sink Far-End Defect State to be 
   the period over which BDI OAM packets are observed subject to the 
   following entry and exit criteria: 
     
   Harrison et. al.      Expires August 2001                  Page 24 
                 OAM Functionality for MPLS Networks    February 2001 
    
    
   ¸    Entry of the LSP Trail Sink Far-End Defect State occurs on the 
        first BDI OAM packet observed. 
   ¸    Exit of the LSP Trail Sink Far-End Defect State occurs after a 
        period of 3 consecutive seconds in which no BDI OAM packets 
        have been received. 
    
   Note that this 3s processing delay on exit is to cater for cases in 
   which perhaps a single BDI is lost (say due to congestion or 
   errors).  Its effect must be catered for in the far-end processing 
   state machine as discussed later. 
    
   Since we have fixed the temporal duration of the far-end state to be 
   directly related to the near-end state (albeit with a +3s exit 
   checking period) we can therefore measure both short-breaks and 
   unavailability of both directions from a single end (on the 
   assumption that bi-directional LSPs are being used). 
    
    
6.6.4   Near-End State Processing Flow-chart 
    
   The following figure summarizes many of the key points regarding the 
   near-end state-processing algorithm for a given LSP. 
    
                Figure 5: LSP Near-End State Processing Flow Chart 
    
   1.   Assume we start in the available state in the box marked 
        æStartÆ.  All timers (shown later) can conceptually be assumed 
        reset at this point.  If there are any QoS metrics being 
        collected (e.g. packet/octet loss measurements from the P OAM 
        packet) then this is assumed to be active at this time. 
   2.   The first decision box is ædLOCV, dTTSI or dLoop?Æ. These 
        defects were defined previously.  If none of these defects are 
        present we keep checking for this condition and stay in the 
        available state.  However, if one of these defects is present 
        we enter the Trail Sink Near-End Defect State. 
   3.   The consequent actions now required depend on the nature of the 
        defect observed, and whether there is any incoming FDI from a 
        lower layer, and should follow the rules given previously. But 
        note that any QoS metrics, which are being collected, are 
        suppressed from aggregation into the long-term registers 
        against available time.  The registers are effectively 
        backdated 3s to allow for the defect detection time (at this 
        stage we cannot judge whether the event will be a Short-Break, 
        and hence the LSP remains in the Available State, or whether 
        the LSP will enter the Unavailable State). 
   4.   We now start timer T1.  This timer is used to determine the 
        duration of the Trail Sink Near-End Defect State, and if this 
        persists for a sufficient time (ie a further 10s) then this 
        timer is used to branch the flow-chart into the Unavailable 
        State processing region. 
   5.   Below (timer) T1, we loop round the decision boxes æT1<10s?Æ 
        and æEnd dLOCV, dTTSI or dLoop?Æ. We can exit this loop if the 
        defect state ends (in accordance with criteria given 
     
   Harrison et. al.      Expires August 2001                  Page 25 
                 OAM Functionality for MPLS Networks    February 2001 
    
        previously) before T1 reaches 10s. Since we are still in the 
        available state, we restart any QoS metric aggregation into the 
        long-term registers (noting the last 3s must be accounted for), 
        we stop FDI/BDI OAM packet generation and capture the short-
        break event in the local registers. Additionally, if the event 
        was due to a dTTSI, then we should also capture the TTSI of the 
        offending LSP and cease the suppression of traffic.  The 
        timestamp of the event should be related to the onset of the 
        defect, which caused it. If however T1 reaches 10s we enter the 
        Unavailable State. Note that it is not possible to enter the 
        Unavailable State unless the Trail Sink Near-End Defect State 
        has persisted for at least 10s in the Available State. 
   6.   We now record a date/time-stamped Unavailable State entry event 
        in the local registers together with information on the nature 
        of the defect, which caused it.  Note that the date/timestamp 
        must be backdated 13s.  Optionally, we may also send an 
        exception report to the NMS with the Unavailable State entry 
        date/timestamp noted above, together with any other relevant 
        information about the defect which caused it, e.g. in the case 
        of dTTSI this should include the TTSI of the offending LSP.  We 
        now stop timer T1 and start timer T2, whose purpose is to 
        record the duration of the Unavailable State. Note that when we 
        enter the Unavailable State we also remain in the Trail Sink 
        Near-End Defect State. 
   7.   We now run round a decision box æEnd dLOCV, dTTSI or dLoop?Æ, 
        which is just below the point where we started timer T2, which 
        checks for the end of the defect state.  When the defect ends 
        (in accordance with the criteria given previously) we stop 
        FDI/BDI OAM packet generation and exit the Trail Sink Near-End 
        Defect State. Any QoS metric aggregation is still inhibited. 
    
   8.   We now run round the decision loop comprised of the two boxes 
        æ>=9 but <= 11 expected CV OAM packets in last 10s AND no 
        unexpected CV OAM packets' and ædLOCV, dTTSI or dLoop?Æ.  If a 
        further defect occurs before we meet the exit criteria of the 
        former decision box, we re-enter the Trail Sink Near-End Defect 
        State and hence restart the generation of FDI/BDI OAM packets 
        (with DL/DT codepoints and other consequent actions relevant to 
        the specific defect observed). Any QoS metric aggregation 
        continues to be inhibited.  In this case we are back at point 7 
        above in the state processing and recommence checking for the 
        end of the defect. Note that timer T2 continues to run. 
   9.   To get out of the Unavailable State we must first have exited 
        the Trail Sink Near-End Defect State as noted in 7 above, and 
        then met the criteria of the decision box æ>=9 but <= 11 
        expected CV OAM packets in last 10s AND no unexpected CV OAM 
        packets?Æ as noted in 8 above. Note that the ælast 10sÆ 
        referred to here includes the 3s interval required to check for 
        the end of the Trail Sink Near-End Defect State as noted above 
        in item 7. 
   10.  We now stop timer T2 and record the duration of the 
        unavailability event in the local registers.  We recommence any 
        QoS metric aggregation into the local registers and cease all 
        consequent actions associated with the Unavailable State.  Note 
     
   Harrison et. al.      Expires August 2001                  Page 26 
                 OAM Functionality for MPLS Networks    February 2001 
    
        that T2 will record Unavailable State duration, which is 3s 
        less than the true unavailability event. Note also that the 
        last 10s belong to the Available State and so any QoS metric 
        aggregation will need to take these 10s into account.  
        Optionally, we may also send an exception report to the NMS 
        with the Unavailable State exit date/timestamp suitably 
        corrected as noted above. 
   11.  This now takes us back to our starting point in the Available 
        State. 
    
    
6.6.5   Far-End State Processing Flow-chart 
    
   The following figure summarizes many of the key points regarding the 
   far-end state-processing algorithm for a given LSP. 
    
                Figure 6: LSP Far-End State Processing Flow Chart 
    
   1.   Assume we start in the available state at the box marked 
        æStartÆ.  All timers shown later in the flow chart can 
        conceptually be assumed to be reset at this point.  If there is 
        any backward QoS aggregation activated on the return direction 
        LSP then this will be via a separate P OAM packet flow on the 
        return LSP. 
   2.   The first decision box is æBDI OAM packet?Æ.  If the answer is 
        'No', then we keep looping this check condition and stay in the 
        Available State.  If the answer is 'Yes', then this implies 
        that the near-end processing at the other end of the (outgoing) 
        LSP has entered the Trail Sink Near-End Defect State.  Note 
        that this also implies that the defect has already existed for 
        3s at the other end of this LSP. 
   3.   We then enter the Trail Sink Far-End Defect State and inhibit 
        any backward QoS metric aggregation.  The QoS registers will 
        need to be corrected for the previous 3s, which should not be, 
        aggregated into the long-term Available State counts. 
   4.   We now start timer T3, and run round the loop composed of the 
        decision boxes æT3 <13s?Æ and æ3s BDI-Free?Æ.  T3 is used to 
        check the duration of the Trail Sink Far-End Defect State.  If 
        T3 does not reach 13s and we get 3s, which are BDI-Free, then 
        we re-start any backward packet level metric aggregation.  Note 
        that the last 6s must be accounted for in any backward QoS 
        metric aggregation registers.  This arises since it takes the 
        near-end processing 3s to declare the end of the defect at the 
        other send of the (outgoing) LSP, and a further 3s to declare 
        the end of the Trail Sink Far-End Defect State at this end of 
        the (return) LSP, and all this time should count towards the 
        Available State at this end of the LSP to ensure correct QoS 
        metric aggregation.  A Short-Break date/time-stamped event 
        should also be recorded in the local registers together the 
        DL/DT information of the defect as given in the BDI OAM packet.  
        This Short-Break event must be date/time-stamped relative to 3s 
        before the time at which the first BDI OAM packet was observed.  
        This now takes us back to the initial start position.  If 
        however T3 reaches 13s we enter the far-end Unavailable State.  
     
   Harrison et. al.      Expires August 2001                  Page 27 
                 OAM Functionality for MPLS Networks    February 2001 
    
        Note that it is not possible to enter the Unavailable State 
        unless the Trail Sink Far-End Defect State has effectively 
        persisted for at least 13s (and which means that at the other 
        end of the (outgoing) LSP the Trail Sink Near-End Defect State 
        has persisted for at least 10s) in available time. 
   5.   Optionally, we may now send a date/time-stamped unavailability 
        entry exception report to the NMS, which includes the relevant 
        BDI OAM packet DL/DT information.  Note that the date/timestamp 
        of any such exception report should be backdated by 16s (ie 3s 
        prior to the first BDI OAM packet being observed for this 
        event) to align the far-end processing with that of the near-
        end processing at the other end.  We now stop timer T3 and 
        start a timer T4, whose purpose is to record the duration of 
        this unavailability event.  Note that when we enter the 
        Unavailable State we also remain in the Trail Sink Far-End 
        Defect State. 
   6.   We now run round a loop that checks for 3s which are BDI-Free.  
        This is used to take us out of the Trail Sink Far-End Defect 
        State.  Note that this is not strictly necessary, and this 
        check condition could have been omitted and we could just have 
        shown the following one which checks for a continuous (ie 
        overall) 10s of BDI-Free behavior.  However, it has been shown 
        like this to harmonize the ælookÆ of the near-end and far-end 
        Trail Sink Defect State processing. 
   7.   If we get 3s which are BDI-Free then we exit the Trail Sink 
        Far-End Defect State and run a loop which checks if we have had 
        an overall continuous period of 10s which are BDI-Free.  If any 
        further BDI OAM packets appear within this overall 10s checking 
        period then we re-enter the Trail Sink Far-End Defect State and 
        need to repeat the process from step 6 above.  If, however, no 
        further BDI OAM packets appear within the 10s checking period 
        we exit the far-end Unavailable State. 
   8.   We stop timer T4 and record the duration of the unavailability 
        event.  T4 will record a time, which is 3s less than the true 
        unavailability event.  A date/time-stamped unavailability exit 
        event, backdated 13s, together with the unavailability duration 
        should now be recorded in the local registers.  Optionally, 
        this information may also be sent to the NMS as an exception 
        report. 
   9.   Any backward QoS metric aggregation can now be restarted, 
        noting that the last 13s belong to available time and so the 
        aggregate registers should be corrected accordingly 
    
6.6.6   A pictorial view of near-end and far-end state processing  
    
   The following figure is given to help clarify the temporal 
   relationships between the near-end and far-end state processing 
   given in the previous flow-charts for short-break event and an 
   unavailability event. 
    
   Figure 7:  Near-End and Far-End Temporal Processing of a Short-Break 
   and Unavailability event 
    
    
     
   Harrison et. al.      Expires August 2001                  Page 28 
                 OAM Functionality for MPLS Networks    February 2001 
    
7. Security Considerations 
    
   The OAM function described in this document enhances the security of 
   MPLS networks, by detecting mis-connections, and therefore 
   preventing customersÆ traffic to be exposed to other customers. 
     
   The MPLS OAM functions as defined in this document do not raise any 
   new security issue, to MPLS networks. 
    
    
8. References 
    
    
   [1]  Rosen E, et al, RFC 3032, "MPLS label stack encoding". 
    
   [2]  Le Faucheur et al, "MPLS support of Differentiated Services", 
   draft-ietf-mpls-ext-08.txt, work in progress. 
    
   [3]  Awduche et al, "RSVP-TE: Extensions to RSVP for LSP Tunnels", 
   draft-ietf-mpls-rsvp-lsp-tunnel-05.txt, work in progress. 
    
   [4]  Hinden and Deering, RFC 1884, "IP Version 6 Addressing 
   Architecture". 
    
9. Author's Addresses 
    
   Neil Harrison 
   British Telecom              Phone: 44-1604-845933 
   Heath Bank                   Email: neil.2.Harrison@bt.com 
   Iugby Road, Harleston 
   South Hampton, UK 
    
   Peter Willis 
   British Telecom              Phone: 44-1473-645178 
   BT, PP RSB10/PP3 B81         Email: peter.j.willis@bt.com 
   Adastrial Park 
   Martlesham, Ipswich, UK  
    
   Shahram Davari 
   PMC-Sierra 
   411 Legget Drive             Phone: 1-613-271-4018 
   Kanata, ON, Canada           Email: Shahram_Davari@pmc-sierra.com 
    
   Ben Mack-Crane 
   Tellabs 
   4951 Indiana Ave             Phone: 1-630-512-7255 
   Lisle, IL, USA               Email: ben.mack-crane@tellabs.com 
    
   Hiroshi Ohta 
   NTT 
   Y-709A, 1-1 HikarinoÆka      phone: 81-468-59-8840  
   Yokosuka-Shi                 Email: ohta.hiroshi@nslab.ntt.co.jp 
   Kanagawa, Japan 
     
   Harrison et. al.      Expires August 2001                  Page 29