Internet DRAFT - draft-lin-ccamp-gmpls-proactive-protection

draft-lin-ccamp-gmpls-proactive-protection



CCAMP Working Group                                              Yi Lin 
Internet Draft                                      Huawei Technologies 
Intended status: Standards Track                       November 3, 2019 
Expires: May 2020 
                                   
 
 
                                      
           RSVP-TE Extensions in Support of Proactive Protection 
             draft-lin-ccamp-gmpls-proactive-protection-00.txt 


Status of this Memo 

   This Internet-Draft is submitted in full conformance with the 
   provisions of BCP 78 and BCP 79.  

   Internet-Drafts are working documents of the Internet Engineering 
   Task Force (IETF), its areas, and its working groups.  Note that 
   other groups may also distribute working documents as Internet-
   Drafts. 

   Internet-Drafts are draft documents valid for a maximum of six 
   months and may be updated, replaced, or obsoleted by other documents 
   at any time.  It is inappropriate to use Internet-Drafts as 
   reference material or to cite them other than as "work in progress." 

   The list of current Internet-Drafts can be accessed at 
   http://www.ietf.org/ietf/1id-abstracts.txt 

   The list of Internet-Draft Shadow Directories can be accessed at 
   http://www.ietf.org/shadow.html 

   This Internet-Draft will expire on May 3, 2020. 

Copyright Notice 

   Copyright (c) 2019 IETF Trust and the persons identified as the 
   document authors. All rights reserved. 

   This document is subject to BCP 78 and the IETF Trust's Legal 
   Provisions Relating to IETF Documents 
   (http://trustee.ietf.org/license-info) in effect on the date of 
   publication of this document. Please review these documents 
   carefully, as they describe your rights and restrictions with 
   respect to this document. Code Components extracted from this 
   document must include Simplified BSD License text as described in 

 
 
 
Yi Lin                  Expires May 3, 2020                   [Page 1] 

Internet-Draft          GMPLS Proactive Protection       November 2019 
    

   Section 4.e of the Trust Legal Provisions and are provided without 
   warranty as described in the Simplified BSD License. 

Abstract 

   This document describes protocol-specific procedures and extensions 
   for Generalized Multi-Protocol Label Switching (GMPLS) Resource 
   ReSerVation Protocol - Traffic Engineering (RSVP-TE) signaling to 
   support Label Switched Path (LSP) Proactive Protection, which create 
   the protection LSP after a failure is predicted and before it 
   becomes a real failure. 

Table of Contents 

   1. Introduction .................................................. 2 
   2. Conventions used in this document ............................. 3 
   3. Overview of Predicted Failure and Related Recovery Methods .... 3 
      3.1. Predicted Failure ........................................ 3 
      3.2. Proactive Protection ..................................... 4 
   4. Modified PROTECTION Object Format ............................. 5 
   5. Extension to ERROR_SPEC Object ................................ 6 
      5.1. New Error Code / Sub-code ................................ 6 
      5.2. New TLV in ERROR_SPEC Object ............................. 6 
   6. End-to-end Proactive Protection ............................... 7 
      6.1. Creation of the Protected LSP ............................ 7 
      6.2. Notification of Predicted Failure Event .................. 7 
      6.3. Tearing Down of the Protection LSP ....................... 8 
   7. Proactive Segment Protection .................................. 8 
      7.1. Creation of the Protected LSP ............................ 8 
      7.2. Notification of Predicted Failure Event .................. 9 
      7.3. Tearing Down of the Segment Recovery LSP ................. 9 
      7.4. Priority and Resource Pre-emption ....................... 10 
   8. Consideration of Backward Compatibility ...................... 11 
   9. Security Considerations ...................................... 11 
   10. IANA Considerations ......................................... 11 
   11. References .................................................. 12 
      11.1. Normative References ................................... 12 
      11.2. Informative References ................................. 12 
   12. Authors' Addresses .......................................... 12 
    
    

1. Introduction 

   [RFC4872] and [RFC4873] describe protocol-specific procedures and 
   extensions for GMPLS RSVP-TE signaling to support end-to-end LSP 

 
 
Yi Lin                  Expires May 3, 2020                   [Page 2] 

Internet-Draft          GMPLS Proactive Protection       November 2019 
    

   recovery (including protection and restoration) and segment LSP 
   recovery, respectively. 

   Traditional protection solution (e.g., 1+1 or 1:1 protection) could 
   have very fast protection switch after failure happens, but takes 
   twice of resource in the network during the whole lifetime of the 
   LSP. On the other hand, the traditional restoration solution has 
   much higher resource use, but the recovery of the LSP is much 
   slower, due to the additional signaling time to create the 
   restoration LSP. 

   In order to reduce the recovery resource while keeping the very fast 
   protection switch, an approach is to use the failure prediction 
   technologies and to create 1+1 or 1:1 protection only when a 
   potential failure is predicted. This approach refers to "Proactive 
   Protection" in this document.  

   This document extends the RSVP-TE protocol to support the control of 
   the Proactive Protection. 

2. Conventions used in this document 

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 
   "OPTIONAL" in this document are to be interpreted as described in 
   BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all 
   capitals, as shown here. 

3. Overview of Predicted Failure and Related Recovery Methods 

3.1. Predicted Failure 

   In most cases, there will be some indications before a physical 
   failure happens in a network. For example, abnormal fluctuation of 
   noise of a lightpath, BER (Bit Error Rate) (before error correction) 
   rising, temperature rising of a transponder. 

   Therefore, by monitoring on certain physical parameters and 
   analyzing the change tendency using, for example, Machine Learning 
   (ML) or other technologies, a node is possible to predict whether 
   failure will happen in an upcoming period of time.  

   Note that a predicted failure is different from a Signal Degrade in 
   that:  

   -  When Signal Degrade happens to a connection, the connection is 
      still available but the quality of the signal carried by this 
 
 
Yi Lin                  Expires May 3, 2020                   [Page 3] 

Internet-Draft          GMPLS Proactive Protection       November 2019 
    

      connection has declined and is lower than the predetermined 
      threshold. For example, the BER of a connection rises and is out 
      of tolerance. 

   -  When a predicted failure of a connection is inferred, no failure 
      nor degradation happens at present, but there is a trend that 
      after a period of time, failure will probably happen, which will 
      cause Signal Fail or Signal Degrade. 

   The methods to predict failures are outside the scope of this 
   document. 

3.2. Proactive Protection 

   The "Proactive Protection" refers to an LSP protection approach 
   which create the protection LSP after a failure is predicted and 
   before it becomes a real failure. Both end-to-end protection 
   (defined in [RFC4872] and segment protection (defined in [RFC4873]) 
   are applicable for the Proactive Protection. 

   The main procedure of Proactive Protection is shown in Figure 1: 

    
         |-> Predicted failure notification received 
         |   |-> Proactive Protection path created 
         |   |               |-> Real failure happens 
         |   |               | |-> Protection switch finished 
         |   |               | | 
         |   |               | |     Protection path deleted <-| 
         |   |               | |     if no failure happened    | 
         |   |               | |                               | 
         |   |        t3     | |                          t6   | 
      ---+---+--------+======x=+==========================+----+---> t 
         t1  t2       |     t4 t5                         |    t7 
                      |                                   | 
                      |<--Predicted failure time period-->| 
    
                Figure 1: Overview of Proactive Protection 

   -  t1: The protection source node of an LSP is notified that a 
      failure will probably happen during t3~t6, so it starts to create 
      1+1 or 1:1 protection of the connection. Here the protection 
      source node can be the source node of the LSP (for end-to-end 
      protection case), or a branch node located between the source node 
      and the predicted failure point of the LSP (for segment protection 
      case). 

 
 
Yi Lin                  Expires May 3, 2020                   [Page 4] 

Internet-Draft          GMPLS Proactive Protection       November 2019 
    

      t2: The 1+1 or 1:1 protecting path is created between the 
      protection source node and the protection destination node. Here 
      the protection destination node can be the destination node of the 
      LSP (for end-to-end protection case), or a merge node located 
      between the predicted failure point and the destination node of 
      the LSP (for segment protection case). 

   -  t4: If real failure happens as predicted, the 1+1 or 1:1 
      protection switch will be triggered. 

   -  t5: Protection switch finished and the service in the connection 
      is recovered. 

   -  t7: If in fact the predicted failure didn't happen, and no further 
      predicted failure notification received, the protection source 
      node MAY tear down the protecting path after t6, in order to save 
      the network resource.  

4. Modified PROTECTION Object Format 

   This document modifies the PROTECTION object (C-Type=2) by adding 
   two new bits T and A in reserved fields, as shown in Figure 2 below: 

    0                   1                   2                   3 
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
   |            Length             | Class-Num(37) |  C-Type (2)   | 
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
   |S|P|N|O|T|  Res.   | LSP Flags |     Reserved      | Link Flags| 
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
   |I|R|A|  Reserved   | Seg.Flags |           Reserved            | 
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 

            Figure 2: The modified PROTECTION object (C-Type=2) 

   -  T (Triggered End-to-end Proactive Protection): 1 bit, when set 
      (1), it indicates that the end-to-end Proactive Protection are 
      required. 

     Note that if T bit is set (1), the LSP Flags SHOULD be one of: 
        0x04    1:N Protection with Extra-Traffic 
        0x08    1+1 Unidirectional Protection 
        0x10    1+1 Bidirectional Protection 

   -  A (proActive Segment Protection): 1 bit, when set (1), it 
      indicates that the Proactive Segment Protection are required. 

 
 
Yi Lin                  Expires May 3, 2020                   [Page 5] 

Internet-Draft          GMPLS Proactive Protection       November 2019 
    

     Note that If A bit is set (1), the Seg. Flags SHOULD be one of: 
        0x04    1:N Protection with Extra-Traffic 
        0x08    1+1 Unidirectional Protection 
        0x10    1+1 Bidirectional Protection 

   See [RFC4872] and [RFC4873] for the definition of other fields. 

5. Extension to ERROR_SPEC Object 

5.1. New Error Code / Sub-code 

   A new Error Sub-code under Error Code "25 - Notify Error" is defined 
   in this document, which is used to notify the event of a predicted 
   failure: 

   Error Code = 25: "Notify Error" (see [RFC3209]) 

   Error Sub-code = TBA: "Notify Error/LSP Local Predicted Failure" 

5.2. New TLV in ERROR_SPEC Object 

   When predicting a failure, a certain time before which the failure 
   may happen may also be predicted. This time information is useful 
   for the source node to know how long it should wait for the 
   predicted failure to become a real failure, and to decide when it's 
   safe to tear down the protection LSP if the predicted failure didn't 
   happen. 

   A new TLV in IPv4/IPv6 IF_ID ERROR_SPEC Object is defined in this 
   document, which is used to indicate the time before which the 
   predicted failure will probably become real failure. The format of 
   this new TLV is shown in Figure 3 below: 

    0                   1                   2                   3 
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
   |          Type = TBA           |          Length = 8           | 
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
   |                              Time                             | 
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 

             Figure 3: New TLV (type=TBA) in ERROR_SPEC Object 

   -  Type: TBA 

   -  Length: 8 

 
 
Yi Lin                  Expires May 3, 2020                   [Page 6] 

Internet-Draft          GMPLS Proactive Protection       November 2019 
    

   -  Time: A relative time measured in second, which indicates within 
      how many seconds (from the current time) the predicted failure 
      will probably become real failure. 

6. End-to-end Proactive Protection 

6.1. Creation of the Protected LSP 

   To create an LSP with recovery type of "End-to-end Proactive 
   Protection", the source node of the LSP generates a Path message 
   with a PROTECTION object included. The T bit in the PROTECTION 
   object MUST be set to 1 (End-to-end Proactive Protection), so that 
   all other nodes along the LSP can start the failure prediction 
   function on related links/nodes. 

   Note that the N bit in the PROTECTION object is used to indicate 
   whether the control plane message exchange is only used for 
   notification or for protection-switching purpose after real failure 
   happens, see [RFC4872]. In other words, the N bit have nothing to do 
   with the notification of a predicted failure before real failure 
   happens. 

   To allow the notification of predicted failure event to the source 
   node by the Notify message, the NOTIFY REQUEST object MUST also be 
   included in the Path message (see [RFC3473]), where the "Notify Node 
   Address" SHOULD be the address of the source node of the LSP. 

6.2. Notification of Predicted Failure Event 

   When an intermediate node on an LSP infers that a failure will 
   happen and will affect the LSP, a Notify message will be sent to the 
   source node of the LSP, to inform such predicted failure event. A 
   new error code/sub-code "Notify Error/LSP Local Predicted Failure" 
   is used in the ERROR_SPEC object or IF_ID_ERROR_SPEC object in the 
   Notify message. 

   The Notify message MAY also include a TLV (type = TBA) in the IPv4 
   or IPv6 IF_ID_ERROR_SPEC object, to indicate the time before which 
   the predicted failure will probably become real failure. 

   On receiving the Notify message with error code/sub-code "Notify 
   Error/LSP Local Predicted Failure", the source node of the LSP 
   SHOULD trigger the procedure to create the protection LSP, according 
   to the protection type indicated in the "LSP Flags" field of the 
   PROTECTION object in the Path message for the protected LSP. The 
   procedures of creating the protection LSP and the protection 
   switching after real failure happens are described in [RFC4872]. 
 
 
Yi Lin                  Expires May 3, 2020                   [Page 7] 

Internet-Draft          GMPLS Proactive Protection       November 2019 
    

6.3. Tearing Down of the Protection LSP 

   After the protected LSP is created, the source node MAY start a 
   timer T_wait and wait for the predicted failure to become a real 
   failure. If no real failure happens and no more notification of 
   predicted failure is received till T_wait, the source node MAY 
   trigger the procedure to tear down the protection LSP, according to 
   local policy. See [RFC4872] about the process of tearing down a 
   protection LSP. 

   Implementations SHOULD allow this policy to be configured to provide 
   a default across all LSPs on a node, but SHOULD also allow it to be 
   configured per LSP. 

   Note that the T_wait MUST longer than the time indicated in the TLV 
   (type=TBA) in the ERROR_SPEC object in the Notify message, if the 
   TLV exists. 

   Note also that the value of T_wait is a local matter of the source 
   node, and is outside the scope of this document.  

7. Proactive Segment Protection 

7.1. Creation of the Protected LSP 

   To create an LSP with recovery type of "Proactive Segment 
   Protection", the source node of the LSP generates a Path message, 
   where: 

   -  A PROTECTION object is included, where the A bit MUST be set to 1 
      (Proactive Segment Protection), so that all nodes along the 
      protected LSP can start the failure prediction function on related 
      links/nodes if supported. The "Seg. Flags" are used to indicate 
      the protection type of the Proactive Segment Protection. 

   -  One or more SERO objects MAY included (i.e., explicit Proactive 
      Segment Protection), indicating the branch node and the merge node 
      of each segment recovery LSP. If no SERO object is included, it 
      indicates that the dynamic Proactive Segment Protection method is 
      used. 

   -  A NOTIFY REQUEST object is included, where the Notify Node 
      Address" SHOULD be the address of the source node of the LSP. 

   For explicit Proactive Segment Protection, when a branch node 
   receives a Path message with A bit set to 1 in the PROTECTION 
   object, the branch node follows [RFC4873] to process the Path 
 
 
Yi Lin                  Expires May 3, 2020                   [Page 8] 

Internet-Draft          GMPLS Proactive Protection       November 2019 
    

   message, except that the Path message for the recovery LSP will not 
   be generated and be sent at this stage. Also, one more NOTIFY 
   REQUEST object SHOULD be added to the Path message of the protected 
   LSP, which carries the address of this branch node. 

   For dynamic Proactive Segment Protection, when an intermediate node 
   receives a Path message with A bit set to 1 in the PROTECTION 
   object, the node will determine if it has the ability to be a branch 
   node, as described in Section 6.2 of [RFC4873]. If yes, it follows 
   the same procedure as what a branch node does in the case of 
   explicit Proactive Segment Protection, as described above. If not, 
   the node only follows the standard procedure to create the protected 
   LSP. 

7.2. Notification of Predicted Failure Event 

   When an intermediate node between a pair of branch and merge nodes 
   on an LSP infers that a failure will happen and will affect the LSP, 
   a Notify message will be sent to the nearest branch node on the 
   upstream direction of the LSP, to inform such predicted failure 
   event. The error code/sub-code "Notify Error/LSP Local Predicted 
   Failure" is used in the ERROR_SPEC object or IF_ID_ERROR_SPEC object 
   in the Notify message. 

   Similar to End-to-end Proactive Protection, the time before which 
   the predicted failure may occur MAY also be included in the Notify 
   message.  

   On receiving the Notify message with error code/sub-code "Notify 
   Error/LSP Local Predicted Failure", the branch node on the protected 
   LSP SHOULD generate a new Path message, and send this new Path 
   message along the recovery LSP between the branch and the merge 
   nodes. The procedures of generating new Path message and creating 
   the recovery LSP are the same as what is described in [RFC4873], 
   except that the A bit in the PROTECTION object of this new Path 
   message MUST set to 1. 

7.3. Tearing Down of the Segment Recovery LSP 

   After the segment recovery LSP is created, the branch node MAY start 
   a timer T_wait and wait for the predicted failure to become a real 
   failure. If no real failure happen and no more notification of 
   predicted failure is received till T_wait, the branch node MAY 
   trigger the procedure to tear down the segment recovery LSP, 
   according to local policy. See [RFC4873] about the process of 
   tearing down a segment recovery LSP. 

 
 
Yi Lin                  Expires May 3, 2020                   [Page 9] 

Internet-Draft          GMPLS Proactive Protection       November 2019 
    

   Implementations SHOULD allow this policy to be configured to provide 
   a default across all LSPs on a node, but SHOULD also allow it to be 
   configured per LSP. 

   Note that the T_wait MUST longer than the time indicated in the TLV 
   (type=TBA) in the ERROR_SPEC object in the Notify message, if the 
   TLV exists. 

   Note also that the value of T_wait is a local matter of the branch 
   node, and is outside the scope of this document. 

7.4. Priority and Resource Pre-emption 

   It's possible that after recovery LSP is created and before the 
   predicted failure becomes a real failure, another real failure 
   happens on the LSP outside the protected segment. In this case, the 
   source node (or an intermediate node in the upstream direction of 
   the real failure) may start a restoration procedure to recover the 
   LSP. For the same protected LSP, since recovering from a real 
   failure always has higher priority than protecting against a 
   predicted failure which still hasn't happened, the restoration LSP 
   can pre-empt the resource of the segment recovery LSP. 

   As shown in Figure 4, assume that node B (branch node) was notified 
   of a predicted failure event between N-4 and M (merge node), and has 
   created the segment recovery LSP along B, N-1, N-2, N-3 and M. If 
   another failure between S (source node) and B happens before the 
   predicted failure becomes a real failure, node S will try to create 
   the restoration LSP. Since that resource is limited, the restoration 
   LSP can pre-empt the resource of the segment recovery LSP between N-
   1 and N-3. 

   The nodes along the segment recovery LSP has enough information to 
   determine whether pre-emption is allowed. This is because these 
   nodes know that: 

   -  The current segment recovery LSP is used for Proactive Segment 
      Protection through the A bit in the PROTECTION object; 

   -  The segment recovery LSP and the restoration LSP are protecting 
      the same LSP through the association relationship. 

    

    

    
 
 
Yi Lin                  Expires May 3, 2020                  [Page 10] 

Internet-Draft          GMPLS Proactive Protection       November 2019 
    

                      |<------ Pre-emption ------>| 
                      |                           | 
     *************************************************************** 
     *+---+         +---+         +---+         +---+         +---+* 
     *|   +---------+N-1+---------+N-2+---------+N-3+---------+   |* 
     *+-+-+         +-+-+         +---+         +-+-+         +-+-+* 
     *  |             |###########################|             |  * 
     *  |             |#                         #|             |  * 
     *  |             |#                         #|             |  * 
     *+-+-+         +-+-+         +---+         +-+-+         +-+-+* 
   ***| S +----X----+ B +---------+N-4+----?----+ M +---------+ D |*** 
      +---+         +---+         +---+         +---+         +---+ 
   =================================================================== 
    
     S: Source node     D: Destination node 
     B: Branch node     M: Merge node 
     X: Real failure    ?: Predicted failure (haven't happened yet) 
    
     =====: Protected LSP 
     #####: Segment Recovery LSP 
     *****: Restoration LSP 

             Figure 4: Resource pre-emption by restoration LSP 

8. Consideration of Backward Compatibility 

   TBD. 

   [Editor's note]: will add some description about interwork with 
   legacy nodes which do not support the function of failure prediction 
   and reporting. 

9. Security Considerations 

   TBD. 

10. IANA Considerations 

   IANA assigns values to RSVP protocol parameters. Within the current 
   document, a new Error code/sub-code value is defined: 

   Error Code = 25: "Notify Error" (see [RFC3209]) 

      o  "Notify Error/LSP Local Predicted Failure"   (TBA) 



 
 
Yi Lin                  Expires May 3, 2020                  [Page 11] 

Internet-Draft          GMPLS Proactive Protection       November 2019 
    

11. References 

11.1. Normative References 

   [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 
             Requirement Levels", BCP 14, RFC 2119, DOI 
             10.17487/RFC2119, March 1997. 

   [RFC3209] Awduche, D., Berger, L., Gan, D., Li, T., Srinivasan, V., 
             and G. Swallow, "RSVP-TE: Extensions to RSVP for LSP 
             Tunnels", RFC 3209, December 2001. 

   [RFC3473] Berger, L., Ed., "Generalized Multi-Protocol Label 
             Switching (GMPLS) Signaling Resource ReserVation Protocol- 
             Traffic Engineering (RSVP-TE) Extensions", RFC 3473, 
             January 2003. 

   [RFC4872] Lang, J., Ed., Rekhter, Y., Ed., and D. Papadimitriou, 
             Ed., "RSVP-TE Extensions in Support of End-to-End 
             Generalized Multi-Protocol Label Switching (GMPLS) 
             Recovery", RFC 4872, May 2007. 

   [RFC4873] Berger, L., Bryskin, I., Papadimitriou, D., and A. Farrel, 
             "GMPLS Segment Recovery", RFC 4873, May 2007. 

   [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 
             2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 
             May 2017. 

11.2. Informative References 

   [RFC4426] Lang, J., Ed., Rajagopalan, B., Ed., and D. Papadimitriou, 
             Ed., "Generalized Multi-Protocol Label Switching (GMPLS) 
             Recovery Functional Specification," RFC 4426, March 2006. 

12. Authors' Addresses 

   Yi Lin 
   Huawei Technologies 
   F3 R&D Center, Huawei Industrial Base, 
   Bantian, Longgang District, 
   Shenzhen 518129 P.R.China 
   Email: yi.lin@huawei.com 




 
 
Yi Lin                  Expires May 3, 2020                  [Page 12]