CCAMP Working Group            Richard Rabbat (Fujitsu Labs of America) 
Internet Draft                           Vishal Sharma (Metanoia, Inc.) 
Expires: August 2004                          Zafar Ali (Cisco Systems) 
                                                                        
                                                          February 2004 
    
   Expedited Flooding for Restoration in Shared-Mesh Transport Networks 
                  draft-rabbat-expedited-flooding-01.txt 
     
Status of this Memo  
    
   This document is an Internet-Draft and is in full conformance with 
   all provisions of Section 10 of RFC2026 [1]. 
    
   Internet-Drafts are working documents of the Internet Engineering 
   Task Force (IETF), its areas, and its working groups.  Note that 
   other groups may also distribute working documents as Internet-
   Drafts.  
    
   Internet-Drafts are draft documents valid for a maximum of six months 
   and may be updated, replaced, or obsoleted by other documents at any 
   time. It is inappropriate to use Internet-Drafts as reference 
   material or to cite them other than as "work in progress."  
    
   The list of current Internet-Drafts can be accessed at  
        http://www.ietf.org/ietf/1id-abstracts.txt  
   The list of Internet-Draft Shadow Directories can be accessed at  
        http://www.ietf.org/shadow.html. 
    
    
Abstract  
    
   Optical transport networks require fast restoration mechanisms with 
   tight time bounds in order to recover from resource failures in a 
   timely manner.  These failures may include fiber cuts, transponder 
   failure and node failures.  Time-bounded recovery is challenging in 
   mesh-based transport networks with shared protection.  This draft 
   discusses some currently available mechanisms and their limitations, 
   and explains the need for an expedited flooding mechanism to 
   accomplish this objective.  The draft also highlights some 
   challenges, mitigating factors, and possible solutions in the 
   implementation of an expedited flooding protocol. 
    
    
Rabbat, R., et al       Expires - August 2004                [Page 1] 
 
         draft-rabbat-expedited-flooding-01.txt          February 2004 
 
 
Table of Contents 
    
   1. Introduction...................................................2 
   2. Terminology....................................................3 
   3. Restoration in Packet Versus Shared Mesh Transport Networks....3 
   4. Signaling versus Flooding Notification for Transport Networks..6 
   4.1 Signaling-Based Notification..................................6 
   4.2 Flooding-Based Notification...................................7 
   4.3 Comparison Between Signaling- and Flooding-Based Notification.8 
   4.4 Alternative Notification Method..............................12 
   5. Expedited Flooding for Notification...........................13 
   5.1 Operation upon Fault Repair..................................14 
   5.2 Impact of Expedited Flooding on Network Operation............14 
   5.3 Graceful Degradation.........................................15 
   5.3.1 Loss of Notification Messages..............................15 
   5.3.2 Multiple Fiber Cuts........................................16 
   6. Conclusion....................................................16 
   7. Intellectual Property Considerations..........................17 
   8. References....................................................18 
   9. Authors' Addresses............................................19 
   10. Full Copyright Statement.....................................20 
    
    
1. Introduction 
    
   With networks evolving towards a packet data layer operating on an 
   optical transport layer controlled by an IP-based control plane, the 
   nature and type of restoration needed at each layer is changing as 
   well.  Time-constrained recovery and protocol scalability are crucial 
   in order to reduce service interruption and eliminate duplication of 
   fault notification.  We illustrate how recovery strategies at the 
   data and transport layers have important differences based on their 
   goals.  A key mechanism during recovery is fault notification, which 
   conveys information from the node detecting the fault to the nodes 
   responsible for activating the protection path.  We consider 
   solutions at the transport layer and present a comparative analysis 
   of a signaling-based approach to notification and flooding-based 
   notification. 
    
   Section 3 describes differences between restoration mechanisms in 
   packet data networks versus restoration in shared-mesh based optical 
   transport networks.  Section 4 discusses the fault notification when 
   using a signaling-based and flooding-based approach, respectively.  
   We study the scalability and impact on recovery time of both 
   solutions.  Section 5 describes the qualities of expedited flooding 
   for notification and highlights its advantages as compared to the 
   flooding mechanisms of unmodified or native link-state protocols.  It 
   also describes how expedited flooding can be used in the case of 

 
Rabbat, R., et al        Expires - April 2004                 [Page 2] 
 
         draft-rabbat-expedited-flooding-01.txt          February 2004 
 
 
   multiple concurrent fiber cuts and supports graceful degradation.  
   Section 6 concludes this draft. 
    
    
2. Terminology 
    
   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 
   document are to be interpreted as described in RFC 2119 [2]. 
    
    
3. Restoration in Packet Versus Shared Mesh Transport Networks 
    
   In this section, we point out certain key differences between 
   restoration in packet and transport networks. 
    
   In MPLS packet networks, one can pre-signal and pre-configure a 
   backup LSP for a working LSP.  This is because selecting a label for 
   a backup LSP at a node is sufficient to be able to switch traffic for 
   that LSP when that traffic arrives.  If resources (buffers and 
   bandwidth) are required for the backup LSP, they can also be reserved 
   in advance (during the LSP signaling phase).  As long as there is no 
   failure on the working LPS, these same resources can still be used by 
   low-priority or extra-traffic LSPs. 
     
   This holds true even for shared mesh restoration in MPLS networks.  
   In this case, multiple labels are assigned, one for each of the 
   backup LSPs transiting a node (corresponding to link- or node-
   disjoint working LSPs that they protect) and using the shared backup 
   path.  But in this case, only one set of resources (buffers, 
   bandwidth) needs to be reserved. 
     
   When we consider transport networks, the situation is different.  
   Now, a backup LSP can be pre-signaled but not pre-configured (unless 
   simple 1+1 protection is desired).  This is because, once an LSP in a 
   transport network is established (that is, it is cross-connected), 
   the full bandwidth of the LSP is automatically consumed, irrespective 
   of whether traffic actually flows on that LSP.  
     
   For this reason, when implementing shared restoration schemes in 
   transport networks (or allowing extra-traffic between endpoints other 
   than the source-destination of a backup LSP) a backup LSP cannot be 
   cross-connected until after the specific failure for which this LSP 
   was pre-signaled has occurred. 
    
   Thus, for transport networks an additional step of reconfiguration is 
   required at all the nodes that lie along the path of a backup LSP 
   corresponding to a working LSP.  
    
 
Rabbat, R., et al        Expires - April 2004                 [Page 3] 
 
         draft-rabbat-expedited-flooding-01.txt          February 2004 
 
 
   This difference is illustrated in Figure 1 and Figure 2 below. Figure 
   1(a) shows a shared mesh restoration scenario in a packet-based MPLS 
   network.  There are two working LSPs W1 and W2, with a single shared 
   backup LSP P1/P2.  The label assignments have been made as shown (L1 
   and L2 for W1 at node B, L1Ć and L2Ć for P1 at node F, and L3Ć and 
   L4Ć for P2 at node F, and so on).  When a fault affecting W1 occurs 
   (Figure 1(b)), node A upon learning of the failure immediately 
   performs a protection switch and begins transmitting working traffic 
   from W1 down the backup LSP P1 with the label L1Ć.  Node F now simply 
   label switches the traffic arriving on link A-F with label L1Ć by 
   placing it on the outgoing link F-C with label L2Ć.  Node F may 
   squelch or drop the low-priority traffic (or any extra-traffic; LSPs 
   E1 and E2 respectively) that was being carried when the backup LSP 
   was not active, by simply purging it from its outgoing queues.  
    
    
      #############    #############     .............    ############# 
      #              -             #     .              -             # 
      #  +----------|A|----------+ #     .  +----------|A|----------+ # 
      #  |        *  - # E1      | #     .  |        #  - .         | # 
      #  |   P1/P2*  | #         | #     .  |        #  | .         | # 
      #  |        *  | #         | #     .  |        #  | .Dropped  | # 
   L1 #  |        *  | #L5       | # L3  .  |     P1 #  | .         | # 
      #  |     L1'*  | ########> | #     .  |        #  | ........> | # 
      #  -        *  -   L6      - #     .  -        #  -           - # 
      # |B|-------*-|F|---------|D|#     . |B|-------#-|F|---------|D|# 
      #  -        *  -  L7       - #     .  -        #  -           - # 
   L2 #  |        *  | ######### | #     .  |        #  | ......... | # 
      #  |     L2'*  | #     E2  | # L4  .  |        #  | .         | # 
      #  |        *  | #L8       | #     .  \/       #  | .Dropped  | # 
      #  |        *  | #         | #     .  /\       #  | .         | # 
      #  |        V  - V         | #     .  |        V  - V         | # 
      #  +----------|C|----------+ #     .  +----------|C|----------+ # 
      #              -             #     .              -             # 
      #############>   <############     .............>   <############ 
           W1               W2                W1               W2 
                    (a)                                (b) 
    
     Figure 1. Restoration scenario in a simple MPLS-based packet 
        network 
    
    
   By contrast, Figure 2 illustrates the same situation for an optical 
   transport network, where the LSPs in question are lambda LSPs.  For 
   ease of exposition, we assume a single lambda per link.  Figure 2(a) 
   shows the same network and LSPs are in the previous case. As shown in 
   Figure 2(b), here the intermediate node F, upon learning of a fault 
   along LSP W1, has to first drop any extra-traffic (or low priority) 
   LSPs that were using the bandwidth (lambda) reserved for the backup 
 
Rabbat, R., et al        Expires - April 2004                 [Page 4] 
 
         draft-rabbat-expedited-flooding-01.txt          February 2004 
 
 
   LSP P1/P2.  It then reconfigures its cross-connect matrix to connect 
   the incoming lambda on link A-F to the outgoing lambda on link F-C 
   (that is, it changes its configuration from A-F -> F-D and D-F -> F-C 
   to A-F -> F-C).   
    
      #############    #############     .............    ############# 
      #              -             #     .              -             # 
      #  +----------|A|----------+ #     .  +----------|A|----------+ # 
      #  |        *  - # E1      | #     .  |        *  - .         | # 
      #  |   P1/P2*  | #         | #     .  |        *  | .         | # 
      #  |        *  | #         | #     .  |        *  | .1. Drop  | # 
      #  |        *  | #         | #     .  |     P1 *  | .         | # 
      #  |        *  | ########> | #     .  |        *  | ........> | # 
      #  -        *  -           - #     .  -        *  -           - # 
      # |B|-------*-|F|---------|D|#     . |B|-------*-|F|---------|D|# 
      #  -        *  -           - #     .  -        *  - 2. Recfg. - # 
      #  |        *  | ######### | #     .  |        *  | ......... | # 
      #  |        *  | #     E2  | #     .  |        *  | .         | # 
      #  |        *  | #         | #     .  \/       *  | .         | # 
      #  |        *  | #         | #     .  /\       *  | .         | # 
      #  |        V  - V         | #     .  |        V  - V         | # 
      #  +----------|C|----------+ #     .  +----------|C|----------+ # 
      #              -             #     .              -             # 
      #############>   <############     .............>   <############ 
           W1               W2                W1               W2 
                    (a)                                           
       (b) 
    
     .............    ############# 
     .              -             # 
     .  +----------|A|----------+ # 
     .  |        #  - 3. Switch | # 
     .  |        #  |           | # 
     .  |        #  |           | # 
     .  |     P1 #  |           | # 
     .  |        #  |           | # 
     .  -        #  -           - # 
     . |B|-------#-|F|---------|D|# 
     .  -        #  -           - # 
     .  |        #  |           | # 
     .  |        #  |           | # 
     .  \/       #  |           | # 
     .  /\       #  |           | # 
     .  |        V  -           | # 
     .  +----------|C|----------+ # 
     .              -             # 
     .............>   <############     (c) 
          W1               W2 
    
 
Rabbat, R., et al        Expires - April 2004                 [Page 5] 
 
         draft-rabbat-expedited-flooding-01.txt          February 2004 
 
 
     Figure 2. Restoration scenario in an optical transport network, for 
        a situation identical to that depicted in Figure 1. 
    
   As shown in Figure 2(c), it is only then that node F is able to carry 
   the working traffic from W1 on P1.  Thus, there needs to be a way to 
   inform all of the intermediate nodes (here F) of the failure along 
   the working path, so that each node can appropriately reset its 
   cross-connects.  This is not required in packet networks as 
   illustrated in Figure 1. 
    
    
4. Signaling versus Flooding Notification for Transport Networks 
    
   In transport networks, after the fault localization step, there are 
   several options that can be used for transmitting information about 
   the fault.  A possible choice is to make use of control plane 
   signaling (sending RSVP-TE Notify messages as per [3]).  This is the 
   approach used in [4] where each LSP end-node sends a Notify message 
   to its corresponding end-node and receives an ACK back.  Another 
   option is to use flooding, where the detecting node floods the 
   network with information about the fault.  We describe both 
   mechanisms, and compare and contrast them in terms of speed and 
   complexity. 
    
4.1 Signaling-Based Notification  
    
   Control plane signaling can be used to notify nodes of a failure and 
   recover from that failure [4].  In the case of signaling, the process 
   or recovery from a link failure is briefly as follows:  
    
   The steps of the process for a node that detects a failure to notify 
   the LSP sources are as follows: 
   - Detect all LSPs that are affected by a link failure 
   - Send a failure indication message to the source of each identified 
     LSP 
   - Intermediate nodes that receive the message forward it to on to 
     the LSP source node 
    
   When each LSP source node receives the failure indication message, it 
   performs the following:   
   - It sends a failure acknowledgement message to the detecting node.  
     Intermediate nodes upon receiving that message in that case send 
     it on to the originating node. 
   - It sends an end-to-end switchover request message to the LSP 
     destination node along the protection path, with information about 
     the LSP that is to be recovered 
    
   The LSP destination node sends an end-to-end switchover response 
   message back to the LSP source node along the protection path. 
 
Rabbat, R., et al        Expires - April 2004                 [Page 6] 
 
         draft-rabbat-expedited-flooding-01.txt          February 2004 
 
 
   Upon receipt of the response message, the LSP source node starts 
   sending data on the protection path. 
    
   In the case of simple 1:1 protection, the amount of messaging in the 
   above scheme can be kept small.  This is not the case of shared-mesh 
   restoration, however.  As an example, consider a network with 100 
   nodes and 200 fiber links, with an average path length of 10 hops; a 
   cut of a fiber that carries eighty wavelengths will lead to the 
   generation of the following number of messages: 
   - 80 messages (end-to-end failure indication) sent by the detecting 
     node to the LSP source nodes 
   - 80 messages (end-to-end failure acknowledgement) sent by LSP 
     source nodes for each of the previous messages 
   - 80 messages (end-to-end switchover request) from each LSP source 
     node to each LSP destination node 
   - 80 messages (end-to-end switchover response) from the LSP 
     destination node to the LSP source node 
    
   Failure indication and acknowledgement messages travel an average of 
   five hops, while end-to-end switchover requests and responses travel 
   an average of ten hops.  This simple scenario generates 80*5 + 80*5 + 
   80*10 + 80*10 = 2400 message hops.  This calculation does not take 
   into account any acknowledgements needed to ensure reliable 
   transmission.  In general, for mesh networks with shared restoration, 
   the number of messages needed to recover from failures can be very 
   large and may lead to notification storms [5]. 
    
   In addressing this point, reference [4] recommends making use of link 
   bundling [6] to decrease the messaging need.  In that respect, when 
   "the working and protection links are mapped to component links, and 
   the labels are the same on the working and protection links, it may 
   be possible to change the component links without needing to re-
   signal each individual LSP" [4].  This condition may only be 
   applicable in a few select cases. 
    
   Another issue is the length of time it takes to finish the process.  
   Notification time is crucial to the recovery process; thus 
   lengthening that time is detrimental to speedy recovery. 
    
4.2 Flooding-Based Notification 
    
   An alternative approach to address the issue of messaging is to use 
   flooding.  Instead of sending per-LSP notification and initiating 
   per-LSP recovery at each LSP source node, the node that detects a 
   failure (e.g. transponder failure or fiber cut) notifies all nodes of 
   the network.  Nodes that are concerned with the recovery take the 
   actions required of them while others forward the messages on with no 

 
Rabbat, R., et al        Expires - April 2004                 [Page 7] 
 
         draft-rabbat-expedited-flooding-01.txt          February 2004 
 
 
   extra action but knowledge about the resource failure in order to 
   maintain an accurate picture of resource availability. 
    
   One such implementation of flooding is by using a link state routing 
   protocol such as Open Shortest Path First (OSPF).  The usual link-
   state protocol floods advertisements periodically.  In fact, OSPF 
   requires that Link State Advertisements (LSAs) be refreshed every 
   1800 seconds [7] and otherwise be expired in 3600 seconds.  Flooding 
   frequency is crucial to the stability of the network, since 
   increasing it may lead to excessive messaging and a larger number of 
   retransmissions and ACKs.  In the case of recovery from link failure 
   in data networks, this may not be a problem and using OSPF-based 
   flooding could be a good solution that decreases the amount of 
   messaging relative to signaling. 
    
   A flooding method for transport networks, however, needs to add 
   another dimension to the flooding efficiency: the speed of 
   notification.  Thus, a solution that applies to transport networks 
   needs to be developed.   
    
   As discussed in Section 3, flooding in transport networks needs to 
   occur much faster than relying on OSPF hold-off timers.  In addition, 
   OSPF flooding is heavy and carries a variety of maintenance 
   information.  This has the downside of relying on a protocol that by 
   design is engineered to be slow and reliable to try to deliver time 
   critical fault information. 
    
   Time-constrained flooding, which we call expedited flooding, has the 
   ability to deliver a lightweight solution to fault notification.  
   Expedited flooding is used to notify nodes of a fault.  When a fault 
   is corrected, it can be sent at a slower pace. This is done in order 
   to minimize the possibility of fluttering, which in itself may lead 
   to network meltdowns [8].  The advantage of expedited flooding is the 
   ability to meet requirements for time constraints. 
    
   In any case (whether using OSPF-based flooding or a new expedited 
   flooding mechanism), the number of messages needed versus signaling 
   is substantially decreased.  For the example cited in section 4.1, 
   the number of messages is the number of fibers (200). This results in 
   a reduction in messaging to less than a 10th of the messaging used in 
   signaling. 
    
4.3 Comparison Between Signaling- and Flooding-Based Notification 
    
   In this section, we present a theoretical comparative analysis of the 
   messaging needs of signaling and flooding.  We compare two metrics: 
   keeping to time bounds and the number of messages generated in the 
   worst-case scenario. 
    
 
Rabbat, R., et al        Expires - April 2004                 [Page 8] 
 
         draft-rabbat-expedited-flooding-01.txt          February 2004 
 
 
   We consider a control plane network graph G = (N, A), where N is the 
   set of nodes and A is the set of control channel links; n = |N| and m 
   = |A|.  We also consider the set B of data-plane links.  We consider 
   a mesh DWDM (Dense Wavelength Division Multiplexing) network with L 
   wavelengths per link, and look at unidirectional paths for 
   simplicity.  The worst-case scenario for signaling occurs when the 
   protection LSPs do not share an ingress or egress.  The worst-case 
   scenario for flooding occurs when the control plane connectedness is 
   very sparse. 
    
   In comparing messaging needs, we consider the scenarios of signaling-
   based notification in Figure 3 and flooding-based notification of 
   Figure 4. 
    
   In Figure 3, B detects a link fault between B and C and sends a 
   Notify message towards LSP ingress S (arrows 1 and 2).  S sends an 
   acknowledgement back (arrows 3Ć and 4Ć) and starts an RSVP-TE 
   handshake process with the destination, the LSP egress T (arrows 3 
   through 16).  The notation nĆ indicates that the message is sent 
   asynchronously at step n.  Ingress S sends messages 3 and 3Ć 
   independently of each other. 
    
   In Figure 4, flooding can only follow the same path as signaling.  In 
   that case, B sends a notification message to the network that reaches 
   S after steps 1 and 2.  The notification message is forwarded on to T 
   through steps 4 through 9.  Acknowledgements (nĆ) are sent 
   asynchronously between every node pair.  After T has received the 
   message, no further action is required, though the notification 
   message is forwarded to the remaining nodes.  Node S knows at what 
   time to start sending data on the activated protection LSP and does 
   so. 
    
   This example shows that the theoretical messaging steps needed in the 
   case of flooding are smaller than those of signaling in all cases, 
   and thus lead to a shortened recovery time.   If we assume that a 
   control channel associated with the protection LSP has a path length 
   of len(cc), and that the length of path that the Notify message 
   travels is len (notification), the maximum number of messages (with 
   no consideration to the acknowledgements) for signaling and flooding 
   respectively to finish the notification steps are: 
    
     o Maximum messages(signaling) = len(notification) + 2 * len(cc) 
      
     o Maximum messages(flooding) = len(notification) + len(cc) 
    
   In the case of flooding, the number of messages is on average 
   smaller, while it is fixed in the case of signaling. 
    
    
Rabbat, R., et al        Expires - April 2004                 [Page 9] 
 
         draft-rabbat-expedited-flooding-01.txt          February 2004 
 
 
            3Ć ->    4Ć -> 
               <- 2     <- 1    
         -----    -----    -----     -----    -----    -----  
         |   |    |   |    |   |     |   |    |   |    |   |  
         | S |----| A |----| B |--/--| C |----| D |----| T |  
       3 |   |    |   |    |   |     |   |    |   |    |   |    10 
         -----    -----    -----     -----    -----    -----  
    ^  |   |                                             |   ^   | 
    |  v   |                                             |   |   v 
         -----    -----    -----     -----    -----    ----- 
    16   |   |    |   |    |   |     |   |    |   |    |   | 9 
         | E |----| F |----| G |-----| H |----| I |----| J | 
         |   |    |   |    |   |     |   |    |   |    |   | 
         -----    -----    -----     -----    -----    ----- 
             4 ->     5 ->     6 ->      7 ->     8 ->  
               <- 15    <- 14    <- 13     <- 12    <- 11 
    
     Figure 3. Signaling-Based Fault Notification Messages 
      
      
            3Ć ->    2Ć ->             12Ć ->   11Ć -> 
               <- 2     <- 1               <- 11    <- 10 
         -----    -----    -----     -----    -----    -----  
         |   |    |   |    |   |     |   |    |   |    |   |  
         | S |----| A |----| B |--/--| C |----| D |----| T |  
       3 |   |    |   |    |   |     |   |    |   |    |   |    10Ć 
         -----    -----    -----     -----    -----    -----  
   ^   |   |                                             |   ^   | 
   |   v   |                                             |   |   V 
         -----    -----    -----     -----    -----    ----- 
   4Ć    |   |    |   |    |   |     |   |    |   |    |   | 9   
         | E |----| F |----| G |-----| H |----| I |----| J | 
         |   |    |   |    |   |     |   |    |   |    |   | 
         -----    -----    -----     -----    -----    ----- 
             4 ->     5 ->     6 ->      7 ->     8 ->  
               <- 5Ć    <- 6Ć    <- 7Ć     <- 8Ć    <- 9Ć 
    
     Figure 4. Worst-Case Scenario for Flooding-Based Fault Notification 
        Messages 
    
    
Rabbat, R., et al        Expires - April 2004                [Page 10] 
 
         draft-rabbat-expedited-flooding-01.txt          February 2004 
 
 
   We now consider the ability to keep to strict time bounds using 
   signaling and flooding, respectively, via the scenarios shown in 
   Figure 5 and Figure 6.  In these scenarios, the working LSPs are (C, 
   D, E, F), (B, C, D, E, F) and (A, B, C, D, E, F).  The network uses 
   path protection, so the protection LSPs of the aforementioned working 
   LSPs are (C, G, H, I, F), (B, G, H, I, F) and (A, G, H, I, F) 
   respectively.  In both figures, for simplicity, we only show the 
   forward messages, i.e. the notification and the reservation messages, 
   and not the acknowledgement or the RSVP-TE Path messages.   
    
    
               <- 3    <- 2     <- 1    
                       <- 2*    <- 1* 
                                <- 1ö  
         -----    -----    -----    -----     -----    -----  
         |   |----|   |----|   |----|   |---/-|   |----|   |  
         | A |    | B |----| C |----| D |--/--| E |----| F |  
         |   |    |   |    |   |----|   |-/---|   |----|   |  
         -----    -----    -----    -----     -----    -----  
        4  |        | 3*     | 2ö                       |||  ^  ^  ^ 
        |  |        | |      | |                        |||  |  |  | 
        v  |        | v      | v    -----     -----    ----- 5ö 6* 7 
           |        |        -------|   |-----|   |----|   |         
           |        ----------------| G |-----| H |----| I | 
           -------------------------|   |-----|   |----|   | 
                                    -----     -----    ----- 
                                        3ö ->    4ö ->  
                                        4* ->    5* ->    
                                        5  ->    6  -> 
    
     Figure 5. Loose Time Bounds in Signaling-Based Fault Notification 
      
    
   In Figure 5, the cut of fiber (D; E) results in the loss of three 
   LSPs and subsequent notification of nodes responsible for activating 
   the backup LSPs for each of these working LSPs.  Message sets i, i* 
   and iö each recover one of the LSPs. 
    
   In this scenario, it is possible for messages 2ö, 3* and 4 to arrive 
   at the control plane of G at the same time, and their arrival order 
   leads to an order in the recovery of the LSPs.  Therefore, a 
   situation arises when the subsequent messages 3ö, 4* and 5 then 4ö, 
   5* and 6, etc. start arriving simultaneously at the control planes of 
   the subsequent nodes and so have to be buffered at each of them.  
   This leads to scalability problems when the number of LSPs that need 
   to be recovered grows. 
    
   In general, to accommodate the queuing of signaling messages at the 
   nodes, it would be necessary to take into consideration the maximum 
 
Rabbat, R., et al        Expires - April 2004                [Page 11] 
 
         draft-rabbat-expedited-flooding-01.txt          February 2004 
 
 
   number of LSPs that may fail at the same time, in our case, L LSPs 
   (the number of wavelengths).  Subsequently, at each node, one would 
   have to account for the maximum queuing delays experienced by the 
   signaling messages.  This increases the recovery time substantially.  
   When equipment is upgraded or more wavelengths are added, the earlier 
   calculations to account for buffering delay will not hold.  
   Therefore, the network either has to allow a longer delay or a 
   possible reconfiguration of all nodes, or is not upgradeable.  To 
   satisfy a notification time bound, the queuing delay calculations 
   would need to assume an in-band control channel and priority queuing 
   for fault notification. 
    
     o Max queuing delay(signaling) = (L-1) * [ len(notification) + 2 * 
        len(cc) ] 
    
    
               <- 3     <- 2     <- 1               <- 9 
         -----    -----    -----    -----     -----    -----  
         |   |----|   |----|   |----|   |---/-|   |----|   |  
         | A |    | B |----| C |----| D |--/--| E |----| F |  
         |   |    |   |    |   |----|   |-/---|   |----|   |  
         -----    -----    -----    -----     -----    -----  
        4  |        | 5      | 6                        |||  ^  
        |  |        | |      | |                        |||  |  
        v  |        | v      | v    -----     -----    ----- 8  
           |        |        -------|   |-----|   |----|   |         
           |        ----------------| G |-----| H |----| I | 
           -------------------------|   |-----|   |----|   | 
                                    -----     -----    ----- 
                                         7 ->     8 ->  
    
     Figure 6. Strict Time Bounds for Flooding-Based Fault Notification 
    
    
   With flooding, the messaging is much simpler.  Considering the 
   scenario of Figure 6, only one message per control channel is 
   exchanged.  In the worst-case scenario where G receives messages from 
   nodes A, B and C simultaneously, after it has processed one message, 
   messages from other nodes can be safely discarded after minimal 
   processing. Thus, the messages experience no queuing delays in the 
   case of single faults: 
    
     o Max queuing delay(flooding) = 0 
    
4.4 Alternative Notification Method 
    
   Ideally, when a fault affecting multiple LSPs occurs, the 
   notification should only target the nodes that are involved in the 
   recovery procedure, including the ingress and egress nodes of the 
 
Rabbat, R., et al        Expires - April 2004                [Page 12] 
 
         draft-rabbat-expedited-flooding-01.txt          February 2004 
 
 
   different LSPs and the nodes on the recovery LSPs.  This would ensure 
   that nodes that are not affected by the failure do not have to 
   perform any processing.  This mechanism could be implemented through 
   multicast addressing.  However, keeping multicast trees constantly 
   updated (with changing topology) and the amount of data needed to 
   maintain them, adds undue complexity and makes multicast notification 
   not very practical. 
    
    
5. Expedited Flooding for Notification 
    
   Given the reasoning presented in section 4, a rapid, lightweight 
   flooding mechanism (or what we call an ôexpedited floodingö 
   mechanism) is a promising candidate for achieving time-bounded 
   notification in transport networks.  
    
   Such a scheme would allow each node to rapidly forward notification 
   messages (after performing some minimal operations on them), and 
   perform any required processing and attendant reconfiguration in the 
   background.  In this manner, the fault notification propagates 
   rapidly through the network, eventually reaching the edge nodes (or 
   nodes responsible for restoration action), while at the same time 
   allowing the intermediate nodes along the path of the backup LSP to 
   reconfigure themselves. 
    
   Operationally, the scheme would work in the following manner.  A node 
   that detects a fault sends a notification packet to all its 
   neighbors, containing the identity of the link, node, or interface at 
   fault.  Each node upon receiving a notification packet on an incoming 
   link performs a local check to ensure that the same fault has not 
   been reported earlier.  
   - If it has, the node does not need to take any further action on 
     this packet and can discard it. 
   - If it has not, the node immediately broadcasts the incoming packet 
     on all its remaining outgoing interfaces (after possibly updating 
     a TTL field).  
    
   At the same time, the node examines its routing and TE databases to 
   ascertain whether it is on the backup path of (a) working LSP(s) 
   affected by the fault reported by the notification packet. If it is 
   not, the node has no further work to do.  If it is, the node takes 
   the appropriate action (such as dropping extra traffic or low 
   priority traffic and reconfiguring its cross-connect) to be able to 
   forward the traffic arriving on the backup LSP corresponding to the 
   affected working LSP(s). 
    
   There has been a lot of interest in the networking community to allow 
   for fast restoration and recovery.  Several proposals such as 
   flooding over one of many parallel links between neighbors [9], [10], 
 
Rabbat, R., et al        Expires - April 2004                [Page 13] 
 
         draft-rabbat-expedited-flooding-01.txt          February 2004 
 
 
   processing Hello messages at higher priority within the network and 
   at the node [11], flooding over spanning trees [12] have been 
   proposed.  
    
   Expedited flooding can also take advantage of features like flooding 
   over one of many parallel links between neighbors [9], [10]. 
   Specifically, if two systems are connected by multiple parallel 
   point-to-point links, flooding can be done over only one such link.  
   If the link designated for flooding goes down and at least one other 
   parallel link is still up, a different parallel link is designated 
   for flooding.  
    
5.1 Operation upon Fault Repair 
    
   Once a previously detected fault is corrected or repaired, it is 
   important to notify the network nodes of this event.  The same 
   process that notifies the nodes of a fault event should also notify 
   them of the fault recovery event.  This ensures consistency in 
   network state.  Since information about the recovery of a resource is 
   not time critical, in this case, the detecting node can hold off 
   sending a fault recovered flooding message for some appropriate 
   amount of time.   
    
   For example, if one were using a link-state routing protocol for 
   expedited flooding, the fault recovered message would be sent to the 
   network using regular protocol flooding without bypassing its hold-
   off mechanisms.   
    
   This allows the detecting node to dependably track the state (up or 
   down) of a resource.  In the event that the detecting node observes 
   the resource to be oscillating between the ôupö and ôdownö states, it 
   would know that flapping or a mis-configuration may exist in the 
   network and could suppress the expedited flooding mechanism.  It 
   could then either invoke a remedial action at other layers or raise 
   an alarm.  The specific action that nodes take upon receiving a 
   "fault recoveredö message is based on policy. 
    
5.2 Impact of Expedited Flooding on Network Operation 
    
   It is important to realize that the time-bounded recovery application 
   requires only a light-weight flooding scheme.  Specifically, normal 
   flooding for link state advertisements needs to guarantee convergence 
   of the link-state routing protocol.  It is, therefore, vital for link 
   state flooding to ensure that link state PDUs (LSAs in the case of 
   OSPF or LSPs in the case of ISIS) that are originated after the 
   initial network topology database synchronization between neighbors 
   is completed are delivered to all routers within the flooding scope 
   limits (an area or the whole AS depending on the protocol and the 
   type of the link state PDU).  Expedited flooding mechanisms discussed 
 
Rabbat, R., et al        Expires - April 2004                [Page 14] 
 
         draft-rabbat-expedited-flooding-01.txt          February 2004 
 
 
   here, on the other hand, are one-shot notifications that are 
   expedited only at the time when a link failure is detected.  
    
   As the expedited fault notification message is active only in the 
   event of a failure in the network, the impact of expedited flooding 
   discussed here on the operation of normal link state flooding is 
   minimal.  For the same reason, the average message rate for expedited 
   flooding becomes negligible.  Hence, the expedited flooding messages 
   discussed here can bypass the hold-off mechanisms typically used for 
   dampening in usual routing protocols.  
    
   The processing overhead for a node to find whether or not it has 
   heard about the fault reported by a received notification message is 
   also quite small. (A node can determine whether or not it is on the 
   backup path of a working LSP(s) affected by the fault reported by a 
   notification packet by simply looking at the local information on a 
   line card.)  
    
5.3 Graceful Degradation 
    
   It is important, from a providerĆs perspective, to ensure that the 
   network operates stably, and that in the case of unanticipated 
   failures (multiple, near-simultaneous fiber cuts, for example) 
   network performance degrades gracefully, while maintaining stability. 
   In this section, we outline two key issues and their solutions: the 
   loss of notification messages and multiple simultaneous fiber cuts. 
    
5.3.1 Loss of Notification Messages 
    
   A notification message could be lost for a variety of reasons 
   including lack of buffer space in the control plane, packet error, 
   software bug and general misconfiguration.  If a notification message 
   or its acknowledgement is lost, the control plane that sent it does 
   not receive an acknowledgement within a specific period of time.  
   Therefore, it will wait for a period of time before retransmitting 
   the message.  This period of time ensures the stability of the 
   network.  After a number of unsuccessful retries, the control channel 
   would be considered DOWN and no further retransmission would be 
   tried.  In that event, the node that could not successfully send the 
   notification message would raise an alarm and try to notify the LSP 
   endnodes to stop transmitting data. 
    
   By implementing diversity in the network, the operator offers 
   mitigation strategies against such errors and achieves fast 
   notification and recovery.  A network operator can use the expected 
   probability of error in notification messages when calculating 
   expected recovery times to be able to degrade gracefully. 
    

Rabbat, R., et al        Expires - April 2004                [Page 15] 
 
         draft-rabbat-expedited-flooding-01.txt          February 2004 
 
 
5.3.2 Multiple Fiber Cuts 
    
   When multiple fiber cuts occur almost simultaneously and the recovery 
   LSPs share a protection resource, notification of that event occurs 
   while the network has not had enough time to compute updated backup 
   LSPs. 
    
   This occurrence is considered to be extremely unlikely ű a US-based 
   carrier mentioned that the simultaneous occurrence of multiple fiber 
   cuts is less than 1% of the total number of fiber faults.  The 
   probability of the cuts affecting the same protection resource is 
   also small (generally 1 to 10%), making this a very low probability 
   event.   
    
   The time-bounded nature of the notification allows for nodes to 
   intelligently deduce the occurrence of multiple faults, and 
   (depending on the timing) enables them to: either activate the 
   correct backup LSP(s) or block the activation of all backup LSPs, 
   thus degrading gracefully. 
    
   To solve the multiple fiber cut problem, a multi-pronged tie-breaking 
   strategy can be adopted such as follows: 
    
   a. LSP priority: A node considers the priority of the LSPs that need 
      to be recovered and will bump the LSPs with the lower priority.  
      This concludes the problem if LSP priority is used. 
    
   b. In the event the LSPs have the same priority, the node will 
      select the LSP that originates at an edge of smallest or largest 
      node id. 
 
   c. If a node receives subsequent fault notifications after the 
      activation of a given backup LSP, and the tie-breaking rules used 
      above dictate a different decision, the nodes will disable/tear 
      down all backup LSPs, and intimate the LSPĆs respective source 
      nodes via a signaling message. 
    
    
6. Conclusion 
    
   This draft discussed the issue of time-constrained recovery in 
   optical transport networks.  We highlighted important issues to 
   demonstrate the appropriateness of using a flooding-based approach to 
   notification, including scalability concerns and time-bounded 
   recovery.  We observed that traditional flooding mechanisms, such as 
   OSPF flooding, if used unmodified would not be appropriate for such 
   critical failure notification.  Therefore, we highlighted the need 
   for a fast flooding mechanism and outlined how its operation would 
   have minimal impact on the network.  
 
Rabbat, R., et al        Expires - April 2004                [Page 16] 
 
         draft-rabbat-expedited-flooding-01.txt          February 2004 
 
 
7. Intellectual Property Considerations 
    
   This section is taken from Section 10.4 of RFC2026 [1]. 
    
   The IETF takes no position regarding the validity or scope of any 
   intellectual property or other rights that might be claimed to 
   pertain to the implementation or use of the technology described in 
   this document or the extent to which any license under such rights 
   might or might not be available; neither does it represent that it 
   has made any effort to identify any such rights. Information on the 
   IETF's procedures with respect to rights in standards-track and 
   standards-related documentation can be found in BCP-11. Copies of 
   claims of rights made available for publication and any assurances of 
   licenses to be made available, or the result of an attempt made to 
   obtain a general license or permission for the use of such 
   proprietary rights by implementors or users of this specification can 
   be obtained from the IETF Secretariat. 
    
   The IETF invites any interested party to bring to its attention any 
   copyrights, patents or patent applications, or other proprietary 
   rights, which may cover technology that may be required to practice 
   this standard. Please address the information to the IETF Executive 
   Director. 


Rabbat, R., et al        Expires - April 2004                [Page 17] 
 
         draft-rabbat-expedited-flooding-01.txt          February 2004 
 
 
8. References
                     
   [1]  Bradner, S., "The Internet Standards Process -- Revision 3", BCP 
        9, IETF RFC 2026, October 1996. 
    
   [2]  Bradner, S., "Key words for use in RFCs to Indicate Requirement 
        Levels," BCP 14, IETF RFC 2119, March 1997. 
    
   [3]  Berger, L. (Editor) et al., "Generalized MPLS Signaling - RSVP-
        TE Extensions," IETF RFC 3473, January 2003. 
    
   [4]  Lang, J., et al (eds.), "RSVP-TE Extensions in support of End-
        to-End GMPLS-based Recovery," work in progress, September 2003. 
    
   [5]  Papadimitriou, D., Mannie, E. (eds.), "Analysis of Generalized 
        Multi-Protocol Label Switching (GMPLS)-based Recovery Mechanisms 
        (including Protection and Restoration)," work in progress, 
        September 2003. 
    
   [6]  Kompella, K., Rekhter, R. and Berger, L., "Link Bundling in MPLS 
        Traffic Engineering," work in progress, July 2002. 
    
   [7]  Moy, J., "OSPF Version 2," IETF RFC 2328, April 1998. 
    
   [8]  Yu, J., "Scalable Routing Design Principles," IETF RFC 2791, 
        July 2000. 
    
   [9]  A. Zinin and M. Shand, "Flooding Optimizations in Link-State 
        Routing Protocols," work in progress. 
    
   [10] J. Moy, "Flooding over Parallel Point-to-Point Links," work in 
        progress. 
    
   [11] A. S. Maunder and G. Choudhury, "Explicit Marking and 
        Prioritized Treatment of Specific IGP Packets for Faster IGP 
        Convergence and Improved Network Scalability and Stability", 
        work in progress. 
    
   [12] Gagan L. Choudhury, and Vishwas Manral "LSA Flooding 
        Optimization Algorithms and Their Simulation Study", draft-
        choudhury-manral-flooding-simulation-00.txt. 


Rabbat, R., et al        Expires - April 2004                [Page 18] 
 
                                     
9. Authors' Addresses  
    
   Richard Rabbat                       
   Fujitsu Labs of America, Inc.          
   1240 E. Arques Ave, MS 345           
   Sunnyvale, CA 94085                  
   United States of America                
   Phone: +1-408-530-4537               
   Email: rabbat@fla.fujitsu.com        
    
   Vishal Sharma 
   Metanoia, Inc. 
   1600 Villa Street, Unit 352 
   Mtn. View, CA 94041 
   United States of America 
   Phone: +1-650-386-6723 
   Email: v.sharma@ieee.org 
    
   Zafar Ali 
   Cisco Systems Inc.  
   100 South Main St. #200   
   Ann Arbor, MI 48104   
   United States of America    
   Phone: +1-734-276-2459  
   Email: zali@cisco.com 


Rabbat, R., et al       Expires - August 2004               [Page 19] 
 
         draft-rabbat-expedited-flooding-01.txt          February 2004 
 
 
10. Full Copyright Statement 
    
   "Copyright (C) The Internet Society (2003). All Rights Reserved. 
   This document and translations of it may be copied and furnished to  
   others, and derivative works that comment on or otherwise explain it 
   or assist in its implementation may be prepared, copied, published 
   and distributed, in whole or in part, without restriction of any 
   kind, provided that the above copyright notice and this paragraph are 
   included on all such copies and derivative works. However, this 
   document itself may not be modified in any way, such as by removing 
   the copyright notice or references to the Internet Society or other 
   Internet organizations, except as needed for the purpose of 
   developing Internet standards in which case the procedures for 
   copyrights defined in the Internet Standards process must be 
   followed, or as required to translate it into languages other than 
   English. 
    
   The limited permissions granted above are perpetual and will not be 
   revoked by the Internet Society or its successors or assigns. 
    
   This document and the information contained herein is provided on an 
   "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING 
   TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING 
   BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION 
   HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF 
   MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE." 
    
      
Rabbat, R., et al        Expires - April 2004                [Page 20]