Internet DRAFT - draft-hegde-rtgwg-microloop-avoidance-using-spring

draft-hegde-rtgwg-microloop-avoidance-using-spring







Routing area                                                    S. Hegde
Internet-Draft                                    Juniper Networks, Inc.
Intended status: Standards Track                               P. Sarkar
Expires: January 4, 2018                                      Individual
                                                            July 3, 2017


                   Micro-loop avoidance using SPRING
         draft-hegde-rtgwg-microloop-avoidance-using-spring-03

Abstract

   When there is a change in network topology either due to a link going
   down or due to a new link addition, all the nodes in the network need
   to get the complete view of the network and re-compute the routes.
   There will generally be a small time window when the forwarding state
   of each of the nodes is not synchronized.  This can result in
   transient loops in the network, leading to dropped traffic due to
   over-subscription of links.  Micro-looping is generally more harmful
   than simply dropping traffic on failed links, because it can cause
   control traffic to be dropped on an otherwise healthy link involved
   in micro-loop.  This can lead to cascading adjacency failures or
   network meltdown.

Requirements Language

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC 2119 [RFC2119].

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at http://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on January 4, 2018.





Hegde & Sarkar           Expires January 4, 2018                [Page 1]

Internet-Draft      Microloop avoidance using SPRING           July 2017


Copyright Notice

   Copyright (c) 2017 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
   2.  Procedures for Micro-loop prevention  . . . . . . . . . . . .   3
   3.  Detailed Solution based on SPRING . . . . . . . . . . . . . .   5
     3.1.  Link-down event . . . . . . . . . . . . . . . . . . . . .   6
     3.2.  Link-up event . . . . . . . . . . . . . . . . . . . . . .  11
     3.3.  Computation of nearest PLR  . . . . . . . . . . . . . . .  12
       3.3.1.  Link down event . . . . . . . . . . . . . . . . . . .  12
       3.3.2.  Node down event . . . . . . . . . . . . . . . . . . .  12
     3.4.  Handling multiple network events  . . . . . . . . . . . .  13
       3.4.1.  Handling SRLG failures  . . . . . . . . . . . . . . .  13
     3.5.  Handling ECMP . . . . . . . . . . . . . . . . . . . . . .  15
     3.6.  Recognizing same network event  . . . . . . . . . . . . .  15
     3.7.  Partial deployment Considerations . . . . . . . . . . . .  15
   4.  Protocol Procedures . . . . . . . . . . . . . . . . . . . . .  17
     4.1.  OSPF  . . . . . . . . . . . . . . . . . . . . . . . . . .  17
     4.2.  ISIS  . . . . . . . . . . . . . . . . . . . . . . . . . .  17
     4.3.  Elements of procedure . . . . . . . . . . . . . . . . . .  18
   5.  Security Considerations . . . . . . . . . . . . . . . . . . .  18
   6.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  19
   7.  Acknowledgments . . . . . . . . . . . . . . . . . . . . . . .  19
   8.  References  . . . . . . . . . . . . . . . . . . . . . . . . .  19
     8.1.  Normative References  . . . . . . . . . . . . . . . . . .  19
     8.2.  Informative References  . . . . . . . . . . . . . . . . .  19
   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  21

1.  Introduction

   Micro-loops are transient loops that occur during the period of time
   when some nodes have become aware of a topology change and have
   changed their forwarding tables in response, but slow routers have
   not yet modified their forwarding tables.  This document provides



Hegde & Sarkar           Expires January 4, 2018                [Page 2]

Internet-Draft      Microloop avoidance using SPRING           July 2017


   mechanisms to prevent micro-loops in the network in the event of link
   up/down or metric change.The micro-loop prevention mechanism uses the
   basic principles of near-side tunnelling as described in [RFC5715]
   sec 6.2.

   Micro-loops can be formed involving the PLRs or nodes which are not
   directly connected to the link/node going down.  The nodes which are
   not directly connected to the node/link going down/up are referred to
   as remote nodes.  The micro-loop prevention mechanism described in
   this document prevents possible micro-loops involving the remote
   nodes.  A new sub-tlv is defined in ISIS router capability TLV
   [RFC4971] and OSPF router capability TLV [RFC4970] for discovering
   support of this feature.  The details are described in Section 4.
   The operational procedures for micro-loop prevention are described in
   Section 3.

2.  Procedures for Micro-loop prevention


      +----+ 10 +----+ 10 +----+  10   +----+ 10 +----+
      | S1 |----| R1 |----| S  |-------| E  |----| D1 |
      +----+    +----+    +----+       +----+    +----+
          \                  \          /
           \ 10               \ 100    / 60
            \                  \      /
             \   +----+         +----+
              +--| R2 |---------| R3 |
                 +----+    30   +----+
                  /
                 / 10
             +----+
             | S2 |
             +----+


                         Figure 1: Sample Network

   The topology shown in figure 1 illustrates a sample network topology
   where micro-loops can occur.  The symmetric link metrics are shown in
   the diagram above.  The traffic from S1 to D1 takes the path
   S1->R1->S->E->D1 and traffic from S2 takes the path
   S2->R2->S1->R1->S->E->D1 in normal operation.  When the S->E link
   goes down, traffic can loop between S1->R2 when the FIB on S1
   reflects the shortest path to D1 after the failure and the FIB on R2
   reflects the shortest path to D1 before the failure.  The mechanisms
   described in [I-D.ietf-rtgwg-uloop-delay] do not address micro-loops
   involving nodes that are not directly attached to the link that has
   just gone down or come up.  For example when S->E link goes down, S



Hegde & Sarkar           Expires January 4, 2018                [Page 3]

Internet-Draft      Microloop avoidance using SPRING           July 2017


   and E are the Point of Local Repair (PLR) and micro-loops formed
   between S1 and R2 are not handled.

   The basic principle of the solution is to send the traffic on
   tunnelled paths for a certain time period until all the nodes in the
   network process the event and update their forwarding plane.  When
   the link S->E goes down, all the nodes in the network tunnel the
   traffic to the nearest PLR.  The PLR S needs to maintain the backup
   path created using FRR ([RFC5286]) or other mechanisms until all
   other nodes in the network converge.  The PLR S forwards the traffic
   to the affected destinations via the back-up path until the
   convergence procedure is complete.  This document assumes 100% backup
   coverage for the destinations via various FRR mechanisms.  This
   document describes the procedures corresponding to the traffic flow
   from sources (S nodes) to the destination nodes (D nodes).  The
   procedures equally apply to the D nodes being source and S nodes
   being destination.

   As soon as a node learns of the topology change, it modifies its FIB
   to use loop-free tunnelled paths for the affected traffic, and it
   starts a "convergence delay timer".  When the "convergence delay
   timer" expires, the node modifies its FIB to use the SPF path based
   on the changed topology.  The use of tunnelled paths during the
   convergence period ensures that (barring other topology changes) all
   traffic affected by the topology change travels on a loop-free path.

   After all the nodes in the network converge to actual SPF path,PLR
   converges to SPF path and updates the FIB.  This micro-loop
   prevention mechanism delays the time it takes for routing to converge
   to the optimal paths in the new topology by a factor of 3 but the
   convergence time is deterministic and completely avoids micro-loops.

   In principle, near-side tunnelling could be accomplished using labels
   distributed via LDP.  However, since the application requires that
   any given router have the potential to create a tunnel to nearly
   every other router in the IGP domain, a large number of targeted LDP
   sessions would be needed to learn the FEC-label bindings distributed
   by the PLRs.  SPRING [I-D.ietf-spring-segment-routing] provides a
   more efficient method for distributing shortest path labels for this
   application, since any router can compute the locally significant
   FEC-label bindings for any other router without the need for targeted
   LDP sessions.

   [RFC5715] describes other mechanisms to prevent micro-loop
   prevention.  Near-side tunnelling is more suited for deployments as
   it does not need additional computation or additional state
   maintenance in the network nodes.Far side tunnelling has the
   disadvantage that it requires the use of not-via addresses [RFC6981]



Hegde & Sarkar           Expires January 4, 2018                [Page 4]

Internet-Draft      Microloop avoidance using SPRING           July 2017


   which requires additional address configuration on each node.Per
   destination non micro-looping path computation is another approach to
   prevent micro-loops but it is computationally intensive.

3.  Detailed Solution based on SPRING


           +----+
           | R4 | SRGB:1000-2000
           +----+ SID:9
            / \
        5  /   \ 5
          /     \         SRGB:1000-2000
   SID:1 /       \ SID:2   SID:3       SID:4     SID:5
      +----+ 10 +----+ 10 +----+  10   +----+ 10 +----+
      | S1 |----| R1 |----| S  |-------| E  |----| D1 |
      +----+    +----+    +----+       +----+    +----+
          \                  \          /
        10 \                  \ 100    / 60
            \  SRGB:1000-2000  \      /
             \   +----+         +----+
              +--| R2 |---------| R3 |SID:7
           SID:6 +----+    30   +----+SRGB:1000-2000
                  /
                 / 10
             +----+
             | S2 |SID:8
             +----+SRGB:1000-2000


                        Figure 2: Sample SR Network

   The above sample topology is provided with basic SPRING
   configurations of SRGB and the indices corresponding to each node.
   Each node has an SRGB 1000-2000 configured on the node.  Same SRGB on
   all nodes is used for simplifying the example and the procedures are
   equally applicable when there is different SRGB configured on
   multiple nodes.  Each node is provisioned with a
   MAX_CONVERGENCE_DELAY value that corresponds to its RIB to FIB
   convergence time.  The information for support of the micro-loop
   prevention feature and the MAX_CONVERGENCE_DELAY value are flooded
   across the IGP domain (ISIS level/OSPF area).  Each node in the IGP
   domain sets the MAX_CONVERGENCE_DELAY to the maximum of the values
   received in the domain.







Hegde & Sarkar           Expires January 4, 2018                [Page 5]

Internet-Draft      Microloop avoidance using SPRING           July 2017


3.1.  Link-down event

   When the S->E link goes down, all the nodes in the network receive
   the event via IGP database flooding.  Each node supporting the micro-
   loop prevention mechanism specified in this document SHOULD perform
   the steps below.

   1.  The PLRs (S and E) perform FRR local repair for destinations
       affected by the failure of the link.  Each computing node
       identifies the destinations affected by the topology change.In
       the example above, the destination D1 is affected by S->E link
       down for nodes S1,R1,R2, and R4.  For S2, although the path to D1
       changes there is no change in the immediate next-hop and hence
       its not necessary for S2 to perform any specific actions to
       prevent micro-loops.

   2.  For each affected destination, identify the nearest PLR
       advertising the change.  The link-down event is advertised by
       both S and E.  S is the nearest PLR for the nodes S1,R1,R2, and
       R4.

   3.  Let the S->E link down event occurs at time T0.

   4.  Start a timer T1 = max (all MAXIMUM_CONVERGENCE_DELAY) at all
       non-PLR nodes with affected destinations.

   5.  Start a timer T2 = 2 * T1 at the PLR.

   6.  For IP routes, modify the FIB for the affected destinations so
       that the nearest PLR's node-sid is pushed on the packet's label
       stack.  For MPLS ingress and transit routes, modify the FIB for
       the affected destinations with a two label stack, the inner label
       corresponding to the destination and the outer label
       corresponding to the nearest PLR.

   7.  In the case of ECMP paths to the nearest PLR, both tunnelled
       paths are used.  S1 has ECMP paths to the destination D1 and both
       the paths are impacted.  Both the paths are modified to carry two
       label stacks containing the nearest PLR on top and the
       destination label at the bottom.

   8.  After the expiry of timer T1 all the non-PLR nodes modify their
       FIBs to use the shortest path as computed by the IGP, and they no
       longer push the node-SID of the nearest PLR on the packets.

   9.  After the expiry of T2, the PLR converges and updates the FIB to
       represent shortest path.




Hegde & Sarkar           Expires January 4, 2018                [Page 6]

Internet-Draft      Microloop avoidance using SPRING           July 2017


   The ingress MPLS routes at various nodes for destination D1 at
   specified time intervals is mentioned below.

















































Hegde & Sarkar           Expires January 4, 2018                [Page 7]

Internet-Draft      Microloop avoidance using SPRING           July 2017


   +======+=============+=================+=============+==============+
   | Node | Before T0   | T0-T1           | T1-T2       | After T2     |
   +======+=============+=================+=============+==============+
   | S1   | Push 1005,  | Push 1005,      | Push 1005,  | Push 1005,   |
   |      | Fwd to R1   | 1003(top), Fwd  | Fwd to R2   | Fwd to R2    |
   |      |             | to R1           |             |              |
   |      +-------------+-----------------+-------------+--------------+
   |      | Push 1005,  | Push 1005,      |             |              |
   |      | Fwd to R4   | 1003(top), Fwd  |             |              |
   |      |             | to R4           |             |              |
   +======+=============+=================+=============+==============+
   | S2   | Push 1005,  | Push 1005, Fwd  | Push 1005,  | Push 1005,   |
   |      | Fwd to R2   | to R2           | Fwd to R2   | Fwd to R2    |
   +======+=============+=================+=============+==============+
   | R1   | Push 1005,  | Push 1005, Fwd  | Push 1005,  | Push 1005,   |
   |      | Fwd to S    | to S            | Fwd to R4   | Fwd to R4    |
   |      +-------------+-----------------+-------------+--------------+
   |      |             |                 | Push 1005,  | Push 1005,   |
   |      |             |                 | Fwd to S1   | Fwd to S1    |
   +======+=============+=================+=============+==============+
   | R2   | Push 1005,  | Push 1005,      | Push 1005,  | Push 1005,   |
   |      | Fwd to S1   | 1003(top), Fwd  | Fwd to R3   | Fwd to R3    |
   |      |             | to S1           |             |              |
   +======+=============+=================+=============+==============+
   | R3   | Push 1005,  | Push 1005,      | Push 1005,  | Push 1005,   |
   |      | Fwd to E    | 1003(top), Fwd  | Fwd to E    | Fwd to E     |
   |      |             | to E            |             |              |
   +======+=============+=================+=============+==============+
   | R4   | Push 1005,  | Push 1005,      | Push 1005,  | Push 1005,   |
   |      | Fwd to R1   | 1003(top), Fwd  | Fwd to S1   | Fwd to S1    |
   |      |             | to R1           |             |              |
   +======+=============+=================+=============+==============+
   | S    | Push 1005,  | Push 1005, Fwd  | Push 1005,  | Push 1005,   |
   |      | Fwd to E    | to R3 *         | Fwd to R3 * | Fwd to R1    |
   |      +-------------+-----------------+-------------+---------- ---+
   |      | Push 1005,  |                 |             | Push 1005,   |
   |      | Fwd to R3 * |                 |             | Fwd to R3 *  |
   +======+=============+=================+=============+==============+
   | E    | Pop, Fwd to | Pop, Fwd to D1  | Pop, Fwd to | Pop, Fwd to  |
   |      | D1          |                 | D1          | D1           |
   +======+=============+=================+=============+==============+

                     * - Indicates backup path.


                     Figure 3: Sample MPLS ingress RIB





Hegde & Sarkar           Expires January 4, 2018                [Page 8]

Internet-Draft      Microloop avoidance using SPRING           July 2017


   The corresponding MPLS transit routes at various nodes at specified
   time interval is shown below.


   +======+==========+==========+==============+===========+===========+
   | Node | Incoming | Before   | T0-T1        | T1-T2     | After T2  |
   |      | Label    | T0       |              |           |           |
   +======+==========+==========+==============+===========+===========+
   | S1   | 1005     | Push     | Push 1005,   | Push      | Push      |
   |      |          | 1005,    | 1003(top),   | 1005, Fwd | 1005, Fwd |
   |      |          | Fwd to   | Fwd to R1    | to R2     | to R2     |
   |      |          | R1       |              |           |           |
   |      |          +----------+--------------+-----------+-----------+
   |      |          | Push     | Push 1005,   |           |           |
   |      |          | 1005,    | 1003(top),   |           |           |
   |      |          | Fwd to   | Fwd to R4    |           |           |
   |      |          | R4       |              |           |           |
   |      +----------+----------+--------------+-----------+-----------+
   |      | 1003     | Push     | Push 1003,   | Push      | Push      |
   |      |          | 1003,    | Fwd to R1    | 1003, Fwd | 1003, Fwd |
   |      |          | Fwd to   |              | to R2     | to R2     |
   |      |          | R1       |              |           |           |
   +======+==========+==========+==============+===========+===========+
   | S2   | 1005     | Push     | Push 1005,   | Push      | Push      |
   |      |          | 1005,    | Fwd to R2    | 1005, Fwd | 1005, Fwd |
   |      |          | Fwd to   |              | to R2     | to R2     |
   |      |          | R2       |              |           |           |
   |      +----------+----------+--------------+-----------+-----------+
   |      | 1003     | Push     | Push 1003,   | Push      | Push      |
   |      |          | 1003,    | Fwd to R1    | 1003, Fwd | 1003, Fwd |
   |      |          | Fwd to   |              | to R2     | to R2     |
   |      |          | R1       |              |           |           |
   +======+==========+==========+==============+===========+===========+
   | R1   | 1005     | Push     | Push 1005,   | Push      | Push      |
   |      |          | 1005,    | Fwd to S     | 1005, Fwd | 1005, Fwd |
   |      |          | Fwd to S |              | to R4     | to R4     |
   |      |          +----------+--------------+-----------+-----------+
   |      |          |          |              | Push      | Push      |
   |      |          |          |              | 1005, Fwd | 1005, Fwd |
   |      |          |          |              | to S1     | to S1     |
   |      +----------+----------+--------------+-----------+-----------+
   |      | 1003     | Push     | Push 1003,   | Push      | Push      |
   |      |          | 1003,    | Fwd to S     | 1003, Fwd | 1003, Fwd |
   |      |          | Fwd to S |              | to S      | to S      |
   +======+==========+==========+==============+===========+===========+
   | R2   | 1005     | Push     | Push 1005,   | Push      | Push      |
   |      |          | 1005,    | 1003(top),   | 1005, Fwd | 1005, Fwd |
   |      |          | Fwd to   | Fwd to S1    | to R3     | to R3     |



Hegde & Sarkar           Expires January 4, 2018                [Page 9]

Internet-Draft      Microloop avoidance using SPRING           July 2017


   |      |          | S1       |              |           |           |
   |      +----------+----------+--------------+-----------+-----------+
   |      | 1003     | Push     | Push 1003,   | Push      | Push      |
   |      |          | 1003,    | Fwd to S1    | 1003, Fwd | 1003, Fwd |
   |      |          | Fwd to   |              | to S1     | to S1     |
   |      |          | S1       |              |           |           |
   +======+==========+==========+==============+===========+===========+
   | R3   | 1005     | Push     | Push 1005,   | Push      | Push      |
   |      |          | 1005,    | 1003(top),   | 1005, Fwd | 1005, Fwd |
   |      |          | Fwd to E | Fwd to E     | to E      | to E      |
   |      +----------+----------+--------------+-----------+-----------+
   |      | 1003     | Push     | Push 1003,   | Push      | Push      |
   |      |          | 1003,    | Fwd to R2    | 1003, Fwd | 1003, Fwd |
   |      |          | Fwd to   |              | to R2     | to R2     |
   |      |          | R2       |              |           |           |
   +======+==========+==========+==============+===========+===========+
   | R4   | 1005     | Push     | Push 1005,   | Push      | Push      |
   |      |          | 1005,    | 1003(top),   | 1005, Fwd | 1005, Fwd |
   |      |          | Fwd to   | Fwd to R1    | to S1     | to S1     |
   |      |          | R1       |              |           |           |
   |      +----------+----------+--------------+-----------+-----------+
   |      | 1003     | Push     | Push 1003,   | Push      | Push      |
   |      |          | 1003,    | Fwd to R1    | 1003, Fwd | 1003, Fwd |
   |      |          | Fwd to   |              | to R1     | to R1     |
   |      |          | R1       |              |           |           |
   +======+==========+==========+==============+===========+===========+
   | S    | 1005     | Push     | Push 1005,   | Push      | Push      |
   |      |          | 1005,    | Fwd to R3 *  | 1005, Fwd | 1005, Fwd |
   |      |          | Fwd to E |              | to R3 *   | to R1     |
   |      |          +----------+--------------+-----------+-----------+
   |      |          | Push     |              |           | Push      |
   |      |          | 1005,    |              |           | 1005, Fwd |
   |      |          | Fwd to   |              |           | to R3 *   |
   |      |          | R3 *     |              |           |           |
   |      +----------+----------+--------------+-----------+-----------+
   |      | 1003     | --       | --           | --        | --        |
   +======+==========+==========+==============+===========+===========+
   | E    | 1005     | Pop, Fwd | Pop, Fwd to  | Pop, Fwd  | Pop, Fwd  |
   |      |          | to D1    | D1           | to D1     | to D1     |
   +======+==========+==========+==============+===========+===========+

        *     - Indicates backup path.


                     Figure 4: Sample MPLS transit RIB






Hegde & Sarkar           Expires January 4, 2018               [Page 10]

Internet-Draft      Microloop avoidance using SPRING           July 2017


3.2.  Link-up event

   When a new-link is added to the network, the PLR needs to update the
   FIB before it announces the change.  First the PLR converges, updates
   the FIB as per the new-link based topology and then announces the
   new-link addition to the rest of the network.  The other network
   nodes SHOULD follow the procedure exactly same as described in sec
   3.1.  They SHOULD update their FIB to tunnel the traffic to the
   closest node corresponding to the change.After MAX_CONVERGENCE_DELAY
   the nodes SHOULD update the FIB with the shortest path next-hops.


                          SRGB:1000-2000
       SID:1     SID:2     SID:3       SID:4     SID:5
      +----+ 10 +----+ 10 +----+  10   +----+ 10 +----+
      | S1 |----| R1 |----| S  |---X---| E  |----| D1 |
      +----+    +----+    +----+       +----+    +----+
          \                  \          /
        10 \                  \ 10     / 100
            \  SRGB:1000-2000  \      /
             \   +----+         +----+
              +--| R2 |---------| R3 |SID:7
           SID:6 +----+    10   +----+SRGB:1000-2000
                  /
                 / 10
             +----+
             | S2 |SID:8
             +----+SRGB:1000-2000


                        Figure 5: Sample SR Network

   In the figure above, when the S->E link is added (or restored back),

   1.  PLR S processes the event and programs the FIB with new path for
       the affected destinations.

   2.  PLR delays flooding the event for MAX_CONVERGENCE_DELAY interval.
       This step prevents possible local micro-loop between S and R3.

   3.  Once PLR floods the event, non PLR nodes in the network identify
       the destinations affected by the database change.  This is done
       by SPF computation and examining the next-hop change.  The
       destination D1 is affected by S->E link up for nodes S1, R1, R2
       and R3.

   4.  For each affected destination, identify the nearest PLR
       advertising the change.  The link-up event is advertised by both



Hegde & Sarkar           Expires January 4, 2018               [Page 11]

Internet-Draft      Microloop avoidance using SPRING           July 2017


       S and E.  S is the nearest PLR for the nodes S1,R1,R2 and R3.
       When there are ECMP paths to the destination and a new ECMP path
       is added, the new ECMP path follows the micro-loop prevention
       mechanisms and tunnels the traffic towards nearest PLR.

   5.  Start a timer T3 = max (all MAXIMUM_CONVERGENCE_DELAY) at all
       non-PLR nodes.

   6.  For IP routes, update the FIB for the affected destinations so
       that the nearest PLR's node-sid is pushed on the packet's label
       stack.  For MPLS ingress and transit router update the path with
       two label stack, the inner label corresponding to the destination
       and the outer label corresponding to the nearest PLR.  This step
       prevents the possible remote micro-loop between S1 and R2.

   7.  After the expiry of timer T3 all the non-PLR nodes perform global
       convergence and update the FIB to represent the shortest path.

   Other management events like metric change are handled similar to the
   link-down/link-up cases for metric increase/metric decrease cases
   respectively.

3.3.  Computation of nearest PLR

   When a network event is received by a node via the IGP database
   change notification, a node has to compute the nearest PLR
   corresponding to that advertisement.  The first database change
   advertisement may be received from any of the PLRs, nearest or
   farthest.

3.3.1.  Link down event

   When a link goes down, IGPs generate a fresh LSP/Router LSA with the
   affected link removed.  The computing node has to identify the
   missing link by walking over the LSP/LSA and compare the contents
   with an older version.  Once the affected link is identified, the
   cost to reach both ends of the link should be examined.  The nearest
   PLR is chosen based on the cost to reach the ends.

3.3.2.  Node down event

   When a node goes down, it is identified by the neighbouring nodes via
   link-down event.  the neighbouring routers generate a fresh LSP/
   Router LSA with the affected link removed.  The computing node has to
   identify the missing link by walking over the LSP/LSA and compare the
   contents with an older version.  Once the affected link is
   identified, the cost to reach both ends of the link should be




Hegde & Sarkar           Expires January 4, 2018               [Page 12]

Internet-Draft      Microloop avoidance using SPRING           July 2017


   examined.  The nearest PLR is chosen based on the cost to reach the
   ends.

   When an advertisement from the farthest node is received before the
   nearest node, it is possible that the node that went down is chosen
   as the nearest PLR, as the node that went down might be still
   lingering in the database.  In such cases node protection mechanisms
   for the deceased node at the previous-hop should prevent traffic
   loss.  The details of such a mechanism is outside the scope of this
   document.

3.4.  Handling multiple network events

   It is important to categorize the received events as belonging to one
   network event or multiple network events.  The link-down/link-up
   event is advertised by both ends of the link.  The node-down/node-up
   event is advertised by all the neighbouring nodes.When an event is
   received, the computing node should analyse the changes in the
   database advertisements and compare with previous database.The micro-
   loop prevention procedures SHOULD be started when the first
   notification is received.  The node SHOULD record the event for which
   micro-loop prevention procedures are being performed.  If there are
   more database changes received during this time, the change should be
   mapped to the already on-going micro-loop prevention procedures.If
   the event is same then the micro-loop prevention procedures MUST
   continue, otherwise the micro-loop prevention procedures SHOULD be
   aborted.

   [RFC5715] sec 6.2 describes mechanisms to handle the SRLG failures.
   If the received failure advertisement is part of an SRLG advertised
   in the IGP TE advertisement, the links on the path sharing same SRLG
   are identified and the tunnel is built with multiple label stack
   corresponding to the nearest PLR of each SRLG member.

   When a failure is received, and the failure does not belong to the
   same SRLG as the already on-going micro-loop prevention, the micro-
   loop prevention procedures MUST be aborted and the normal convergence
   procedures SHOULD be followed.

3.4.1.  Handling SRLG failures

   Consider a sample network as shown above with S->E and S1->R1
   belonging to same SRLG group.  The symmetric link metrics are shown
   in the figure and the SRGB is 1000-2000 on all nodes.  When the S->E
   link goes down, all the links belonging to the same SRLG are
   considered to be down and the route is modified to carry multiple
   node-sids along the path.




Hegde & Sarkar           Expires January 4, 2018               [Page 13]

Internet-Draft      Microloop avoidance using SPRING           July 2017


                SRGB:1000-2000
       SID:1     SID:2     SID:3       SID:4     SID:5
      +----+ 10    +----+ 10 +----+  10   +----+ 10 +----+
      | S1 |-------| R1 |----| S  |-------| E  |----| D1 |
      +----+ SRLG=5+----+    +----+ SRLG=5+----+    +----+
          \                    \          /
        10 \                    \ 10     / 100
            \  SRGB:1000-2000    \      /
             \   +----+         +----+
              +--| R2 |---------| R3 |SID:7
           SID:6 +----+    10   +----+SRGB:1000-2000
                  /
                 / 10
             +----+
             | S2 |SID:8
             +----+SRGB:1000-2000

                 Figure 6: Sample Network with SRLG links

   1.  when the S->E link goes down, S and E generate the link down
       event, update their Router-LSA/ LSP and flood the updated
       information across the IGP domain.

   2.  The nodes in the IGP domain process the link-down event for
       affected destinations.If there are any other links with same SRLG
       on the path to destination, the nearest PLRs for those links are
       identified.  In this example topology S1->R1 and S->E belong to
       same SRLG.  For destination D1, R2 identifies two PLRs S1 and S
       for the S->E link down event.

   3.  The nodes build the tunnelled path having multiple labels for
       each of the identified links. for ex, R2 builds a stack
       containing node-sid of S1 and S.  The tunnelled path at R2 looks
       as shown in Figure 7 below.

       +------+--------------------+---------------------------------+
       | Node | Destination Prefix | Label Operation                 |
       +------+--------------------+---------------------------------+
       | R2   | D1                 | Push 1005, 1003, 1001(top),     |
       |      |                    | Fwd to S1                       |
       +------+--------------------+---------------------------------+

          Figure 7: Sample ingress RIB for SRLG failure handling

   4.  The procedures as described in sec 3.1 for the link-down event is
       followed to achieve micro-loop free convergence.





Hegde & Sarkar           Expires January 4, 2018               [Page 14]

Internet-Draft      Microloop avoidance using SPRING           July 2017


3.5.  Handling ECMP

   When a network event is received, if the the change causes only one
   of the ECMP paths to change, then the micro-loop prevention
   mechanisms described in sec 3.1 and 3.2 are applied to the changed
   path only.  As described in section 3.1 and 3.2 , if there is an ECMP
   path to the nearest PLR, then all ECMP paths are used to tunnel the
   traffic during convergence.

3.6.  Recognizing same network event

   When a link goes down, both the ends of the link report the event by
   updating their LSP/LSA and flood it across the IGP domain.  It is
   possible that the same network event being reported by two nodes is
   perceived as two different network events by the nodes in the IGP
   domain.  The nodes processing the network events SHOULD evaluate if
   the received multiple events correspond to a single event by
   comparing the both ends of the reported link and also by looking at
   the previous event for which micro-loop prevention is being
   performed.  If the event is same then micro-loop prevention
   procedures MUST be allowed to continue and MUST NOT be aborted.

   Node down or new node addition events are reported by removing a link
   or adding a new link by all the adjacent nodes.  In addition Node up
   event also comprises of a new LSA advertisement.  The criteria to
   recognize if the event is same is to look at both ends of the changed
   link.  If one end of the changed link maps to previously reported
   events and the other end of the link (advertising router) changes for
   each successive event, then the event is SHOULD be recognized as a
   new node addition or a node deletion.  Micro-loop procedures MUST be
   allowed to continue and MUST NOT be aborted.

3.7.  Partial deployment Considerations

   The micro-loop mechanisms described in this document, are very
   effective and safe when all the nodes in the network support this
   feature and apply it when a network event happens.  However, in some
   topologies, when all the nodes do not support the micro-loop
   prevention mechanism, the time duration of the loop can increase when
   only some nodes apply the procedures described in this document and
   some nodes do not.

   For example, consider the sample topology described in the figure
   below.







Hegde & Sarkar           Expires January 4, 2018               [Page 15]

Internet-Draft      Microloop avoidance using SPRING           July 2017


                             +-----+
                             |  S3 |
                             +-----+
                               /
                              /
      +----+ 10 +----+ 10 +----+  10   +----+ 10 +----+
      | S1 |----| R1 |----| S  |-------| E  |----| D1 |
      +----+    +----+    +----+       +----+    +----+
          \                  \          /
           \ 10               \ 100    / 60
            \                  \      /
             \   +----+         +----+
              +--| R2 |---------| R3 |
                 +----+    30   +----+
                  /
                 / 10
             +----+
             | S2 |
             +----+


             Figure 8: Sample Network with partial deployment

   In this topology, S1, S2, and S3 are traffic sources and D1 is the
   destination.  For each of the sources, Figure 9 shows the path before
   the failure (the before path) and the path after the failure (the
   post convergence path)..

   +----+------+-------------------------+-----------------------------+
   | Sr | Dest | Original Path           | Post-Convergence Path       |
   | c  |      |                         |                             |
   +----+------+-------------------------+-----------------------------+
   | S1 | D1   | S1->R1->S->E->D1        | S1->R2->R3->E->D1           |
   +----+------+-------------------------+-----------------------------+
   | S2 | D1   | S2->R2->S1->R1->S->E->D1| S2->R2->R3->E->D1           |
   +----+------+-------------------------+-----------------------------+
   | S3 | D1   | S3->S->E->D1            | S3->S->R1->S1->R2->R3->E->D1|
   +----+------+-------------------------+-----------------------------+

   Figure 9: Traffic flow in normal operation and post convergence path
                            with S->E link down

   In the above topology, if the PLR S does not support the micro-loop
   prevention mechanism but all other nodes support and apply this
   mechanism, then there is a possibility that the duration of traffic
   looping is higher than when the micro-loop prevention mechanisms are
   not applied at all.  To mitigate this issue, protocol extensions to
   negotiate the support of this feature in the IGP domain is needed.



Hegde & Sarkar           Expires January 4, 2018               [Page 16]

Internet-Draft      Microloop avoidance using SPRING           July 2017


   Section 4 describes the protocol mechanisms to advertise the support
   of this feature in OSPF and ISIS.

   However, in certain deployments and topologies, it MAY be safe to
   apply the micro-loop prevention procedures even when all the nodes in
   the network do not support this feature, especially in topologies
   where the post convergence path from PLR does not traverse the nodes
   in P space of the PLR with respect to the the node or link being
   protected.

4.  Protocol Procedures

4.1.  OSPF

   [RFC4970], defines Router Information (RI) LSA which may be used to
   advertise properties of the originating router.  Payload of the RI
   LSA consists of one or more nested Type/Length/Value (TLV) triplets.
   This document defines a new TLV Micro-loop prevention support TLV
   which has following format:

   0               1             2             3
   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | Type                        | Length                          |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+


             Figure 10: OSPF micro-loop prevention support TLV

   Type : TBA, Suggested value 15

   Length: 0

   The MAX_CONVEREGENCE_DELAY described in this document is advertised
   using Controlled Convergence TLV as described in [I-D.ietf-ospf-mrt]

4.2.  ISIS

   [RFC4971], defines Router capability TLV which may be used to
   advertise properties of the originating router.  This document
   defines a new sub-TLV Micro-loop prevention support sub-TLV which has
   following format:









Hegde & Sarkar           Expires January 4, 2018               [Page 17]

Internet-Draft      Microloop avoidance using SPRING           July 2017


   0               1             2
   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | Type         | Length       |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

           Figure 11: ISIS micro-loop prevention support sub-TLV

   The Router Capability TLV specifies flags that control its
   advertisement.  The Micro-loop prevention support sub-TLV MUST be
   propagated throughout the level and SHOULD NOT be advertised across
   level boundaries.  Therefore Router Capability TLV distribution flags
   SHOULD be set accordingly, i.e.: the S flag in the Router Capability
   TLV [RFC4971] MUST be unset.

   Type : TBA, Suggested value 5

   Length: 0

   The MAX_CONEVREGENCE_DELAY described in this document is advertised
   using Controlled Convergence TLV as described in [I-D.ietf-isis-mrt]

4.3.  Elements of procedure

   The micro-loop prevention support sub-TLV MUST be advertised only
   when the feature is enabled.When all the nodes in the IGP domain
   advertise this sub-TLV, a node supporting this feature MUST perform
   the micro-loop prevention procedures as described in this document.
   The micro-loop prevention mechanisms are applied within the OSPF area
   or ISIS level.

   When there are one or more nodes in the IGP domain which do not
   support this feature, a node MAY perform micro-loop prevention
   procedures.  Near side tunnelling mechanism ensures that when a group
   of nodes support this feature, traffic sourced from these set of
   nodes do not suffer micro-loop.  A manageability interface SHOULD be
   provided to support micro-loop prevention in case of partial feature
   deployment.

5.  Security Considerations

   This document does not introduce any further security issues other
   than those discussed in [RFC2328] ,[RFC5340] , [ISO10589] and
   [RFC1195]







Hegde & Sarkar           Expires January 4, 2018               [Page 18]

Internet-Draft      Microloop avoidance using SPRING           July 2017


6.  IANA Considerations

   This specification updates one OSPF registry: OSPF Router Information
   (RI) TLVs Registry

   i) TBD - Micro-loop prevention support TLV

   This specification updates one ISIS registry: ISIS Router capability
   TLVs (TLV 242) Registry

   i) TBD - Micro-loop prevention support sub-TLV

7.  Acknowledgments

   Thanks to Chris Bowers, Hannes Gredler,Eric Rosen and Stephane
   Litkowsky for valuable inputs.

8.  References

8.1.  Normative References

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119,
              DOI 10.17487/RFC2119, March 1997,
              <http://www.rfc-editor.org/info/rfc2119>.

   [RFC4970]  Lindem, A., Ed., Shen, N., Vasseur, JP., Aggarwal, R., and
              S. Shaffer, "Extensions to OSPF for Advertising Optional
              Router Capabilities", RFC 4970, DOI 10.17487/RFC4970, July
              2007, <http://www.rfc-editor.org/info/rfc4970>.

   [RFC4971]  Vasseur, JP., Ed., Shen, N., Ed., and R. Aggarwal, Ed.,
              "Intermediate System to Intermediate System (IS-IS)
              Extensions for Advertising Router Information", RFC 4971,
              DOI 10.17487/RFC4971, July 2007,
              <http://www.rfc-editor.org/info/rfc4971>.

8.2.  Informative References

   [I-D.ietf-isis-mrt]
              Li, Z., Wu, N., Zhao, Q., Atlas, A., Bowers, C., and J.
              Tantsura, "Intermediate System to Intermediate System (IS-
              IS) Extensions for Maximally Redundant Trees (MRT)",
              draft-ietf-isis-mrt-03 (work in progress), June 2017.







Hegde & Sarkar           Expires January 4, 2018               [Page 19]

Internet-Draft      Microloop avoidance using SPRING           July 2017


   [I-D.ietf-ospf-mrt]
              Atlas, A., Hegde, S., Bowers, C., Tantsura, J., and Z. Li,
              "OSPF Extensions to Support Maximally Redundant Trees",
              draft-ietf-ospf-mrt-03 (work in progress), June 2017.

   [I-D.ietf-rtgwg-uloop-delay]
              Litkowski, S., Decraene, B., Filsfils, C., and P.
              Francois, "Micro-loop prevention by introducing a local
              convergence delay", draft-ietf-rtgwg-uloop-delay-05 (work
              in progress), June 2017.

   [I-D.ietf-spring-segment-routing]
              Filsfils, C., Previdi, S., Decraene, B., Litkowski, S.,
              and R. Shakir, "Segment Routing Architecture", draft-ietf-
              spring-segment-routing-12 (work in progress), June 2017.

   [ISO10589]
              "Intermediate system to Intermediate system intra-domain
              routeing information exchange protocol for use in
              conjunction with the protocol for providing the
              connectionless-mode Network Service (ISO 8473), ISO/IEC
              10589:2002, Second Edition.", Nov 2002.

   [RFC1195]  Callon, R., "Use of OSI IS-IS for routing in TCP/IP and
              dual environments", RFC 1195, DOI 10.17487/RFC1195,
              December 1990, <http://www.rfc-editor.org/info/rfc1195>.

   [RFC2328]  Moy, J., "OSPF Version 2", STD 54, RFC 2328,
              DOI 10.17487/RFC2328, April 1998,
              <http://www.rfc-editor.org/info/rfc2328>.

   [RFC5286]  Atlas, A., Ed. and A. Zinin, Ed., "Basic Specification for
              IP Fast Reroute: Loop-Free Alternates", RFC 5286,
              DOI 10.17487/RFC5286, September 2008,
              <http://www.rfc-editor.org/info/rfc5286>.

   [RFC5340]  Coltun, R., Ferguson, D., Moy, J., and A. Lindem, "OSPF
              for IPv6", RFC 5340, DOI 10.17487/RFC5340, July 2008,
              <http://www.rfc-editor.org/info/rfc5340>.

   [RFC5715]  Shand, M. and S. Bryant, "A Framework for Loop-Free
              Convergence", RFC 5715, DOI 10.17487/RFC5715, January
              2010, <http://www.rfc-editor.org/info/rfc5715>.

   [RFC6981]  Bryant, S., Previdi, S., and M. Shand, "A Framework for IP
              and MPLS Fast Reroute Using Not-Via Addresses", RFC 6981,
              DOI 10.17487/RFC6981, August 2013,
              <http://www.rfc-editor.org/info/rfc6981>.



Hegde & Sarkar           Expires January 4, 2018               [Page 20]

Internet-Draft      Microloop avoidance using SPRING           July 2017


Authors' Addresses

   Shraddha Hegde
   Juniper Networks, Inc.
   Exora Business Park
   Bangalore, KA  560037
   India

   Email: shraddha@juniper.net


   Pushpasis Sarkar
   Individual

   Email: pushpasis.ietf@gmail.com




































Hegde & Sarkar           Expires January 4, 2018               [Page 21]