Internet DRAFT - draft-ayandeh-mpls-dynamics

draft-ayandeh-mpls-dynamics





Submitted to MPLS Working Group                          Siamack Ayandeh
INTERNET DRAFT                                                    Nortel
<draft-ayandeh-mpls-dynamics-00.txt> 
                                                               Yanhe Fan
                                                                  Nortel

                                                              March 1998
                                                  Expires September 1998



                          MPLS Routing Dynamics


Status of this Memo

   This document is an Internet-Draft. Internet-Drafts are working
   documents of the Internet Engineering Task Force (IETF), its areas,
   and its working groups. Note that other groups may also distribute
   working documents as Internet-Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time. It is inappropriate to use Internet-drafts as reference
   material or to cite them other than as "work in progress."

   To learn the current status of any Internet-Draft, please check the
   "1id-abstracts.txt" listing contained in the Internet-Drafts shadow
   Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe),
   munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or
   ftp.isi.edu (US West Coast).


Abstract

   This Internet draft outlines how transients in protocols such as
   OSPF [7] (intra-domain routing) and MPLS LDP [2] can lead to periods
   of time with loops and other undesirable routing phenomenon. It
   defines a set of metrics to quantify this dynamic behavior.
   Quantification of these metrics, using a simulation model, for
   various local vs. egress label allocation schemes is used as the
   basis of a comparative analysis. This includes quantifying any
   incremental impact that MPLS may have once it is introduced into an
   autonomous IP routing area. The resulting insights would help
   develop a better understanding of routing dynamics. This includes
   identifying key nodal and network variables which impact routing
   dynamics hence guiding MPLS feature selection and IP network design. 



Ayandeh and Fan                                                 [Page 1]

Internet Draft        draft-ayandeh-mpls-dynamics-00.txt      March 1998




Table of Content

   1. Overview of the Issues of Network Routing Dynamics             2
   2. What Impacts Network Routing Dynamics                          3
   3. Metrics to Quantify Network Routing Dynamics                   4
   4. A Description of the Model and Configurations                  5
   5. Label Allocation Schemes (Local vs. Egress Control)            9
   6. Routing Dynamics: Results and Discussion                      10 
   7. Conclusion                                                    16 
   Appendix A   The Sensitive Analysis of the Time of Label
                Allocating (in Allocating Task)                     16
   References                                                       17


1. Overview of the Issue of Network Routing Dynamics

   The issue of network routing dynamics and stability has received
   attention in the literature recently where measurements of routing
   traffic for BGP backbones are used to quantify the stability of the
   routing information. Inter-domain routing information exchange often
   suffer from rapid fluctuation in network reachability information.  
   This phenomenon also referred to as "route flaps" can contribute to
   poor end to end performance of networks while degrading overall
   efficiency.

   The focus of this investigation however is intra-domain routing, in
   particular OSPF, and how transients in such protocols can lead to
   periods of time with loops and other undesirable exchanged routing
   phenomenon. As such a cause and effect investigation technique is
   deployed where addition/deletion of networks or link up/down events
   trigger intra-domain routing protocol information exchange. Using a
   simulation model the behavior of these protocols is tracked and
   observed resulting in insights and metrics which can be used to
   quantify and develop a better understanding of routing dynamics.

   The simulation model mimics the behavior of both layer-3 routing
   protocols such as OSPF, as well as MPLS. As a whole the tool allows
   an investigation of the network behavior, with and without MPLS,
   generating metrics which quantify the following:

   - Impact of introducing protocols such as MPLS/LDP on network
     dynamics
   - Comparison of various MPLS proposals for network control and
     dealing with loops
   - Identifying key nodal and network variables which impact routing
     dynamics hence providing insights in MPLS feature selection and
     network design.

   Given the many technical proposals which have already been put
   forward or are in process of submission to IETF MPLS WG, the


Ayandeh and Fan                                                 [Page 2]

Internet Draft        draft-ayandeh-mpls-dynamics-00.txt      March 1998


   simulation capability allows for a first order assessment of various
   proposals. This provides the first step in assessment and in not
   intended to replace prototyping or implementation trials.


2. What Impacts Network Routing Dynamics

   Network routing dynamics entails transient behavior which results
   in undesirable circumstances such as routing loops, interface speed
   mismatch, and black holes.  A number of network and nodal features
   impact the dynamic behavior of network routing and are briefly
   discussed below:

   a. The nature of the protocol in question and the control approach
   adapted.  For example label distribution may be achieved through
   schemes referred to as "Local Controls" vs. "Egress Control".
   Different protocols exhibit different dynamic behavior.

   b. Duration of Shortest Path (SP) table computations which is a
   function of network size and CPU MIPS available for this purpose.

   c. Propagation time of events which is a function of speed of event
   detection, link speed, and CPU MIPS, as well as network topology.

   d. Volume of users traffic which may compete with communication of
   routing information. This occurs to a lessor extent on the links,
   through priority queuing, and can occur to a greater extent when
   competing for a shared CPU resource.

   e. Rate of BGP route injections which is an indicator of instability
   of inter-domain routing in a large network. Intranets may be immune
   to such effects.

   The impact of features listed above are captured through a series of
   events which act as stimuli for the network under simulation. These
   events may also reflect a change in network and nodal configuration.
   The list of events are:

   1. Addition or deletion of networks
   2. Link failure
   3. Change in the capacity of Label Switch Routers (LSRs) and the
      network links
   4. Influence of the nodal architecture of the LSR
   5. External route updates (not currently in the model) need
      hierarchical label allocation
   6. Change in topology under investigation i.e. a linear topology
      vs. a mesh

   These events result in routing events which trigger route
   calculation and label allocation.


Ayandeh and Fan                                                 [Page 3]

Internet Draft        draft-ayandeh-mpls-dynamics-00.txt      March 1998


3. Metrics to Quantify Network Routing Dynamics

   Parameters which defined the metric for routing dynamics and
   stability are listed and discussed below. We try to capture the
   significance of each parameter with long term intention of
   developing metrics which would readily quantify network intra-domain
   routing dynamics. Note that the value of these parameters is
   impacted by the list of features "a" to "e" of section 2.


3.1. Routing Table Update Time (RTUT)

   The time interval between occurrence of an event, which results in
   routing table updates, and end of the routing table update for a
   given LSR. Such an update may be incremental or result of a shortest
   path calculation:

   - Incremental update result from BGP injections.
   - Shortest path calculation updates the routing table to reflect say
     a change in AS network topology.

   Significance: This parameter impacts all the other parameters listed
   below. As such it is a base metric.

3.2. Routing Table Convergence Time (RTCT)

   An event may result in a number of LSRs needing a routing table
   update. The RTCT is the longest of such routing table update times.

   Note that RTCT is not expected to be affected by the LDP for both
   local control and egress control provided the following guidelines
   are followed:
   - LSAs are transmitted at highest link priority;
   - LDP traffic is light and forwarded with priority;
   - Route checking functions and LDP protocols require small amount
     of processing compared to SP route calculations.

   Significance:  This is a base metric and minimizing RTCT should be
   a network design goal.

3.3. Label Allocation Time (LAT)

   Labels may be allocated to a new route or may be re-allocated as a
   result of a change to a route.  Once an event results in a new
   route, LAT is defined as the interval between such an event and an
   LSR being capable of doing L2 forwarding (having both incoming and
   outgoing labels for local control, or receiving Acknowledgment
   message(s) from upstream neighbor(s) for egress control) for the new
   or the changed route. This interval accounts for shortest path
   calculations, local label manipulations if any, and the time it


Ayandeh and Fan                                                 [Page 4]

Internet Draft        draft-ayandeh-mpls-dynamics-00.txt      March 1998


   takes to communicate this information to other LSRs. Note that the
   scope of such an advertisement may be the LSR's neighbors as with
   local allocation or  more extensive as with egress controls. LAT is
   measured with respect to individual LSRs.

   Note that propagation time of events and label distribution impact
   the label allocation time with a linear multi-hop topology having a
   greater impact on egress controls vs. local controls.

   Significance: LAT allows for a comparison of different label
   distribution schemes such as local and egress controls. It is a
   measure of a network's response to any route change and the speed
   with which it becomes capable of doing label based forwarding.

3.4. Label Allocation Convergence Time (LACT)

   An event may result in a number of LSRs needing to allocate labels.
   The LACT is the longest of such label allocation times through out
   the network under investigation.

3.5. Loop Intensity (LI)

   Routing protocols such as OSPF may create transient routing loops,
   since each node learns the current link status of the network at a
   different time. The impact of loops may be packets which do not
   reach their destination. If prolonged, loops may lead to congestion
   on certain links. Beside L3 loops i.e. resulting from layer-3
   forwarding, there may be L2 loops (which result from forwarding
   based on labels). The latter occurs when using label allocation
   methods which use local control.  

   Let :
   A: be the number of paths with loops; 
   N: be the total number of paths considered (see explanation below)
   L: be the sum of all loop durations.

   LI is then defined as:

        LI = (A*L)/(N*RTCT).

   In this draft, when L3 loops are considered, N is 225 i.e. all
   possible source and destination pairs of 15 LSRs. For L2 loops, N 
   equals to 14 (the number of paths to a single destination N6).

   Significance: LI is a measure of the severity and potential impact
   of loops on payload and network traffic.


4. A Description of the Model and Configurations

4.1. The Simulation Environment


Ayandeh and Fan                                                 [Page 5]

Internet Draft        draft-ayandeh-mpls-dynamics-00.txt      March 1998


   The results in this draft are based on a simulation model of OSPF
   with only one single area. No BGP injections are currently assumed
   in the model. LDP [4] for local control and ARIS [6] protocol for
   the egress control are modeled. The simulation tool used is an
   extended library of C++ classes and functions which have been
   developed for the purpose of discrete event simulations and
   statistical data collection.

4.2 A Single Autonomous System

   Figure 1 shows the single autonomous system which was simulated.
   There are fifteen LSRs (LSR1 - LSR15) and fifteen networks (N1 -
   N15). All the LSRs and networks are connected via SONET links. OSPF
   is used as the IGP and the routing domain has only one area.
              


                      +-----+        +-----+  +-----+
                   N6 |LSR6 |    N10 |LSR10|  |LSR9 | N9
                      +-----+        +-----+  +-----+
                         |               \      /
                         |                \    / 
                      +-----+            +-----+
                   N7 |LSR7 |============|LSR8 | N8
                      +-----+            +-----+
                        //                  \\
                N5     //                    \\   N11
              +-----+ //                      \\+-----+    +-----+
              |LSR5 |//                        \|LSR11|----|LSR12| N12
      N1      +-----+                           +-----+    +-----+
    +-----+     ||                                 ||         |
    |LSR1 |\    ||                                 ||         |
    +-----+ \ +-----+                              ||      +-----+
             \|LSR4 |                              ||      |LSR13| N13
      N2     /+-----+                              ||      +-----+
    +-----+ /   ||                                 ||
    |LSR2 |/    ||                            N15  ||
    +-----+   +-----+                           +-----+    +-----+
           N3 |LSR3 |===========================|LSR15|----|LSR14| N14
              +-----+                           +-----+    +-----+


                    Figure 1: A single Autonomous System


Ayandeh and Fan                                                 [Page 6]

Internet Draft        draft-ayandeh-mpls-dynamics-00.txt      March 1998





4.3. Nodal Model Reflecting a Simple Generic LSR Architecture

    Figure 2 shows the nodal model of LSR from label allocation's
    perspective. 



                  +-----------------------------+
                  |      Label Allocating       |
                  |                             |
                  +-----------------------------+
                  |      Route Computing        |
                  |                             |
                  +--------------+--------------+
                  |  RouteCheck  |  Flooding    |
                  |              |              |
                  +-----------------------------+
                  |        Forwarding           |
                  |                             |
                  +-----------------------------+

                       Figure 2: Nodal Model


    We define the following simulation modules to capture each of the
    components of the nodal model in figure 2:

    a. Forwarding Module: copying packets from input to output
       buffers or internal buffers for route/allocation handling

    b. RouteCheck Module: upon receiving a link state advertisement,
       it checks against link state database record to determine which
       is more recent

    c. Flooding Module: copying packets on all interfaces to neighbors

    d. Computing Module: triggered by the new advertisement,
       calculating the routing table by either Dijkstra (IGP updates)
       or incremental (EGP) algorithm
 
    e. Allocating Module: allocating a label for the stream and
       propagates the binding information to neighbor(s)

    All these modules execute independently i.e. there is no processor
    competition among them. The only exceptions are Computing and
    Allocation Tasks. Both Tasks share the same processor and Computing
    Task has the priority. Figure 3 is the block diagram of an LSR.


Ayandeh and Fan                                                 [Page 7]

Internet Draft        draft-ayandeh-mpls-dynamics-00.txt      March 1998





                                             neighbor(s)  
     
                                                / \ 
                                                 |  binding/requiring
                         binding            +----------+
                  +-----------------------> |Allocating|
                  |                         +----------+
                  |                           / \    |
                  |                     route  |     |
                  |                     change |     | 
                  |                            |     | binding
                  |  +----------+ new   +---------+  |
                  |  |RouteCheck|-----> |Computing|  |
                  |  +----------+\      +---------+  |
                  |    / \    |   \                  |
                  | LSAs|     |    \ new             |
    binding       |     |     |     \               \ / 
               +----------+   |      \          +--------+      
    LSAs ----> |Forwarding|   |       --------> |Flooding|--> all
               +----------+   |                 +--------+    neighbors
    user traffic    |         |
                   \ /       \ /
              user traffic   old LSAs


                    Figure 3  Model of an LSR
     

4.4. Work Times Used Within a Node

    Forwarding Task
       User packet arrival interval has the Weibull distribution:
            W(4.0, mean/24.0);
       User packet length has Weibull distribution: W(1.1, 1.143)
    RouteCheck Task
       Checking Time is 10.0 us
    Flooding Task
       Transmission time: packet length/link bandwidth
    Computing Task
       Computing time (100 mips): 57.5 us
    Allocation Task
       local/remote without attribute checking: 1.0 us; 
       local/remote with attribute checking: 1.5 us; others: 0.2 us

4.5. Nodal and Link Capacities Used in The Network Model

    Default configuration for each LSR are


Ayandeh and Fan                                                 [Page 8]

Internet Draft        draft-ayandeh-mpls-dynamics-00.txt      March 1998


       a) CUP power available for Computing and Allocating is 
          100 mips
       b) Computing has priority to use the CPU power
       c) Maximum forwarding capacity is 836,000 pps

    The regular links are OC3 links and the bold links are 2xOC3
    capacity.

5. Label Allocation Schemes (Local vs. Egress Controls)

   A fundamental concept in MPLS is the association of labels and
   network layer routing. The difference between the various
   contributions, lie in what the label assignment trigger is, at what
   degree of granularity the labels are assigned, and how the
   label/stream binding information is distributed.

   In this draft, we only consider the topology-driven allocation for
   fine granularity (i.e. per layer-3 route) and explicit label
   distribution. There are two types of topology-driven allocation:
   local control and egress control.

   Label allocation methods we investigate in this draft include:

   Local Control(and its three flavors): [3,4]

       a) downstream label allocation (DS)
       The label allocation is done by the downstream LSR, i.e. the 
       incoming labels are allocated locally and the outgoing labels
       are allocated remotely(by downstream LSR).

       b) upstream label allocation (US)
       The label allocation is done by the upstream LSR, i.e. the 
       incoming labels are allocated remotely (by the upstream LSR)
       and outgoing labels are allocated locally.

       c) downstream label allocation on demand (DSD)
       The label allocation is done by the downstream LSR upon 
       receiving the request from the upstream LSR, i.e. the incoming
       labels are allocated locally after receiving the request from
       upstream LSR and the outgoing labels are allocated remotely(by
       the downstream LSR).

   Egress Control [2,5,6]

       a) basic egress label allocation (BEC)
       Only the egress node may initiate the transmission of label
       allocation information. Non-egress nodes wait until they get a 
       label from downstream LSR, allocate a label and pass a 
       corresponding label to upstream LSR(s).



Ayandeh and Fan                                                 [Page 9]

Internet Draft        draft-ayandeh-mpls-dynamics-00.txt      March 1998


       b) egress label allocation with diffusion (ECD)
       Works in the same way as basic egress label allocation except
       that it uses "diffusion computation" to prune the upstream tree
       before replacing the old LSP with a new LSP. This is done by 
       removing all LSRs from the tree that would otherwise be on a 
       looping path.
 
      
6. Routing Dynamics: Results and Discussion

   We use a what-if analysis approach. We create an event, observe the
   consequences, and measure the metrics defined in section 3. There
   are two type of events:
      a) Event 1(E1): A new network attaches to LSR2.
      b) Event 2(E2): The Link between LSR5 and LSR7 goes down. This
         link failure will affect multiple routes and we only consider
         the label allocation for the route of N6 at this time.

   Event 1 is the case that an LSP is initially set up and Event 2 is
   the case that the LSP in question is changed. The events are
   repeated a number of times in order to cope with the randomness of
   user traffic. Unless stated explicitly, the values we present are
   the average value over the runs. Each LSR is fed with user traffic.
   The LSR Utilization(LU) is the percentage of LSRs' maximum
   forwarding capacity which is used by user traffic. By default, all
   LSRs have the same LU. The time unit used in this draft is
   microseconds.
   
6.1. Label Allocation Convergence

6.1.1. Label Allocation Convergence Time (LACT)

   The Table 1 and Table 2 give the comparison of LACT among the five
   label allocation methods.

           Table 1  LACT Comparison (E1)
          -----------------------------------
           LU |  DS | US  | DSD | BEC | ECD 
          ----+-----+-----+-----+-----+------
           10 | 157 | 162 | 167 | 164 | 164
           20 | 169 | 175 | 183 | 179 | 179
           30 | 187 | 197 | 208 | 202 | 202
           40 | 230 | 239 | 258 | 249 | 249
           50 | 267 | 289 | 322 | 310 | 310
           60 | 371 | 415 | 467 | 442 | 442
           70 | 550 | 627 | 719 | 670 | 670
           80 | 918 | 1050| 1276| 1139| 1139
          -----------------------------------

       


Ayandeh and Fan                                                [Page 10]

Internet Draft        draft-ayandeh-mpls-dynamics-00.txt      March 1998


           Table 2  LACT Comparison (E2)
          -----------------------------------
           LU | DS  | US  | DSD | BEC | ECD
          ----+-----+-----+-----+-----+------
           10 | 113 | 108 | 117 | 127 | 147
           20 | 120 | 115 | 128 | 138 | 165
           30 | 132 | 127 | 142 | 157 | 202
           40 | 154 | 146 | 170 | 188 | 250
           50 | 186 | 177 | 212 | 244 | 343
           60 | 250 | 229 | 293 | 350 | 541
           70 | 375 | 349 | 462 | 555 | 935
           80 | 604 | 550 | 760 | 907 | 1633
          -----------------------------------

   For the initial setup of an LSP (E1), the LACT differences among
   the five allocation methods are small, especially for the light to
   medium LSR utilization. The downstream allocation has the best
   performance and downstream on demand allocation is the worst in term
   of the LACT. The two egress control allocation methods have the same
   performance since no diffusion computation is required.

   For the LSP change (E2), the egress control allocation has much
   longer LACT than local control(see Table 2). Comparing with DS
   (LU=80%), BEC and ECD will take 50% and 170% more time to converge.
   This means that local control's period of instability is shorter than
   egress control in the face of transients. Egress controls on the
   other hand avoid transient loops. Both schemes however are equally
   exposed to other forms of loops, which occur outside of the transient
   interval, i.e. the ones that are caused by other factors such as say
   software bugs. This latter category however are likely to occur with
   a much lower frequency.  

   For the local control, the DS and US have the roughly same
   performance and DSD has the worst LACT. For the egress control, BCE
   and ECD have the same LACT if there is no diffusion computation
   involved, otherwise the difference may be noticeable.

6.1.2. Label Allocation Time (LAT)

   Tables 3 and 4 show the comparison of label allocation time
   (LAT) among the five allocation methods at 60% LSR utilization.



Ayandeh and Fan                                                [Page 11]

Internet Draft        draft-ayandeh-mpls-dynamics-00.txt      March 1998


          Table 3 LAT Comparison (E1)
        --------------------------------------
          LSR  | DS  | US  | DSD | BEC | ECD
        -------+-----+-----+-----+-----+------
         LSR01 | 189 | 183 | 241 | 193 | 193
         LSR02 | 97  | 169 | 97  | 174 | 174
         LSR03 | 186 | 242 | 229 | 257 | 257
         LSR04 | 150 | 241 | 193 | 256 | 256
         LSR05 | 189 | 249 | 238 | 267 | 267
         LSR06 | 271 | 263 | 322 | 286 | 286
         LSR07 | 229 | 302 | 278 | 327 | 327
         LSR08 | 260 | 334 | 300 | 359 | 359
         LSR09 | 299 | 293 | 358 | 320 | 320
         LSR10 | 292 | 283 | 337 | 312 | 312
         LSR11 | 266 | 308 | 306 | 334 | 334
         LSR12 | 290 | 348 | 332 | 375 | 375
         LSR13 | 327 | 322 | 373 | 348 | 348
         LSR14 | 268 | 262 | 317 | 278 | 278
         LSR15 | 228 | 310 | 269 | 330 | 330
        --------------------------------------

          Table 4 LAT Comparison (E2)
        --------------------------------------
          LSR  | DS  | US  | DSD | BEC | ECD
        -------+-----+-----+-----+-----+------
         LSR01 | 182 | 177 | 233 | 287 |  -
         LSR02 | 187 | 180 | 234 | 289 |  -
         LSR03 | 208 | 177 | 230 | 276 | 407
         LSR04 | 193 | 224 | 207 | 347 | 541
         LSR05 | 166 | 97  | 173 | 285 | 497
        --------------------------------------

   For event 1, all LSRs will allocate a label for the route of the new
   attached network. The following is the label allocation LSR sequence,

      DS  : LSR2, 4, 3, 5, 1, 15, 7, 8, 11, 14, 6, 12, 10, 9 and 13
      US  : LSR2, 1, 4, 3, 5, 14, 6, 10, 9, 7, 11, 15, 13, 8 and 12
      DSD : LSR2, 4, 3, 5, 1, 15, 7, 8, 11, 14, 6, 12, 10, 9 and 13
      BEC : LSR2, 1, 4, 3, 5, 14, 6, 10, 9, 7, 15, 11, 13, 8 and 12
      ECD : LSR2, 1, 4, 3, 5, 14, 6, 10, 9, 7, 15, 11, 13, 8 and 12

   For event 2, only LSR1 - LSR5's label for N6 will be affected by
   either re-allocation or re-sending/checking binding information. The
   LSR sequence is as follows,

      DS  : LSR5, 1, 2, 4 and 3
      US  : LSR5, 3, 1, 2 and 4
      DSD : LSR5, 4, 3, 1 and 2
      BEC : LSR3, 5, 1, 2 and 4
      ECD : LSR3, 5 and 4


Ayandeh and Fan                                                [Page 12]

Internet Draft        draft-ayandeh-mpls-dynamics-00.txt      March 1998



   From the LSR sequence we may see the differences among those
   allocation methods and get the sense of why one takes longer than
   the other.

6.2. Impact on the Routing Convergence

   Table 5 shows the impact on routing (OSPF) and the RTCT among the
   five allocation methods generically referred to as MPLS. Differences
   between label allocation schemes are negligible (within 0.17 %) 
   since the work times assumed are similar and SP calculation has 
   priority over label processing.

           Table 5  RTCT Comparison (E1&E2)
          --------------------------------
              |     E1      |     E2
          ------------------+-------------
           LU | OSPF | MPLS | OSPF | MPLS    
          ----+------+------+------+------
           10 | 155  | 155  | 126  | 126
           20 | 166  | 166  | 136  | 136
           30 | 184  | 184  | 148  | 149 
           40 | 224  | 225  | 181  | 181
           50 | 259  | 259  | 214  | 215 
           60 | 360  | 361  | 289  | 291
           70 | 539  | 539  | 429  | 431
           80 | 895  | 895  | 734  | 735 
          --------------------------------

   The impact of running explicit label distribution protocols is very
   small on the routing convergence and the five methods have almost
   the same performance and impact.

   These simulation results very much depend on the assumption about
   work-time value of Allocating Task. In order to assess the
   sensitivity of the results with respect to this assumption, we
   enlarge these value by a factor of 10 and 25 times. This analysis
   shows that the simulation results are not sensitive to those values
   until label allocation will take roughly the same or more time than
   routing table computation. This is only likely to occur for a large
   number of route updates. For details see Appendix A.

6.3. Loops

   In order to capture the multiple hop loops, we add an asymmetric
   link to the network between LSR5 and LSR15. The bandwidth from LSR15
   to LSR5 is 3xOC3 and the bandwidth from LSR5 to LSR15 is half an OC3.

   We consider Event 2 (E2 where link between LSR5 and LSR7 goes down)
   with no repetition. Our simulation results show that the same link


Ayandeh and Fan                                                [Page 13]

Internet Draft        draft-ayandeh-mpls-dynamics-00.txt      March 1998


   failure event does not always create the L3 transient multiple-hop
   loop and the L3 transient loop does not always result in L2 loop.
   The occurrence of transient loops (L3, L2) very much depends on the
   sequence of (events leading to) routing table updates and label
   allocation.

   The following is the case that we capture both L3 and L2 multiple-hop
   transient loops. The LU is 80%. The next hops to N6 before and after
   the link failure are

            Table 6   Next Hop to N6
          ----------------------------
            LSR No. | Before | After    
          ----------+--------+--------
               5    |  LSR7  |  LSR4
               4    |  LSR5  |  LSR3
               3    |  LSR15 |  LSR15  
               15   |  LSR5  |  LSR11  
          ----------------------------

6.3.1. L3 Loop

   There is one four-hop transient loop. Taking the link failure as
   start of time, this loop starts at 113 and ends at 354, with the
   duration of 241 us.


          +--> LSR5 --> LSR4 --> LSR3 --> LSR15 --+
          |                                       |
          +---------------------------------------+

   The loop intensity index (LI) is 0.019 ((14*241)/(225*773)).

6.3.2. L2 Loop
  
   a) Local Control
   There is one multiple L2 transient loop when using downstream
   allocation (DS)


          +--> LSR5 --> LSR4 --> LSR3 --> LSR15 --+
          |                                       |
          +---------------------------------------+


   This loop is from 212 to 354 and its duration is 142 us. The L2
   loop is completely covered by the L3 loop.

   The LI is 0.092 = ((7*142)/(14*773)).


Ayandeh and Fan                                                [Page 14]

Internet Draft        draft-ayandeh-mpls-dynamics-00.txt      March 1998



   Note: Since the set of paths in L3 and L2 looping consideration is
   different, currently the LIs in two cases are not comparable. 

   Table 7 below shows the routing table update time and the label
   allocation times which lead to the loops described above.

           Table 7  RTUT & LAT
          ----------------------
           LSR No.| RTUT | LAT      
          --------+-------------
              5   | 100  | 212
              4   | 113  | 212 
              3   | 208  | 209  
              15  | 354  | 355  
          ----------------------

   b) Egress Control
   Egress control label allocation can avoid L2 loop setup. With the
   same L3 transient loop, egress control allocation allows L2
   loop-free LSP setup. This is because egress control uses the LSR
   path vector, similar in function to the BGP AS_PATH attribute. 

6.3.3. Discussion

   The L3 loop is the same for both local control and egress control.
   Local control may create the L2 loop. This is because an LSR can
   assign a label whenever it wants. This independence results in a
   short LACT (355 us). The shorter LACT is, the more stable the
   networks is.

   Egress control can avoid the L2 loops. The reason is that when
   routes are changing, the node R detects a change in the next hop
   for a given stream, it asks its new next hop for a label and the
   associated LSR ID list.  It then makes sure that R's ID is not in
   the LSR ID list received from the new next hop. R either grows a new
   switched path upstream tree to replace the old LSP (BEC) or starts
   a "Diffusion Computation" to prune the tree upstream of R by
   removing all LSRs on that tree which would be on a looping path if R
   was to switch-over to the new path (ECD). Then R can replace the old
   LSP with the new LSP without causing any loops.

   Egress control introduces the dependence among the LSRs. This
   dependence results in a longer LACT, 872 for BEC and 1586 for ECD.
   LACT increases by 517 (146%, more) and 1231 (347% more), compared
   with DS. 

   The benefits of using ECD is avoiding the unnecessary unsplicing/
   resplicing (LSR1, 2, 3, 14 in this case). The disadvantage of ECD
   is the longer LACT. Comparing with BEC, ECD increases LACT by 714


Ayandeh and Fan                                                [Page 15]

Internet Draft        draft-ayandeh-mpls-dynamics-00.txt      March 1998


   (82% more). The comparison between BEC and ECD is shown in Table 8.

           Table 8  LAT
          ----------------------
           LSR No.| BEC | ECD
          --------+-----+-------
           LSR01  | 855 |  -
           LSR02  | 822 |  -
           LSR03  | 872 | 1004
           LSR04  | 858 | 1586 
           LSR05  | 862 | 1288
           LSR14  | 734 |  -
           LSR15  | 822 | 1117
          ----------------------


7. Conclusion

   Label allocation convergence time (LACT) is the important metric to
   indicate how fast the whole network is fully capable of doing label
   forwarding again after a route change. To design an LDP with short
   LACT is one major goal of network design. Simulations show that when
   topology changes, local control has much shorter LACT than egress
   control. The period of layer-2 instability is therefore shorter for
   local controls. However this period may be accompanied by layer-2
   loops.

   The topology change may cause L3 transient loops and the L3 loops
   may result the L2 loops. Our simulation shows that the L2 loops are
   overlapped by the L3 loops completely and that their duration is
   shorter than the latter. The egress control can avoid the L2 loops
   and local control can not. However egress controls suffer from a
   longer period when forwarding based on labels may not be available.
   This period overlaps with the interval when layer-3 loops are
   present (i.e. layer-3 forwarding may not help). More simulations are
   currently underway to quantify the benefits of each scheme using a
   single metric.

   Egress control with diffusion can avoid the unnecessary unsplicing/
   resplicing when using basic egress control, but again with a longer
   LACT.  

   Route Computation should have the priority over the label allocation,
   since LIB is built on top of FIB. Our simulation results show that
   for single route updates, running LDP has a minor impact on routing
   convergence. The five label allocation methods have almost the same
   and minor impact on routing convergence(see Appendix A).


   Appendix A  The sensitive analysis of the time of label allocating
               (in Allocating Task)

Ayandeh and Fan                                                [Page 16]

Internet Draft        draft-ayandeh-mpls-dynamics-00.txt      March 1998



   The assumption we make on Allocating Task is as following
       handling local assignment is 1.0 us, remote assignment 1.0 us,
       local/remote assignment with LSR ID checking 1.5 us and all
       others 0.2 us
   To make thing worse, we enlarge those value by 10 and 25 times.
   There are three sets of values
       set 1:  1.0/1.0/1.5/0.2
       set 2:  10.0/10.0/15.0/2.0
       set 3:  25.0/25.0/37.5/5.0
   Using set 2, the time for allocation is still smaller than computing
   time(57.5 us), and the time for allocation is almost the same at
   some LSRs and lager in most LSRs than computing time for set 3. The 
   LU is Low(20), Median(50) and High(80)
  
 
      Table 9  LACT (E1)
     ----------------------------------------------
      Set No. |       DS        |       BEC     
              +-----+-----+-----+-----+-----+------
              | Low | Med | High| Low | Med | High  
     ---------+------+-----+------+-----+------+---
      No. 1   | 169 | 267 | 918 | 179 | 310 | 1139
      No. 2   | 188 | 289 | 939 | 225 | 368 | 1212
      No. 3   | 218 | 325 | 974 | 343 | 483 | 1324
    -----------------------------------------------


      Table 10  RTCT (E1)
     ----------------------------------------------
      Set No. |       DS        |       BEC     
              +-----+-----+-----+-----+-----+------
              | Low | Med | High| Low | Med | High  
     ---------+-----+-----+-----+-----+-----+------
      No. 1   | 166 | 259 | 895 | 166 | 259 | 895
      No. 2   | 166 | 259 | 895 | 166 | 259 | 895
      No. 3   | 166 | 260 | 898 | 166 | 259 | 895
    -----------------------------------------------


   
   
References

[1] "A Framework for Multiprotocol Label Switching", R. Callon, P. 
Doolan, etc., work in progress, draft-mpls-framework-01.txt, July
1997
[2] "A Proposed Architecture for MPLS", E. Rosen, A. Viswanathan,
R. Callon, work in progress, draft-ietf-mpls-arch-00.txt, August 1997


Ayandeh and Fan                                                [Page 17]

Internet Draft        draft-ayandeh-mpls-dynamics-00.txt      March 1998


[3] "Cisco Systems' Tag Switching Architecture Overview", Y. Rekhter,
B. Davie etc., RFC 2105, February 1997
[4] "Tag Distribution Protocol", P. Doolan, B. Davie etc., work in
progress, draft-doolan-tdp-spec-01.txt, May 1997
[5] "ARIS: Aggregate Route-Based IP Switching", A. Viswanathan etc.,
work in progress, draft-viswanathan-aris-overview-00.txt, March, 1997
[6] "ARIS Specification", N. Feldman and A. Viswanathan, work in
progress, draft-feldman-aris-spec-00.txt, March 1997
[7] "OSPF version 2", J. Moy, RFC 1583, March 1994

  
Authors' Address

     Siamack Ayandeh
     Northern Telecom
     P.O Box 3511, Station C
     Ottawa, Ontario
     Canada
     (613) 763-3645
     ayandeh@nortel.ca

     Yanhe Fan
     Northern Telecom
     P.O Box 3511, Station C
     Ottawa, Ontario
     Canada
     (613) 765-2315
     yanhe@nortel.ca




Ayandeh and Fan                                                [Page 18]

Internet Draft        draft-ayandeh-mpls-dynamics-00.txt      March 1998