Network Working Group                                    Wang Yahong
Internet-Draft                                           Datang Telecom
Expiration Date: June 2000                               Dec 1999

                  
                    A probe method in MPLS loop avoidance

                     <draft-wang-mpls-loop-probe-00.txt>

Status of this Memo

    This document is an Internet-Draft and is in full conformance with
    all provisions of Section 10 of RFC2026.

    Internet-Drafts are working documents of the Internet Engineering
    Task Force (IETF), its areas, and its working groups.  Note that
    other groups may also distribute working documents as Internet-
    Drafts.

    Internet-Drafts are draft documents valid for a maximum of six
    months and may be updated, replaced, or obsoleted by other documents
    at any time.  It is inappropriate to use Internet-Drafts as
    reference material or to cite them other than as "work in progress."

    The list of current Internet-Drafts can be accessed at
    http://www.ietf.org/ietf/1id-abstracts.txt

    The list of Internet-Draft Shadow Directories can be accessed at
    http://www.ietf.org/shadow.html.

    To view the list Internet-Draft Shadow Directories, see
    http://www.ietf.org/shadow.html.

Abstract

This paper introduces a very simple mechanism, based on "probe" 
message, which can be used for loop avoidance in the case that a LSP 
is changed. This probe mechanism is based on the idea that in the 
situation of one LSR has a new next hop and changes the LSP to new next
hop, if this change leads to a loop, the only reason for the loop 
formation is that the LSR changes its next hop. So that we use a very 
simple length-fixed probe message, no state is needed to keep in the 
tandem LSRs, and the message primitives are very limited and simple. 
Also this method is flexible and robust.


Wang Yahong             Expires    June 2000                  [Page 1]

Internet Draft        draft-wang-mpls-loop-probe-00.txt      June 1999


1. Why we need a simple loop avoidance mechanism

It is clear that some mechanisms should be provided in MPLS to avoid
the negative effect of LSP Loop formation[1-3], especially for the LSRs
which can not process the TTL. There are several solutions to loop 
problem in MPLS,such as loop prevention,loop detection,loop mitigation. 
So a loop avoidance mechanism should be simple and effective, can it 
have a better competitive ability. 

Also in MPLS, the loop is not frequently formed, it is a little 
probability event, the time is urgent especially in case of one LSP is
changed, so if a loop avoidance mechanism is too complex, the price is
also very high. 

The current two mechanisms, diffusion(path-vector) algorithm and 
colored thread algorithm[3-4], are complex, diffusion algorithm need a 
length-variant path vector message, while "colored thread" is too 
complex in algorithm self(a complex state machine), it also needs the
LSRs involved to keep some states, store some thread messages, and do
five primitive threads action.

Complex means less stable, more difficult to understand and 
implement, and may delay the time to detect a loop. So we need a simple
and clear loop avoidance method, in this paper, we propose a very simple 
loopavoidance method in the case that a LSP is changed(not the case a 
LSP is built up),called "probe" method.
 
As the draft is in it's preliminary form, so some items are not
explained clearly,the explanations and background information can be
found in[1][2][3].

2. Basic idea

The "probe" method is based on the watching result in the process of 
a loop formation in the case that a LSR changes its next hop and so 
changes a LSP. 

In the situation of one LSR has a new next hop and changes the LSP to 
new next hop, if this change leads to a loop, the only reason for the 
loop formation is the LSR changes its next hop.

This result is clear and straight forward. In normal situation, before 
one LSR changes its next hop, there is no loop. So if the LSR changes 
the next hop and leads to a loop formation, there is only one action, 
the LSR changes its next hop. And there is at least one action for LSP 
changes from a loopless state to a loop state. So that the only reason 
for the loop formation is the LSR changes its next hop.

We can get some further reductions from this result. If one LSR have a
new next hop and change the LSP for certain FEC[1] to new next hop, 
there only two situation, create a loop or not. And in the situation of
a loop formation the LSR must be in the loop. So if the LSR sends a 

Wang Yahong             Expires    June 2000                  [Page 2]

Internet Draft        draft-wang-mpls-loop-probe-00.txt      June 1999


message(probe) downstream, the message can only reach the egeress LSR
(loopless) or return to itself(loop), no other situation
(if the message has enough time to live). 

3. The Probe method

The probe method is used for loop avoidance in the case that a LSP is
changed because one or more LSR changes its next hop.

The basic mechanism of "probe" method is that when one LSR, referred 
as "Source LSR", has a new next hop and changes the LSP to new next hop, 
the Source LSR sends a probe message download with its LSR id(such as 
the IP address), so it can judge if there is loop for this next hop 
change based on the probe message. There are two alternatives for the 
detailed action. 

3.1 Optimistic method

To a LSP, once a LSR in the LSP has a new next hop to egress LSR in the
LSP, so it changes it's next hop to the new LSR, now it can not decide 
if this change leads to a loop. So it sends a Probe Message(carries the 
LSR id and the flag to indicate it is a probe message) downstream to the
new next hop LSR. To the Probe Message, we called the LSR who originally
sends it is "Source LSR", the egress LSR for the LSP is also called as 
Egress LSR, the other LSR in the LSP is Tandem LSR.

Once the Source LSR sends the Probe, it start a Timer(Max hops * time 
spent in each hop) .If Source LSR receive the Probe Message before the 
Timer is time out, it can decide the new hop leads to a loop, if no, it
think the new LSP is loop free. The duration of the Timer is determined, 
during the time, it can do nothing, or it can send the data along the 
new LSP,but cache the data sent before the timer is time out, if loop is
detected, it can send these data to a new alternative LSP.

The Tandem LSR receive the Probe Message, it just send it downstream, 
do nothing else, when the probe message reaches the Egress LSR, it just
is discard.

3.2 Pessimistic method 

To a LSP, once a LSR in the LSP has a new next hop to egress LSR in the 
LSP, so it changes it's next hop to the new LSR, now it can not decide 
if  this change leads to a loop. So it sends a Probe Message(carries the
LSR id and the flag to indicate it is a probe message) downstream to the
new next hop LSR.

Once the Source LSR sends the Probe, it start a Timer(Max hops * time 
spent in each hop) . So during the time-out period, it waits for the 
ACK message from the Egress LSR, if the timer is triggered but the 

Wang Yahong             Expires    June 2000                  [Page 2]

Internet Draft        draft-wang-mpls-loop-probe-00.txt      June 1999


Source LSR have not receive the ACK from the Egress LSR, so it can 
decide there is a loop. Else if the Source LSR receives the ACK Message 
from the Egress LSR before the timer is time out, it can judge there is 
no loop.

The Tandem LSR receives the Probe Message, it just sends it downstream, 
do nothing else, when it receives the ACK message from the Egress LSR, 
it just sends it upstream, do nothing else. When the probe message 
reaches the Egress LSR,the Egress LSR responses to send back a probe 
acknowledgement message, for short as ACK. 

The duration of the Timer is determined, during the time, it can do 
nothing, or it can send the data along the new LSP, but cache the data 
sent before the timer is time out, if loop is detected, it can send 
these data to a new alternative LSP.

Though the two methods are very similar, we give them separate names. 
Because for the Optimistic method, we suppose that the initial state is
loopless, and we send a probe message to affirm there is a loop. For 
the Pessimistic method, we suppose that the initial state is loop, and 
we send a probe message to affirm there is a loopless.
 
Because the two method are based on a timer(for Pessimistic method, 
maybe we can use a timer-less method, but if the time is too long to 
receive a ACK message, the L3 transient loop maybe disappear[5]). So 
they both can be affected by the bottleneck node(LSR) that where 
processing load is terribly high, then a Probe message will be delayed 
too much. So that for Optimistic method in some cases, loop formation 
will be omitted, for Pessimistic method, in some cases, some loopless 
situation maybe mistakenly judged to be loop. we can combine the two 
methods to get a better result. And we can adopt some means, such as 
sends a probe message several copies at a time, or use a local 
acknowledgement method to accelerate the process of probe message. 
We need a simulation and quantitative analysis and comparison to decide 
the details and do some selection. 

We use an example from [4] to explain this probe methode, in the figure 
below, thers is a break between LSR R5 and R6, so that R5 change its 
next hop to R1, so that there is loop R5-->R1--->R3--->R5. When R5 
change the next hop from R6 to R1, it sends a Probe meassage downstream
to R1,the probe passes form R1---> R3, then back to R5, The Source LSR 
can detect this loop in a very short time.


       R1------>R3------>R5------>R6------->R8
               ^        ^                  ^
               |        |                  |
               R2       R4                 R7


Wang Yahong             Expires    June 2000                  [Page 3]

Internet Draft        draft-wang-mpls-loop-probe-00.txt      June 1999


4. Discussion

4.1 How and when a MPLS loop can be formed?

Though the reasons for a loop formation are not too clear by now, but 
there are at least three explicit reasons: Transient L3 route loop(no 
TTL ability in some LSRs), route mis-configuration, a path psss through 
a MPLS domain and other non-MPLS domain. 

Loop maybe formed by two actions. One is that  the LSP is bulit up, 
whether in a ordered or independent, or in downstream or 
downstream-on-demand[2]. the other is that one (or more )LSR changes 
the next hop for the same FEC and so changes the LSP.  

4.2 Loop prevention and Loop detection

For Probe Method, if during the timer-out period, the Source LSR does 
not send any data on the new LSP, it is loop prevention, if the Source 
LSR does send data on the new LSP, it is loop detection.

4.3 TTL in the probe message

Generally, to avoid a Probe Message forms a loop, or is sent endless, 
a TTL field is needed. If we only use the Optimistic method, in the 
situation of only one Source LSR, TTL is no needed, because the Probe 
Message is sent downstream, so one result is that it reach the egress 
LSR(no loop), the second result is it return the Source LSR, no other 
results. But in the case of two or more Source LSR in one LSP, the case 
is complex,further study is need.And in the pessimistic method, the ACK 
message may need a TTL to avoid the Tandem LSR to keep some states.  

4.4 Send downstream

Because the LSP is lable-merged or VC-merged in some LSRs, so we think 
the best probe method is to send the probe downstream, not upstream. So 
that the process is simple and the number of LSRs involved is less. 

4.5 Value of the Loop Timer

It is very difficult for determine the time-out value of timer, though 
the value can be equal to Max hops *time spent in each hop(Maximum or 
average ). A feasible value need to determined by some simulation and 
affected by current network size and related protocol.

4.6  Probe method used in the LSP setup case

Because in the case of LSP setup, because the LSR that initiate the LSP 
may not in the Loop if a loop is formed. But I think the idea behind the 
probe method may do some help. And in the case of LSP setup, time is not 
too urgent, so we can use a simple but maybe stupid method, such as the


Wang Yahong             Expires    June 2000                  [Page 4]

Internet Draft        draft-wang-mpls-loop-probe-00.txt      June 1999


Confirm Information form an Egress LSR to Ingress LSR to determine there 
is no loop in this LSP.

It is clear the probe method is very simple and also in its preliminary 
stage. Some questions should be for further discussion and simulation. 

5. Conclusion

The probe method is very simple method for loop avoidance in MPLS, its 
probe message only carries the LSR id and the flag to indicate it is a 
probe message, message length is fixed and short. The tandem LSRs just 
forward the probe message, the action is very simple, the tandem LSRs 
may need not know the message is for Loop probe. No state is needed to 
keep and no received message is needed to stored except the Souce LSR 
(Pessimistic method) who sends this probe and the Egress LSR(Optimistic 
method).Also it is very flexible and robust.

6. Author's Address
Wang Yahong
Beijing R&D Center
Datang Telcom 
40 Rd. XueYuan, 100083
Beijing, P.R.China.
wangyh@catt.ac.cn 

7. References 

[1] R. Callon, et al, " A Framework for Multiprotocol Label Switching", 
    September 1999, <draft-ietf-mpls-framework-05.txt>
[2] Eric C. Rosen, et al, " Multiprotocol Label Switching Architecture", 
     August 1999, draft-ietf-mpls-arch-06.txt 
[3] Y. Obha, et al, "MPLS Loop Prevention Mechanism", October 1999, 
    <draft-ietf-mpls-loop-prevention-02.txt>
[4] Y. Obha, "Issues on loop prevention in MPLS Networks", pp64-68, 
    IEEE Com Mag, Dec,1999
[5] S.Ayandeh, Y. Fan, " MPLS Routing Dynamics"(Expired)


Wang Yahong             Expires    June 2000                  [Page 5]