Traffic Engineering Working Group                       Vanessa Springer
Internet Draft                                         Craig Pierantozzi
Expiration Date: February 2001			               Jim Boyle


                                                             August 2000


                   Level3 MPLS Protocol Architecture


                   draft-springer-te-level3bcp-00.txt

Status of this Memo

   This document is an Internet-Draft and is in full conformance with
   all provisions of Section 10 of RFC2026.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.


Springer, et al.                                                [Page 1]

Internet Draft     draft-springer-te-level3bcp-00.txt        August 2000


Abstract

       This paper discusses Traffic Engineering with Multi-Protocol
   Label Switching in the Level3 network. A brief overview of traffic
   engineering is given followed by constraints affecting Level3's
   design. The approach Level3 will use, which is LDP edge and an RSVP-TE
   core, is presented. Several architectures were considered when
   deciding upon a design. These methods are discussed as well as the
   reasons they were ultimately refuted.


Table of Contents

    1      Specification of Requirements  ..........................   2
    2      Introduction  ...........................................   2
    3      Design Constraints  .....................................   3
    4      MPLS Architecture Chosen  ...............................   4
    5      Other Architectures  ....................................   6
    6      Systems Issues  .........................................   7
    7      Conclusion  .............................................   8
    8      Author Information  .....................................   8


1. Specification of Requirements

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC 2119.


2. Introduction

   The task of mapping traffic flows onto an existing topology so as to
   optimize resource utilization and network performance is called
   Traffic Engineering [TE].  Optimization refers to finding the minimum
   or maximum of a given function.  An ideal function would allow for
   minimizing the maximum utilization in a network.  However, this would
   require computing paths for demands offline and configuring these
   down into the network elements. This also would not allow proper
   response to different failure scenarios.  TE can also be used more
   loosely to refer to networks that associate bandwidth with point to
   point macro-flows in order to try to avoid over-utilization of
   resources.


Springer, et al.                                                [Page 2]

Internet Draft     draft-springer-te-level3bcp-00.txt        August 2000


   Without TE, traffic follows the shortest path calculated by the IGP.
   Doing so conserves network resources, but is not always the optimum
   path. Paths may overlap, creating congestion, while other paths are
   under-utilized. One solution to this problem is IGP metric
   manipulation. However, this can create unforeseen congestion in other
   parts of the network.

   MPLS provides a method of traffic engineering. With MPLS, traffic
   engineering attempts to control traffic on the network using
   Constrained Shortest Path First (CSPF). CSPF creates a path that is
   altered by restrictions when calculating a path. This allows complete
   control over the path the LSP will take. This path may not always be
   the shortest path, but will utilize paths that are less congested.

   This memo will be focused entirely on the use of Traffic Engineering
   in Level3's network. Specifically, it focuses on how Level3 will
   utilize MPLS to enhance network performance.


3. Design Constraints

   Level3 progressed from ATM due to growth necessitating a new WAN
   switch. When considering different protocol architecture
   alternatives, a few constraints were taken into consideration.  These
   can be summarized as follows:


           o Cost effectiveness and switching hardware roadmap
           o Multiple Routing Domains
           o Avoid data plane hierarchy within the gateway
           o Edge to Edge MPLS (for Customer VPNs)
           o Fast Reroute


   The chosen technology should have broad vendor support and clear path
   to 10x and 100x OC192 chassis offerings.

   Level3 supported multiple services over a single ATM WAN.  This
   proved very cost effective while using leased capacity.  It is
   expected to remain cost effective for the near to mid term over owned
   facilities.  Supporting multiple services over a common core requires
   the ability to separate service domains in terms of routing
   information, queuing on multi-service links, and forwarding
   isolation.  It was desired to have an architecture that did not
   necessitate purchase of additional routing equipment to consolidate
   traffic from within a service, as often is the case with routers
   meshed over ATM architectures.


Springer, et al.                                                [Page 3]

Internet Draft     draft-springer-te-level3bcp-00.txt        August 2000


   Support of edge to edge MPLS is required to allow for customer VPN
   service.  It made for some interesting protocol scenarios when
   considering a network with on the order of 30 gateways, each with up
   to 8 edge routers.

   The last requirement is fast rerouting capabilities. This is somewhat
   related to having a multi-service core, but it is most strongly
   driven by carrying a voice service over the core.  Besides obvious
   call quality issues, where too much drop-out can lead a caller to
   believe that the call has been dropped (on the order of 2-4 seconds),
   several of the signalling entities in the network are not very
   tolerant of unavailability of the network (on the order of 5-10
   seconds).  Level3's goal in this area was 45 ms, however 2 seconds
   was placed as an upper limit.


4. MPLS Architecture Chosen

   Based on capacity, cost, and the future of product mix, the decision
   was made to use POS. MPLS enhances POS networks by enabling the
   support of multiple services, customer VPN's, traffic engineering,
   and fast-reroute, which are architectural design requirements.
   MPLS/POS does this in a manner which is more scalable and manageable
   than ATM.

   At this time, Level3's architecture consists of an IP edge with an
   RSVP-TE core.  The RSVP-TE backbone is fully meshed and in the US
   currently consist of 20 routers and will grow to around 30 by year
   end 2000.  Plans are to rollout LDP as it becomes available in
   production software, or in single-vendor mode for limited VPN
   offerings. Limited use of Layer 2 tunneling [L2TUN] across the core
   is also used to carry internal networks traffic as well as to tunnel
   multicast IP across the non-multicast enabled core.

   This architecture was chosen based on its simple, scalable design,
   given the constraints. This model also supports MPLS edge to edge,
   while RSVP delivers the bandwidth management and reconvergence speeds
   that are required.

   Besides somewhat Level3 unique requirements, such as fast reroute,
   the decision to add the complexity of RSVP-TE came along with other
   more traditional advantages.  The Level3 data network currently
   utilizes significant leased facility.  These can include parallel
   paths provided by alternate carriers.  Loss of 1/3 or 1/2 of the
   capacity on a given span doesn't necessarily change the overall nodal
   path traffic would take via SPF.  CSPF allows for only the traffic
   that can fit in the remaining capacity to remain along the SPF, while
   other demands are routed along longer, less utilized paths.  This is


Springer, et al.                                                [Page 4]

Internet Draft     draft-springer-te-level3bcp-00.txt        August 2000


   especially useful when the overall topology becomes more meshed and
   multiple alternate paths must be utilized effectively.

   LDP is a method by which routers inform others of the label to use to
   forward traffic through them. It is useful in architectures, such as
   our own, that require efficient hop-by-hop routed tunnels, such as
   VPN, and tunneling between BGP edge routers [LDPA]. LDP allows for
   flexibility as to when it advertises label bindings, and strategies
   for retained learned labels and label distribution. Level3 intends to
   use LDP in a downstream unsolicited, ordered control with liberal
   label retention mode.

   LDP allows for minimal configurations in one gateway when a router is
   added to another. The only configuration needed would be in the
   gateway of which the router was added. Because of the feature that
   allows LSR's to discover LDP peers, configuration of LDP peers would
   also be minimal. As LDP is hop-by-hop, only adjacent peers must be
   configured, which is usually the case for IP configuration anyways.
   The only network wide change would be in the case of additions to the
   top-level IBGP mesh, which can also be minimized by the uses of BGP
   route reflection.

   A multiservice network requires the ability to separate the different
   services.  In particular it is important to isolate the traffic in
   terms of routing domains, queuing and forwarding.  The different core
   services supported by the core network include Internet platform,
   voice and internal networks.  In order to protect the non Internet
   services from potential global routing instability, it was decided to
   avoid running BGP in the core of the network to minimize potential
   CPU impact of such instability.  Queuing is done via MPLS COS bits,
   which are marked at imposition to note which service the traffic
   belongs to.  Firewalls, and constrained control information insure
   that traffic cannot be forwarded from one service to another.  LDP
   again allows lack of BGP information in the core, while maintaining
   an edge to edge forwarding LSP between BGP ingress and ingress points
   in the Internet platform.  Removal of BGP entries from the core will
   also speed updates of forwarding state on topology changes.

   It should be noted that LDP must be run over, or in parallel with,
   the RSVP-TE LSPs.  This allows an WAN router at the egress of an RSVP
   LSP to split out traffic to multiple adjacent MPLS routers within the
   site and region it serves from one RSVP-TE LSP.  It also requires two
   levels of label stacking to carry Internet traffic across the core,
   or potentially three for VPN traffic.  This is fairly easy to
   configure as the directed LDP sessions are a mesh of a limited set of
   routers, and have parity with the RSVP-TE LSP mesh.

   Presignalled backup LSP's are used to speed convergence in the


Springer, et al.                                                [Page 5]

Internet Draft     draft-springer-te-level3bcp-00.txt        August 2000


   traffic engineered core. The TE-LSP will consist of a primary working
   LSP and a protect LSP. The working LSP will usually run along the
   shortest path while the protect LSP attempts to avoid paths in use by
   the working LSP. If failure of the primary LSP occurs, the LSP will
   switch to the protect LSP.  When the primary path becomes re-
   established, traffic will be placed back on it. Notification of LSP
   failure is unfettered by LSA generation holddowns.  The secondary LSP
   also prevents attempting to establish an LSP upon failure based on an
   out of date topology database.  It is noted that while BGP and IP
   routing are full in the core (due to present lack of LDP availability), 
   the use of RSVP LSPs localize and thus minimize black holed traffic 
   that can arise when OSPF pulls traffic through a router before it has
   complete BGP information.


5. Other Architectures

   Other alternatives were considered before making a final protocol
   architecture decision.

   One option was IP over ATM. Historically, ATM provided higher
   bandwidth, higher performance and cost effective interfaces.  It also
   decoupled topology from IP forwarding and thus allowed for the
   development of traffic engineering on IP networks. However, because
   they are two different technologies, managing the network becomes
   somewhat complex. Another limitation of this model is the OC-12 edge.
   The fastest ATM router interfaces commonly available run at OC-12
   speeds. Considering the migration to OC-48 and OC-192 WAN speeds,
   this limitation becomes increasingly important.  Besides complex mesh
   management, POS appeared to have a stronger and more competitive
   future.

   Our current design of an IP edge with RSVP in the core will not be
   adequate due to the lack of support for MPLS edge to edge. With this
   requirement, LDP is still necessitated.  Another common approach is
   to do RSVP within a region and deaggregate the IP traffic to an
   inter-region router which is connected to an inter-region RSVP mesh
   [FRO].  This also does not allow for edge to edge mpls.

   RSVP edge to edge was not used based on scalability issues on the
   edge and in the core.  Taking an example of 30 gateways, each with 8
   edge devices, adding a router would mean configuration of 240
   devices.  Each would also have to have 239 RSVP sessions configured
   and maintained.  In the core, on the order of 25,000 RSVP sessions
   would have to be manageable in transit on some network elements.
   This did not appear scalable from a protocol or manageability
   perspective.


Springer, et al.                                                [Page 6]

Internet Draft     draft-springer-te-level3bcp-00.txt        August 2000


   Another approach would involve a loosely routed RSVP edge which gets
   further encapsulated into an RSVP core[HRSVP]. This reduces the core
   scalability issues mentioned above, but retains edge manageability
   and scalability concerns of above.  At the time of evaluation (mid
   1999), ways of doing RSVP in RSVP for LSP encapsulation were only
   just beginning to be considered.

   CR-LDP was not seriously considered as it too was just being
   developed within standards and implementations.  Also, it was not
   planned for support by key vendors.  As RSVP was already widely
   deployed in at least 2 major ISPs, it was expected to be the more
   battle hardened implementation, too.


6. Systems Issues

   As of June 2000, Level3's WAN configuration allows for configuration
   of the RSVP-TE core in a non-constrained manner.  Growth trends are
   expected to make this no longer true by October.  Systems are being
   developed to monitor LSP utilization and feed configurations back
   into the network.  These systems are likely to resemble other systems
   in place at other large ISPs. These systems are not trivial and lend
   creedance to thoughts that TE are not worth pursuing unless 100%
   necessary.  Besides programming complexity, reconfiguring a network
   on potentially a weekly basis is also something that could introduce
   service affecting outages.  It is recommended that these types of
   reconfigurations only be undertaken when they are necessitated on a
   frequent basis, and thus become more operationally routine.  That
   said, once such a system is in place, the need for metric and other
   routing firedrills, or tactical use of limited amounts of TE-LSPs,
   hopefully become less common, or perhaps even something for
   historical perspective.

   In response to warnings against extensive use of MPLS TE, it can be
   noted that relying on tactical techniques to avoid congestion can
   also be potentially service impacting. Such tactical maneuvers
   involve complex analysis that could perhaps be done better,
   routinely, by computers.  Tactical fixes are also frequently left in
   place and forgotten, and sometimes are remembered only when a network
   is not behaving as expected.


Springer, et al.                                                [Page 7]

Internet Draft     draft-springer-te-level3bcp-00.txt        August 2000


7. Conclusion

   Level3 has developed a unique MPLS protocol architecture that allows
   a scalable edge to edge MPLS platform, which allows for both customer
   and core-service VPNs.  RSVP-TE is used in the core to provide
   traditional traffic engineering advantages such as avoiding
   congestion during failure scenarios (and even during normalled up
   operation with overdue capacity augments).  It also provides for fast
   network reconvergence.  IP is currently used, and LDP will be used as
   available, to bring in traffic from different services from within
   the gateway as well as non traffic-engineered regions.  The RSVP+LDP
   architecture can be portrayed as complex in comparison to an IP only
   or IP + RSVP architecture. However, it can also represented as more
   cost effective and easier to maintain than multiple service specific
   networks.


References

   [TE] D. Awduche, A. Chiu, A. Elwalid, I. Widjaja, X. Xiao, "A
   Framework for Internet Traffic Engineering", draft-ietf-tewg-
   framework-02.txt, work in progress, May 2000.

   [L2TUN] L. Martini, et. al. "Layer 2 Tunneling using MPLS", draft-
   martini-l2circuit-trans-mpls-02.txt, work in progress, June 2000.

   [LDPA] B. Thomas, E. Gray, "LDP Applicability", draft-ietf-mpls-ldp-
   applic-02.txt, work in progress, August 2000.

   [FRO] "Traffic Engineering with MPLS in the Internet", IEEE Network
   Magazine, March/April 2000.

   [HRSVP] "LSP Hierarchy with MPLS TE", draft-ietf-mpls-lsp-hiearchy-
   00.txt, work in progress, July 2000.


8. Author Information


   Vanessa Springer
   Level 3 Communications, LLC.
   1025 Eldorado Blvd.
   Broomfield, CO 80021
   e-mail: vanessa.springer@level3.com


Springer, et al.                                                [Page 8]

Internet Draft     draft-springer-te-level3bcp-00.txt        August 2000


   Craig Pierantozzi
   Level 3 Communications, LLC.
   1025 Eldorado Blvd.
   Broomfield, CO 80021
   e-mail: tozz@level3.net


   Jim Boyle
   Level 3 Communications, LLC.
   1025 Eldorado Blvd.
   Broomfield, CO 80021
   e-mail: jboyle@level3.net


Springer, et al.                                                [Page 9]