Internet Engineering Task Force
INTERNET-DRAFT
TE Working Group
                                          Daniel O. Awduche
January 2000                              UUNET (MCI Worldcom)

                                          Angela Chiu
                                          AT&T

                                          Anwar Elwalid
                                          Lucent Technologies

                                          Indra Widjaja
                                          Fujitsu Network Communications

                                          Xipeng Xiao
                                          Global Crossing


              A Framework for Internet Traffic Engineering

                    draft-ietf-tewg-framework-00.txt


Status of this Memo

   This document is an Internet-Draft and is in full conformance with
   all provisions of Section 10 of RFC2026.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   To view the list Internet-Draft Shadow Directories, see
   http://www.ietf.org/shadow.html.


Awduche/Chiu/Elwalid/Widjaja/Xiao                               [Page 1]


Internet Draft      draft-ietf-tewg-framework-00.txt   Expires July 2000


Abstract

   This memo describes a framework for Traffic Engineering (TE) in the
   Internet.  The framework is intended to promote better understanding
   of the issues surrounding traffic engineering in IP networks, and to
   provide a common basis for the development of traffic engineering
   capabilities for the Internet.  The framework explores the
   principles, architectures, and methodologies for performance
   evaluation and performance optimization of operational IP networks.
   The optimization goals of traffic engineering seek to enhance the
   performance of IP traffic while utilizing network resources
   economically, efficiently, and reliably. The framework includes a set
   of generic requirements, recommendations, and options for Internet
   traffic engineering.  The framework can serve as a guide to
   implementors of online and off-line Internet traffic engineering
   mechanisms, tools, and support systems. The framework can also help
   service providers in devising traffic engineering solutions for their
   networks.


Table of Contents

    1.0 Introduction
       1.1 What is Internet Traffic Engineering?
       1.2 Scope
       1.3 Terminology
    2.0 Background
       2.1 Context of Internet Traffic Engineering
       2.2 Network Context
       2.3 Problem Context
          2.3.1 Congestion and its Ramifications
       2.4 Solution Context
          2.4.1 Combating the Congestion Problem
       2.5 Implementation and Operational Context
    3.0 Traffic Engineering Process Model
       3.1 Components of the Traffic Engineering Process Model
       3.2 Measurement
       3.3 Modeling and Analysis
       3.4 Optimization
    4.0 Historical Review and Recent Developments
       4.1 Traffic Engineering in Classical Telephone Networks
       4.2 Evolution of Traffic Engineering in Packet Networks
          4.2.1 Adaptive Routing in ARPANET
          4.2.2 Dynamic Routing in the Internet
          4.2.3 ToS Routing
          4.2.4 Equal Cost MultiPath
       4.3 Overlay Model
       4.4 Constraint-Based Routing
       4.5 Overview of Recent IETF Projects Related to Traffic
   Engineering
          4.5.1 Integrated Services
          4.5.2 RSVP
          4.5.3 Differentiated Services
          4.5.4 MPLS


Awduche/Chiu/Elwalid/Widjaja/Xiao                               [Page 2]


Internet Draft      draft-ietf-tewg-framework-00.txt   Expires July 2000


          4.5.5 IP Performance Metrics
          4.5.6 Flow Measurement
          4.5.7 Endpoint Congestion Management
       4.6 Overview of ITU Activities Related to Traffic Engineering
    5.0 Taxonomy of Traffic Engineering Systems
       5.1 Time-Dependent Versus State-Dependent
       5.2 Offline Versus Online
       5.3 Centralized Versus Distributed
       5.4 Local Versus Global
       5.5 Prescriptive Versus Descriptive
       5.6 Open-Loop Versus Closed-Loop
    6.0 Requirements for Internet Traffic Engineering
       6.1 Generic Requirements
       6.2 Routing Requirements
       6.3 Traffic Mapping Requirements
       6.4 Measurement Requirements
       6.5 Network Survivability
          6.5.1 Survivability in MPLS Based Networks
       6.6 Content Distribution (Webserver) Requirements
       6.7 Off-line Traffic Engineering Support Systems
       6.8 Traffic Engineering in Diffserv Environments
    7.0 Multicast Considerations
    8.0 Inter-Domain Considerations
    9.0 Conclusion
    10.0 Security Considerations
    11.0 Acknowledgments
    12.0 References
    13.0 Authors' Addresses


1.0 Introduction


   This memo describes a framework for Internet traffic engineering.
   The intent is to articulate the general issues, principles and
   requirements for Internet traffic engineering; and where appropriate
   to provide recommendations, guidelines, and options for the
   development of online and off-line Internet traffic engineering
   capabilities and support systems.

   The framework can assist vendors of networking hardware and software
   in developing mechanisms and support systems for the Internet
   environment that support the traffic engineering function. The
   framework can also help service providers in devising and
   implementing traffic engineering solutions for their networks.

   The framework provides a terminology for describing and understanding
   common Internet traffic engineering concepts.  The framework also
   provides a taxonomy of known traffic engineering styles.  In this
   context, a traffic engineering style abstracts important aspects from
   a traffic engineering methodology. Traffic engineering styles can be
   viewed in different ways depending upon the specific context in which
   they are used and the specific purpose which they serve. The
   combination of styles and views results in a natural taxonomy of


Awduche/Chiu/Elwalid/Widjaja/Xiao                               [Page 3]


Internet Draft      draft-ietf-tewg-framework-00.txt   Expires July 2000


   traffic engineering systems.

   Although Internet traffic engineering is most effective when applied
   end-to-end, the initial focus of this framework document is intra-
   domain traffic engineering (that is, traffic engineering within a
   given autonomous system). However, in consideration of the fact that
   a preponderance of Internet traffic tends to be inter-domain (that
   is, they originate in one autonomous system and terminate in
   another), this document provides an overview of some of the aspects
   that pertain to inter-domain traffic engineering.

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED",  "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC 2119.

   This draft is preliminary and will be reviewed and revised over time.


1.1. What is Internet Traffic Engineering?


   Internet traffic engineering is defined as that aspect of Internet
   network engineering that deals with the issue of performance
   evaluation and performance optimization of operational IP networks.
   Traffic Engineering encompasses the application of technology and
   scientific principles to the measurement, characterization, modeling,
   and control of Internet traffic [AWD1, AWD2].

   A major objective of Internet traffic engineering is to enhance the
   performance of an operational network; at both the traffic and
   resource levels. This is accomplished by addressing traffic oriented
   performance requirements, while utilizing network resources
   efficiently, reliably, and economically. Traffic oriented performance
   measures include delay, delay variation, packet loss, and goodput.

   It is worthwhile to emphasize that an important objective of Internet
   traffic engineering is to facilitate reliable network operations
   [AWD1]. Reliable network operations can be facilitated by providing
   mechanisms that enhance network integrity and by embracing policies
   that emphasis network survivability,  so that the vulnerability of
   the network to service outages arising from errors, faults,  and
   failures that occur within the infrastructure can be minimized.

   It is also important to be cognizant of the fact that ultimately,
   what really matters is the performance of the network as seen by end
   users of network services. This crucial aspect should be kept in view
   when developing traffic engineering mechanisms and policies. The
   charateristics that are visible to end users are the emergent
   properties of the network. Emergent properties are the
   characteristics of the network viewed as a whole.

   A significant, but subtle, practical advantage of applying traffic
   engineering concepts to operational networks is that it helps to
   identify and structure goals and priorities in terms of enhancing the


Awduche/Chiu/Elwalid/Widjaja/Xiao                               [Page 4]


Internet Draft      draft-ietf-tewg-framework-00.txt   Expires July 2000


   quality of service delivered to end-users of network services, and in
   terms of measuring and analyzing the achievement of these goals.

   The optimization aspects of traffic engineering can be achieved
   through capacity management and traffic management. As used in this
   document, capacity management includes capacity planning, routing
   control, and resource management. Network resources of particular
   interest include link bandwidth, buffer space, and computational
   resources. Likewise, as used in this document, traffic management
   includes traffic conditioning, scheduling, and other functions that
   regulate traffic flow through the network or that arbitrate access to
   network resources between different packets.

   The optimization objectives of Internet traffic engineering should be
   viewed as a continual and iterative process of network performance
   improvement, rather than as a one time goal. Traffic engineering also
   demands continual development of new technologies and new
   methodologies for network performance enhancement.  The optimization
   objectives of Internet traffic engineering may change over time as
   new requirements are imposed, or as new technologies emerge, or as
   new insights are brought to bear on the underlying problems.
   Moreover, different networks may have different optimization
   objectives, depending upon their business models, capabilities, and
   operating constraints. Regardless of the specific optimization goals
   that prevail in any particular environment, for practical purposes,
   the optimization aspects of traffic engineering are ultimately
   concerned with network control.

   Thus, the optimization aspects of traffic engineering can be viewed
   from a control perspective. The control dimension of Internet traffic
   engineering can be pro-active and/or reactive. In the reactive case,
   the control system responds to events that have already transpired in
   the network. In the pro-active case, the control system takes
   preventive action to obviate predicted unfavorable future network
   states, or takes perfective action to induce a more desirable state
   in the future. The control dimension of Internet traffic engineering
   responds at multiple levels of temporal resolution to network events.
   Some aspects of capacity management such as capacity planning
   functions respond at a very coarse temporal level, ranging from days
   to possibly years. The routing control functions operate at
   intermediate levels of temporal resolution, ranging from milliseconds
   to days.  Finally, the packet level processing functions (e.g. rate
   shaping, queue management, and scheduling) operate at very fine
   levels of temporal resolution, responding to the real-time
   statistical characteristics of traffic, ranging from picoseconds to
   milliseconds. The subsystems of Internet traffic engineering control
   include: capacity augmentation, routing control, traffic control, and
   resource control (including control of service policies at network
   elements).  Inputs into the control system include network state
   variables, policy variables, and decision variables.

   For practical purposes, traffic engineering concepts and mechanisms
   must be sufficiently specific and well defined to address known
   requirements, but at the same time must be flexible and extensible to


Awduche/Chiu/Elwalid/Widjaja/Xiao                               [Page 5]


Internet Draft      draft-ietf-tewg-framework-00.txt   Expires July 2000


   accommodate unforeseen future demands.

   A major challenge in Internet traffic engineering is the realization
   of automated control capabilities that adapt quickly and at
   reasonable cost to significant changes in network state, while
   maintaining stability.


1.2. Scope


   The scope of this document is intra-domain traffic engineering; that
   is, traffic engineering within a given autonomous system in the
   Internet. The framework will discuss concepts pertaining to intra-
   domain traffic control, including such issues as routing control,
   micro and macro resource allocation, and the control coordination
   problems that arise consequently.

   This document will describe and characterize techniques already in
   use or in advanced development for Internet traffic engineering,
   indicate how they fit together, and identify scenarios in which they
   are useful.

   Although the emphasis is on intra-domain traffic engineering, in
   Section 8.0, however, an overview of the high level considerations
   pertaining to inter-domain traffic engineering will be provided.
   Inter-domain Internet traffic engineering is crucial to the
   performance enhancement of the global Internet infrastructure.

   Whenever possible, relevant requirements from existing IETF documents
   and other sources will be incorporated by reference.


1.3 Terminology


   This subsection provides terminology which is useful for Internet
   traffic engineering. The definitions presented apply to this
   framework document. These terms may have other meanings elsewhere.

     - Baseline analysis:
          A study conducted to serve as a baseline for comparison to the
          actual behavior of the network.

     - Busy hour:
          A one hour period within a specified interval of time
          (typically 24 hours) in which the traffic load in a
          network or subnetwork is greatest.

      - Congestion:
          A state of a network resource in which the traffic incident
          on the resource exceeds its output capacity over an interval
          of time.


Awduche/Chiu/Elwalid/Widjaja/Xiao                               [Page 6]


Internet Draft      draft-ietf-tewg-framework-00.txt   Expires July 2000


     - Congestion avoidance:
          An approach to congestion management that attempts to obviate
          the occurrence of congestion.

     - Congestion control:
          An approach to congestion management that attempts to remedy
          congestion problems that have already occurred.

     - Constraint-based routing:
          A class of routing protocols that take specified traffic
          attributes, network constraints, and policy constraints into
          account in making routing decisions. Constraint-based routing
          is applicable to traffic aggregates as well as flows. It is a
          generalization of QoS routing.

     - Demand side congestion management:
          A congestion management scheme that addresses congestion
          problems by regulating or conditioning offered load.

     - Effective bandwidth:
          The minimum amount of bandwidth that can be assigned to a flow
          or traffic aggregate in order to deliver 'acceptable service
          quality' to the flow or traffic aggregate.

     - Egress traffic:
          Traffic exiting a network or network element.

     - Ingress traffic:
          Traffic entering a network or network element.

     - Inter-domain traffic:
          Traffic that originates in one Autonomous system and
          terminates in another.

     - Loss network:
          A network that does not provide adequate buffering for
          traffic, so that traffic entering a busy resource within
          the network will be dropped rather than queued.

     - Network Survivability:
          The capability to provide a prescribed level of QoS for
          existing services after a given number of failures occur
          within the network.

     - Off-line traffic engineering:
          A traffic engineering system that exists outside of the
          network.

     - Online traffic engineering:
          A traffic engineering system that exists within the network,
          typically implemented on or as adjuncts to operational network
          elements.

     - Performance measures:


Awduche/Chiu/Elwalid/Widjaja/Xiao                               [Page 7]


Internet Draft      draft-ietf-tewg-framework-00.txt   Expires July 2000


          Metrics that provide quantitative or qualitative measures of
          the performance of systems or subsystems of interest.

     - Performance management:
          A systematic approach to improving effectiveness in the
          accomplishment of specific networking goals related to
          performance improvement.

     - Provisioning:
          The process of assigning or configuring network resources to
          meet certain requests.

     - QoS routing:
          Class of routing systems that selects paths to be used by a
          flow based on the QoS requirements of the flow.

     - Service Level Agreement:
          A contract between a provider and a customer that guarantees
          specific levels of performance and reliability at a certain
          cost.

     - Stability:
          An operational state in which a network does not oscillate
          in a disruptive manner from one mode to another mode.

     - Supply side congestion management:
          A congestion management scheme that provisions additional
          network resources to address existing and/or anticipated
          congestion problems.

     - Transit traffic:
          Traffic whose origin and destination are both outside of
          the network under consideration.

     - Traffic characteristic:
          A description of the temporal behavior or a description of the
          attributes of a given traffic flow or traffic aggregate.

     - Traffic engineering system
          A collection of objects, mechanisms, and protocols that are
          used conjunctively to accomplish traffic engineering
          objectives.

     - Traffic flow:
          A stream of packets between two end-points that can be
          characterized in a certain way. A micro-flow has a more
          specific definition: A micro-flow is a stream of packets with
          a bounded inter-arrival time and with the same source and
          destination addresses, source and destination ports, and
          protocol ID.

     - Traffic intensity:
          A measure of traffic loading with respect to a resource
          capacity over a specified period of time. In classical


Awduche/Chiu/Elwalid/Widjaja/Xiao                               [Page 8]


Internet Draft      draft-ietf-tewg-framework-00.txt   Expires July 2000


          telephony systems, traffic intensity is measured in units of
          Erlang.

     - Traffic matrix:
          A representation of the traffic demand between a set of origin
          and destination abstract nodes. An abstract node can consist
          of one or more network elements.

     - Traffic monitoring:

          The process of observing traffic characteristics at a given
          point in a network and collecting the traffic information for
          analysis and further action.

     - Traffic trunk:
          An aggregation of traffic flows belonging to the same class
          which are forwarded through a common path. A traffic trunk
          may be characterized by an ingress and egress node, and a
          set of attributes which determine its behavioral
          characteristics and requirements from the network.


2.0 Background


   The Internet has quickly evolved into a very critical communications
   infrastructure, supporting significant economic, educational, and
   social activities. At the same time, the delivery of Internet
   communications services has become a very competitive endeavor.
   Consequently, optimizing the performance of large scale IP networks,
   especially public Internet backbones, has become an important
   problem.  Network performance requirements are multidimensional,
   complex, and sometimes contradictory; thereby making the traffic
   engineering problem very challenging.

   The network must convey IP packets from ingress nodes to egress nodes
   efficiently, expeditiously, reliably, and economically. Furthermore,
   in a multiclass service environment (e.g. Diffserv capable networks),
   the resource sharing parameters of the network must be appropriately
   determined and configured according to prevailing policies and
   service models to resolve resource contention issues arising from
   mutual interference between packets traversing through the network.
   Moreover, in multi-class environments, consideration must be given to
   resolving competition for network resources between traffic streams
   belonging to the same service class (intra-class contention
   resolution) and between traffic streams belonging to different
   classes (inter-class contention resolution).


2.1 Context of Internet Traffic Engineering


   The context of Internet traffic engineering pertains to the scenarios
   in which the problems that traffic engineering attempts to solve


Awduche/Chiu/Elwalid/Widjaja/Xiao                               [Page 9]


Internet Draft      draft-ietf-tewg-framework-00.txt   Expires July 2000


   manifest. A traffic engineering methodology establishes appropriate
   rules to solve traffic performance problems that occur in a specific
   context. The context of Internet traffic engineering includes:

    (1) A network context which defines the situations in which the
        TE problems occur. The Network context encompasses network
        structure, network policies, network characteristics, network
        constraints, network quality attributes, network optimization
        criteria, etc.

    (2) A problem context which defines the general and concrete
        issues that TE addresses. The problem context encompasses
        identification, abstraction of relevant features,
        representation, formulation, requirements and desirable
        features of solutions, etc.

    (3) A solution context which suggests how to solve the TE
        problems. The solution context encompasses analysis, evaluation
        of alternatives, prescription, and resolution.

    (4) An implementation and operational where the solutions are
        methodologically instantiated. The implementation and
   operational
        context which encompasses planning, organization, and execution.

   In the following subsections, we elaborate on the context of Internet
   traffic engineering.


2.2 Network Context


   IP networks range in size from small clusters of routers situated
   within a given location, to thousands of interconnected routers and
   switches distributed all over the world.

   At the most basic level of abstraction, an IP network can be
   represented as: (1) a constrained system consisting of set of
   interconnected resources which provide transport services for IP
   traffic, (2) a demand system representing the offered load to be
   transported through the network, and (3) a response system consisting
   of network processes, protocols, and related mechanisms which
   facilitate the movement of traffic through the network [see also
   AWD2].

   The network elements and resources may have specific characteristics
   which restrict the way in which they handle the demand. Additionally,
   network resources may be equipped with traffic control mechanisms
   which allow the way in which they handle the demand to be regulated.
   Traffic control mechanisms may also be used to control various packet
   processing activities within the resource, or to arbitrate contention
   for access to the resource by different packets, or to regulate
   traffic behavior through the resource. A configuration management and
   provisioning system may allow the settings of the traffic control


Awduche/Chiu/Elwalid/Widjaja/Xiao                              [Page 10]


Internet Draft      draft-ietf-tewg-framework-00.txt   Expires July 2000


   mechanisms to be manipulated by external or internal entities in
   order to constrain or to exercise control over the way in which the
   network element responds to internal and external stimuli.

   The details of how the network provides transport services for
   packets are specified in the policies of the network administrators
   and are installed through network configuration management and
   provisioning systems.  Generally, the types of services provided by
   the network also depends upon the technology and characteristics of
   the network elements, the prevailing policies, as well as the ability
   of the network administrators to translate policies into network
   configurations.

   There are three significant characteristics of contemporary Internet
   networks: (1) they provide real-time services, (2) they have become
   mission critical, and (3) their operating environments are very
   dynamic. The dynamic characteristics of IP networks can be attributed
   in part to fluctuations in demand, to the interaction between various
   network protocols and processes, to the rapid evolution of the
   infrastructure which demands constant insertion of new technologies
   and new network elements, and to transient and persistent impairments
   which occur within the system.

   The most significant function permformed by an IP network is the
   routing of packets from source nodes to destination nodes. Not
   surprisingly, one of the most significant functions performed by
   Internet traffic engineering is the control and optimization of the
   routing function, so as to steer packets in the most effective way
   through the network. As packets are conveyed through the network,
   they contend for the use of network resources. If the arrival rate of
   packets exceed the output capacity of a network resource over an
   interval of time, the resource is said to be congested, and some of
   the arrival packets may be dropped as a result. Congestion also
   increases transit delays, delay variation, and reduces the
   predictability of network service delivery. Thus, congestion is a
   highly undesirable phenomenon. Combating congestion at reasonable
   cost is a major objective of Internet traffic engineering.

   A basic economic premise for packet switched networks in general and
   the Internet in particular is the efficient sharing of network
   resources by multiple traffic streams. One of the fundamental
   challenges in operating a network, especially large scale public IP
   networks, is the need to increase the efficiency of resource
   utilization while minimizing the possibility of congestion.

   Increasingly, the Internet will have to function in the presence of
   different classes of traffic, especially with the advent of
   differentiated services. In practice, a particular set of packets may
   have specific delivery requirements which may be specified explicitly
   or implicitly. Two of the most important traffic delivery
   requirements are (1) capacity constraints which can be expressed
   statistically as peak rates, mean rates, burst sizes, or as some
   deterministic notion of effective bandwidth, and (2) QoS constraints
   which can be expressed in terms of integrity constraints (e.g. packet


Awduche/Chiu/Elwalid/Widjaja/Xiao                              [Page 11]


Internet Draft      draft-ietf-tewg-framework-00.txt   Expires July 2000


   loss) and temporal constraints, e.g. timing restrictions for the
   delivery of each packet and timing restrictions for the delivery of
   consecutive packets belonging to the same traffic stream. Packets may
   also be grouped into classes, in such a way that each class may have
   a common set of behavioral characteristics and/or a common set of
   delivery requirements.


2.3 Problem Context


   There are a number of fundamental problems associated with the
   operation of a network described by the simple model of the previous
   subsection. The present subsection reviews the problem context with
   regard to the traffic engineering function.

   One problem concerns how to identify, abstract, represent, and
   measure relevant features of the network which are relevant for
   traffic engineering.

   One particularly important class of problems concerns how to
   explicitely formulate the problems that TE attempts to solve, how to
   identify the requirements on the solution space, how to specify the
   desireable features of good solutions, and how to measure and
   characterize the effectiveness of the solutions.

   Another problem concerns how to measure and estimate relevant network
   state parameters. Effective traffic engineering relies on a good
   estimate of the offered traffic load as well as a view of the
   underlying topology and associated resource constraints.  A network-
   wide view of the topology is also a must for off-line planning.

   Still another problem concerns how to characterize the state of the
   network and how to evaluate its performance under a variety of
   scenarios. There are two aspects to the performance analysis problem.
   One aspect relates to the evaluation of the system level performance
   of the network. The other aspect relates to the evaluation of the
   resource level performance, which restricts attention to the
   performance evaluation of individual network resources. In this memo,
   we shall refer to the system level characteristics of the network as
   the "macro-states" and the resource level characteristics as the
   "micro-states." Likewise, we shall refer to the traffic engineering
   schemes that deal with network performance optimization at the
   systems level as macro-TE and the schemes that optimize at the
   individual resource level as micro-TE. In general, depending upon the
   particular performance measures of interest, the system level
   performance can be derived from the resource level performance
   results using appropriate rules of composition.

   Yet another fundamental problem concerns how to optimize the
   performance of the network. Performance optimization may entail some
   degree of resource management control, routing control, and/or
   capacity augmentation.


Awduche/Chiu/Elwalid/Widjaja/Xiao                              [Page 12]


Internet Draft      draft-ietf-tewg-framework-00.txt   Expires July 2000


   As noted previously, congestion is an undesirable phenomena in
   operational networks. Therefore, we devote the next subsection to the
   issue of congestion and its ramifications within the problem context
   of Internet traffic engineering.


2.3.1 Congestion and its Ramifications


   Congestion is one of the most significant problems in an operational
   IP context. A network element is said to be congested if it
   experiences sustained overload over an interval of time. Almost
   invariably, congestion results in degradation of service quality to
   end users. Congestion control schemes can include demand side
   policies and supply side policies. Demand side policies may restrict
   access to congested resources and/or dynamically regulate the demand
   to alleviate the overload situation. Supply side policies may re-
   allocate network resources by redistributing traffic over the
   infrastructure and/or expand or augment network capacity. In this
   memo, the emphasis is mainly on congestion management schemes that
   fall within the scope of the network, rather than congestion
   management systems that depend on sensitivity and adaptivity from
   end-systems. That is, the focus of this memo with respect to
   congestion management is on those solutions that can be provided by
   control entities operating on the network or by the actions of
   network administrators.


2.4 Solution Context


   The solution context for Internet traffic engineering involves
   analysis, evaluation of alternatives, and choice between alternative
   courses of action.  Generally the solution context is predicated on
   making reasonable inferences about the current or future state of the
   network and possibly making appropriate decisions that may involve a
   preference between alternative sets of action. More specifically, the
   solution context demands good estimates of traffic workload,
   characterization of network state, and possibly a set of control
   actions. Control actions may involve manipulation of parameters
   associated with the routing function, control over tactical capacity
   acquisition, and control over the traffic management functions.

   The following is a subset of the instruments that may be applicable
   to the solution context of Internet TE.

   (1) A set of policies, objectives, and requirements (which may be
       context dependent) for network performance evaluation and
       performance  optimization.

   (2) A collection of online and possibly off-line tools and mechanisms
       for measurement, characterization, modeling, and control
       of Internet traffic and control over the placement and allocation
       of network resources, as well as control over the mapping or


Awduche/Chiu/Elwalid/Widjaja/Xiao                              [Page 13]


Internet Draft      draft-ietf-tewg-framework-00.txt   Expires July 2000


       distribution of traffic onto the infrastructure.

   (3) A set of constraints on the operating environment, the network
       protocols, and the traffic engineering system itself.

   (4) A set of administrative control parameters which may be
       manipulated through a Configuration Management (CM) system. The
   CM
       system itself may include a configuration control subsystem, a
       configuration repository, a configuration accounting subsystem,
       and a configuration auditing subsystem.

   Derivation of traffic characteristics through measurement and/or
   estimation is very useful within the realm of the solution space for
   traffic engineering. Traffic estimates can be derived from customer
   subscription information, from traffic projections, from traffic
   models, or from actual empirical measurements. In order to measure
   and derive traffic matrices at various levels of detail, the
   measurement may be performed at the flow level or at the traffic
   aggregate level.  Measurements at the flow level or on small traffic
   aggregates may be performed at edge nodes, where traffic enters and
   leaves the network [FGLR].

   In order to conduct performance studies and planning on existing or
   future networks, a routing analysis may be performed to determine the
   path(s) which the routing protocols will choose for each traffic
   demand, and the utilization of network resources as traffic is routed
   through the network. The routing analysis needs to capture the
   selection of paths through the network, the assignment of traffic
   across multiple feasible routes , and the multiplexing of IP traffic
   over traffic trunks (if such constructs exists) and over the
   underlying network infrastructure. A topology model for the network
   may be extracted from network architecture documents, or from network
   designs, or from information contained in router configuration files,
   or from routing databases, or from routing tables. Topology
   information may also be derived from servers that monitor network
   state and from servers that perform provisioning functions.

   Routing in operational IP networks can be administratively controlled
   at various level of abstraction, e.g., manipulating BGP attributes;
   manipulating IGP metrics; manipulating  traffic engineering
   parameters, resource parameters, and policy constraints for path
   oriented technologies such as MPLS, etc. Within the context of MPLS,
   the path of an explicit LSP can be computed and established in
   various ways, e.g. (1) manually, (2) automatically online using
   constraint-based routing processes implemented on label switching
   routers, or (3) automatically off-line using a constraint-based
   traffic engineering support systems.


2.4.1 Combating the Congestion Problem


   Minimizing congestion is a significant aspect of traffic engineering.


Awduche/Chiu/Elwalid/Widjaja/Xiao                              [Page 14]


Internet Draft      draft-ietf-tewg-framework-00.txt   Expires July 2000


   This subsection gives an overview of the general approaches that have
   been used or proposed to combat congestion problems.

   Congestion management policies can be categorized based upon the
   following criteria (see [YaRe95] for a more detailed taxonomy of
   congestion control schemes): (1) Response time scale, which can be
   characterized as long, medium, or short; (2) reactive versus
   preventive which relates to congestion control and congestion
   avoidance; and (3) supply side versus demand side congestion
   management schemes. These aspects are elaborated upon in the
   following paragraphs.

   (1) Response time scale

   - Long (weeks to months): Capacity planning works over a relatively
   long time scale to expand network capacity based on estimates or
   forecasts of future traffic demand and traffic distribution. Since
   router and link provisioning takes time and are in general expensive,
   these upgrades are typically carried out in the weeks-to-months or
   even years time scale.

   - Medium (minutes to days): Several control policies fall within the
   medium time scale category. Examples include: 1) Adjusting IGP and/or
   BGP parameters to route traffic away or towards certain segment of
   the network; 2) Setting up and/or adjusting some Explicitly-Routed
   Label Switched Paths (ER-LSPs) to route some traffic trunks away from
   possibly congested resources or towards possibly more favorable
   routes; 3) reconfiguring the logical topology of the network to make
   it correlate more closely with the traffic distribution using for
   example some underlying path-oriented technology such as MPLS LSPs,
   ATM PVCs, or optical channel trails (see e.g. [AWD6]).  Many of these
   adaptive medium time scale response schemes rely on a measurement
   system that monitors changes in traffic distribution, traffic shifts,
   and network resource utilization and subsequently provides feedback
   to the online and/or off-line traffic engineering mechanisms and
   tools which employ this feedback information to trigger certain
   control actions to occur within the network. The traffic engineering
   mechanisms and tools can be implemented in a distributed or
   centralized fashion, and may have a hierarchical or flat structure.
   The comparative merits and demerits of distributed and centralized
   control structures for networks are well known. A centralized scheme
   may have global visibility into the network state and may produce
   potentially more optimal solutions. However, centralized schemes are
   prone to single points of failure and may not scale as well as
   distributed schemes. Moreover, the information utilized by a
   centralized scheme may be stale and may not reflect the actual state
   of the network. It is not an objective of this memo to make a
   recommendation between distributed and centralized schemes. This is a
   choice that network administrators must make based on their specific
   needs.

   - Short (picoseconds to minutes): This category includes packet level
   processing functions and events in the order of several round trip
   times. It includes router mechanisms such as passive or active buffer


Awduche/Chiu/Elwalid/Widjaja/Xiao                              [Page 15]


Internet Draft      draft-ietf-tewg-framework-00.txt   Expires July 2000


   management which is used to control congestion and/or signal
   congestion to end systems so that they can slow down. One of the most
   popular active management schemes, especially for TCP traffic, is
   Random Early Detection (RED) [FlJa93], which supports congestion
   avoidance by controlling the average queue size. During congestion
   (but before the queue is filled), the RED scheme chooses arriving
   packets to be "marked" according to a probabilistic algorithm which
   takes into account the average queue size. For a router that does not
   utilize explicit congestion notification (ECN) see e.g., [Floy94]),
   the marked packets can simply be dropped to signal the inception of
   congestion to end systems; otherwise, if the router supports ECN,
   then it can set the ECN field in the packet header. Several
   variations of RED have been proposed for use in multiclass
   environments with different drop precedence levels [RFC-2597], e.g.,
   RED with In and Out (RIO) and Weighted RED. It is generally agreed
   that RED provides congestion avoidance performance which is not worse
   than traditional Tail-Drop (TD) (i.e., dropping arriving packets only
   when the queue is full). Importantly, however, RED reduces the
   possibility of global synchronization and improves fairness among
   different TCP sessions. However, RED by itself can not prevent
   congestion and unfairness caused by unresponsive sources, e.g., UDP
   connections, or some misbehaved greedy connections. Other schemes
   have been proposed to improve the performance and fairness in the
   presence of unresponsive connections.  Some of these schemes have
   been proposed as theoretical frameworks and are not typically
   available in existing products. Two such schemes are Longest Queue
   Drop (LQD) and Dynamic Soft Partitioning with Random Drop (RND)
   [SLDC98].

   (2) Reactive versus preventive

   - Reactive (recovery): reactive policies are those that react to
   existing congestion in order to improve it. All the policies
   described in the long and medium time scales above can be categorized
   as being reactive especially if the policies are based on monitoring
   and identifying existing congestion problems and initiating relevant
   actions to ease the situation.

   - Preventive (predictive/avoidance): preventive policies are those
   that take proactive action to prevent congestion based on estimates
   or predictions of potential congestion problems in the future. Some
   of the policies described in the long and medium time scales fall
   under this category. They do not necessarily respond immediately to
   existing congestion problems. Instead they may take into account
   forecasts of future traffic demand and distribution, and may take or
   prescribe actions in order to prevent potential congestion problems
   in the future. The schemes described in short time scale, e.g., RED
   and its variations, ECN, LQD, and RND, are also used for congestion
   avoidance since dropping or marking packets as an early congestion
   notification before queues actually overflow would trigger
   corresponding TCP sources to slow down.

   (3) Supply side versus demand side


Awduche/Chiu/Elwalid/Widjaja/Xiao                              [Page 16]


Internet Draft      draft-ietf-tewg-framework-00.txt   Expires July 2000


   - Supply side: supply side policies are those that seek to increase
   the effective capacity available to traffic in order to control or
   obviate congestion. One way to accomplish this is to minimize
   congestion by having a relatively balanced network. For example,
   capacity planning should aim to provide a physical topology and
   associated link bandwidths that match estimated traffic workload and
   traffic distribution based on forecasting, subject to budgetary and
   other constraints.  However, if actual traffic distribution does not
   match the topology derived from capacity panning (due, for example,
   to forecasting errors or facility constraints), then the traffic can
   be mapped onto the existing topology using routing control mechanisms
   or by modifying the logical topology using path oriented technologies
   (e.g., MPLS, ATM, optical channel trails), or by using some other
   load redistribution mechanisms.

   - Demand side: demand side policies are those that seek to control or
   regulate the offered traffic. For example, some of the short time
   scale mechanisms described earlier (such as RED and its variations,
   ECN, LQD, and RND) as well as policing and rate shaping mechanisms
   attempt to regulate the offered load in various ways. Tariffs may
   also be applied as a demand side instrument. However, to date,
   tariffs have not been used as a means of demand side congestion
   management within the Internet.

   In summary, a variety of mechanisms can be brought to bear to address
   congestion problems in IP networks. These mechanisms may operate at
   multiple time-scales.


2.5 Implementation and Operational Context


   The operational context of Internet traffic engineering is
   characterized by constant change which occur at multiple levels of
   abstraction.  The implementation context demands effective planning,
   organization, and execution. The planning aspects may involve
   determining prior sets of actions in order to achieve desired
   objectives. Organizing involves arranging and assigning
   responsibility to the various components of the traffic engineering
   system and coordinating their activities in order to accomplish the
   desired TE objectives. Execution involves measuring and applying
   corrective or perfective actions to attain and maintain desired TE
   goals.


3.0 Traffic Engineering Process Model(s)


   This section describes a process model that captures the high level
   practical aspects of Internet traffic engineering in an operational
   context. The process model is described in terms of a sequence of
   actions that a traffic engineer, or more generally that a traffic
   engineering system, goes through in order to optimize the performance
   of an operational network (see also [AWD1, AWD2]). Although the


Awduche/Chiu/Elwalid/Widjaja/Xiao                              [Page 17]


Internet Draft      draft-ietf-tewg-framework-00.txt   Expires July 2000


   details regarding how traffic engineering is carried out may differ
   from network to network, the process model described here represents
   broad activities which are common to most traffic engineering
   methodologies. This process model may be enacted explicitely or
   implicitely; by an automaton and/or by a human.

   The first phase of the TE process model is to define relevant control
   policies that govern the operation of the network. These policies may
   depend on the prevailing business model, the network cost structure,
   the operating constraints, and one or more optimization criteria, as
   well as other factors.

   The second phase of the process model is a feedback process which
   involves acquiring measurement data from the operational network. If
   empirical data is not readily available from the network, then
   synthetic workloads may be used instead, which reflect either the
   prevailing or the expected workload of the network. Synthetic
   workloads may be derived by estimation or by extrapolation using
   prior empirical data, or by using mathematical models of traffic
   characteristics, or by using some other means.

   The third phase of the process model is to analysis the network state
   and to characterize traffic workload. In general, performance
   analysis may be proactive and/or reactive. Proactive performance
   analysis identifies potential problems that do not exist, but that
   may manifest at some point in the future. Reactive performance
   analysis, on the other hand, identifies existing problems, determines
   their cause through a process of diagnosis, and if necessary
   evaluates alternative approaches to remedy the problem. A number of
   quantitative and qualitative techniques may be used in the analysis
   process, including modeling based analysis and simulation. The
   analysis phase of the process model may involve the following
   actions: (1) investigate the concentration and distribution of
   traffic across the network or relevant subsets of the network, (2)
   identify the characteristics of the offered traffic workload, (3)
   identify existing or potential bottlenecks, and (4) identify network
   pathologies such as ineffective placement of links, single points of
   failures, etc. Network pathologies may result from a number of
   factors such as inferior network architecture, inferior network
   design, configuration problems, and others.  A traffic matrix may be
   constructed as part of the analysis process. Network analysis may
   also be descriptive or prescriptive.

   The fourth phase of the TE process model is concerned with the
   performance optimization of the network. The performance optimization
   phase generally involves a decision process which selects and
   implements a particular set of actions from a choice between
   alternatives.  Optimization actions may include use of appropriate
   techniques to control the offered traffic, or to control the
   distribution of traffic across the network.  Optimization actions may
   also involve increasing link capacity, deploying additional hardware
   such as routers and switches, adjusting parameters associated with
   routing such as IGP metrics and BGP attributes in a systematic way,
   and adjusting traffic management parameters. Network performance


Awduche/Chiu/Elwalid/Widjaja/Xiao                              [Page 18]


Internet Draft      draft-ietf-tewg-framework-00.txt   Expires July 2000


   optimization may also involve starting a network planning process to
   improve the network architecture, network design, network capacity,
   network technology, and the configuration of network elements in
   order to accommodate current and future growth.


3.1 Components of the Traffic Engineering Process Model


   As evidenced by the discussion in the previous subsection, the key
   components of the TE process model include a measurement subsystem, a
   modeling and analysis subsystem, and an optimization subsystem. The
   following subsections elaborate on these components as they apply to
   the TE process model.


3.2 Measurement


   Measurement is crucial to the traffic engineering function. The
   operational state of a network can only be conclusively determined
   through measurement. Measurement is also critical to the optimization
   function because it provides feedback data which is used by TE
   control subsystems to adaptively optimize network performance in
   response to events and stimuli that originate within and outside the
   network. Measurement is also needed to ascertain the quality of
   network services and to evaluate the effectiveness of TE policies.
   Experience suggests that measurement is most effective when it is
   applied systematically.

   In developing a measurement system to support the TE function in IP
   networks, the following questions should be considered very
   carefully:  Why is measurement needed in this particular context?
   What parameters are to be measured?  How should the measurement be
   accomplished?  Where should the measurement be performed? When should
   the measurement be performed?  How frequently should the monitored
   variables be measured? What level of measurement accuracy and
   reliability is desirable. What level of measurement accuracy and
   reliability is realistically attainable? To what extent can the
   measurement system permissibly interfere with the monitored network
   components and variables? What is the acceptable cost of measurement?
   To a large degree, the answers to the above questions will determine
   the measurement tools and measurement methodologies that are
   appropriate in any given TE context.

   It is also worthwhile to point out that there is a distinction
   between measurement and evaluation. Measurement provides raw data
   concerning state parameters and variables of monitored elements.
   Evaluation utilizes the raw data to make inferences regarding the
   monitored system.

   Measurement in support of the TE function can occur at different
   levels of abstraction. For example, measurement can be used to derive
   packet level characteristics, flow level characteristics, user or


Awduche/Chiu/Elwalid/Widjaja/Xiao                              [Page 19]


Internet Draft      draft-ietf-tewg-framework-00.txt   Expires July 2000


   customer level characteristics, traffic aggregate characteristics,
   component level characteristics, network wide characteristics, etc.


3.3 Modeling and Analysis


   Modeling and analysis are important aspects of Internet traffic
   engineering. Modeling involves constructing an abstract or physical
   representation which depicts relevant traffic and network attributes
   and characteristics.

   Accurate source models for traffic are particularly very useful for
   analysis.  A major research topic in Internet traffic engineering is
   the development of traffic source models that are consistent with
   empirical data obtained from operational networks. Such models should
   also be tractable and amenable to analysis. The topic of source
   models for IP traffic is a research topic and is therefore outside
   the scope of this document; nonetheless its importance cannot be
   over-emphasized.

   A network model is an abstract representation of the network which
   captures relevant network features, attributes, and characteristics,
   such as link and nodal attributes and constraints.  A network model
   may facilitate analysis and/or simulation which can be used to
   predict network performance under various conditions, and also to
   guide network expansion plans.

   Network simulation tools are extremely useful for traffic
   engineering. A good network simulator can be used to mimic and
   visualize network characteristics in various ways under various
   conditions.  For example, a network simulator might be used to depict
   congested resources and hot spots, and to provide hints regarding
   possible solutions to network performance problems. A good simulator
   may also be used to validate the effectiveness of planned solutions
   to network issues without the need to tamper with the operational
   network, or to commence an expensive network upgrade which may not
   achieve the desired objectives. Furthermore, during the process of
   network planning, a network simulator may reveal pathologies such as
   single points of failure which may require additional redundancy, and
   potential bottlenecks and hot spots which may require additional
   capacity. Routing simulators are especially useful. A routing
   simulator may identify planned links which may not actually be used
   to route traffic by the existing routing protocols. Simulators can
   also be used to conduct scenario based and perturbation based
   analysis, as well as sensitivity studies.  Simulation results can be
   used to initiate appropriate actions in various ways. For example, an
   important application of network simulation tools is to investigate
   and identify how best to evolve and grow the network in order to
   accommodate projected future demands.


Awduche/Chiu/Elwalid/Widjaja/Xiao                              [Page 20]


Internet Draft      draft-ietf-tewg-framework-00.txt   Expires July 2000


3.4 Optimization


   Network performance optimization involves resolving network issues
   into concepts that enable a solution, identifying a solution, and
   implementing the solution. Network performance optimization can be
   corrective or perfective. In corrective optimization, the goal is to
   remedy a problem that has occurred or that is incipient. In
   perfective optimization, the goal is to improve network performance
   even when explicit problems do not exist and are not anticipated.

   Network performance optimization is a continual process, as noted
   previously.  Performance optimization iterations may consist of
   real-time optimization sub-processes and non-real-time network
   planning sub-processes.  The difference between real-time
   optimization and network planning is largely in the relative time-
   scale at they operate and in the granularity of actions.  One of the
   objectives of a real-time optimization sub-process is to control the
   mapping and distribution of traffic over the existing network
   infrastructure to avoid and/or relieve congestion, to assure
   satisfactory service delivery, and to optimize resource utilization.
   Real-time optimization is needed because, no matter how well a
   network is designed, random incidents such as fiber cuts or shifts in
   traffic demand will occur.  When they occur, they can cause
   congestion and other problems to manifest in an operational network.
   Real-time optimization must solve such problems in small to medium
   time-scales ranging from micro-seconds to minutes or hours. Examples
   of real-time optimization include queue management, IGP/BGP metric
   tuning, and using technologies such as MPLS explicit LSPs to change
   the paths of some traffic trunks [XIAO].

   One of the functions of the network planning sub-process is to
   initiate actions to evolve the architecture, technology, topology,
   and capacity of a network in a systematic way. When there is a
   problem in the network, real-time optimization should provide an
   immediate fix. Because of the need to respond promptly, the real-time
   solution may not be the best possible solution.  Network planning may
   subsequently be needed to refine the solution and improve the
   situation.  Network planning is also needed to expand the network to
   support traffic growth and changes in traffic distribution over time.
   As noted previously, the outcome of network planning might be a
   change in the topology and/or capacity of the network.

   It can be seen that network planning and real-time performance
   optimization are mutually complementary activities. A well-planned
   and designed network makes real-time optimization easier, while a
   systematic approach to real-time network performance optimization
   allows network planning to focus on long term issues rather than
   tactical considerations. Systematic real-time network performance
   optimization also provides valuable inputs and insights towards
   network planning.

   Stability is a major consideration in real-time network performance
   optimization. This aspect will be reiterated repeatedly throughout


Awduche/Chiu/Elwalid/Widjaja/Xiao                              [Page 21]


Internet Draft      draft-ietf-tewg-framework-00.txt   Expires July 2000


   this memo.


4.0 Historical Review and Recent Developments


   This section presents a brief review of various traffic engineering
   approaches that have been proposed and implemented in
   telecommunications and computer networks. The discussion is not meant
   to be exhaustive, but is primarily intended to illuminate pre-
   existing perspectives and prior art concerning traffic engineering in
   the Internet as well as in legacy telecommunications networks.


4.1 Traffic Engineering in Classical Telephone Networks


   It is useful to begin with a review of traffic engineering in
   telephone networks which often relates to the means by which user
   traffic is steered from the source to the destination.  This
   subsection presents a brief overview of this topic. The book by G.
   Ash [ASH2] contains a detailed description of the various routing
   strategies that have been applied in telephone networks.

   The early telephone network relied on static hierarchical routing,
   whereby routing patterns remained fixed independent of the state of
   the network or time of day. The hierarchy was intended to accommodate
   overflow traffic, improve network reliability via alternate routes,
   and prevent call looping by using strict hierarchical rules.  The
   network was typically over-provisioned since a given fixed route had
   to be dimensioned so that it could carry user traffic during a busy
   hour of any busy day.  Hierarchical routing in the telephony network
   was found to be too rigid with the advent of digital switches and
   stored program control which were able to manage more complicated
   traffic engineering rules.

   Dynamic routing was introduced to alleviate the routing inflexibility
   in the static hierarchical routing so that the network would operate
   more efficiently, thereby resulting in significant economic gains
   [HuSS87].  Dynamic routing typically reduces the overall loss
   probability by 10 to 20 percent as compared to static hierarchical
   routing.  Dynamic routing can also improve network resilience by
   recalculating routes on a per-call basis and periodically updating
   routes.

   There are three main types of dynamic routing in the telephone
   network: time-dependent routing, state-dependent routing (SDR), and
   event dependent routing (EDR).

   In time-dependent routing, regular variations in traffic loads due to
   time of day and seasonality are exploited in pre-planned routing
   tables.  In state-dependent routing, routing tables are updated
   online according to the current state of the network (e.g, traffic
   demand, utilization, etc.).  In event dependent routing, routing


Awduche/Chiu/Elwalid/Widjaja/Xiao                              [Page 22]


Internet Draft      draft-ietf-tewg-framework-00.txt   Expires July 2000


   changes are incepted by events, such as call setups encountering
   congested or blocked links, whereupon new paths are searched out
   using learning models.  EDR methods are real-time adaptive, but do
   not require global state information such as is the case with SDR.
   Examples of EDR schemes include the dynamic alternate routing (DAR)
   from BT, the state-and-time dependent routing (STR) from NTT, and the
   success-to-the-top (STT) routing from AT&T.

   Dynamic non-hierarchical routing (DNHR) is an example of dynamic
   routing that was introduced in the AT&T toll network in the 1980's to
   respond to time-dependent information such as regular load variations
   as a function of time.  Time-dependent information in terms of load
   may be divided into three time scales: hourly, weekly, and yearly.
   Correspondingly, three algorithms are defined to pre-plan the routing
   tables.  Network design algorithm operates over a year-long interval
   while demand servicing algorithm operates on a weekly basis to fine
   tune link sizes and routing tables to correct forecast errors on the
   yearly basis. At the smallest time scale, routing algorithm is used
   to make limited adjustments based on daily traffic variations.
   Network design and demand servicing are computed using off-line
   calculations.  Typically, the calculations require extensive search
   on possible routes.  On the other hand, routing may need online
   calculations to handle crankback.  DNHR adopts a "two-link" approach
   whereby a path can consist of two links at most.  The routing
   algorithm presents an ordered list of route choices between an
   originating switch and a terminating switch.  If a call overflows, a
   via switch (a tandem exchange between the originating switch and the
   terminating switch) would send a crankback signal to the originating
   switch which would then select the next route, and so on, until no
   alternative routes are available in which the call is blocked.


4.2 Evolution of Traffic Engineering in Packet Networks


   This subsection reviews related prior work that was intended to
   improve the performance of data networks.  Indeed, optimization of
   the performance of data networks started in the early days of the
   ARPANET. Other early commercial networks such as SNA also recognized
   the importance of performance optimization and service
   differentiation.

   In terms of traffic management, the Internet has been a best effort
   service environment until recently. In particular, very limited
   traffic management capabilities existed in IP networks to provide
   differentiated queue management and scheduling services to packets
   belonging to different classes.

   In the following subsections, we review the evolution of practical
   implementations of traffic engineering mechanisms in IP networks and
   its predecessors.


Awduche/Chiu/Elwalid/Widjaja/Xiao                              [Page 23]


Internet Draft      draft-ietf-tewg-framework-00.txt   Expires July 2000


4.2.1 Adaptive Routing in ARPANET


   The early ARPANET recognized the importance of adaptive routing where
   routing decisions were based on the current state of the network
   [McQ80].  In the early minimum delay routing approaches, each packet
   was forwarded to its destination along a path for which the total
   estimated transit time is the smallest.  Each node maintained a table
   of network delays, which represented the estimated delay that a
   packet can expect to experience along a given  path toward its
   destination. The minimum delay table was periodically transmitted by
   a node to its neighbors. The shortest path in terms of hop count was
   also propagated to give the connectivity information.

   A drawback of this approach is that dynamic link metrics tend to
   create "traffic magnets" whereby congestion will be shifted from one
   location of a network to another location; essentially creating
   oscillation and instability.


4.2.2 Dynamic Routing in the Internet


   The Internet, which evolved from the APARNET, adopted dynamic routing
   algorithms with distributed control to determine the paths that
   packets should take en-route to their destinations.  The routing
   algorithms themselves are adaptations of shortest path algorithms
   where costs are based on link metrics. In principle, the link metric
   can be based on static or dynamic quantities.  In the static case,
   the link metric may be assigned administratively according to some
   local criteria.  In the dynamic case, the link metric may be a
   function of some congestion measure such as delay or packet loss.

   It was recognized early that static link metric assignment was
   inadequate because it can easily lead to unfavorable scenarios
   whereby some links become congested while some others remain lightly
   loaded.  One of the many reasons for the inadequacy of static link
   metrics is that link metric assignment was often done without
   considering the traffic matrix in the network.  Moreover, the routing
   protocols did not take traffic attributes and capacity constraints
   into account in making routing decisions. The practical implication
   is that traffic concentration is localized in subsets of the network
   infrastructure, potentially causing congestion.  Even if link metrics
   are assigned in accordance with the traffic matrix, unbalanced loads
   in the network can still occur due to a number of reasons, such as:
    - Some resources might not be deployed in the most optimal locations
      from a routing perspective.
    - Forecasting errors in traffic volume and/or traffic distribution.
    - Dynamics in traffic matrix due to the temporal nature of traffic
      patterns, BGP policy change from peers, etc.

   The inadequacy of the legacy Internet interior gateway routing system
   is one of the factors motivating the interest in path oriented
   technologies with explicit routing and constraint-based routing


Awduche/Chiu/Elwalid/Widjaja/Xiao                              [Page 24]


Internet Draft      draft-ietf-tewg-framework-00.txt   Expires July 2000


   capability, such as MPLS.


4.2.3 ToS Routing


   In ToS-based routing, different routes to the same destination may be
   selected depending on the Type-of-Service (ToS) field of an IP packet
   [RFC-1349].  The ToS classes may be classified as low delay and high
   throughput.  Each link is associated with multiple link costs, where
   each link cost is used to compute routes for a particular ToS.  A
   separate shortest path tree is computed for each ToS. Since the
   shortest path algorithm has to be run for each ToS, the computation
   may be quite expensive with this approach.  Classical ToS-based
   routing has become outdated as the IP header field has been replaced
   by a Diffserv field. A more serious technical issue with the
   classical TOS based routing concerns the fact that it is difficult to
   perform effective traffic engineering because each class still relies
   exclusively on shortest path routing.


4.2.4 Equal Cost MultiPath


   Equal Cost MultiPath (ECMP) is another technique that attempts to
   address the deficiency in Shortest Path First (SPF) interior gateway
   routing systems [RFC-2178].  In a SPF algorithm, if two or more paths
   to a given destination have the same cost, the algorithm will choose
   one of them.  In ECMP, the algorithm is modified slightly so that if
   two or more equal shortest cost paths exist between two nodes, the
   traffic between the nodes is distributed among the multiple equal-
   cost paths.  Traffic distribution across the equal-cost paths is
   usually done in two ways: 1) packet-based in a round-robin fashion,
   or 2) flow-based using hashing on source and destination IP
   addresses.  Approach 1) can easily cause out-of-order packets while
   approach 2) is dependent on the number and distribution of flows.
   Flow-based load sharing may be unpredictable in an enterprise network
   where the number of flows is relatively small and heterogeneous
   (i.e., hashing may not be uniform), but is generally effective in
   core public networks where the number of flows is very large.

   Because link costs are static and bandwidth constraints are not taken
   into account, ECMP attempts to distribute the traffic as equally as
   possible among the equal-cost paths independent of the congestion
   status of each path.  As a result, given two equal-cost paths, it is
   possible that one of the paths will be more congested than the other.
   Another drawback of ECMP is that load sharing cannot be done on
   multiple paths which have non-identical costs.


4.3 Overlay Model


   In the overlay model, a virtual-circuit network, such as ATM or frame


Awduche/Chiu/Elwalid/Widjaja/Xiao                              [Page 25]


Internet Draft      draft-ietf-tewg-framework-00.txt   Expires July 2000


   relay, provides virtual-circuit connectivity between routers that are
   located at the edges of the virtual-cirtuit network.  In this mode,
   two routers that are connected through a virtual circuit see a direct
   adjacency between themselves independent of the physical route taken
   by the virtual circuit through the ATM or frame relay network.  Thus,
   the overlay model essentially decouples the logical topology that
   routers see from the physical topology that the ATM or frame relay
   network manages.  The overlay model enables the network operator to
   perform traffic engineering by re-configuring the virtual circuits so
   that a virtual circuit on a congested physical path can be re-routed
   to a less congested one.

   The overlay model requires the management of two separate networks
   (e.g., IP and ATM) which results in increased operational complexity
   and cost.  In the fully-meshed overlay model, each router would peer
   to every other router in the network. Some of the issues with the
   overlay model are discussed in [AWD2].


4.4 Constrained-Based Routing


   Constrained-based routing pertains to a class of routing systems that
   compute routes through a network subject to satisfaction of a set of
   constraints and requirements. The constraints may be imposed by the
   network and/or by administrative policies. Constraints may include
   bandwidth, delay, and policy instruments such as resource class
   attributes. The concept of constraint-based routing in IP networks
   was first defined in [AWD1] within the context of MPLS traffic
   engineering requirements. Unlike QoS routing which generally deals
   with routing traffic flows in order to QoS prescribed QoS
   requirements, constraint-based routing is applicable to traffic
   aggregates as well as flow and may also take policy restrictions into
   account.


4.5 Overview of Recent IETF Projects Related to Traffic Engineering


   This subsection reviews a number of recent IETF activities that are
   pertinent to Internet traffic engineering.


4.5.1 Integrated Services


   The IETF developed the integrated services model which requires
   resources, such as bandwidth and buffers, to be reserved a priori for
   a given traffic flow to ensure that the quality of service requested
   by the traffic flow is satisfied. The integrated services model
   requires additional components beyond those used in the best-effort
   model such as packet classifiers, packet schedulers, and admission
   control.  A packet classifier is used to identify flows that are to
   receive a certain level of service. A packet scheduler handles the


Awduche/Chiu/Elwalid/Widjaja/Xiao                              [Page 26]


Internet Draft      draft-ietf-tewg-framework-00.txt   Expires July 2000


   service of different packet flows to ensure that QoS commitments are
   met.  Admission control is used to determine whether a router has the
   necessary resources to accept a new flow.

   Two services have been defined: guaranteed service [RFC-2212] and
   controlled-load service [RFC-2211]. The guaranteed service can be
   used for applications that require real-time delivery. For this type
   of application, data that is delivered to the application after a
   certain time is generally considered worthless. Thus guaranteed
   service has been designed to provide a firm bound on the end-to-end
   packet delay for a flow.

   The controlled-load service can be used for adaptive applications
   that can tolerate some delay but that are sensitive to traffic
   overload conditions. This type of applications typically function
   satisfactorily when the network is lightly loaded but degrade
   significantly when the network is heavily loaded. Thus, controlled-
   load service has been designed to provide approximately the same
   service as best-effort service in a lightly loaded network regardless
   of actual network conditions. Controlled-load service is described
   qualitatively in that no target values of delay or loss are
   specified.

   The main issue with the Integrated services model has been
   scalability, especially in large public IP networks which may
   potentially have millions of concurrent micro-flows.


4.5.2 RSVP


   RSVP, a soft state signaling protocol, was originally invented as a
   signaling protocol for applications to reserve network resources
   [RFC-2205]. Under RSVP, the sender sends a PATH Message to the
   receiver, specifying the characteristics of the traffic.  Every
   intermediate router along the path forwards the PATH Message to the
   next hop determined by the routing protocol. Upon receiving a PATH
   Message, the receiver responds with a RESV Message to request
   resources for the flow. The RESV message travels to the source in the
   opposite direction along the path through which the PATH message
   traversed. Every intermediate router along the path can reject or
   accept the request of the RESV Message.  If the request is rejected,
   the router will send an error message to the receiver, and the
   signaling process will terminate. If the request is accepted, link
   bandwidth and buffer space are allocated for the flow and the related
   flow state information will be installed in the router.

   One of the issues with the original RSVP specification was
   scalability, because reservations were required for micro-flows, so
   that the amount of state maintained on network increases linearly
   with the number of micro-flows. Recently, however, RSVP has been
   modified and extended in several ways to overcome the scaling
   problems and to enable it to become a versatile signaling protocol
   for IP networks. For example, RSVP has been extended to reserve


Awduche/Chiu/Elwalid/Widjaja/Xiao                              [Page 27]


Internet Draft      draft-ietf-tewg-framework-00.txt   Expires July 2000


   resources for aggregation of flows, to set up MPLS explicit label
   switched paths, and to perform other signaling functions within the
   Internet.


4.5.3 Differentiated Services


   The essence of the Differentiated Services (Diffserv) effort within
   the IETF is to allow traffic to be categorized and divided into
   classes, and subsequently to allow each class to be treated
   differently, especially during times when there is a shortage of
   resources such as link bandwidth and buffer space [RFC-2475].

   Diffserv defines the Differentiated Services field (DS field,
   formerly known as TOS octet) and uses it to indicate the forwarding
   treatment a packet should receive [RFC-2474]. Diffserv also
   standardizes a number of Per-Hop Behavior (PHB) groups. Using
   different classification, policing, shaping and scheduling rules,
   several classes of services can be defined.

   In order for a customer to receive Differentiated Services from its
   Internet Service Provider (ISP), it may be necessary for the customer
   to have a Service Level Agreement (SLA) with the ISP. An SLA may
   explicitly or implicitly specify a Traffic Conditioning Agreement
   (TCA) which defines classifier rules as well as metering, marking,
   discarding, and shaping rules.

   At the ingress to a Diffserv network, packets are classified,
   policed, and possibly shaped. When a packet traverses the boundary
   between different Diffserv domains, the DS field of the packet may be
   re-marked according to existing agreements between the domains.

   In Differentiated Services, there are only a finite and limited
   number of service classes that can be indicated by the DS field. The
   main advantage of the Diffserv approach is scalability: Since
   resources are allocated on a per-class basis, the amount of state
   information is proportional to the number of classes rather than to
   the number of application flows.

   It should be evident from the above discussion that the Diffserv
   model essentially deals with traffic management issues on a per hop
   basis. Thus, the Diffserv control model consists of a collection of
   micro-TE control mechanisms. Other traffic engineering capabilities
   such as capacity management, including routing control, are also
   required in Diffserv networks in order to deliver acceptable service
   quality.


4.5.4 MPLS


   MPLS is an advanced forwarding scheme which also includes extensions
   to conventional IP control plane protocols. MPLS extends the Internet


Awduche/Chiu/Elwalid/Widjaja/Xiao                              [Page 28]


Internet Draft      draft-ietf-tewg-framework-00.txt   Expires July 2000


   routing model, and enhances packet forwarding and path control
   [RoVC].

   Each MPLS packet has a fixed length label affixed to the header. In a
   non-ATM/FR environment, the header contains a 20-bit label, a 3-bit
   Experimental field (formerly known as Class-of-Service or CoS field),
   a 1-bit label stack indicator and an 8-bit TTL field. In an ATM (FR)
   environment, the header contains only a label encoded in the VCI/VPI
   (DLCI) field.  An MPLS capable router, termed Label Switching Router
   (LSR), examines the label and possibly the experimental field in
   forwarding a packet.

   At the ingress LSRs of an MPLS-capable domain, IP packets are
   classified into forwarding equivalence classes (FECs) and routed
   based on a variety of factors, including e.g. a combination of the
   information carried in the IP header of the packets and the local
   routing information maintained by the LSRs. An MPLS header is then
   appended to each packet according to the notion of forwarding
   equivalence classes. Within an MPLS-capable domain, an LSR will use
   the label prependend to packets as the index into a local next hop
   label forwarding entry (NHLFE). The packet is then processed as
   specified in NHLFE.. The incoming label may be replaced by an
   outgoing label and the packet may be switched to the next LSR. This
   label-switching process is very similar to the label (VCI/VPI)
   swapping process in ATM networks.  Before a packet leaves an MPLS
   domain, its MPLS header is removed. The path through which a FEC
   traverses between an ingress LSRs and an egress LSRs is called a
   Label Switched Path (LSP). The path of an explicit LSP is defined at
   the originating (ingress) node of the LSP. MPLS can use a signaling
   protocol such as RSVP or LDP to set up LSPs.

   MPLS is a very powerful technology for Internet traffic engineering
   because it supports explicit LSPs which allow constraint-based
   routing to be implemented efficiently in IP networks.


4.5.5 IP Performance Metrics


   The IPPM WG has been developing a set of standard metrics that can be
   applied to the quality, performance, and reliability of Internet
   services by network operators, end users, or independent testing
   groups [RFC2330], so that users and service providers have accurate
   common understanding of the performance and reliability of the
   Internet component 'clouds' that they use/provide.  Examples of
   performance metrics include one-way packet loss [RFC2680], one-way
   delay [RFC2679], and connectivity measures between two nodes
   [RFC2678]. Other metrics include second-order measures of packet loss
   and delay.

   Performance metrics are useful for specifying Service Level
   Agreements (SLAs), which are sets of service level objectives
   negotiated between users and service providers, where each objective
   is a combination of one or more performance metrics subject to


Awduche/Chiu/Elwalid/Widjaja/Xiao                              [Page 29]


Internet Draft      draft-ietf-tewg-framework-00.txt   Expires July 2000


   constraints.


4.5.6 Flow Measurement


   A flow measurement system enables network traffic flows to be
   measured and analyzed at the flow level for a variety of purposes.
   RTMF has produced an architecture document that defines a method to
   specify traffic flows, and a number of components (meters, meter
   readers, manager) to measure the traffic flows [RFC-2722].  A meter
   observes packets passing through a measurement point, classifies them
   into certain groups, accumulates certain usage data such as the
   number of packets and bytes for each group, and stores the usage data
   in a flow table.  For this purpose, a group may represent a user
   application, a host, a network, a group of networks, any combination
   of the above, etc.  A meter reader gathers usage data from various
   meters so that it can be made available for analysis.  A manager is
   responsible for configuring and controlling meters and meter readers.
   The instructions received by a meter from a manager include flow
   specification, meter control parameters, and sampling techniques.
   The instructions received by a meter reader from a manager include
   the meter's address whose data is to be collected, the frequency of
   data collection, and the types of flows to be collected.


4.5.7 Endpoint Congestion Management


   The work in endpoint congestion management is intended to catalog a
   set of congestion control mechanisms that transport protocols can
   use, and to develop a unified congestion control mechanism across a
   subset of an endpoint's active unicast connections called a
   congestion group.  A congestion manager continuously monitors the
   state of the path for each congestion group under its control, and
   uses that information to instruct a scheduler on how to partition
   bandwidth among the connections of that congestion group.


4.6 Overview of ITU Activities Related to Traffic Engineering


   This section provides an overview of prior work within the ITU-T
   pertaining to traffic engineering in traditional telecommunications
   networks.

   ITU-T Recommendations E.600 [itu-e600], E.701 [itu-e701], and E.801
   [itu-e801] address traffic engineering issues in traditional
   telecommunications networks. Recommendation E.600 provides a
   vocabulary for describing traffic engineering concepts, while E.701
   defines reference connections, Grade of Service (GOS), and traffic
   parameters for ISDN.  Recommendation E.701 uses the concept of a
   reference connection to identify representative cases of different
   types of connections without describing the specifics of their actual


Awduche/Chiu/Elwalid/Widjaja/Xiao                              [Page 30]


Internet Draft      draft-ietf-tewg-framework-00.txt   Expires July 2000


   realizations by different physical means. As defined in
   Recommendation E.600, "a connection is an association of resources
   providing means for communication between two or more devices in, or
   attached to, a telecommunication network."  Also, E.600 defines "a
   resource as any set of physically or conceptually identifiable
   entities within a telecommunication network, the use of which can be
   unambiguously determined" [itu-e600].  There can be different types
   of connections as the number and types of resources in a connection
   may vary.

   Typically, different network segments are involved in the path of a
   connection.  For example, a connection may be local, national, or
   international.  The purposes of reference connections are to clarify
   and specify traffic performance issues at various interfaces between
   different network domains.  Each domain may consist of one or more
   service provider networks.

   Reference connections provide a basis to define grade of service
   (GoS) parameters related to traffic engineering within the ITU-T
   framework.  As defined in E.600, "GoS refers to a number of traffic
   engineering variables which are used to provide a measure of the
   adequacy of a group of resources under specified conditions."  These
   GoS variables may be probability of loss, dial tone delay, etc.  They
   are essential for network internal design and operation, as well as
   component performance specification.

   In the ITU framework, GoS is different from quality of service (QoS).
   QoS is the performance perceivable by a user of a telecommunication
   service and expresses the user's degree of satisfaction of the
   service.  GoS is a set of network oriented measures which
   characterize the adequacy of a group of resources under specified
   conditions. On the other hand, QoS parameters focus on performance
   aspects which are observable at the service access points and network
   interfaces, rather than their causes within the network.  For a
   network to be effective in serving its users, the values of both GoS
   and QoS parameters must be related, with GoS parameters typically
   making a major contribution to the QoS.

   To assist the network provider in the goal of improving efficiency
   and effectiveness of the network, E.600 stipulates that a set of GoS
   parameters must be selected and defined on an end-to-end basis for
   each major service category provided by a network.  Based on a
   selected set of reference connections, suitable target values are
   then assigned to the selected GoS parameters, under normal and high
   load conditions.  These end-to-end GoS target values are then
   apportioned to individual resource components of the reference
   connections for dimensioning purposes.


5.0 Taxonomy of Traffic Engineering Systems


   This section presents a short taxonomy of traffic engineering
   systems. A taxonomy of traffic engineering systems can be constructed


Awduche/Chiu/Elwalid/Widjaja/Xiao                              [Page 31]


Internet Draft      draft-ietf-tewg-framework-00.txt   Expires July 2000


   based on traffic engineering styles and traffic engineering views.
   Such a classification system is shown below:

    - Time-dependent vs State-dependent
    - Offline vs Online
    - Centralized vs Distributed
    - Local vs Global Information
    - Prescriptive vs Descriptive
    - Open Loop vs Closed Loop

   In the following subsections, these classification systems are
   described in greater detail.


5.1 Time-Dependent Versus State-Dependent


   TE methodologies can be classified into two basic types:  time-
   dependent or state-dependent.  In this framework, all TE schemes are
   considered to be dynamic.  Static TE implies that no traffic
   engineering methodology or algorithm is being applied.

   In the time-dependent TE, historical information based on seasonal
   variations in traffic is used to pre-program routing plans.
   Additionally, customer subscription or traffic projection may be
   used.  Pre-programmed routing plans typically change on a relatively
   long time scale (e.g., diurnal). Time-dependent algorithms make no
   attempt to adapt to random variations in traffic or changing network
   conditions. An example of time-dependent algorithm is a global
   centralized optimizer where the input to the system is traffic matrix
   and multiclass QoS requirements described [MR99].

   State-dependent or adaptive TE adapts the routing plans for packets
   based on the current state of the network. The current state of the
   network gives additional information on variations in actual traffic
   (i.e., perturbations from regular variations ) that could not be
   predicted by using historical information.  An example of state-
   dependent TE that operates in a relatively long time scale is
   constraint-based routing, and an example that operates in a
   relatively short time scale is a load-balancing algorithm described
   in [OMP] and [MATE].

   The state of the network can be based on various parameters such as
   utilization, packet delay, packet loss, etc. These parameters in turn
   can be obtained in several ways. For example, each router may flood
   these parameters periodically or by means of some kind of trigger to
   other routers.  An alternative approach is to have a particular
   router that wants to perform adaptive TE to send probe packets along
   a path to gather the state of that path. Yet, another approach is to
   have some management system to gather MIB information from the
   interfaces.  Because of the dynamic nature of the network conditions,
   expeditious and accurate gathering of state information is typically
   critical to adaptive TE.  State- dependent algorithms may be applied
   to increase network efficiency and resilience. While time-dependent


Awduche/Chiu/Elwalid/Widjaja/Xiao                              [Page 32]


Internet Draft      draft-ietf-tewg-framework-00.txt   Expires July 2000


   algorithms are more suitable for predictable traffic variations,
   state-dependent algorithms are more suitable for adapting to the
   prevailing state of the network.


5.2 Offline Versus Online


   Traffic engineering requires the computation of routing plans.  The
   computation itself may be done offline or online.  For the scenarios
   where the routing plans do not need to be executed in real-time, then
   the computation can be done offline.  As an example, routing plans
   computed from forecast information may be computed offline.
   Typically, offline computation is also used to perform extensive
   search on multi-dimensional space.

   Online computation is required when the routing plans need to adapt
   to changing network conditions as in state-dependent algorithms.
   Unlike offline computation which can be computationally demanding,
   online computation is geared toward simple calculations to fine-tune
   the allocations of resources such as load balancing.


5.3 Centralized Versus Distributed


   With centralized control, there is a central authority which
   determines routing plans on behalf of each router.  The central
   authority collects the network-state information from all routers,
   and returns the routing information to the routers periodically.  The
   routing update cycle is a critical parameter which directly impacts
   the performance of the network being controlled.  Centralized control
   may need high processing power and high bandwidth control channels.

   With distributed control, route selection is determined by each
   router autonomously based on the state of the network.  The network
   state may be obtained by the router using some probing method, or
   distributed by other by routers on a periodic basis.


5.4 Local Versus Global


   TE algorithms may require local or global network-state information.
   It is to be noted that the scope network-state information does refer
   to the scope of the optimization. In other words, it is possible for
   a TE algorithm to perform global optimization based on local state
   information. Similarly, a TE algorithm may arrive at a local optimum
   solution even if it relies on global state information.

   Global information pertains to the state of the entire domain that is
   being traffic engineered. Examples include traffic matrix, or loading
   information on each link.  Global state information is typically
   required with centralized control.  In some cases, distributed-


Awduche/Chiu/Elwalid/Widjaja/Xiao                              [Page 33]


Internet Draft      draft-ietf-tewg-framework-00.txt   Expires July 2000


   controlled TEs may also need global information.

   Local information pertains to the state of a portion of the domain.
   Examples include the bandwidth and packet loss rate of a particular
   path.  Local state information may be sufficient for distributed-
   controlled TEs.


5.5 Prescriptive Versus Descriptive


   Prescriptive traffic engineering evaluates alternatives and
   recommends a course of action. Prescriptive traffic engineering can
   be further categorized as either corrective or perfective. Corrective
   TE prescribes a course of action to address an existing or predicted
   anomaly. Perfective TE prescribes a course of action to evolve and
   improve network performance even when no anomalies are evident.

   Descriptive traffic engineering characterizes the state of the
   network and assesses the impact of various policies without
   recommending any particular course of action.


5.6 Open-Loop Versus Closed-Loop


   Open-loop control is where control action does not use any feedback
   information from the current network state. The control action may,
   however, use its own on local information for accounting purposes.

   Closed-loop control is where control action utilizes feedback
   information from the network state. The feedback information may be
   in the form historical information or current measurement.


6.0 Requirements for Internet Traffic Engineering


   This section describes the some high level requirements and
   recommendations for traffic engineering in the Internet. Because this
   is a framework document, these requirements are presented in very
   general terms. Additional documents to follow may elaborate on
   specific aspects of these requirements in greater detail.

   [NOTE: THIS SECTION IS AN INITIAL VERSION OF THE HIGH LEVEL TE
   REQUIREMENTS. IT WILL BE REVISED OVER TIME TO EXTEND AND REFINE IT.]


6.1 Generic Requirements


   Usability:  In general, it is desirable to have a TE system that can
   be readily deployed in an existing network. It is also desirable to
   have a TE system that is easy to operate and maintain.


Awduche/Chiu/Elwalid/Widjaja/Xiao                              [Page 34]


Internet Draft      draft-ietf-tewg-framework-00.txt   Expires July 2000


   Automation:  Whenever feasible, a TE system should automate the
   traffic engineering functions to minimize operator intervention in
   the control of operational networks.

   Scalability:  Contemporary public networks are growing very fast with
   respect to network size and traffic volume.  Therefore, a TE system
   SHOULD be scalable to remain applicable as the network evolves. In
   particular, a TE system SHOULD remain functional as the network
   expands with regard to the number of routers and links, and with
   respect  to the traffic volume.  A TE system SHOULD have a scalable
   architecture, SHOULD not adversely impair other functions and
   processes in a network element, and SHOULD not consume too much
   network resources when collecting and distributing state information
   or when exerting control.

   Stability:  Stability is a very important consideration in traffic
   engineering systems that respond to changes in the state of the
   network.  State-dependent traffic engineering methodologies typically
   mandate a tradeoff between responsiveness and stability.  It is
   strongly RECOMMENDED that when tradeoffs are warranted between
   responsiveness and stability, that the tradeoff should be made in
   favor of stability (especially in public IP backbone networks).

   Flexibility:  A TE system SHOULD be flexible to allow for changes in
   optimization policy. In particular, a TE system SHOULD provide
   sufficient configuration options so that a network administrator can
   tailor the TE system to a particular environment.  It may also be
   desirable to have both online and offline TE subsystems which can be
   independently enabled and disabled.  In multiclass networks, TE
   systems SHOULD also have options that support class based performance
   optimization.

   Observability:  As part of the TE system, mechanisms SHOULD exist to
   collect statistics from the network and to analyze them to determine
   how well the network is functioning.  Derived statistics such as
   traffic matrices, link utilization, latency, packet loss, and other
   performance measures or interest which are derived from network
   measurements can be used as indicators of prevailing network
   conditions.  Other examples of status information which should be
   observed include existing functional routes, and e.g. in the context
   of MPLS existing LSP routes, etc.

   Simplicity:  Generally, a TE system should be as simple as possible
   consistent with the intended applications. More importantly, the TE
   system should be relatively easy to use (i.e., clean, convenient, and
   intuitive user interfaces).  Simplicity in user interface does not
   necessarily imply that the TE system will use naive algorithms. Even
   when complex algorithms and internal structures are used, such
   complexities should be hidden as much as possible from the network
   administrator through the user interface.

   Congestion management: A TE system SHOULD map the traffic onto the
   network to minimize congestion.  If the total traffic load cannot be
   accommodated, then a TE system may rely on short time scale


Awduche/Chiu/Elwalid/Widjaja/Xiao                              [Page 35]


Internet Draft      draft-ietf-tewg-framework-00.txt   Expires July 2000


   congestion control mechanisms to mitigate congestion.  A TE system
   SHOULD be compatible with and complement existing congestion control
   mechanisms.  It is generally desirable to minimize the maximum
   resource utilization per service in an operational network.  The use
   of trunk reservation technique may also be useful in some situations.

   Survivability: It is critical for an operational network to recover
   promptly from network failures and to maintain the required QoS for
   existing services.  Survivability generally mandates introducing
   redundancy into the architecture, design, and operation of networks.
   There is a tradeoff between the level of survivability that can be
   attained and the cost required to attain it. The time required to
   restore a network service from a failure depends on several factors,
   including the particular context in which the failure occurred, the
   architecture and design of network, the characteristics of the
   network elements and network protocols, the applications and services
   that were impacted by the failure, etc. The extent and impact of
   service disruptions due to a network failure or outage can vary
   depending on the length of the outage, the part of the network where
   the failure occurred, the type and criticallity of the network
   resources that were impaired by the failure, the types of services
   that were impacted by the failure (e.g., voice quality degradation
   may be tolerable for an inexpensive VoIP service, but not be
   tolerable for a toll-quality VoIP service). Survivability can be
   addressed at the device level by developing network elements that are
   more reliable; and at the network level by incorporating redundancy
   into the architecture, design, and operation of networks.  It is
   recommended that a philosophy of robustness and survivability should
   be adopted in the architecture, design, and operation of IP networks
   (expecially public IP networks) and network elements. At the same
   time, because different contexts may demand different levels of
   survivability, the mechanisms developed to support network
   survivability should be flexible so that they can be tailored to
   different needs.


6.2 Routing Requirements


   [NOTE: THIS SECTION IS STILL WORK IN PROGRESS]

   Routing control is one of the most significant aspects of Internet
   traffic engineering.  Traditional IGPs which are based on shortest
   path algorithms have limited control capabilities for traffic
   engineering.  These limitations include:

   1. The well know issues with shortest path protocols. Since IGPs
   always use the shortest paths to forward traffic, load sharing cannot
   be done among paths of different costs.  Using shortest paths to
   forward traffic conserves network resources, but it may cause the
   following problems: 1) If traffic from a source to a destination
   exceeds the capacity of the shortest path, the shortest path will
   become congested while a longer path between these two nodes is
   under-utilized; 2) the shortest paths from different sources can


Awduche/Chiu/Elwalid/Widjaja/Xiao                              [Page 36]


Internet Draft      draft-ietf-tewg-framework-00.txt   Expires July 2000


   overlap at some links. If the total traffic from different sources
   exceeds the capacity of any of these links, congestion will occur.
   Such problems occur because traffic demand changes over time but
   network topology cannot be changed as rapidly, causing the network
   architecture to become suboptimal over time.

   2. Equal-Cost Multi-Path (ECMP) supports sharing of traffic among
   equal cost paths between two nodes. However, ECMP attempst to divide
   the traffic as equally as possible among the equal cost shortest
   paths. Generally, ECMP does not support configurable load splitting
   ratios among equal cost paths.  The result is that in the aggregate,
   one of the paths may carry significantly more traffic than other
   paths because it also may also carry traffic from other sources.

   3. Modifying IGP metric to control traffic distribution tends to have
   network-wide effect. Consequently, undesirable and unanticipated
   traffic shifts can be triggered as a result.

   Because of these limitations, new capabilities are needed to control
   the routing function in IP networks.  Some of these capabilities are
   described below.

   Constraint-based routing is highly desirable in IP networks,
   especially public IP backbones with complex topologies [AWD1].
   Constraint-based routing computes routes that fulfil some
   requirements subject to constraints.  Constraints may include
   bandwidth, hop count, delay, and administrative policy instruments
   such as resource class attributes [AWD1, RFC-2386].  This makes it
   possible to select that satisfy a given set of requirements subject
   to network and administrative policy constraints. Routes computed
   through constraint-based routing are not necessarily the shortest
   paths. Constraint-based routing works best with path oriented
   technologies that support explicit routing such as MPLS.

   Constraint-based routing can also be used as a means to redistribute
   traffic onto the infrastructure, even for best effort traffic.  For
   example, is the bandwidth constraints are set the bandwidth
   constraint of the paths and reservable bandwidth of the link
   properly, the congestion caused by uneven traffic distribution as
   described above can be avoided.  The performance and resource
   efficiency of the network is thus improved.

   In order compute routes subject to constraints, a number of
   enhancements are needed to conventional link state IGPs such as OSPF
   and IS-IS. The basic extensions required are outlined in [Li-IGP].
   Specializations of these requirements to OSPF were described in
   [KATZ] and to IS-IS in [SMIT].  Essentially, these enhancements
   require the propagation of additional information in link state
   advertisements. Specifically, in addition to normal link-state
   information, an enhanced IGP is required to propagate a number of
   topology state information that are needed for constraint-based
   routing. Some of the additional topology state information include
   link attributes such as: 1) reservable bandwidth, and 2) link
   resource class attribute which is an administratively specified


Awduche/Chiu/Elwalid/Widjaja/Xiao                              [Page 37]


Internet Draft      draft-ietf-tewg-framework-00.txt   Expires July 2000


   property of the link. The resource class attribute concept was
   defined in [AWD1].  The additional topology state information is
   carried in new TLVs or sub-TLVs in IS-IS, or in the Opaque LSA in
   OSPF [SMIT, KATZ].

   An enhanced link-state IGP may flood information more frequently than
   a normal IGP. This is because even without changes in topology,
   changes in reservable bandwidth or link affinity can trigger the
   enhanced IGP to initiate flooding.  In order to avoid consuming
   excessive link bandwidth and computational resources, a tradeoff is
   typically required between the timeliness of the information flooded
   and the flooding frequency.

   In a TE system, it is also desirable for the routing subsystem to
   make load splitting ratio among multiple paths (with equal cost or
   different cost) configurable.  This capability gives network
   administrators more flexibility in controlling traffic distribution,
   and can be very useful for avoiding/relieving congestion in some
   situations. Examples can be found in [XIAO].

   Another desirable feature of the routing system is the capability to
   control the route of subsets of traffic without affecting the routes
   of other traffic; provided that sufficient resources exist for this
   purpose. This capability allows more refined control over the
   distribution of traffic accross the network.  For example, the
   capability to move traffic from a source to a destination away from
   its original path to another path without affecting the paths of
   other traffic allows traffic to moved from resource-poor network
   segments to resource-rich segments. Path oriented technologies such
   as MPLS support this capability naturally.

   If the network supports multiple classes of service, the routing
   subsystem SHOULD have the capability to select different paths for
   different classes of traffic.


6.3 Traffic Mapping Requirements


   Traffic mapping pertains to the assignment of the traffic to the
   network topology to meet certain requirements and optimize resource
   usage.  Traffic mapping can be performed by time-dependent or state-
   dependent mechanisms, as described in Section 5.1.  A TE system
   SHOULD support both time-dependent and state-dependent mechanisms.

   For the time-dependent mechanism:
       - a TE system SHOULD maintain traffic matrices.
       - a TE system SHOULD have an algorithm that generates a mapping
         plan for each traffic trunk.
       - In certain environments (e.g., MPLS) a TE system SHOULD be
         able to control the path from any source to any destination;
         e.g., with explicit routing.
       - a TE system SHOULD be able to setup multiple paths to forward
         traffic from any source to any destination, and distribute the


Awduche/Chiu/Elwalid/Widjaja/Xiao                              [Page 38]


Internet Draft      draft-ietf-tewg-framework-00.txt   Expires July 2000


         traffic among them based on a configurable traffic split.
       - a TE system SHOULD provide a graceful migration from one
         mapping plan to another as the traffic matrix changes to
         minimize service disruption.

   For the state-dependent mechanism:
       - a TE system SHOULD be able to gather and maintain link state
         information, for example, by using enhanced OSPF or IS-IS.
       - for a given demand request, QoS requirements, and other
         constraints, a TE system SHOULD be able to compute and setup a
         path, for example, by using constraint-based routing.
       - a TE system SHOULD be able to perform load balancing among
         multiple paths. Load balancing SHOULD NOT compromise the
         stability of the network.

   In general, a TE system SHOULD support modification of IGP link
   metrics to induce changes in the traffic mapping patterns.


6.4 Measurement Requirements


   The importance of measurement in traffic engineering has been stated
   previously.  In order to support the traffic engineering function,
   mechanisms SHOULD be provided to measure and collect statistics from
   the network.  Additional capabilities may be provided to help in the
   analysis of the statistics.  The actions of these mechanisms SHOULD
   not adversely affect the accuracy and integrity of the statistics
   collected. The mechanisms for statistical data acquisition SHOULD
   also be able to scale as the network evolves.

   Traffic statistics may be classified according to time scales, which
   may be long-term or short-term.  Long-term traffic statistics are
   very useful for traffic engineering. Long-term time scale traffic
   statistics MAY capture or reflect seasonality network workload (e.g.,
   hourly, daily, and weekly variations in traffic profiles; etc.).  For
   a network that supports multiple classes of service, aspects of the
   monitored traffic statistics MAY also reflect class of service
   characteristics.  Analysis of the long-term traffic statistics MAY
   yield secondary statistics such as busy hour characteristics, traffic
   growth patterns, persistent congestion and hot-spot problems within
   the network, imbalances in link utilization caused by routing
   anomalies, etc.

   There SHOULD also be a mechanism for constructing traffic matrices
   for both long-term and short-term traffic statistics. In multiservice
   IP networks, the traffic matrices MAY also be constructed for
   different service classses.  Each element of a traffic matrix
   represents a statistic of traffic flow between a pair of abstract
   nodes.  An abstract node may represent a router, a collection of
   routers, or a site in a VPN.

   At the short-term time scale, traffic statistics SHOULD provide
   reasonable and reliable indicators of the current state of the


Awduche/Chiu/Elwalid/Widjaja/Xiao                              [Page 39]


Internet Draft      draft-ietf-tewg-framework-00.txt   Expires July 2000


   network.  In particular, some traffic statistics SHOULD reflect link
   utilization, and link and path congestion status. Examples of
   congestion indicators include excessive packet delay, packet loss,
   and high resource utilization.  Examples of mechanisms for
   distributing such information including SNMP, probing techniques,
   FTP, and IGP link state advertisements, etc.


6.5 Network Survivability


   Network survivability refers to the capability of the network to
   maintain service continuity in the presence of failures within the
   network. This can be accomplished by promptly recovering from network
   failures and maintaining the required QoS for existing services after
   recovery. Survivability has become an issue of great concern to the
   Internet community with the increasing demands to carry mission
   critical traffic, real-time traffic, and other high priority traffic
   over the Internet. As network technologies continue improve, failure
   protection and restoration capabilities have become available from
   multiple layers. At the bottom of the layered stack, optical networks
   are now capable of providing dynamic ring and mesh restoration
   functionality as well as traditional protection functionality. For
   instance, the SONET/SDH layer provides survivability capability with
   Automatic Protection Switching (APS), as well as self-healing ring
   and mesh architectures. Similar functionality are provided by layer 2
   technologies such as ATM (generally with slower mean restoration
   times). At the IP layer, rerouting is used to restore service
   continuity following link and node outages. Rerouting at the IP layer
   occurs after a period of routing convergence, which may require
   seconds to minutes to complete. In order to support advanced
   survivability requirements, path-oriented technologies such a MPLS
   can be used to enhance the survivability of IP networks; in a
   potentially cost effective manner. The advantages of path oriented
   technologies such as MPLS for IP restoration becomes even more
   evident when class based protection and restoration capabilities are
   required.

   Recently, a common suite of control plane protocols has been proposed
   for both MPLS and optical transport networks under the acronym
   Multiprotocol Lambda Switching [AWD5]. This new paradigm of
   Multiprotocol Lambda Switching will support even more sophisticated
   mesh restoration capabilities at the optical layer for the emerging
   IP over WDM network architectures.

   Another important aspect regarding multi-layer survivability is that
   various technologies at different layers provide protection and
   restoration capabilities at different temporal granularities (i.e.,
   in terms of time scales) and at different bandwidth granularity (from
   packet-level to wavelength level). Protection and restoration
   capabilities can also be aware or unaware of different service
   classes.

   As noted previously, the impact of service outages varies


Awduche/Chiu/Elwalid/Widjaja/Xiao                              [Page 40]


Internet Draft      draft-ietf-tewg-framework-00.txt   Expires July 2000


   significantly for different service classes depending on the
   effective duration of the outage.  The duration of an outage can vary
   from milliseconds (with minor service impact), to seconds (with
   possible call drops for IP telephony and session time-outs), to
   minutes and hours (with potentially considerable social and business
   impact).

   Generally, it is a challenging task to cordinate different protection
   and restoration capabilities across multiple layers in a cohesive
   manner so as to ensure that network survivability is maintained at
   reasonable cost. Protection and restoration coordination across
   layers may not always be feasible, because, for example, networks at
   different layers might belong to different administrative domains.

   In the following paragraphs, some of the general requirements for
   protection and restoration coordination are highlighted.

   - Protection/restoration capabilities from different layers SHOULD be
   coordinated whenever feasible and appropriate in order to provide
   network survivability in a flexible and cost effective manner. One
   way to achieve the coordination is to minimize function duplication
   across layers. Escalation of alarms and other fault indicators from
   lower layers to higher layers may also be performed in a coordinated.
   A temporal order of restoration triger timing at different layers is
   another way to coordinate multi-layer protection/restoration.

   - Spare capacity at higher layers is often regarded as working
   traffic at lower layers. Placing protection/restoration functions in
   many layers may increase redundancy and robustness, but it SHOULD not
   result in significant and avoidable inneficiencies in network
   resource utilization.

   - It is generally desirable to have a protection/restoration scheme
   that is bandwidth efficient.

   - Failure notification throughout the network SHOULD be timely and
   reliable.

   - Alarms and other fault monitoring and reporting capabilities SHOULD
   be provided at appropriate layers.


6.5.1 Survivability in MPLS Based Networks


   MPLS is an important emerging technology that enhances IP networks in
   terms of features and services. Because MPLS is path-oriented it can
   potentially provide faster and more predictable protection and
   restoration capabilities than conventional IP systems. This
   subsection provides an outline of some of the basic features and
   requirements of MPLS networks regarding protection and restoration. A
   number of Internet drafts also discuss protection and restoration
   issues in MPLS networks (see e.g., [ACJ99], [MSOH99], and [Shew99]).


Awduche/Chiu/Elwalid/Widjaja/Xiao                              [Page 41]


Internet Draft      draft-ietf-tewg-framework-00.txt   Expires July 2000


   Protection types for MPLS networks can be categorized into link
   protection, node protection, path protection, and segment protection,
   as discussed below.

   - Link Protection: The goal of link protection is to protect an LSP
   from a given link failure. Under link protection, the path of the
   protect or backup LSP (also called secondary LSP) is disjoint from
   the path of the working or operational LSP at the particular link
   over which protection is required. When the protected link fails,
   traffic on the working LSP is switched over to the protect LSP at the
   head-end of the failed link. This is a local repair method which can
   be potentially fast. It might be more appropriate in situations where
   some network elements along a given path are less reliable than
   others.

   - Node Protection: The goal of LSP node protection is to protect an
   LSP from a given node failiure. Under node protection, the path of
   the protect LSP is disjoint from the path of the working LSP at
   particular node that is to be protected. The secondary path is also
   disjoint from the primary path at all links associated with the node
   to be protected. When the node fails, traffic on the working LSP is
   switched over to the protect LSP at the upstream LSR that directly
   connects to the failed node.

   - Path Protection: The goal of LSP path protection is to protect an
   LSP from failure at any point along its routed path. Under path
   protection, the path of the protect LSP is completely disjoint from
   the path of the working LSP. The advantage of path protection is that
   the protect LSP protects the working LSP from all possible link and
   node failures along the path, except for failures that might occur at
   the ingress and egress LSRs. Additionally, since the path selection
   is end-to-end, path protection mign yield more efficient in terms of
   resource usage than link or node protection.  However, in general,
   path protection may be slower than link and node protection.

   - Segment Protection: In some cases, an MPLS domain may be
   partitioned into multiple protection domains whereby a failure in a
   protection domain is rectified with that domain.  In cases where an
   LSP traverses multiple protection domains, a protection mechanism
   within a domain only needs to protect the segment of the LSP that
   lies within the domain. Segment protection will generally be faster
   than path protection because recovery generally occurs closer to the
   fault.


   Protection option: Anoter issue to consider is the concept of
   protection options. It can be described in general using the notation
   m:n protection where m is the number of protect LSPs used to protect
   n working LSPs. In the following, some feasible protection options
   are described.

   - 1:1: one working LSP is protected/restored by one protect LSP;

   - n:1: one working LSP is protected/restored by n protect LSPs,


Awduche/Chiu/Elwalid/Widjaja/Xiao                              [Page 42]


Internet Draft      draft-ietf-tewg-framework-00.txt   Expires July 2000


   perhaps with configurable load splitting ratio. In situations where
   more than one protect LSP is used, it may be desirable to share the
   traffic accross the protect LSPs when the working LSP fails in so as
   to satisfy the bandwidth requirement of the traffic trunk associated
   with the working LSP, especially when it may not be feasible to find
   one path that can satisfy the the bandwidth requirement of the
   primary LSP;

   - 1:n: one protection LSP is used to protect/restore n working LSPs;

   - 1+1: traffic is sent cocurrently on both the working LSP and the
   protect LSP. In this case, the egress LSR selects one of the two LSPs
   based on some location traffic integrity decision process.  This
   option would probably not be used pervasively in IP networks due to
   its inefficiency in terms of resource utilization.


   Resilience Attributes:

   - Basic attribute: reroute using IGP or protection LSP(s) when a
   segment of the working path fails, or no rerouting at all.

   -  Extended attributes:

   1. Protection LSP establishment attribute: the protection LSP is i)
   pre-established, or ii) established-on-demand after receiving failure
   notification. Pre-established protection LSP can be faster while
   established-on-demand one can potentially find a more optimal path
   and with more efficient resource usage.

   2. Constraint attribute under failure condition: the protection LSP
   requires certain constraint(s) to be satisfied, which can be the same
   or less than the ones under normal condition, e.g., bandwidth
   requirement, or choose to use 0-bandwidth requirement under any
   failure condition.

   3. Protection LSP resource reservation attribute: resource allocation
   of a pre-established protection LSP is, i) pre-reserved, or ii)
   reserved-on-demand after receiving failure notification;

   A pre-established and pre-reserved protection LSP can guarantee that
   the QoS of existing services is maintained upon failure while a pre-
   established and reserve-on-demand one or an established-on-demand one
   may not be able to. In addition, it is the fastest among the three.
   It can switch packets on the protection LSP once the ingress LSR
   receives the failure notification message without experiencing any
   delay for resource availability checking and protection LSP
   establishment. However, a pre-established protection LSP may not be
   able to adapt to any new change in the network since its
   establishment if there could be a better path due to the change. In
   addition, the bandwidth being reserved on the protection LSP is
   subtracted from the available bandwidth pool on all associated links,
   hence, not available for admitting new LSPs in the future. On the
   other hand, it differs from SONET protection in terms that the


Awduche/Chiu/Elwalid/Widjaja/Xiao                              [Page 43]


Internet Draft      draft-ietf-tewg-framework-00.txt   Expires July 2000


   reserved bandwidth does not sit idle, instead it can be used by any
   traffic presents on those links.  Now, comparing a pre-established
   protection LSP and an established-on-demand one, the former is
   potentially faster since it only needs to wait to check if the
   requested bandwidth is available on the pre-established path without
   waiting for the path to be set up. However, if the requested
   bandwidth is not available on the pre-established path, it may choose
   to use an established-on-demand one as a second option.

   Failure Notification:

   Failure notification SHOULD be reliable and fast enough, i.e., at
   least in the same order as IGP notification, which is through LSA
   flooding, if not faster.


6.6 Content Distribution (Webserver) Requirements


   The Internet is dominated by client-server interactions, especially
   Web traffic. The location of major information servers has a
   significant impact on the traffic patterns within the Internet, and
   on the perception of service quality by end users.

   A number of dynamic load balancing techniques have been devised to
   improve the performance of replicated Web servers. The impact of
   these techniques is that the traffic becomes more dynamic in the
   Internet, because Web servers can be dynamically picked based on the
   locations of the clients, and the relative performance of different
   networks or different parts of a network.  This process can be called
   Traffic Directing (TD).  It is similar to Traffic Engineering but is
   at the application layer.

   Scheduling systems in TD that allocate servers to in replicated,
   geographically dispersed information distribution systems may require
   performance parameters of the network to make effective decisions.
   It is desirable that the TE system provide such information.  The
   exact parameters needed are to be defined. When there is congestion
   in the network, the TD and TE systems SHOULD act in a coordinated
   manner. This topic is for further study.

   Because TD can introduce more traffic dynamics into a network,
   network planning SHOULD take this into consideration.  It can be
   desirable to reserve a certain amount of extra capacity for the links
   to accommodate this additional traffic fluctuation.


6.7 Offline Traffic Engineering Support Systems


   If optimal link efficiency is desired, an offline and centralized
   traffic engineering support system MAY be provided as an integral
   part of an overall TE system.  An offline and centralized traffic
   engineering support system can be used to compute the paths for the


Awduche/Chiu/Elwalid/Widjaja/Xiao                              [Page 44]


Internet Draft      draft-ietf-tewg-framework-00.txt   Expires July 2000


   traffic trunks. By taking all the trunk requirements, link attributes
   and network topology information into consideration, an offline TE
   support system can typically find a better trunk placement than an
   online TE system, where every router in the network finds paths
   originated from it in a distributed manner based on its own
   information.  An offline TE support system may compute paths for
   trunks periodically, e.g., daily, for the purpose of re-optimization.
   The computed paths can then be downloaded into the routers. An online
   TE  support system is still needed, so that routers can adapt to
   changes promptly.


6.8 Traffic Engineering in Diffserv Environments


   [NOTE: THIS SECTION IS WORK IN PROGRESS AND WILL BE UPDATED IN THE
   NEXT VERSION OF  DRAFT]

   Traffic engineering will be very important in Diffserv environments.
   This section describes the traffic engineering features and
   requirements that are specifically pertinent to Differentiated
   Services (Diffserv) capable IP networks.


7.0 Multicast Considerations


   For further study.


8.0 Inter-Domain Considerations


   Inter-domain traffic engineering is concerned with the performance
   optimization for traffic that originates in one administrative domain
   and terminates in a different one.

   Traffic exchange between autonomous occurs through exterior gateway
   protocols. Currently, BGP-4 [bgp4] is the defacto EGP standard.

   Traditionally, in the public Internet, BGP based policies are used to
   control import and export policies for inter-domain traffic.  BGP
   policies are also used to determine exit and entrance points to and
   from peer networks.

   Inter-domain TE is inherently more difficult than intra-domain TE.
   The reasons for this are both technical and administrative.
   Technically, the current version of BGP does not propagate topology
   and link state information outside accross domain boundaries.
   Administratively, there are differences in operating costs and
   network capacities between domains, and what may be considered a good
   solution in one domain may not necessarily be a good in another
   domain. Moreover, it would generally be considered imprudent for one


Awduche/Chiu/Elwalid/Widjaja/Xiao                              [Page 45]


Internet Draft      draft-ietf-tewg-framework-00.txt   Expires July 2000


   domain to permit another domain to influence the routing and control
   of traffic in its network.

   When Diffserv becomes widely deployed, inter-domain TE will become
   even more important, but more challenging to address.

   MPLS TE-tunnels (explicit LSPs) add a degree of flexibility in terms
   of selection of exit points for inter-domain routing.  The concept of
   relative and absolute metrics were defined in [SHEN]. If the BGP
   attributes are defined such that the BGP decision process depends on
   IGP metrics to select exit points for Inter-domain traffic, then some
   inter-domain traffic destined to a given peer network can be made to
   prefer a given exit point by establishing a TE-tunnel between the
   router making the selection to the peering point via a TE-tunnel and
   assigning the TE-tunnel a metric which is smaller than the IGP cost
   to all other peering points. If a peer accepts and processes MEDs,
   then a similar MPLS TE-tunnel based scheme can be applied to cause
   certain entrance point to be preferred by setting MED to be the IGP
   cost, which has been modified by the tunnel metric.

   Similar to intra-domain TE, Inter-domain TE is best accomplished when
   a traffic matrix can be derived.  traffic matrix for inter-domain
   traffic.

   Generally, redistribution of inter-domain traffic requires
   coordination between peering partners. Any export policy in one
   domain that results load redistribution across peer points can
   significantly affect the traffic distribution inside the domain of
   the peering partner. This, in turn, will affect the intra-domain TE
   due to changes in the intra-domain traffic matrix. Therefore, it is
   critical for peering partners to negotiate and coordinate with each
   other before attemping any policy changes that may result in
   significant shifts in inter-domain traffic. In practice, this
   coordination can be quite challenging for technical and non-technical
   reasons.

   It is a matter of speculation as to whether MPLS, or similar
   technologies, can be extended to allow selection of constrained-paths
   across domain boundaries.


9.0 Conclusion


   This document described a framework for traffic engineering in the
   Internet.  It presented an overview of some of the basic issues
   surrounding traffic engineering in IP networks. The context of TE was
   described, a TE process models and a taxonomy of TE styles were
   presented.  A brief historical review of pertinent developments
   related to traffic engineering was provided. Finally, the document
   specified a set of generic requirements, recommendations, and options
   for Internet traffic engineering.


Awduche/Chiu/Elwalid/Widjaja/Xiao                              [Page 46]


Internet Draft      draft-ietf-tewg-framework-00.txt   Expires July 2000


10.0 Security Considerations


   This document does not introduce new security issues.


11.0 Acknowledgments


   The authors would like to thank Jim Boyle for inputs on the
   requirements section, Francois Le Faucheur for inputs on class-type,
   and Gerald Ash for inputs on routing in telephone networks.  The
   subsection describing an "Overview of ITU Activities Related to
   Traffic Engineering" was adapted from a contribution by Waisum Lai.


12.0 References


   [ACJ99] L. Anderson, B. Cain, and B. Jamoussi, "Requirement Framework
   for Fast Re-route with MPLS", Work in progress, October 1999.

   [ASH1] J. Ash, M. Girish, E. Gray, B. Jamoussi, G. Wright,
   "Applicability Statement for CR-LDP," Work in Progress, 1999.

   [ASH2] J. Ash, Dynamic Routing in Telecommunications Networks, McGraw
   Hill, 1998

   [AWD1] D. Awduche, J. Malcolm, J. Agogbua, M. O'Dell, J. McManus,
   "Requirements for Traffic Engineering over MPLS," RFC 2702, September
   1999.

   [AWD2] D. Awduche, "MPLS and Traffic Engineering in IP Networks,"
   IEEE Communications Magazine, December 1999.

   [AWD3] D. Awduche, L. Berger, D. Gan, T. Li, G. Swallow, and V.
   Srinivasan "Extensions to RSVP for LSP Tunnels," Work in Progress,
   1999.

   [AWD4] D. Awduche, A. Hannan, X. Xiao, " Applicability Statement for
   Extensions to RSVP for LSP-Tunnels" Work in Progress, 1999.

   [AWD5] D. Awduche et al, "An Approach to Optimal Peering Between
   Autonomous Systems in the Internet," International Conference on
   Computer Communications and Networks (ICCCN'98), October 1998.

   [AWD6] D. Awduche, Y. Rekhter, J. Drake, R. Coltun, "Multiprotocol
   Lambda Switching: Combining MPLS Traffic Engineering Control with
   Optical Crossconnects," Work in Progress, 1999.

   [CAL] R. Callon, P. Doolan, N. Feldman, A. Fredette, G. Swallow, A.
   Viswanathan, A Framework for Multiprotocol Label Switching," Work in
   Progress, 1999.


Awduche/Chiu/Elwalid/Widjaja/Xiao                              [Page 47]


Internet Draft      draft-ietf-tewg-framework-00.txt   Expires July 2000


   [FGLR] A. Feldmann, A. Greenberg, C. Lund, N. Reingold, and J.
   Rexford, "NetScope: Traffic Engineering for IP Networks," to appear
   in IEEE Network Magazine, 2000.

   [FlJa93] S. Floyd and V. Jacobson, "Random Early Detection Gateways
   for Congestion Avoidance", IEEE/ACM Transactions on Networking, Vol.
   1 Nov. 4., August 1993, p. 387-413.

   [Floy94] S. Floyd, "TCP and Explicit Congestion Notification", ACM
   Computer Communication Review, V. 24, No. 5, October 1994, p. 10-23.

   [HuSS87] B.R. Hurley, C.J.R. Seidl and W.F. Sewel, "A Survey of
   Dynamic Routing Methods for Circuit-Switched Traffic", IEEE
   Communication Magazine, Sep 1987.

   [itu-e600] ITU-T Recommendation E.600, "Terms and Definitions of
   Traffic Engineering", March 1993.

   [itu-e701] ITU-T Recommendation E.701 "Reference Connections for
   Traffic Engineering", October 1993.

   [JAM] B. Jamoussi, "Constraint-Based LSP Setup using LDP," Work in
   Progress, 1999.

   [Li-IGP] T. Li, G. Swallow, and D. Awduche, "IGP Requirements for
   Traffic Engineering with MPLS," Work in Progress, 1999

   [LNO96] T. Lakshman, A. Neidhardt, and T. Ott, "The Drop from Front
   Strategy in TCP over ATM and its Interworking with other Control
   Features", Proc. INFOCOM'96, p. 1242-1250.

   [MATE] I. Widjaja and A. Elwalid, "MATE: MPLS Adaptive Traffic
   Engineering," Work in Progress, 1999.

   [McQ80] J.M. McQuillan, I. Richer, and E.C. Rosen, "The New Routing
   Algorithm for the ARPANET", IEEE. Trans. on Communications, vol. 28,
   no. 5, pp. 711-719, May 1980.

   [MR99] D. Mitra and K.G. Ramakrishnan, "A Case Study of Multiservice,
   Multipriority Traffic Engineering Design for Data Networks, Proc.
   Globecom'99, Dec 1999.

   [MSOH99] S. Makam, V. Sharma, K. Owens, C. Huang,
   "Protection/Restoration of MPLS Networks", Work in Progress, October,
   1999.

   [OMP] C. Villamizar, "MPLS Optimized OMP", Work in Progress, 1999.

   [RFC-1349] P. Almquist, "Type of Service in the Internet Protocol
   Suite", RFC 1349, Jul 1992.

   [RFC-1458] R. Braudes, S. Zabele, "Requirements for Multicast
   Protocols," RFC 1458, May 1993.


Awduche/Chiu/Elwalid/Widjaja/Xiao                              [Page 48]


Internet Draft      draft-ietf-tewg-framework-00.txt   Expires July 2000


   [RFC-1771] Y. Rekhter and T. Li, "A Border Gateway Protocol 4 (BGP-
   4), RFC 1771, March 195.

   [RFC-1812] F. Baker (Editor), "Requirements for IP Version 4
   Routers," RFC 1812, June 1995.

   [RFC-1997] R. Chandra, P. Traina, and T. Li, "BGP Community
   Attributes" RFC 1997, August 1996.

   [RFC-1998] E. Chen and T. Bates, "An Application of the BGP Community
   Attribute in Multi-home Routing," RFC 1998, August 1996.

   [RFC-2178] J. Moy, "OSPF Version 2", RFC 2178, July 1997.

   [RFC-2205] R. Braden, et. al., "Resource Reservation Protocol (RSVP)
   - Version 1 Functional Specification", RFC 2205, September 1997.

   [RFC-2211] J. Wroclawski, "Specification of the Controlled-Load
   Network Element Service", RFC 2211, Sep 1997.

   [RFC-2212] S. Shenker, C. Partridge, R. Guerin, "Specification of
   Guaranteed Quality of Service," RFC 2212, September 1997

   [RFC-2215] S. Shenker, and J. Wroclawski, "General Characterization
   Parameters for Integrated Service Network Elements", RFC 2215,
   September 1997.

   [RFC-2216] S. Shenker,  and J. Wroclawski, "Network Element Service
   Specification Template", RFC 2216, September 1997.

   [RFC-2330] V. Paxson et al., "Framework for IP Performance Metrics",
   RFC 2330, May 1998.

   [RFC-2386] E. Crawley, R. Nair, B. Rajagopalan, and H. Sandick, "A
   Framework for QoS-based Routing in the Internet", RFC 2386, Aug.
   1998.

   [RFC-2475] S. Blake et al., "An Architecture for Differentiated
   Services", RFC 2475, Dec 1998.

   [RFC-2597] J. Heinanen, F. Baker, W. Weiss, and J. Wroclawski,
   "Assured Forwarding PHB Group", RFC 2597, June 1999.

   [RFC-2678] J. Mahdavi and V. Paxson, "IPPM Metrics for Measuring
   Connectivity", RFC 2678, Sep 1999.

   [RFC-2679] G. Almes, S. Kalidindi, and M. Zekauskas, "A One-way Delay
   Metric for IPPM", RFC 2679, Sep 1999.

   [RFC-2680] G. Almes, S. Kalidindi, and M. Zekauskas, "A One-way
   Packet Loss Metric for IPPM", RFC 2680, Sep 1999.

   [RFC-2722] N. Brownlee, C. Mills, and G. Ruth, "Traffic Flow
   Measurement: Architecture", RFC 2722, Oct 1999.


Awduche/Chiu/Elwalid/Widjaja/Xiao                              [Page 49]


Internet Draft      draft-ietf-tewg-framework-00.txt   Expires July 2000


   [RoVC] E. Rosen, A. Viswanathan, R. Callon, "Multiprotocol Label
   Switching Architecture," Work in Progress, 1999.

   [Shew99] S. Shew, "Fast Restoration of MPLS Label Switched Paths",
   draft-shew-lsp-restoration-00.txt, October 1999.

   [SLDC98] B. Suter, T. Lakshman, D. Stiliadis, and A. Choudhury,
   "Design Considerations for Supporting TCP with Per-flow Queueing",
   Proc. INFOCOM'99, 1998, p. 299-306.

   [XIAO] X. Xiao, A. Hannan, B. Bailey, L. Ni, "Traffic Engineering
   with MPLS in the Internet", IEEE Network magazine, March 2000.

   [YaRe95] C. Yang and A. Reddy, "A Taxonomy for Congestion Control
   Algorithms in Packet Switching Networks", IEEE Network Magazine, 1995
   p. 34-45.

   [SMIT] H. Smit and T. Li, "IS-IS extensions for Traffic
   Engineering,"Internet Draft, Work in Progress, 1999

   [KATZ] D. Katz, D. Yeung, "Traffic Engineering Extensions to
   OSPF,"Internet Draft, Work in Progress, 1999

   [SHEN] N. Shen and H. Smit, "Calculating IGP routes over Traffic
   Engineering tunnels" Internet Draft, Work in Progress, 1999.


Awduche/Chiu/Elwalid/Widjaja/Xiao                              [Page 50]


Internet Draft      draft-ietf-tewg-framework-00.txt   Expires July 2000


13.0 Authors' Addresses:


      Daniel O. Awduche
      UUNET (MCI Worldcom)
      22001 Loudoun County Parkway
      Ashburn, VA 20147
      Phone: 703-886-5277
      Email: awduche@uu.net

      Angela Chiu
      AT&T Labs
      Room C4-3A22
      200 Laurel Ave.
      Middletown, NJ 07748
      Phone: (732) 420-2290
      Email: alchiu@att.com

      Anwar Elwalid
      Lucent Technologies
      Murray Hill, NJ 07974, USA
      Phone: 908 582-7589
      Email: anwar@lucent.com

      Indra Widjaja
      Fujitsu Network Communications
      Two Blue Hill Plaza
      Pearl River, NY 10965, USA
      Phone: 914-731-2244
      Email: indra.widjaja@fnc.fujitsu.com

      Xipeng Xiao
      Global Crossing
      141 Caspian Court,
      Sunnyvale, CA 94089
      Email: xipeng@globalcenter.net
      Voice: +1 408-543-4801


Awduche/Chiu/Elwalid/Widjaja/Xiao                              [Page 51]