Internet Engineering Task Force                     C-Y Lee
INTERNET DRAFT                                      L. Andersson
Expires April 2000                                  Nortel Networks
                                                    Ken Carlberg
                                                    SAIC
                                                    Bora Akyol
                                                    Pluris
                                                    October 1999

                Engineering Paths for Multicast Traffic
                   <draft-leecy-multicast-te-01.txt>

Status of this memo

   This document is an Internet-Draft and is in full conformance with
   all provisions of Section 10 of RFC2026.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet- Drafts as reference
   material or to cite them other than as "work in progress."

  
     The list of current Internet-Drafts can be accessed at
     http://www.ietf.org/ietf/1id-abstracts.txt

     The list of Internet-Draft Shadow Directories can be accessed at
     http://www.ietf.org/shadow.html.


Abstract

   This document describes a solution to engineer paths for IP multicast
   traffic in a network, by directing the control messages to setup
   multicast trees on engineered paths. This enables the network
   operator to have control over the topology of multicast trees.

   This proposal partitions the multicast traffic engineering problem
   such that multicast routing protocols do not have to be modified to
   allocate resources for multicast traffic nor do resource allocation
   protocols such as RSVP or CR-LDP have to be able to setup forwarding
   states (in this case labels) like multicast routing protocols.

   Resources are allocated on the same trip that paths are selected and
   setup.  This prevent the problem of data being forwarded on branches
   of the tree where resources have not being allocated yet. An
   important aspect of this proposal is that it enables multicast paths

Expires April 2000                                              [Page 1]

Internet Draft  Engineering Paths for Multicast Traffic       April 2000

   to be engineered in an aggregatable manner, allowing this solution to
   scale in the backbone.

1. Overview

   In general, traffic is engineered to traverse certain paths so as to
   utilize resources in a network in a more optimal manner, while at the
   same time improving the level of service that can be offered.

   In conventional IP routing, traffic may be engineered to use a path
   by configuring preferred links towards a destination with a lower
   metric. This method only allows traffic to be engineered based on the
   destination address.  Since the forwarding is based on the
   destination address only, traffic cannot be engineered based on other
   attributes (which maybe useful for traffic engineering purposes) of
   the packet such as the source address of a packet or the requested
   service level.  In contrast, MPLS abstracts the forwarding paradigm
   and allows traffic to be forwarded based on attributes (known as
   forwarding equivalence class (FEC) in MPLS) in addition to the
   destination address. This provides a versatile and convenient syntax
   for traffic engineering purposes.

   This document describes a way to provide a basic traffic engineering
   mechanism for multicast. Traffic Engineering (TE) functionalities (in
   the MPLS entity) are used to decide where to forward the join control
   messages of multicast protocols, based on different traffic
   engineering requirements and to allocate resources. (Note that
   multicast data packets however are forwarded based on Layer 3 (L3)
   address information and are not label switched. )

   Using this basic multicast traffic engineering mechanism, ISPs can
   define particular FECs for their network, resources required to
   receive traffic from certain root prefix, decrease fanouts at a node
   by limiting the number paths towards the node(prefix), allowing only
   certain paths to carry multicast traffic, experiment with heuristics
   to better engineer multicast trees, use a function to dynamically
   compute suitable paths based on current or predicted network
   resources. All these additional network or content provider specific
   functions to engineer traffic can be developed independently of the
   basic multicast traffic engineering scheme.

2.0 Motivation

   The fundamental problem with doing multicast Traffic Engineering (TE)
   is the difficulty in doing it in a scalable manner. Multicast routes
   are very difficult (and some claim impossible) to aggregate. One can
   associate a label with a unicast route(prefix) and packets sent to
   that destination can be aggregated by associating them with the

Expires April 2000                                              [Page 2]

Internet Draft  Engineering Paths for Multicast Traffic       April 2000

   label.

   Since multicast routes are not aggregatable in general, associating a
   label with a multicast route implies per flow/group resource
   allocation. In essence, this kind of association will result in RSVP
   (or ATM) style resource allocation and is more applicable to per flow
   QOS than traffic engineering.

   In contrast the approach taken in this proposal decouples traffic
   engineering from multicast route setup, thereby allowing the
   resources and paths for multicast data delivery to be independently
   allocated.  What this implies is, resources  and paths can be
   aggregated and engineered; and traffic can be statistically
   multiplexed, enabling network operators to provide differentiated
   services for multicast traffic in a scalable manner.

3.0 Scope

   This draft described mechanisms which is applicable to multicast
   routing protocols such as PIM-SM, CBT, BGMP, Express or Simple
   Multicast, which will be called 'control driven' in this draft. The
   mechanisms to handle 'Data driven' or flood and prune protocols (eg
   DVMRP and PIM-DM) is FFS. This proposal assumes a multicast
   group/tree has a common service level requirement. It is  envisaged
   that heterogeneous receivers requirement can be met by layer encoding
   data in different multicast groups or other variation of layer
   encoding.

   It should be noted that the MPLS concepts of interest here are the
   FEC, ERO and resource allocation and path selection. Although the
   proposed scheme do not use label switching the solution is described
   in MPLS terms since the concepts of interest here have already been
   defined in MPLS.

4.0 Approach

   A control driven multicast routing protocol sends a 'join' message to
   graft a node to a multicast distribution tree, creating multicast
   routes in the process. Since the join messages are forwarded based on
   unicast routes, if the conventional routing table is used, the
   multicast routes setup will be based on conventional routes.  To
   constrain multicast paths, the join message can be sent via paths,
   computed or statically configured.

   This draft describes a scheme where multicast routing control
   messages (including join messages) are forwarded by the TE entity in
   a router on the constraint path.

Expires April 2000                                              [Page 3]

Internet Draft  Engineering Paths for Multicast Traffic       April 2000

   To allow a router to process control messages, the control messages
   should contain the router alert option. The control message is
   identified at the egress router by its FEC.  Based on the FEC, the
   MPLS entity can derive the path the control message should take and
   allocate resources as specified.  A multicast routing protocol would
   setup the forwarding state on the ports/interface where the join is
   received. To enable the establishment of multicast forwarding state
   based on constraint (unicast) routes, multicast routing protocols
   which verify the Reverse Path Forwarding (RPF) must turn off this
   check or be able to obtain the 'constraint' RPF via a Constraint
   Based Routing (CBR) API.  To prevent redundant data and loops, a loop
   avoidance scheme based on the concepts described in [MPLS-LOOP-AVOID]
   or [SM] can be used in the routing protocol. If there is a loop, the
   routing protocol should not create forwarding states for the group on
   the port where the join is received.

   Other alternatives to send the join on the engineered path such as -
   extending CR-LDP/TE-RSVP to send and merge joins for the multicast
   tree associated with a label - changing the multicast routing
   protocol to send the join along the explicit route, either require
   multicast routing protocol functionalities to be present in MPLS or
   MPLS functionalities to be incorporated into multicast routing
   protocols.  This proposal uses MPLS (label and explicit route object)
   to cause engineered paths to be selected but forward data using
   multicast routing.  It does not require MPLS or multicast routing
   protocols to be merged, an exercise which tend to - result in
   redundant or the reinventing, of functionalities at L2/L3; increase
   the complexity of multicast traffic engineering while not providing
   any means of aggregating multicast traffic engineering.

   The alternative approaches listed above require traffic to be
   engineered for each group/tree since multicast labels/routes are most
   likely to be not aggregatable. Each group must be assigned a
   different label as well.  In contrast this proposal allows a network
   provider to aggregate the engineered path towards a root or root
   prefix (since resource allocation and path selection can be
   independent of the setup of forwarding states/routes). The root
   prefix could be a subnet or domain. Multicast traffic in the backbone
   network can then be, provisioned in a more scalable manner and
   statistically multiplexed on the (aggregated) engineered paths.

5.0 Procedure

5.1 At the Egress Router

   At any egress router (a router where multicast data exits the
   network) the IP fields of interest in the control message (referred
   to as FEC here, for lack of a better term), the associated path

Expires April 2000                                              [Page 4]

Internet Draft  Engineering Paths for Multicast Traffic       April 2000

   selection mechanisms are defined in a Traffic Configuration table.
   These FECs correlate to the control messages of routing protocols.
   (eg, destination = root prefix/target-node address, ToS=codepoint).
   Note that the message carrying this information traverses the network
   from egress to ingress.  The path selection mechanisms can be based
   on, a static table or a Constraint Based Routing (CBR) table, or a
   path selection algorithm (dynamic).  The resources required for the
   FEC can be statically configured at the egress router or obtain via
   other means as described in [MC_DS_PROV].

   Figure 1 shows the passage of control messages in an egress router
   (dotted lines) and the interface between the various entities in the
   router (+++ lines)

   When a join message arrives at the egress router the packet is
   processed by the appropriate multicast routing protocol, to setup
   multicast forwarding states. If there are already forwarding states,
   a join message is discarded, otherwise, the multicast routing
   protocol calls an API provided by the Multicast Traffic Engineering
   (MCTE) entity to get the next hop to the root.

   The form of the API is represented in terms of the following:

     get_MCTE_next_hop(Target-Node, Group);

   Target-Node is a mandatory value. The value of Target-Node is in the
   form of an IP address.  Group is not required for (a)-(c) and
   optional for (d) below. The return value is the next hop to the
   Target-Node.

   The MCTE entity :  a) obtains the route from conventional routing if
   no path or path selection mechanism is specified in the Traffic
   Configuration table, or b) obtains the manually configured explicit
   route in the Traffic Configuration table or c) obtains the explicit
   routes via a CBR process (Refer to [MPLS-TE] and [ISIS-TE]/[OSPF-TE]
   for details) or d) invokes the path selection algorithm, specified in
   the Traffic Configuration table.  (Note: the routes in (a)-(c) are
   based on the network topology, whereas (d) may take into account the
   tree topology in the computation of routes)

   The MCTE entity stores the route(s) obtained or computed for this
   FEC, and used these routes when it prepends a MCTE header in the
   control message later.

   The form of the API provided by the path selection algorithm in (d)
   above is represented in terms of the following:

     get_MCTE_route(Target-Node, Group, Type-of-Metric)

Expires April 2000                                              [Page 5]

Internet Draft  Engineering Paths for Multicast Traffic       April 2000

   Target-Node is a mandatory value, and the rest are optional in their
   usage or applicability.  The value of Target-Node is in the form of
   an IP address. The return value is a list of explicit route(s).
   (Note: currently, the above API assumes IPv4.  A different API will
   be used for IPv6)

   The other parameters of the API are optional. The Group represent an
   added level of granularity by which network administrators can base
   their traffic engineering decisions (e.g this allows per group/flow
   traffic engineering).  (Note: currently, port values are not included
   due to the common practice of correlating session to group address).

   Finally, the Type-of-Metric value correlates to different types of
   metrics used to distinguish one path from another.  The default value
   is (1), which correlates to hop count.  Other defined values consist
   of:  (2) bandwidth, (4) delay, and (8) fan-out.  In cases where the
   underlying algorithm (of get_MCTE_route) does not support metrics
   other than hop count, this field is ignored. The Type-of-Metric is
   specified with the path selection algorithm in the Traffic
   Configuration table.

                          ----------
                         | MCTE API |
                          ----------
                              +
                              +
                   -------------------------
                   | Multicast  Routing    |
                   -------------------------
                        ^            |
                        |            |
                        |            v
     ____________       |          ------------       ______________________
     | IP|Ctl Msg| ---->|          |  MCTE    | ----> | IP | MCTE | Ctl Msg
|
     _____________                 -----------
_______________________
                                       +
                                       +
                                       +
                                  ---------------
                                 | FEC,Path and  |
                                 | Resource      |
                                 | Specification |
                                 ----------------

                  Fig. 1 At the egress (wrt data flow) router

Expires April 2000                                              [Page 6]

Internet Draft  Engineering Paths for Multicast Traffic       April 2000

   After the multicast forwarding states are setup, the control message
   is forwarded towards the root. If the control message matches a
   defined FEC, it is diverted to the MCTE entity. How the outgoing
   control message is diverted to the MCTE entity is implementation
   dependent. The MCTE entity calls an API provided by the MRP
   (Multicast Routing protocol)to find out whether the control message
   is a path setup (join), path teardown (leave) message or other
   maintenance message. If it is a path setup, resources specified in
   the Traffic Configuration table is allocated, if it is a path
   teardown message the resources are deallocated. If it is a
   maintenance control message, the control message is forwarded as is
   without any MCTE header and will be forwarded by the multicast
   routing protocol in intermediate routers as per normal.

   If it is either a path setup or path teardown message, the MCTE
   entity prepends a MCTE header - containing the FEC, explicit routes
   (provided by the path selection mechanism) resources required (e.g
   Traffic Parameter, service level) and the protocol id of the control
   message. The IP protocol id is set to IPPROTO_MCTE.

   The MCTE header is placed between the IP header and the control
   message.  Resources as specified in the Traffic Configuration table
   are allocated/deallocated before the MCTE message is forwarded to the
   next hop returned by the path selection mechanism specified.  To
   allow other routers to process this MCTE message (which includes the
   control message), the packet will be labeled as Router Alert.

   5.2 At the Intermediate Routers

   Figure 2 shows the passage of control messages in an intermediate
   router (dotted lines) and the interface between the various entities
   in the router (+++ lines)

                           ----------
                           | MCTE API|
                           ----------
                              +
                              +
                   -------------------------
                   | Multicast  Routing    |
                   -------------------------
                          ^              |
             __________   |              |   __________
            |IP|Ctl Msg|  |              |  |IP|Ctl Msg|
            ____________  |              |  ____________
                          |              v
    _______________     ----------     ------------       ________________

Expires April 2000                                              [Page 7]

Internet Draft  Engineering Paths for Multicast Traffic       April 2000

   |IP|MCTE|CtlMsg|---> |  MCTE  |     |   MCTE   | ----> |IP|MCTE|Ctl Msg|
   |              |     | Entity |     | Entity   |       |               |
   ________________     ----------     ------------       _________________
                            +           +
                            +           +
                            +           +
                          ----------------
                          | MCTE         |
                          | State        |
                          ----------------

                  Fig. 2 At an intermediate router

   When the next hop (or other intermediate nodes) receives the packet
   with Router Alert, it will be taken out of the forwarding path and
   directed to the MCTE entity since the IP protocol id is IPPROTO_MCTE.

   The MCTE entity allocates/deallocates the resources requested by the
   MCTE message, creates a transient state for the MCTE message, called
   the MCTE state, for short. The appropriate mutlicast routing protocol
   (MRP), depending on the value of protocol id in the MCTE message, is
   then invoked. The exact mechanisms used in the router to accomplish
   this is implementation dependent.

   The MRP creates the forwarding state for the group and forwards the
   join message towards the root. As in the egress router,  the next hop
   towards the root is obtained from an MCTE API. Since the FEC for this
   control message matches the MCTE state created earlier, the join
   message is diverted to the MCTE entity. The MCTE entity placed the
   corresponding MCTE header on the control message and forwards the
   message to the next hop. The transient MCTE state is removed at this
   point.

   Note that the FEC is only configured at the egress router (wrt to
   multicast data), intermediate routers are informed of the FEC
   information by previous hops. Similarly, the explicit or constraint
   route is only configured or computed at the egress router; the next
   hop and other intermediate nodes learn of the explicit routes via the
   explcit route list propagated from the egress router.

5.3 Loops

   If the MPLS control message specifies looping explicit routes :

   * then if the tree is uni-directional, only the join message will
   loop.  Data will not loop since data flow is only in one direction

Expires April 2000                                              [Page 8]

Internet Draft  Engineering Paths for Multicast Traffic       April 2000

   from root to members.

   * then if the tree is bi-directional, the join message will loop, but
   because permanent states would not be established in this case, data
   will not be forwarded on the looping path.

   However if there is a change in next hop towards the root at a node
   where there is already an existing forwarding state, then multicast
   routing protocols which uses bi-directional trees or a hybrid of
   uni-directional and bi-directional branches could invoke a loop
   avoidance procedure. One way to avoid loops in this case is (using
   splice message) described in [SM] and [MPLS-LOOP-AVOID].

6.0 Path Selection

   This proposal allows different path selection algorithms to be used,
   depending on the FEC and path selection mechanism association.  Paths
   can be configured, computed, discovered or obtain through other
   means.

   A path selection mechanism will return the constraint routes given
   for e.g the group address, root of multicast tree and possibly other
   criteria. How the paths are selected are independent of this
   proposal, but a generic interface (API) between path selection
   algorithms and this multicast traffic engineering scheme is required
   and is specified in Section 5.1.

7.0 Applications

   This section list some possible applications of this proposal.

   a) A network operator may define an explicit route [Rx, Ry, Rz]
   towards a domain with prefix 10.0.0.0 for multicast traffic.  Any
   member joining a group where the root address has the prefix 10.0.0.0
   will have data delivered to it via the explicit route [Rz, Ry, Rx]
   (data is in the reverse direction of the join control message).

   This explicit route may be a Loose Source Route, or a route
   calculated by an algorithm eg an Internal Gateway Protocol (IGP)
   which can provide constraint based routes.

   It is worth noting that the explicit route can be the desired path
   from a root towards a member instead of the reverse path (from member
   towards the root).

   b) Another variation of the above may define an additional field of
   interest in the FEC, the TOS. This will allow a network operator, to
   engineer paths or/and provision resources for traffic requiring

Expires April 2000                                              [Page 9]

Internet Draft  Engineering Paths for Multicast Traffic       April 2000

   Expedited Forwarding [EF] or Assured Forwarding [AF]. (Refer to
   [MCPROV]).

   c) To decrease fanout, egress routers (where multicast data traffic
   exits) can obtain the contraint routes towards the root of the tree
   and construct the tree along these paths instead.  These routes can
   be statically configured or provided by an algorithm which takes into
   account fanout in route computation and this can be developed
   independently of the basic TE scheme described in this proposal.

   d) Load Balancing - a load balancing algorithm can provide an
   alternative path that a control message can take depending on the
   service level requirement of the group and the current utilization of
   the equal cost paths.

   e) Policy routing - Different paths may be defined for different
   groups.

8.0 Acknowledgments
   The authors are grateful to Dirk Ooms and Yunzhou Li for reviewing
   this draft and their helpful suggestions to improve this proposal,
   Jamal Hadi-Salim for his technical advice and Jon Crowcroft for
   providing insightful comments.

References

   [ARCH] E. Rosen, A. Viswanathan, R. Callon, "Multiprotocol Label
   Switching Architecture", Work in Progress, July 1998.

   [MPLS-TE] Awduche, D. et al., "Requirements for Traffic Engineering over
    MPLS", Internet Draft, draft-ietf-mpls-traffic-eng-00.txt, October 1998.

   [CRLDP] L. Andersson, A. Fredette, B. Jamoussi, R. Callon, P. Doolan,
   N. Feldman, E. Gray, J. Halpern, J. Heinanen T. E. Kilty, A. G.
   Malis, M. Girish, K. Sundell, P. Vaananen, T. Worster, L. Wu, R.
   Dantu, "Constraint-Based LSP Setup using LDP", Work in Progress,
   January, 1999.

   [ISIS_TE] Smit, H. and T. Li, "ISIS Extensions for Traffic Engineering,"
   draft-ietf-isis-traffic-00.txt, work in progress.

   [OSPF-TE], D Katz, D Yeung, "Traffic Engineering Extensions to OSPF",
   draft-katz-yeung-ospf-traffic-00.txt

   [TE-RSVP] D. Awduche, L. Berger, D-H. Gan, T. Li, G. Swallow,
   Vijay Srinivasan,

Expires April 2000                                             [Page 10]

Internet Draft  Engineering Paths for Multicast Traffic       April 2000

   Internet Draft, draft-ietf-mpls-rsvp-lsp-tunnel-02.txt, September 1999

   Multicast Routing with resource reservation,
   Journal of High Speed Networks 7 (1998) 113-139,
   B. Rajagopalan, R. Nair

   CBT, Core Based Tree Multicast Routing,
   Internet-Draft, March 1998, Ballardie, Cain, Zhang

   PIM-SM, Protocol independent multicast-sparse mode Specification,
   RFC-2117, June 1997
   Estrin, Farinacci, Helmy, Thaler, Deering, Handley,
   Jacobson, Liu, Sharma, and Wei.

   BGMP, Border Gateway Multicast Protocol Specification,
   Internet-Draft, March 1998, Thaler, Estrin, Meyers

   Express, H. Holbrook, D. Cheriton
   Sigcomm Paper

   SM, Simple Multicast, Internet-Draft, March 1999,
   draft-perlman-simple-multicast-02.txt, Perlman et al

   YAM, K. Carlberg, J. Crowcroft
   Hipparch 1998

   [MPLS-LOOP-AVOID] "Avoiding Loops in MPLS", Internet Draft,
   draft-leecy-mpls-loop-avoid-00.txt, June 1999
   C-Y Lee, L. Andersson, Y. Ohba,

   [CLARK] D. Clark and J. Wroclawski, "An Approach to Service
   Allocation in the Internet", Internet Draft

   [DSHEAD]  K. Nichols and S. Blake, "Definition of the
   Differentiated Services Field (DS Byte) in the IPv4 and IPv6
   Headers", Internet Draft, May 1998.

   [AF]  J.Heinanen, F.Baker, W. Weiss, J. Wroclawski
   Assured Forwarding PHB Group RFC2597, June 1999

   [EF]  V.Jacobson, K. Nichos, K. Poduri,
   Expedited Forwarding Per Hop Behavior, RFC2598, June 1999

   [MCPROV] C-Y Lee,
   Provisioning Resources for Multicast Traffic in a
   Differentiated Services Network, Internet Draft October 1999

Expires April 2000                                             [Page 11]

Internet Draft  Engineering Paths for Multicast Traffic       April 2000

Authors' Information

   Cheng-Yin Lee
   Nortel Networks
   PO Box 3511, Station C
   Ottawa, ON K1Y 4H7, Canada
   leecy@nortelnetworks.com

   Loa Andersson
   Nortel Networks Inc
   Kungsgatan 34, PO Box 1788
   111 97 Stockholm
   Sweden
   Phone: +46 8 441 78 34
   obile: +46 70 522 78 34
   email: loa_andersson@nortelnetworks.com

   Ken Carlberg
   SAIC
   S 1-2-8
   1710 Goodridge Drive
   McLean, VA.  22102
   carlberg@time.saic.com

   Bora Akyol
   Pluris Terabit Network Systems
   10445 Bandley Drive
   Cupertino, CA 95014
   USA
   akyol@pluris.com
   Phone: (408) 861-3302
   Fax: (408) 863-0271
   email: akyol@pluris.com

Expires April 2000                                             [Page 12]