INTERNET DRAFT D. Ooms, R. Hoebeke, P. Cheval Alcatel L. Wu Cisco February, 2001 Expires August, 2001 MPLS Multicast Traffic Engineering Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Abstract There are several reasons for operators to construct multicast trees by another means than multicast routing protocols. This document lists these reasons and describes 2 ways of building a multicast traffic-engineered tree: root-initiated tree and leaf-initiated tree. Finally it defines extensions to CR-LDP to support MPLS multicast traffic engineering. Table of Contents 1. Introduction 2. Motivation 3. Description 4. Root-Initiated Traffic Engineered Tree 5. Leaf-Initiated Traffic Engineered Tree Ooms, et al. Expires August 2001 [Page 1] Internet Draft draft-ooms-mpls-multicast-te-00.txt February 2001 6. Extensions for CR-LDP 6.1. MPLS Multicast Tree ID 6.2. TLVs 6.2.1. Explicit Tree Object (EXPLICIT-TREE TLV) 6.2.2. Tree-Hop Object (ER-HOP TLV extension) 6.3. Messages 6.3.1. Label Request Message 6.3.2. Join Message 6.4. Multicast FEC Elements 6.4.1. (*, G) FEC Element 6.4.2. (S, G) FEC Element 7. Security Considerations 1. Introduction Multicast routing protocols, e.g. PIM-SM [PIM-SM], construct shortest paths trees towards the sender or a core node (i.e. shortest path in the direction from the receiver to the sender or core). The use of these multicast routing protocols enables the trees to be highly dynamic: receivers (and senders) can join/leave fairly easy without affecting other branches of the distribution tree. This happens at the cost of constructing non-optimal trees. 2. Motivation In this draft the protocol extensions will be defined that allow the creation of multicast trees by other means than multicast routing protocols. The reasons for enabling this are: 1. Some network operators don't inject BGP routes into their backbone (e.g. they create a full mesh of LSPs). The result is that SSM [SSM] can not be deployed, because the intermediate nodes in the network do not know where to send the PIM-SM Join messages to (if the source is in another AS). 2. Some applications of multicast do not require very dynamic trees, e.g. content distribution to proxy servers or subtrees in backbone networks. For these applications it becomes worthwhile to create more optimal distribution trees. Multicast trees constructed by multicast routing protocols are not optimal because: a. These trees are only shortest path if the paths are symmetric. This is a false assumption in the current Internet. b. Shortest-path trees themselves are non-optimal. In Figure 1 at the left the topology of a fictitious network is depicted. It is assumed that all link metrics are 1. There is one sender S and three Ooms, et al. Expires August 2001 [Page 2] Internet Draft draft-ooms-mpls-multicast-te-00.txt February 2001 receivers Ri. The tree constructed by a multicast routing protocol is depicted in the middle of Figure 1, it consumes resources on 6 links. An optimal Steiner tree is depicted at the right in Figure 1, it consumes resources on only 4 links. The calculation of Steiner trees is known to be an NP-complete problem, but several heuristics exist to approximate this type of tree [STEINER]. R------R1 R------R1 R R1 / | / | / | / | S--R------R2 S--R------R2 S--R------R2 \ | \ | \ | \ | R------R3 R------R3 R R3 topology shortest-path tree Steiner tree cost=6 cost=4 Figure 1 3. In a next step - similar to unicast traffic engineering [MPLS-TE] - an MPLS multicast tree (a point-to-multipoint LSP) can be built which is automatically computed by a suitable entity based on QoS and policy requirements, taking into consideration the network state. In this case an IGP is needed. Both, OSPF [CR-OSPF] and IS-IS [CR-ISIS] can be used for this purpose. The LSA information is flooded throughout an AS, the point-to-multipoint tree is calculated based on the pruned CR topology. 3. Description This document does not specify how the trees are calculated. It assumes that there is an entity (sender, core, offline tool, ...) that calculates the tree. The tree can be built as a result of a multicast SLA between a core network and its access network, or can be triggered dynamically by PIM Join/Prune messages from the access network. The actual mechanisms, processes, and algorithms used to trigger and compute explicitly routed trees are beyond the scope of this specification (and are not a subject for standardization). Once the tree is calculated, MPLS will be used to establish the tree in the network. This document will specify the extensions to MPLS signaling protocols that allow the establishment of pre-calculated trees. The mechanism in this draft constructs multicast trees immediately on L2. Thus the mapping of L3 trees onto L2, as described in [MPLS-MC], is not needed here. Ooms, et al. Expires August 2001 [Page 3] Internet Draft draft-ooms-mpls-multicast-te-00.txt February 2001 When a multicast packet arrives at the root of an MPLS multicast tree, after the classification, an MPLS label is imposed to the packet. Then, at the subsequent hops, the LSR looks up the forwarding table with the incoming label, finds out all the downstream routers and corresponding outgoing labels, makes the replications, and forwards them to the downstream routers with the outgoing labels. The MPLS tree tunnel concept proposed in the draft applies to a core network model. It does not apply to the end-host workstation. 4. Root-Initiated Traffic Engineered Tree A root-initiated traffic engineered tree is a tree built from a root to all the leafs. This tree can be centrally calculated by an NMS station and sent over to the root, or it can be a tree calculated by the root of the tree. The tree can be setup, similar to [CR-LDP], in downstream on demand ordered control mode. This is achieved by extending the signaling protocol with a new explicit tree object which represents the whole multicast tree. The root of the tree sends a Label Request message along with the explicit tree object. The subsequent LSR looks up its downstream routers in the explicit tree object of the label request message. Then it sends the label request to these downstream routers. After the router receives the Label Mapping messages from the downstream routers, it allocates a label itself, installs this point-to-multipoint MPLS forwarding entry into the forwarding table and sends a Label Mapping message to its upstream router. Given that this style of tree creation must carry all of the elements of the entire tree in the initial label request, and given that it is highly undesirable to fragment such requests, this style of tree building is primarily applicable to trees with a small number of leafs. If a root driven tree creation is desired for large trees, a mechanism will be needed by which the tree can be established in several separate requests. This tree can be torn down by the Label Release messages sent from the root to all the leafs. When a node receives a Label Release message, it takes the MPLS forwarding entry out of the forwarding table, and sends a Label Release message to every downstream router. 5. Leaf-Initiated Traffic Engineered Tree A leaf-initiated traffic engineered tree is a tree built from the leafs to the root. One needs to distinguish two cases: Ooms, et al. Expires August 2001 [Page 4] Internet Draft draft-ooms-mpls-multicast-te-00.txt February 2001 1. The tree can be centrally calculated by an NMS station and the reverse path of the root to each leaf is sent to the leaf. Note that in this case the pruning of one leaf and the subsequent new tree calculation by the NMS station can affect other branches than the one of the removed leaf. 2. The reverse path from the leaf to root can also be calculated by the leaf of the tree, based on e.g. CR-IGP information. Each leaf node sends a Join message (note that this is not a multicast routing Join message, but an extension to an MPLS signaling protocol) with the explicit reverse path and an MPLS label towards the root. At the subsequent upstream router, the Join messages of the same tree are merged, a label is allocated, the point-to-multipoint forwarding entry is installed into the forwarding table, and a Join with the newly allocated MPLS label and the explicit reverse path object is sent to the upstream router. When a leaf node wants to leave the group, it sends a Label Withdraw to its upstream router. When all the downstreams neighbors of a router leave the group, it should send a label withdraw to its upstream neighbor. When a Join reaches a on-tree router, the router processes the Join, modifies the forwarding entry with the label assigned by the newly joined downstream router and finishes the join procedure. 6. Extensions for CR-LDP 6.1. MPLS Multicast Tree ID An MPLS multicast tree id is used to uniquely identify a multicast tree. The LSPID field can be used to represent the MPLS multicast tree id value. The semantics of the LSPID are specified in [CR-LDP]. 6.2. TLVs 6.2.1. Explicit Tree Object (EXPLICIT-TREE TLV) The EXPLICIT-TREE TLV is an object that specifies the tree to be established (Figure 2). It is composed of one or more ER-HOP TLVs, which represent Tree-Hop objects. Ooms, et al. Expires August 2001 [Page 5] Internet Draft draft-ooms-mpls-multicast-te-00.txt February 2001 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |0|0| Type = TBA | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ER-HOP TLV 1 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ER-HOP TLV 2 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ~ ............ ~ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ER-HOP TLV n | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 2 EXPLICIT-TREE TLV U bit and F bit are defined in [LDP] and are both cleared. Type: A fourteen-bit field carrying the value of the EXPLICIT-TREE TLV Type = TBA. Length: Specifies the length of the value field in bytes. ER-HOP TLVs: One or more ER-HOP TLVs defined in Section 6.2.1 (extension to ER-HOP of [CR_LDP]). The ER-HOP TLV represents a node (a Tree-Hop) in the tree and they are in "depth-first-order" in the message. Figure 3. gives an example: A | +-------+----------+ | | | B C D | +----+------+ | | | E F G Figure 3 This EXPLICT-TREE TLV can be encoded as {A,{B,C,{E,F,G},D}} and are ordered in the EXPLICIT-TREE TLV as (A, B, C, E, F, G, D). 6.2.2. Tree-Hop Object (ER-HOP TLV extension) The Tree-Hop is an object that is used to represent a node that is part of the explicit tree (Figure 4). Instead of using a new TLV, the existing ER-HOP TLVs (ER-HOP 1 to 4) can be re-used with a new field in part of the reserved space. The extended ER-HOP-1 TLV [CR- LDP] is depicted in Figure 4: Ooms, et al. Expires August 2001 [Page 6] Internet Draft draft-ooms-mpls-multicast-te-00.txt February 2001 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |0|0| Type=0x0801 | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |L| Sub-Tree Size | Reserved | PreLen | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | IPv4 address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 4 U bit and F bit are defined in [LDP] and are both cleared. Type: A fourteen-bit field carrying the value of the ER-HOP-1 TLV type (0x0801). Length: Specifies the length of the value field in bytes (8). Sub-Tree Size: This field contains the number of Tree-Hop objects under this subtree. L bit: The L bit in the Tree-Hop is a one-bit attribute. If the L bit is set, then the value of the attribute is "loose." Otherwise, the value of the attribute is "strict." For brevity, we say that if the value of the Tree-Hop attribute is loose then it is a "loose Tree-Hop." Otherwise, it's a "strict Tree-Hop.". Further, we say that the abstract node of a strict or loose Tree-Hop is a strict or a loose node, respectively. Loose and strict nodes are always interpreted relative to their prior abstract nodes. The path between a strict node and its prior node MUST include only network nodes from the strict node and its prior abstract node. The path between a loose node and its prior node MAY include other network nodes, which are not part of the strict node or its prior abstract node. The other ER-HOP TLVs can be extended in a similar way. 6.3. Messages 6.3.1. Label Request Message For root-initiated explicit trees, the label request described in [CR-LDP] is changed to carry an EXPLICIT-TREE TLV instead of an ER TLV. So the format of the label request message is as follows: Ooms, et al. Expires August 2001 [Page 7] Internet Draft draft-ooms-mpls-multicast-te-00.txt February 2001 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |0| Label Request (0x0401) | Message Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Message ID | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | FEC TLV | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | LSPID TLV (CR-LDP, mandatory) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | EXPLICIT-TREE TLV (CR-LDP, mandatory) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Traffic TLV (CR-LDP, optional) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Pinning TLV (CR-LDP, optional) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Resource Class TLV (CR-LDP, optional) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Pre-emption TLV (CR-LDP, optional) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 5 6.3.2. Join Message The Join messsage is a new CR-LDP message to allow the leaf-initiated tree construction. It is sent from downstream routers towards the root. It is used in downstream unsolicited ordered control mode. It contains the explicit path from leaf towards root. The format of the Join message is as follows: Ooms, et al. Expires August 2001 [Page 8] Internet Draft draft-ooms-mpls-multicast-te-00.txt February 2001 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |0| Join (TBA) | Message Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Message ID | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | FEC TLV | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Label TLV | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | LSPID TLV (CR-LDP, mandatory) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ER TLV (CR-LDP, mandatory) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Traffic TLV (CR-LDP, optional) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Resource Class TLV (CR-LDP, optional) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Pre-emption TLV (CR-LDP, optional) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 6 6.4. Multicast FEC Elements If the above trees are constructed to carry IP multicast traffic, additional FEC elements need to be defined. 6.4.1. (*, G) FEC Element The encoding for a (*, G) FEC is: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | FEC El. Type | Address Family | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Multicast Group Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 7 FEC Element Type: TBA for (*, G). Address Family: Two octet quantity containing a value from ADDRESS FAMILY NUMBERS in [rfc1700] that encodes the address family for the address prefix in the Prefix field. Length: Length of the Multicast Group address in octets. Ooms, et al. Expires August 2001 [Page 9] Internet Draft draft-ooms-mpls-multicast-te-00.txt February 2001 Multicast Group Address: the multicast group address G encoded according to the Address Family field. 6.4.2. (S, G) FEC Element The encoding for a (S, G) FEC is: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | FEC El. Type | Address Family | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Source Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Multicast Group Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 8 FEC Element Type: TBA for (S, G). Address Family: Two octet quantity containing a value from ADDRESS FAMILY NUMBERS in [rfc1700] that encodes the address family for the address prefix in the Prefix field. Length: Sum of the lengths of the Source and Multicast addresses in octets. Source Address: the source address S encoded according to the Address Family field. Multicast Group Address: the multicast group address G encoded according to the Address Family field. 7. Security Considerations Security considerations will be addressed in a future revision of this document. References [CR-ISIS] "IS-IS extensions for Traffic Engineering", Tony Li, Henk Smit, work in progress, Internet Draft, , September 2000. [CR-LDP] "Constraint-Based LSP Set up Using LDP", Bilel Jamoussi, Ooms, et al. Expires August 2001 [Page 10] Internet Draft draft-ooms-mpls-multicast-te-00.txt February 2001 et.al., work in progress, Internet Draft, , July 2000. [CR-OSPF] "OSPF Extensions for Traffic Engineering", Derek M. Yeung, work in progress, Internet Draft, , September 2000. [LDP] "LDP Specification", L. Andersson, et al., RFC3036. [MPLS-MC] "Framework for IP Multicast in MPLS", D.Ooms, et.al., work in progress, Internet Draft, , January 2001. [MPLS-TE] "Requirements for Traffic Engineering Over MPLS", Daniel O. Awduche, et.al., RFC2702. [PIM-SM] "Protocol Independent Multicast-Sparse Mode (PIM-SM)", B. Fenner, et.al., work in progress, Internet Draft, , July 2000. [SSM] "Source-Specific Multicast", H. Holbrook, B. Cain, work in progress, Internet draft, , November 2000. [STEINER] "The Steiner Tree Problem, (Annals of Discrete Mathematics 53)", Frank K. Hwang, Dana S. Richards, Pawel Winter, ISBN 0- 444-89098-X, Elsevier Science. Authors Addresses Dirk Ooms Alcatel Fr. Wellesplein 1, 2018 Antwerpen, Belgium. Phone : 32 3 2404732 E-mail: Dirk.Ooms@alcatel.be Rudy Hoebeke Alcatel Fr. Wellesplein 1, 2018 Antwerpen, Belgium. Phone : 32 3 2408439 E-mail: Rudy.Hoebeke@alcatel.be Pierrick Cheval Alcatel 5 rue Noel Pons 92734 Nanterre Cedex France Phone : O1 4652 4027 Fax : 01 4652 4795 E-mail: Pierrick.Cheval@space.alcatel.fr Ooms, et al. Expires August 2001 [Page 11] Internet Draft draft-ooms-mpls-multicast-te-00.txt February 2001 Liwen Wu Cisco Systems 250 Apollo Drive, Chelmsford, MA 01824, USA Phone : +1 (978) 244-3087 E-mail: liwwu@cisco.com Ooms, et al. Expires August 2001 [Page 12]