MPLS Working Group Seisho Yasukawa (NTT) - Editor Internet Draft Alan Kullberg (Motorola) - Editor Expiration Date: March 2004 October 2003 Extended RSVP-TE for Point-to-Multipoint LSP Tunnels Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Yasukawa, et. al. [Page 1] Internet Draft draft-yasukawa-mpls-rsvp-p2mp-03.txt October 2003 Contributors Contributors are listed by alphabetical order. Dean Cheng Cisco Systems Inc. 170 W Tasman Dr. San Jose, CA 95134 Phone 408 527 0677 Email: dcheng@cisco.com Markus Jork Avici Systems 101 Billerica Avenue N. Billerica, MA 01862 Phone: +1 978 964 2142 EMail: mjork@avici.com Hisashi Kojima NTT Network Service Systems Laboratories, NTT Corporation 9-11, Midori-Cho 3-Chome Musashino-Shi, Tokyo 180-8585 Japan Phone: +81 422 59 6070 EMail: kojima.hisashi@lab.ntt.co.jp Dimitri Papadimitriou Alcatel Francis Wellesplein 1, B-2018 Antwerpen, Belgium Phone: +32 3 240-8491 Email: Dimitri.Papadimitriou@alcatel.be Andrew G. Malis Tellabs 2730 Orchard Parkway San Jose, CA 95134 Phone: +1 408 383 7223 Email: Andy.Malis@tellabs.com Koji Sugisono NTT Network Service Systems Laboratories, NTT Corporation 9-11, Midori-Cho 3-Chome Musashino-Shi, Tokyo 180-8585 Japan Phone: +81 422 59 2605 EMail: sugisono.koji@lab.ntt.co.jp Yasukawa, et. al. [Page 2] Internet Draft draft-yasukawa-mpls-rsvp-p2mp-03.txt October 2003 Masanori Uga NTT Network Service Systems Laboratories, NTT Corporation 9-11, Midori-Cho 3-Chome Musashino-Shi, Tokyo 180-8585 Japan Phone: +81 422 59 4804 EMail: uga.masanori@lab.ntt.co.jp JP Vasseur Cisco Systems, Inc. 300 Beaver Brook Road Boxborough , MA - 01719 USA Email: jpv@cisco.com Yasukawa, et. al. [Page 3] Internet Draft draft-yasukawa-mpls-rsvp-p2mp-03.txt October 2003 0. Summary for Sub-IP Area (This section to be removed before publication.) 0.1. Summary This document specifies extensions and mechanisms to RSVP-TE in support of MPLS point-to-multipoint LSPs. 0.2. Where does it fit in the Picture of the Sub-IP Work This work fits squarely in the MPLS box. 0.3. Why is it Targeted at this WG This draft is targeted at the MPLS WG, because this draft specifies the extensions to RSVP-TE signalling protocol in support of MPLS point-to-multipoint LSPs creation/deletion/modification signalling. 0.4. Justification In this draft the protocol extensions will be defined allowing the creation and deletion of point-to-multipoint trees by other means than pure multicast routing protocols. The reasons for enabling this are: 1. Some network operators don't inject BGP routes into their backbone (e.g. they create a full mesh of LSPs). The result is that SSM can not be deployed, because the intermediate nodes in the network do not know where to send the PIM-SM Join messages to (if the source is in another AS). 2. Some applications of point-to-multipoint do not require very dynamic trees, e.g. content distribution to proxy servers or subtrees in backbone networks. For these applications it becomes worthwhile to create more optimal distribution trees. Point-to-multipoint trees constructed by multicast routing protocols are not optimal because: a. These trees are only shortest path if the paths are symmetric. This is a false assumption in the current Internet. b. Shortest-path trees themselves are non-optimal. On the other hand the calculation of an optimal Steiner trees is known to be an NP-complete problem requiring the usage of heuristics. Yasukawa, et. al. [Page 4] Internet Draft draft-yasukawa-mpls-rsvp-p2mp-03.txt October 2003 3. In a next step - similar to unicast traffic engineering[9] - an MPLS point-to-multipoint tree (a point-to-multipoint LSP) can be built which is automatically computed by a suitable entity based on QoS and policy requirements, taking into consideration the network state. In this case an IGP is needed. Both, OSPF[14] and IS-IS[21] can be used for this purpose. The LSA information is flooded throughout an AS, the point-to-multipoint tree is then calculated based on the adequate topology. 0.5. Related I-d's - draft-poj-optical-multicast-02.txt (expired) This i-d addresses the specifics of optical point-to-multipoint connection, while focusing on non-packet environments only this i-d is the early precursor of the Tree ERO object and the related mechanisms developed in the present i-d. - draft-ooms-mpls-multicast-te-01.txt (expired) This i-d describes 2 ways of building a multicast traffic-engineered tree: root-initiated tree and leaf-initiated tree. It also proposed extensions to MPLS (CR-LDP) signalling for setting up and tearing down traffic engineered multicast trees. - draft-cheng-mpls-rsvp-multicast-er-00.txt (expired) This i-d introduces the RSVP-TE's explicit route capability in support of multicast applications and more generally to point-to multipoint LSPs. The proposed mechanisms were also intended for point-to multipoint applications in non-packet LSP capable networks. First i-d having clarified the processing of the Tree ERO object and initiate corresponding RSVP-TE message processing. - draft-chung-mpls-rsvp-multicasting-00.txt (expired) This i-d addresses extensions to the Resource Reservation Protocol (RSVP) to support MPLS multicast. The concepts developed are in accordance to the traffic engineering (TE) attributes provided in the MPLS-TE specifications. This i-d introduces the concept of Leaf initiated Join/Leave procedures using RSVP-TE. Yasukawa, et. al. [Page 5] Internet Draft draft-yasukawa-mpls-rsvp-p2mp-03.txt October 2003 Table of Contents 1. Introduction ................................................... 7 2. Terminology and conventions .................................... 7 2.1 Terminology ................................................ 7 2.2 Conventions ................................................ 9 3. Applicability .................................................. 9 4. Architecture ...................................................10 4.1 P2MP LSP tunnels ...........................................10 4.2 Calculation of P2MP tree route .............................11 4.3 P2MP LSP establishment, teardown, and modification mechanisms .................................................11 4.4 Basic operation of P2MP LSP tunnels ........................11 4.5 P2MP session ...............................................13 4.5.1 P2MP session object ...................................13 4.5.2 P2MP_LSP_TUNNEL_IPv4 session object ...................13 4.5.3 P2MP_LSP_TUNNEL_IPv6 session object ...................14 4.6 Explicit routing ...........................................14 4.6.1 Tree Explicit Route Object (TERO) .....................14 4.6.2 Tree Record Route Object (TRRO) .......................21 4.6.3 Message Size ..........................................25 5. Sender-initiated P2MP LSP establishment ........................25 5.1 Sender-initiated P2MP LSP establishment mechanism ..........25 5.2 Processing of TERO and TRRO for sender-initiated P2MP LSP establishment .....................................26 5.3 Teardown mechanism .........................................28 5.4 Path/Resv error ............................................29 5.5 Message format .............................................29 5.5.1 Path message format ...................................29 5.5.2 Resv message format ...................................30 5.5.3 Other RSVP message formats.............................30 6. Sender-initiated grafting/pruning mechanism ....................31 6.1 Sender-initiated grafting mechanism ........................31 6.2 Graft Error ................................................32 6.3 Sender-initiated pruning mechanism .........................34 7. Error Codes and Error Value Sub-Codes ..........................35 8. Application for Traffic Engineering ............................35 8.1 Rerouting Traffic Engineered P2MP Tunnels ..................35 8.2 Re-establishment of subtree ................................35 9. Differences between RSVP Multicasting and P2MP TE Tunnels ......35 10. Security Considerations........................................36 11. References ....................................................36 12. Author's Addresses ............................................38 Yasukawa, et. al. [Page 6] Internet Draft draft-yasukawa-mpls-rsvp-p2mp-03.txt October 2003 1. Introduction Point-to-multipoint (P2MP) technology will become increasingly important with the dissemination of new, real-time applications, such as content delivery services and video conferences, which require P2MP real-time transmission capability with much more bandwidth and stricter QoS than non-real-time applications. This document defines RSVP-TE [1] protocol extensions in order to establish, maintain, and teardown a P2MP label switched path(LSP) [22]. The use of label switching routers (LSRs) with these extensions allows service providers to offer services that utilize point-to-point (P2P) and/or P2MP multiprotocol label switching (MPLS) in the same service network. These RSVP-TE protocol extensions are very flexible and can be used to carry protocols other than IP multicasting, e.g., Ethernet, PPP, and SONET/SDH. No assumption is made about the format of the data to be carried in the signaled LSP. 2. Terminology and conventions 2.1 Terminology The reader is assumed to be familiar with the terminology in [1], [2], [3] and [4]. P2P: Point-to-point P2MP: Point-to-multipoint P2MP LSP: A label switched path that has one unique ingress LSR (also referred to as the root) and more than one egress label switching router. Yasukawa, et. al. [Page 7] Internet Draft draft-yasukawa-mpls-rsvp-p2mp-03.txt October 2003 Sender node: Headend of the P2MP LSP. It controls the P2MP LSP layout and places traffic onto it by pushing a label. Branch LSR: A label switching router (LSR) that has more than one downstream LSR. A branch LSR receives a single MPLS frame, makes a duplicate of it, and sends each to downstream interfaces. Graft node: An LSR that is already a member of the P2MP tree and is in process of signaling a new subtree. Prune node: An LSR that is already a member of the P2MP tree and is in process of tearing down an existing subtree. Explicit Route Object (ERO): An object specifying the explicit path of the message including the object. Tree Explicit Route Object (TERO): An extended ERO for describing the P2MP tree topology. Record Route Object (RRO): An object recording the information of route through which the message including the object has passed. Tree Record Route Object (TRRO): An extended RRO for recording the route along which each merged message has traveled. Subtree: A subtree is a portion of a P2MP tree starting at a particular LSR that is a member of the P2MP tree and includes ALL downstream nodes that are also members of the P2MP tree. Yasukawa, et. al. [Page 8] Internet Draft draft-yasukawa-mpls-rsvp-p2mp-03.txt October 2003 2.2 Conventions The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC2119 [5]. 3. Applicability This document defines extensions to RSVP-TE ([1]), which defines the traffic engineering support in the MPLS networks using RSVP as defined in the [2] as the basis. A RSVP session as defined in the [2] can be either a uni-cast session, where there is a single sender and a single receiver, as well as a multicast session, where there is a single sender and multiple receivers, and in either case, the sender does not explicitly specify how the session is going to be routed in the networks. One of the key enhancements made in the [1] in the contrast of the [2] is that a point-to-point RSVP session can be explicitly specified by using the ERO, which consists of a hop-by-hop routing path from the sender to the receiver. This enhancement is to meet the traffic engineering requirements to be fulfilled by the LSP established with respect to the capabilities and the resource usage of the MPLS TE capable network. Note that LSPs as defined in [1] are point-to-point in nature and in fact, there are texts that explicitly state that multicast support is for further study. This document intends to fill this support by expanding the semantics of the ERO and RRO as defined in the [1] for traffic engineering associated with point-to-multipoint LSPs as part of the MPLS technology. This document intends to fill this support by expanding the semantics of the ERO and RRO as defined in the [1] for traffic engineering associated with point-to-multipoint MPLS LSP applications, as part of the MPLS technology. Note the difference between a point-to- multipoint LSP as defined in this document and a multicast session as defined in the [2] is similar to the difference between a MPLS LSP as defined in the [1] and a single RSVP session as defined in the [2], where one requires the traffic engineering as defined in the scope of MPLS, but the other not. In brief, as [1] covers RSVP extensions and mechanisms for constraint-based P2P LSPs, this document covers RSVP extensions and mechanisms for constraint-based P2MP LSPs. The technology as defined in this document can be used in MPLS networks where point-to-multipoint tunnels are needed, including applications such as news broadcasting. Yasukawa, et. al. [Page 9] Internet Draft draft-yasukawa-mpls-rsvp-p2mp-03.txt October 2003 The Generalized MPLS (GMPLS) architecture [10] and the GMPLS signalling extensions to RSVP [17] have further enhanced the MPLS protocols for use in non-IP networks, along with a set of new code points for RSVP. One key enhancement introduced in [17] is the definition of RSVP mechanisms allowing the separation of the control plane and data plane. This enables RSVP to be used as the signaling protocol for both IP and non-IP networks, with a set of technologyspecific extensions and code points defined in dedicated documents such as [18] which specifies these extensions and and code-points for Sonet/SDH. It is the authors's intention that the technology as defined in this document applies to non-IP/MPLS network, where LSPs can be established using general labels accommodating TDM, LSC and FSC interfaces. In general, all the code points as defined in the [17] and [18] are applicable for point-to-multipoint LSPs in the relevant sense, except that the ERO and RRO are replaced by the TERO and TRRO with their associated processing rules, as defined in this document. 4. Architecture 4.1 P2MP LSP tunnels This protocol defines "P2MP flow" by a label that is assigned to a set of packets at the ingress node of a P2MP LSP. Such a P2MP LSP is referred to as a "P2MP LSP tunnel" because the traffic through it is opaque to intermediate nodes along the LSP. To enable the identification and association of such P2MP LSP tunnels, new P2MP_LSP_TUNNEL session objects are defined. Each session object carries the sender node address of a P2MP LSP and its tunnel ID. The sender node address is more preferable than destination (leaf) node addresses to identify a P2MP LSP tunnel because frequent topological changes may occur and destination nodes may be added and removed from the tree over time. And to express the leaf nodes and topology of this P2MP LSP tunnel, TREE_EXPLICIT_ROUTE object and TREE_RECORD_ROUTE object are also defined. This protocol uses conventional SENDER_TEMPLATE (or FILTER_SPEC) object to convey sender node address and LSP ID. Therefore, the P2MP_LSP_TUNNEL session object together with the SENDER_TEMPLATE object uniquely identify a P2MP LSP tunnel. Yasukawa, et. al. [Page 10] Internet Draft draft-yasukawa-mpls-rsvp-p2mp-03.txt October 2003 It is very useful to associate sets of LSP tunnels which share the same sender node in a P2MP application. This can be useful during rerouting operations or to spread a traffic trunk over multiple P2MP paths [1]. In traffic engineering such sets are called P2MP traffic engineered tunnels (P2MP TE tunnels). The same as unicast TE tunnels, these P2MP TE tunnels are uniquely identified by the SESSION objects. 4.2 Calculation of P2MP tree route The calculation for a P2MP tree requires two major pieces of information. The first is the routing path from the source node of a P2MP tree to each of the leaf node, and the second is the traffic engineering related pararmeters including bandwidth etc. on each of the TE link along the routing path. Note this requirement is exactly the same as calculating a P2P RSVP LSP per RSVP-TE [1], except with the P2MP, there are multiple destination nodes. Both routing information as required by calculating P2P LSP is generally performed by executing IP routing protocols with TE extensions, including OSPF-TE [14] and ISIS-TE [21], and this is in general also applied to calculating P2MP LSP. However, other mechanisms and protocols may also be used for this purpose, which is out of the scope of this document. 4.3 P2MP LSP establishment, teardown, and modification mechanisms This P2MP MPLS protocol supports the sender-initiated P2MP LSP tunnel establishment mechanism. It has i) P2MP LSP tunnel establishment and teardown mechanisms and ii) partial P2MP LSP (subtree LSP) tunnel establishment, teardown, and modification mechanisms. 4.4 Basic operation of P2MP LSP tunnels This paragraph explains the basic P2MP LSP tunnel establishment mechanism. Figure 1 shows the basic sender-initiated P2MP LSP tunnel establishment mechanism. To create a P2MP LSP tunnel, the sender node creates a Path message with a TERO in addition to a LABEL_REQUEST object, a SESSION object, and a SENDER_TEMPLATE object. The TERO includes P2MP tree information to set. The Path message is forwarded towards its destination along a P2MP path specified by the TERO. Each intermediate node along the path records the TERO, SESSION object, and SENDER_TEMPLATE object in its path state block. An intermediate branch node copies the Path message to each next hop node specified Yasukawa, et. al. [Page 11] Internet Draft draft-yasukawa-mpls-rsvp-p2mp-03.txt October 2003 in the TERO until the destination leaf nodes are reached. Path Path Path -----> -----> -----> . Ingress LSR <----- <----- <----- . (Sender) Resv Resv Resv . +---+ +---+ +---+ +---+ | S |------| N |------| N |------| N |..... +---+ +---+ +---+ +---+ ^| ^| Resv||Path Resv||Path || || |V || Path +---+ || -----> | N |..... || <----- Egress LSR +---+ |V Resv (Leaf) . +---+ +---+ . | N |-----| L | . +---+ +---+ . . . Figure 1: Fundamental P2MP LSP establish mechanism When the Path message reaches the destination leaf node, the leaf node responds to a LABEL_REQUEST by including a LABEL object in its response Resv message. Then the Resv message is sent back upstream toward the sender node, following the path state created by the Path message in reverse order. Each node that receives a Resv message containing a LABEL object uses that label for outgoing traffic associated with this tunnel. If the node is a branch node, it merges all the Resv messages coming from downstream leaf nodes. After merging the downstream Resv messages, it allocates a new label and places it in the corresponding LABEL object of the Resv message, which it sends upstream to the previous hop (PHOP). Finally, the merged Resv message propagates upstream to the sender node. Thus, a P2MP LSP is established. This label assignment is performed in downstream on-demand ordered control mode. Yasukawa, et. al. [Page 12] Internet Draft draft-yasukawa-mpls-rsvp-p2mp-03.txt October 2003 4.5 P2MP session 4.5.1 P2MP session object The new SESSION object is defined for P2MP LSP tunnels. To identify a P2MP tunnel, P2MP_LSP_TUNNEL_IPv4 and P2MP_LSP_TUNNEL_IPv6 are added to the SESSION object as new C-Types. They each have a tunnel sender address and Tunnel ID. 4.5.2 P2MP_LSP_TUNNEL_IPv4 session objects Class = SESSION, C-Type = P2MP_LSP_TUNNEL_IPv4 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | IPv4 tunnel sender address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | MUST be zero | Tunnel ID | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ IPv4 tunnel sender address IPv4 address of the tunnel sender Tunnel ID A 16-bit identifier used in the SESSION that remains constant over the life of the tunnel. Yasukawa, et. al. [Page 13] Internet Draft draft-yasukawa-mpls-rsvp-p2mp-03.txt October 2003 4.5.3 P2MP_LSP_TUNNEL_IPv6 session objects Class = SESSION, C-Type = P2MP_LSP_TUNNEL_IPv6 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | + + | IPv6 tunnel sender address | + + | (16 bytes) | + + | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | MUST be zero | Tunnel ID | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ IPv6 tunnel sender address IPv6 address of the tunnel sender Tunnel ID A 16-bit identifier used in the SESSION that remains constant over the life of the tunnel. 4.6 Explicit routing 4.6.1 Tree Explicit Route Object (TERO) 4.6.1.1 Overview Explicit routing is the main function of RSVP-TE. RSVP-TE defines an ERO to describe the explicit route of the tunnel. An ERO is defined as a list of subobjects. Each subobject represents information about nodes composing the data path. For P2MP, the path topology is a tree. So the ERO is extended to express a P2MP tree and a new Tree Explicit Route Object (TERO) is defined. TERO is included in a message concerning path establishment like the Path message. It specifies nodes of a tree in which the message is traversed. In particular, TERO of a Path message initiated by the sender node contains the whole topology of the P2MP tree. Before a node sends a TERO in an outgoing Path message, it MUST split the TERO such that only hops describing the subtree towards which the Yasukawa, et. al. [Page 14] Internet Draft draft-yasukawa-mpls-rsvp-p2mp-03.txt October 2003 Path is sent are included in the TERO. The result is that downstream nodes do not have a complete picture of the P2MP tree. Instead, each node receives a TERO containing a subtree with the receiving node as the root of that subtree. By not sending the entire TERO downstream, downstream nodes will use less memory for storing the state information for the LSP. TERO is a list of subobjects. Each subobject shows information about a node on the P2MP tree. To describe the tree topology, each subobject should be sorted to be able to show link connectivity. For this purpose, the subobjects are arranged in depth-first-order and have a number of hops from the sender node. For the P2MP tree shown in figure 2 below, the TERO is encoded as follows: T={A(0,4),B(1,4),C(2,4),D(3,4),E(3,4),F(2,4),G(2,4),H(3,4),I(1,4), J(1,4),K(2,4),L(2,4)} A | +----+----+ | | | B I J | | +---+---+ +-+-+ | | | | | C F G K L | | +-+-+ | | | | D E H Figure 2: TERO corresponding to a P2MP tree Throughout this document, TERO hops are shown in the following format: N(d,s) where N == Node identifier d == distance s == subtree id Yasukawa, et. al. [Page 15] Internet Draft draft-yasukawa-mpls-rsvp-p2mp-03.txt October 2003 For example, C(2,4) means the hop specifies node C, 2 hops from the sender node, with a subtree id of 4. The subtree id field is used by downstream nodes to determine if the TERO has changed without the need to compare the entire object. When the sender node makes a change in the TERO, then the subtree id field is changed in each TERO hop along the series of hops towards the graft or prune node. 4.6.1.2 Data Terminating Nodes In some cases, a P2MP tree may be calculated that contains a branch node that must also behave as a leaf node with respect to the dataplane (i.e., in the case of locally attached client nodes)[22]. Allowing for a node to be both a branch and leaf at the same time may enable calculation of otherwise impossible and/or of more optimal trees. For example, reference the following network topology represented in figure 3. Node A is the sender node, and the leaf nodes requested are B, C, E, and F. Without allowing C* and E* to be both a branch and leaf, no tree would be possible. A | B---C*--D--E*--F Figure 3: Tree topology including a node which becomes a branch and leaf In order for a node in the tree to know to terminate the data locally, an explicit designator is placed in the TERO hop object. When set, a node SHOULD terminate data locally, as well as forward the data to other downstream nodes if any exist. Yasukawa, et. al. [Page 16] Internet Draft draft-yasukawa-mpls-rsvp-p2mp-03.txt October 2003 4.6.1.3 Strict and loose explicit routes Two kinds of explicit route are prepared. In a strict explicit route, the nodes composing the route are specified strictly. A loose explicit route allows an external routing algorithm to decide the route between specified nodes. Unicast routing algorithm such as OSPF[15] and IS-IS[16] are examples of such algorithms. These routes are selectected at each link of the tree by loose explicit routing allows route computation and selection of the explicit route from the the local nodes to the specified loose nodes. The protocol allows a mixture of strict and loose explicit routes in the same P2MP tree, so TERO should show where the strict and loose explicit routes are. The information should relate to nodes described by subobjects. As the information satisfying this condition, the input link status is used to specify strict and loose explicit routes in P2MP trees. In the case of a P2MP tunnel, the attribute of an input link relates to each node uniquely. The L bit in a subobject of TERO shows the input link status of the node. If the L bit shows loose, the input link belongs to a loose explicit route. Otherwise, it belongs to a strict explicit route. This protocol allows the insertion of additional subobjects. In a loose explicit route, the edge nodes of the route indicated as a loose explicit route may know the topology of the network around the loose explicit route. The edge node may calculate the route and specify the route with TEROs. In this case, the edge node inserts the calculated route into the TERO of messages related to the loose explicit route. Consider the number of hops from the sender node to a node. It is counted based on the TERO. If intermediate nodes between a leaf node and the sender node belong to a strict explicit route, they are specified in TERO and the leaf node can count the number of hops. When loose explicit routes are part of a P2MP tree, the sender node cannot count the number of hops. Because nodes on a loose explicit route are not specified, sender nodes cannot count the number of hops in the loose explicit route. So they cannot calculate the relevant number of hops of the tree nodes. Therefore, the protocol defines the number of hops between the edges of an loose explicit route as 1. This definition satisfies the demand for TERO; using this definition, TERO can show the tree topology and node connectivity. The distance field in TERO hops is always relative to the sender node. This is true in ALL messages in which a TERO can appear. Yasukawa, et. al. [Page 17] Internet Draft draft-yasukawa-mpls-rsvp-p2mp-03.txt October 2003 4.6.1.4 Object format Class = TREE_EXPLICIT_ROUTE, C_Type = 1 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | // (Subobjects) // | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Subobject format Type = IPv4_ADDRESS 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |L| Type (1) | Length |T| Distance | Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Subtree Id | Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | IPv4 address (4 bytes) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Yasukawa, et. al. [Page 18] Internet Draft draft-yasukawa-mpls-rsvp-p2mp-03.txt October 2003 Type = UNNUMBERED_INTERFACE 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |L| Type (2) | Length |T| Distance | Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Subtree Id | Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Router ID | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Interface ID (32 bits) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Type = IPv6_ADDRESS 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |L| Type (3) | Length |T| Distance | Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Subtree Id | Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | IPv6 address (16 bytes) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | IPv6 address (continued) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | IPv6 address (continued) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | IPv6 address (continued) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ L The L bit is an attribute of the subobject. The L bit is set if the route of a path between the node specified by this subobject and its upstream node is not specified (Loose explicit route). If the bit is not set, the route of a path is specified explicitly (Strict explicit route). Type 0x01 IPv4 address 0x02 Unnumbered Interface ID 0x03 IPv6 address Yasukawa, et. al. [Page 19] Internet Draft draft-yasukawa-mpls-rsvp-p2mp-03.txt October 2003 Length The Length contains the total length of the subobject in bytes, including the Type and Length fields. IPv4/6 address An IPv4/6 address of node corresponding to the subobject. In the case that the node has several IP addresses, the address of input interface should be specified. Router ID The Router ID of the node corresponding to the subobject. Unnumbered Interface ID The unnumbered interface ID [11]. Distance The distance from the sender node to the node specified by the subobject. The value of distance between neighboring nodes specified in TERO is 1. T The T bit (terminating node) is set to 1 to indicate that the node in this hop is a data terminating node and has locally attached clients (receivers of the data). This bit MUST be set for any of the leaf nodes listed in the TERO. Reserved MUST be 0 on transmission and ignored on reception. Subtree Id The subtree identifier associated with this TERO hop. Yasukawa, et. al. [Page 20] Internet Draft draft-yasukawa-mpls-rsvp-p2mp-03.txt October 2003 4.6.2 Tree Record Route Object (TRRO) Record Route Object (RRO) is a list of subobjects describing a node. Each subobject is sorted to show the order in which the message went through the nodes. In P2MP communication, some messages SHOULD be merged into a new message to integrate the information that each message conveys. In this case, RRO shows all the nodes through which each message traveled. So RRO should be able to represent the tree structure in P2MP. We call the extended RRO a TRRO. When messages are merged at a branch node, their TRROs also are merged. Each TRRO shows the topology information of the downstream subtree. The merged TRRO is a series of downstream TRROs. The subobject specifying the branch node is inserted at the top of the series. So the merged TRRO shows the subtree whose root is the branch node. In the case of a Resv message, when the message arrives at the sender node of the P2MP tree, TRRO shows the current topology of the P2MP tree. To represent the tree topology, the subobjects of TRRO are sorted in depth-first-order and they have information about the number of hops from the sender node as the case of TERO. The topology information may be sent to all nodes composing the P2MP tree by refreshing the Path message. | +----------------------|----------------------+ | P2MP | | | tree T | | | | | | +--------------N--------------+ | | | | | | | | | | | | +-----------+ +-----------+ +-----------+ | | | Subtree 1 | | Subtree 2 | | Subtree n | | | +-----------+ +-----------+ +-----------+ | +---------------------------------------------+ T = {R(0),T1,T2,...,Tn} Ti = {TRRO describing subtree i} Figure 4 Merged TRRO Yasukawa, et. al. [Page 21] Internet Draft draft-yasukawa-mpls-rsvp-p2mp-03.txt October 2003 4.6.2.1 Object format Class = TREE_RECORD_ROUTE, C_Type = 1 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | // (Subobjects) // | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4.6.2.2 Subobject format Type = IPv4_ADDRESS 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type (1) | Length |T| Distance | Flags | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Subtree Id | Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | IPv4 address (4 bytes) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Type = UNNUMBERED_INTERFACE 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type (2) | Length |T| Distance | Flags | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Subtree Id | Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Router ID | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Interface ID (32 bits) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Yasukawa, et. al. [Page 22] Internet Draft draft-yasukawa-mpls-rsvp-p2mp-03.txt October 2003 Type = IPv6_ADDRESS 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type (3) | Length |T| Distance | Flags | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Subtree Id | Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | IPv6 address (16 bytes) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | IPv6 address (continued) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | IPv6 address (continued) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | IPv6 address (continued) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Type 0x01 IPv4 address 0x02 Unnumbered Interface ID 0x03 IPv6 address Length The Length contains the total length of the subobject in bytes, including the Type and Length fields. T The T bit (terminating node) is set to 1 to indicate that the node in this hop is a data terminating node and has locally attached clients (receivers of the data). This bit MUST be set for any of the leaf nodes listed in the TRRO. Distance The distance from sender node to the node specified by the subobject. The value of distance between neighboring nodes specified in TRRO is 1. Yasukawa, et. al. [Page 23] Internet Draft draft-yasukawa-mpls-rsvp-p2mp-03.txt October 2003 Flags 0x01 Local protection available (from [1]) Indicates that the link downstream of this node is protected via a local repair mechanism. This flag can only be set if the Local protection flag was set in the SESSION_ATTRIBUTE object of the corresponding Path message. 0x02 Local protection in use (from [1]) Indicates that a local repair mechanism is in use to maintain this tunnel (usually in the face of an outage of the link it was previously routed over). 0x04 Bandwidth protection (from [20]) The PLR will set this when the protected LSP has a backup path which provides the desired bandwidth, which is that in the FAST_REROUTE object or the bandwidth of the protected LSP, if no FAST_REROUTE object was included. The PLR may set this whenever the desired bandwidth is guaranteed; the PLR MUST set this flag when the desired bandwidth is guaranteed and the "bandwidth protection desired" flag was set in the SESSION_ATTRIBUTE object. 0x08 Node protection (from [20]) When set, this indicates that the PLR has a backup path providing protection against link and node failure on the corresponding path section. In case the PLR could only setup a link-protection backup path, the "Local protection available" bit will be set but the "Node protection" bit will be cleared. Subtree Id The subtree identifier associated with this TRRO hop. Reserved MUST be 0 on transmission and ignored on reception. IPv4/6 address A 32/128-bit unicast, host address. Yasukawa, et. al. [Page 24] Internet Draft draft-yasukawa-mpls-rsvp-p2mp-03.txt October 2003 Router ID The Router ID of the node corresponding to the subobject. Unnumbered Interface ID The unnumbered interface ID [11]. 4.6.3 Message Size The introduction of the TERO and TRRO objects has increased the likelihood that a message carrying either of these objects may exceed the link MTU. If a message exceeds the link MTU, then IP fragmentation and reassembly MUST be used in sending the message to it's destination. 5. Sender-initiated P2MP LSP establishment 5.1 Sender-initiated P2MP LSP establishment mechanism The sender node initiates a Path message with a TERO including a P2MP tree topology to be established. This Path message is forwarded along the path described in the TERO and duplicated by each branch node. Each node receiving the Path message records the P2MP tree state described in the SESSION object and the SENDER_TEMPLATE object. A leaf node which receives a Path message responds to it by sending a Resv message including a LABEL and a TRRO object. A Resv message is sent back upstream towards the sender node on a hop-by-hop basis according to the recorded P2MP tree state in each intermediate node. The Resv messages from all downstream nodes MUST be merged at the branch node. When the sender node receives the Resv messages, the establishment of the P2MP LSP is completed. Figure 5 shows the sequence of P2MP LSP establishment events. Node B, which is the branch node for the nodes C, D, and E, merges the Resv messages. After this merging, the node B initiates a Resv message and sends it upstream. Yasukawa, et. al. [Page 25] Internet Draft draft-yasukawa-mpls-rsvp-p2mp-03.txt October 2003 +---node--** | (C) +---------node--** | (D) Content | (leaf) Servers--sender--node----------------node---------------node--client (A) (B) (E) (F) | | | | | | | | Path(TERO) | | | | | |----->| | | | | | | | Path(TERO) | | | | | |----->| | | | | | | Path(TERO) | | | | |------------>| | | | | | Path(TERO) | | | | |------------------->| | | | | | | | Path(TERO) | | | | | | |----------------->| | | | | | | Resv(TRRO) | | | | Resv(TRRO) | |<-----------------| | | |<-----| | | | | | | Resv(TRRO) | | | | |<------------| | | | | | |Resv(TRRO) | | | | |<-------------------| | | | Resv(TRRO) | | | | | |<-----| | | | | | | | | | | | | Figure 5 Sequence of sender-initiated P2MP LSP establishment events 5.2 Processing of TERO and TRRO for sender-initiated P2MP LSP establishment A node that receives a Path message refers to the SESSION object and SENDER_TEMPLATE. The node determines whether the Path message is for an existing LSP or a new LSP. Yasukawa, et. al. [Page 26] Internet Draft draft-yasukawa-mpls-rsvp-p2mp-03.txt October 2003 If the Path message is for a new LSP, then the following procedure applies. The node verifies that the first subobject (hop) in the TERO is an address or interface belonging to this node. If not, then a PathErr message MUST be sent upstream. The error code is set to "P2MP error" and the error value sub-code is set to "Bad TREE_EXPLICIT_ROUTE object" as defined in section 7. If the TERO contains only one hop, then this node is a leaf node and sends a Resv message upstream. Otherwise, the TERO is processed to locate all hops with a distance that is 1 greater than the distance value contained in the first TERO hop. For each such hop, a unicast Path message MUST be sent to the node specified by the TERO hop. If the Path message is for an existing LSP, then the following procedure applies. If the first subobject in the TERO contains the same subtree id as in the previously received Path message, then the entire TERO is treated as if no change has occurred and processing ends. Otherwise, the TERO is processed as follows. Locate each hop in the TERO with a distance that is 1 greater than the distance value contained in the first TERO hop and add it to a next hop list. If there are any downstream nodes for which this node holds path state that do NOT appear in the next hop list, then a PathTear message MUST be sent immediately to that node. Then, for each hop in the next hop list, if there is no path state for this next hop, then a unicast Path message MUST be sent to the node specified by the TERO hop. Otherwise (path state is found for the next hop), if the new subtree id is different for this next hop than the subtree id previously recorded in the path state, then a unicast Path trigger message MUST be sent to the node specified by the TERO hop. Note that the TERO in each outgoing Path message MUST contain just the subtree with the next hop node as the root node of that subtree. Each node that receives a Resv message containing a LABEL object uses that label for outgoing traffic associated with this LSP tunnel. If the node is not the sender node, it allocates a new label and places that label in the corresponding LABEL object of the Resv message, which it sends upstream to the PHOP. Yasukawa, et. al. [Page 27] Internet Draft draft-yasukawa-mpls-rsvp-p2mp-03.txt October 2003 A branch node SHOULD delay sending a trigger Resv message if one has been sent upstream recently for the same P2MP LSP. If a node were to send a trigger Resv message upstream each time it received a trigger Resv message from a downstream peer, then nodes closer to the sender node could be flooded with trigger Resv messages. In order to avoid this situation, a limit is placed on the frequency of sending trigger Resv messages upstream from a branch node. When a branch node receives the first Resv message as a result of having sent multiple downstream Path messages, a trigger Resv message is sent upstream. Shortly thereafter, a second Resv message is received and the node needs to merge the TRROs and send a trigger Resv message upstream. But, since a trigger Resv message has been recently sent, the node SHOULD delay sending another trigger Resv message. The recommended amount of time between trigger Resv messages is 5 seconds, and the value should also be less than rsvpIfRefreshInterval as defined in [19]. Note that even with this delay, the dataplane can be enabled for each outgoing label when the Resv is received. So the P2MP tree can be operational before the sender node has received the TRRO showing the entire P2MP tree. When Resv messages are merged, a new TRRO is made using TRROs included in the received Resv messages. This new TRRO expresses the P2MP subtree. The branch node that merges Resv messages is the root node of P2MP subtree. The TRRO of the Resv messages is needed to detect loops on the P2MP tree. The node SHOULD record information of TRRO. 5.3 Teardown mechanism A P2MP LSP tunnel is explicitly torn down by a PathTear message. The PathTear message must be routed exactly like the corresponding Path message. The PathTear message is copied by each branch node and is forwarded downstream to delete the corresponding path state. PathTear messages are initiated not only by the sender node but also by the branch node that receives a Path message with a TERO from which downstream nodes have been removed. The process of pruning nodes from an existing P2MP tree is described in section 6.3. Yasukawa, et. al. [Page 28] Internet Draft draft-yasukawa-mpls-rsvp-p2mp-03.txt October 2003 5.4 Path/Resv error During the P2MP LSP establishment described in section 5.1 and 5.2, various errors may occur. The main reasons are bandwidth allocation failures and unknown next hops specified by TERO. If a node does not support a new object or new C-type defined by this protocol, it sends error messages indicating "unknown object class" error or an "Unknown object C-Type". A node that initiates a ResvErr message SHOULD send ResvErr messages to all leaf nodes that are affected by the error. 5.5 Message format 5.5.1 Path message format TERO is added to the Path message. Note that a new C-Type is added to SESSION object. ::= [ ] [ [ | ] ... ] [ ] [ ] [ ] [ ] [ ] [ ] [ ... ] ::= [ ] [ ] [ ] [ ] Yasukawa, et. al. [Page 29] Internet Draft draft-yasukawa-mpls-rsvp-p2mp-03.txt October 2003 5.5.2 Resv message format TRRO is added to the Resv message. Note that a new C-Type is added to SESSION object. ::= [ ] [ [ | ] ... ] [ ] [ ] [ ] [ ] [ ] [ ... ]