L3VPN Working Group Maria Napierala Internet Draft ATT Intended Status: Proposed Standard Expires: July 24, 2011 Eric C. Rosen IJsbrands Wijnands Cisco Systems, Inc. January 24, 2011 A Simple Method for Segmenting Multicast Tunnels for Multicast VPNs draft-rosen-l3vpn-mvpn-segments-00.txt Abstract To provide Multicast VPN (MVPN) Service, Service Providers (SPs) need to instantiate multicast tunnels (known as "P-tunnels") that enable the Provider Edge (PE) routers of a given VPN to transmit multicast packets to each other. Some SPs organize their networks in a hierarchical manner, with the PE routers in "edge areas", and the edge areas connected to each other via a "core area". Although the edge areas and the core area may be in the same Autonomous System, the Border Routers of the core area use BGP to exchange routes to the PE routers. A P-tunnel that connects PE routers in different edge areas can be thought of as having three segments: a segment through one edge area, a segment through the core area, and a segment through the second edge area. It is desirable to preserve the independence of the core area by allowing it to use a different tunneling technology than that used in the edge areas. However, it is not desirable for a Border Router to participate specifically in the MVPN signaling, or even to have any knowledge of which MVPNs are in the edge areas that attach to it. This document specifies a simple method for segmenting MVPN P-tunnels at the BRs, subject to these constraints. Status of this Memo This Internet-Draft is submitted to IETF in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Napierala, et al. [Page 1] Internet Draft draft-rosen-l3vpn-mvpn-segments-00.txt January 2011 Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Copyright and License Notice Copyright (c) 2011 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Napierala, et al. [Page 2] Internet Draft draft-rosen-l3vpn-mvpn-segments-00.txt January 2011 Table of Contents 1 Introduction .......................................... 4 1.1 Specification of requirements ......................... 4 1.2 MVPN through Core and Edge Areas ...................... 4 2 Procedures ............................................ 7 2.1 When the P-tunnel uses mLDP in the Edge Areas ......... 7 2.2 When the P-tunnel uses PIM in the Edge Areas .......... 8 2.2.1 Source-Specific Trees ................................. 8 2.2.2 Shared Trees when the BR is not the RP ................ 8 2.2.3 Shared Trees when each BR is an RP .................... 8 3 Aggregation Strategies ................................ 9 4 Preventing Aggregation ................................ 9 4.1 mLDP P-Tunnels ........................................ 10 4.2 PIM P-Tunnels ......................................... 10 5 IANA Considerations ................................... 11 6 Security Considerations ............................... 11 7 Acknowledgments ....................................... 11 8 Authors' Addresses .................................... 12 9 Normative References .................................. 12 10 Informational References .............................. 13 Napierala, et al. [Page 3] Internet Draft draft-rosen-l3vpn-mvpn-segments-00.txt January 2011 1. Introduction 1.1. Specification of requirements The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119]. 1.2. MVPN through Core and Edge Areas Consider a Service Provider (SP) network that consists of a number of "edge areas", along with a "core area". Each edge area contains some number of Provider Edge (PE) routers. At the boundary of the "core area" are "Border Routers" (BRs). Each BR has a number of "edge- facing" interfaces that lead into edge areas, and a number of "core- facing" interfaces that lead to the core. Any data packet that needs to travel from one edge area to another must traverse the core (going through at least one BR) to do so. We assume that some set of PEs are offering Multicast Virtual Private Network (MVPN) service according to [MVPN]. This requires the PEs of a given VPN to create one or more multicast tunnels through the SP network ("P-tunnels"). Each such tunnel will have a single "root PE" and a number of "leaf PEs". Any P-tunnel that has a leaf PE that is in a different edge area that the root PE will have to traverse the core area, passing through one or more BRs. The procedures of this document are applicable when the BRs use BGP to distribute to each other the routes to the PE routers. We do not assume that the core area is in a different Autonomous System (AS) than the edge areas; this may or may not be the case. There are a number of different multicast tunneling technologies that the PEs can use for setting up the P-tunnels. In this document, we focus on two such technologies: PIM [PIM-SM] and Multipoint LDP (mLDP) [mLDP]; use of other P-tunnel technology is outside the scope of this document. For a given P-tunnel, we assume that the edge areas are using the same multicast tunneling technology, but that the core area may be using a different multicast tunneling technology than the edge areas use. In fact, the core area can even use unicast replication instead of a true multicast tunneling technology. Furthermore, if the core area does use multicast tunnels, it can aggregate a number of P-tunnels from the edge areas into a single tunnel through the core. In this case, a set of P-tunnels are aggregated upon entry into the core, and deaggregated upon exit from the core. Napierala, et al. [Page 4] Internet Draft draft-rosen-l3vpn-mvpn-segments-00.txt January 2011 For example, consider the following topology: PE1 --- Edge Area 1 --- BR1 ---------- BR3 --- Edge Area 3 --- PE3 | | PE2 -------| | Core Area | |----- BR4 --- Edge Area 4 --- PE4 Suppose there is some MVPN to which PE1, ..., PE4 are attached. Sup- pose that there are two P-tunnels for that MVPN. Tunnel T1 is rooted at PE1 and has PE3 as a leaf. Tunnel T2 is rooted at PE2 and has both PE3 and PE4 as leaf nodes. If no segmentation is done, the creation of tunnel T1 begins at PE3 with PIM or mLDP signaling. This signaling passes along through edge area 3 to BR3, and passes through the core to BR1. "Passing through the core" means passing through each intermediate core router on the path from BR3 to BR1. The signaling then passes through edge area 1 to PE1. If segmentation is done, the creation of tunnel T1 begins in exactly the same way, at PE3, and passes through Edge Area 3 to BR3 in exactly the same manner. However, the signaling through the core is different. No signaling passes through the intermediate nodes in the core area. Rather, BR3 does mLDP signaling over a Targeted LDP Ses- sion to BR1 [TMLDP]. BR3 passes enough information in this Targeted mLDP signaling to enable BR1 to reinitiate the necessary PIM or mLDP signaling in Edge Area 1. Within the core area, the Border Routers decide what sort of tunneling technology to use, and the core tunnel- ing technology is transparent to the PEs and to any other systems in the edge areas. The BRs may, for example, decide to use Unicast Replication. In this case, when BR1 receives, from Edge Area 1, a packet traveling on the Area 1 segment of tunnel T1, it encapsulates it and unicasts it to BR3. The encapsulation carries enough information to inform BR3 that the packet needs to be put on the Area 3 segment of tunnel T1. If BR1 receives, from Edge Area 1, a packet traveling on tunnel the Area 1 segment of T2, BR1 makes two copies of the packet, encapsulates each, and then unicasts one copy to BR3 and one copy to BR4. Based on information carried in the encapsulation, BR3 and BR4 each know that the packet must be forwarded on the segment of T2 in their respective Edge Areas. Napierala, et al. [Page 5] Internet Draft draft-rosen-l3vpn-mvpn-segments-00.txt January 2011 Alternatively, the BRs may decide to aggregate T1 and T2 into a sin- gle core multicast tunnel. Suppose, for example, that there is an RSVP-TE P2MP LSP whose head-end is BR1, and that has BR3 and BR4 as leaf nodes. Then when BR1 receives, from Edge Area 1, a packet trav- eling on tunnel T1, it encapsulates it and sends it through this core tunnel. When BR1 receives, from Edge Area 1, a packet traveling on P-tunnel T2, it again encapsulates it and sends it through this same core tunnel. Both packets will be received by BR3 and BR4. The packet traveling on P-tunnel T2 will be forwarded by BR3 and BR4 on the Edge Area 3 and Edge Area 4 segments of T2 respectively. The packet traveling on P-tunnel T1 will be forwarded by BR3 on the Edge Area 3 segment of T1. However, BR4 will drop that packet, because BR4 there is no Edge Area 4 segment of T1. Naturally, the encapsula- tion used by the head-end of the core tunnel must enable the leaf nodes of the core tunnel to determine whether a given packet is trav- eling on a P-tunnel that passes through that leaf node, and if so, which P-tunnel. It is worth noting that in the following topology: PE1 --- Edge Area 1 --- BR1 --------- BR3 --- Edge Area 3 --- PE3 | | | PE2 -------| | | |--- BR2 --- Core | |---- BR4 --- Edge Area 4 --- PE4 | |---- BR5 --- Edge Area 5 --- PE5 it is possible to combine unicast replication by the PEs with unicast replication by the BRs. For instance, if PE2 has a multicast packet to be sent to PE4 and PE5, it is possible for PE2 to unicast one copy of the packet to BR2 and then for BR2 to unicast a copy to BR4 and a copy to B5. This can be done if the PEs have Targeted LDP sessions to the BRs, and the BRs have Targeted LDP sessions to each other. This is a straightforward generalization of the procedures described in section 2.1 below. Napierala, et al. [Page 6] Internet Draft draft-rosen-l3vpn-mvpn-segments-00.txt January 2011 2. Procedures The Border Routers of the core area must know that they are core area Border Routers. This is known either by provisioning, or by some other method that is outside the scope of this document. It is assumed that the Border Routers have routes to the PE routers, and that the Border Routers use BGP to exchange these routes with each other. 2.1. When the P-tunnel uses mLDP in the Edge Areas The creation of an mLDP P-tunnel begins at a PE router. If a given P-tunnel passes through a particular BR, say BR1, then BR1 will receive an LDP Label Mapping Message on an LDP session over one of its edge-facing interfaces. This message will specify a root node for the P-tunnel. If BR1 looks up the root node, and finds that its route to the root node is a BGP-distributed route whose next hop is another BR, it sets up a Targeted LDP session to the BGP next hop of that route. We refer to this BGP next hop as the "upstream BR". (If such a Targeted LDP session to the upstream BR already exists, a new session is not set up; rather the existing one is used.) BR1 then executes the procedures specified in [TMLDP]. In order to execute the procedures specified in [TMLDP], a downstream BR must know whether the technique for carrying a P-tunnel in the core is unicast replication or whether it is multicast tunneling. This is known either by provisioning, or by some method that is outside the scope of this document. If multicast tunneling is used in the core, the upstream BR must know what kind of tunnel is to be used. If aggregation of multiple P-tunnels into a single core tunnel is used, all the BRs must support MPLS upstream-assigned labels. It is possible that a BR will receive, over an edge-facing interface, an mLDP Label Mapping message containing a "Recursive Forwarding Equivalence Class (FEC)", as described in [LDP-RECURS]. If the root of the recursive FEC is the BR itself, the BR follows the procedures of [LDP-RECURS], to obtain a new root. If its route to the new root is a BGP route whose next hop is another BR, the procedures of [TMLDP] are followed. Napierala, et al. [Page 7] Internet Draft draft-rosen-l3vpn-mvpn-segments-00.txt January 2011 2.2. When the P-tunnel uses PIM in the Edge Areas The creation of a PIM P-tunnel begins when a PE router sends a PIM Join message specifying either (*,G) or (S,G). We first consider the case where the P-tunnel is a source-specific multicast tree. Then we consider two different options that may be used when the P-tunnel is a PIM shared tree. The first option may be used by a BR which is not itself the Rendezvous Point (RP) for the shared tree. The second option is useful if all the BRs function as RPs for the shared trees within their attached edge areas. 2.2.1. Source-Specific Trees In this case, a BR, say BR1, will receive a Join(S,G) over one of its edge-facing interfaces. BR1 looks up its route to S, and finds that the route is a BGP route whose next hop is another BR. BR1 sets up a Targeted mLDP session to the "upstream BR" (or uses an existing Targeted mLDP session to it). BR1 then uses the encoding specified in [LDP-INBAND] to derive an LDP multipath FEC from the (S,G). From this point on, the procedures of [TMLDP] are used, precisely as described in the previous section. 2.2.2. Shared Trees when the BR is not the RP If the PIM P-tunnel is a shared tree (*,G), and if the PEs are configured so that they never switch to source-specific trees for G, then a similar procedure can be used. BR1 looks up its route to the RP [PIM-SM] of the shared tree. If the route is a BGP route whose next hop is another BR, that next hop becomes the "upstream BR" for that P-tunnel. BR1 uses a Targeted mLDP session to the upstream BR, and uses the encoding specified in [LDP-INBAND-SHARED] to derive an LDP multipath FEC from the (*,G). Then the procedures of [TMLDP] are used. 2.2.3. Shared Trees when each BR is an RP If G is a non-SSM PIM group address, and if there are sources for G in a particular edge area, it is possible to configure the BRs of that area to function as RPs for G. However, each such BR then needs to discover all the other BRs that are also functioning as RPs for G. This can be done by having the BRs originate and receive "Source Active BGP A-D routes". The procedures for generating and receiving these routes, and the mLDP procedures for setting up P2MP LSPs based on these routes, are specified in described in [LDP-INBAND-SHARED]. However, the LDP signaling described therein would take place over Napierala, et al. [Page 8] Internet Draft draft-rosen-l3vpn-mvpn-segments-00.txt January 2011 Targeted LDP sessions. Each such Source Active A-D route refers to a particular group G. It is RECOMMENDED that each such Source Active A-D route carry an IPv4 Address specific Route Target or an IPv6 Address specific Route Target (as appropriate)[RFC4360, RFC5701], with the address G in the "global administrator" field. This allows BRs that have no interest in group G to filter out the Source Active A-D routes that are about G. 3. Aggregation Strategies If it is desired to aggregate multiple P-tunnels into a single core area multicast tunnel, the choice of P-tunnels to map into which core area multicast tunnels is made by the upstream Border Router. The procedures for performing the aggregation are described in [TMLDP]. When the P-tunnels are MP-LSPs, It is RECOMMENDED that an upstream Border Router be able to aggregate all P-tunnels whose FECs begin with the same bit string (i.e., aggregate the set of FECs that are identical under a mask, where the mask consists of a sequence of ones followed by a sequence of zeroes). If the PEs have been configured to use MP FECs with type 2 opaque values (as defined in [MLDP-OV]), this technique allows all MP-LSPs of a given MVPN to be aggregated together. Selection of an aggregation strategy is outside the scope of this document. Note that PIM P-Tunnels and mLDP P-Tunnels may be aggregated in the same core tunnel. 4. Preventing Aggregation It may be desirable to prevent certain P-tunnels from being aggregated into a single core multicast tunnel. This is done by configuration at the PEs. The PEs convey this information to the BRs by including certain TLVs in the mLDP or PIM messages used to set up the P-tunnels. Napierala, et al. [Page 9] Internet Draft draft-rosen-l3vpn-mvpn-segments-00.txt January 2011 4.1. mLDP P-Tunnels When a P-tunnel is instantiated as an MP-LSP, and the PEs of that P- tunnel have been configured to disallow aggregation of that P-tunnel, the PEs indicate this fact in their MLDP signaling. When the PEs send a label mapping message that includes the corresponding FEC element, the PEs will also include an LDP MP Status TLV [mLDP] that carries the "Do Not Aggregate" status code (to be assigned by IANA). This TLV MUST be passed along with FEC element by any upstream LSR that sends a label mapping message or label request message containing the FEC element. If an mLDP node receives label mapping messages for a given FEC from more than one downstream neighbor, and some of those messages have the "Do Not Aggregate" status code while others do not, the "Do Not Aggregate" status code MUST be passed upstream. When a BR receives a label mapping message for an MP FEC element and a MP Status TLV containing the "Do Not Aggregate" status code, the BR knows that the MP-LSP corresponding to the FEC element SHOULD NOT be aggregated into a core multicast tunnel. If some PEs are configured to disallow aggregation for a given P- tunnel, but others are not, the results are unpredictable. For a given P-tunnel, the upstream BR MAY make its decision to aggregate or not based on the first mLDP label mapping message it sees for that P- tunnel. 4.2. PIM P-Tunnels When a P-tunnel is instantiated as a PIM multicast tree, the the PEs of that P-tunnel have been configured to disallow aggregation of that P-tunnel, the PEs indicate this fact in their PIM signaling. When the PEs send a PIM Join message for the corresponding (S,G) or (*,G), the PEs will include the "Do Not Aggregate" PIM Join Attribute. This is a PIM Join Attribute as specified in [PIM-JA]. The value of the Attr_Type field of this Join Attribute is to be assigned by IANA. The Length field of this Join Attribute is set to 0, and the F bit is set to 1. This attribute MUST be passed upstream in the PIM Join messages for the given (S,G) or (*,G). If a PIM node receives Join(S,G) or Join(*,G) from more than one downstream neighbor, and some of those Joins have the "Do Not Aggregate" Join Attribute while others do not, the attribute MUST be passed upstream. The conflict resolution procedure in [PIM-JA] is not used. When a BR receives a PIM Join message containing the "Do Not Aggregate" Join Attribute, the BR knows that the corresponding Napierala, et al. [Page 10] Internet Draft draft-rosen-l3vpn-mvpn-segments-00.txt January 2011 multicast distribution tree SHOULD NOT be aggregated into a core multicast tunnel. If some PEs are configured to disallow aggregation for a given P- tunnel, but others are not, the results are unpredictable. For a given P-tunnel, the upstream BR MAY make its decision to aggregate or not based on the first PIM Join it sees for that P-tunnel. If a BR uses the procedures of [LDP-INBAND] to map a PIM tree into a MP-LSP, and if the PIM tree has been set up with the "Do Not Aggregate" Join Attribute, the corresponding LDP messages SHOULD carry the "Do Not Aggregate" status code. If a BR using the procedures of [LDP-INBAND] needs to map an MP-LSP to a PIM tree, and the corresponding LDP messages carry the "Do Not Aggregate" status code, the corresponding PIM messages SHOULD carry the "Do Not Aggregate" PIM Join Attribute. 5. IANA Considerations [mLDP] creates a registry known as "LDP MP Status Value Element Types". This document requests IANA to assign a value from this registry for "Do Not Aggregate". [PIM-JA] creates a registry known as "PIM Join Attributes Types". This document requests IANA to assign a value from this registry for "Do Not Aggregate". 6. Security Considerations This document raises no new security considerations beyond those discussed in [LDP], [LDP-UP], and [RFC5331]. 7. Acknowledgments The authors wish to thank Don Heidrich for his contribution to this work. Napierala, et al. [Page 11] Internet Draft draft-rosen-l3vpn-mvpn-segments-00.txt January 2011 8. Authors' Addresses Maria Napierala AT&T Labs 200 Laurel Avenue, Middletown, NJ 07748 E-mail: mnapierala@att.com Eric C. Rosen Cisco Systems, Inc. 1414 Massachusetts Avenue Boxborough, MA, 01719 E-mail: erosen@cisco.com IJsbrand Wijnands Cisco Systems, Inc. De kleetlaan 6a Diegem 1831 Belgium E-mail: ice@cisco.com 9. Normative References [LDP] Loa Andersson, Ina Minei, Bob Thomas, editors, "LDP Specification", October 2007 [LDP-INBAND] IJsbrand Wijnands, Toerless Eckert, Maria Napierala, Nicolai Leymann, "mLDP based in-band signaling for Point-to- Multipoint and Multipoint-to-Multipoint Label Switched Paths", draft- ietf-mpls-mldp-in-band-signaling-02.txt, July 2010 [LDP-RECURS] IJsbrand Wijnands, Eric Rosen, Maria Napierala, Nicolai Leymann, "Using mLDP through a Backbone where there is no Route to the Root", draft-ietf-mpls-mldp-recurs-fec-00.txt, October 2010 [LDP-UP] Rahul Aggarwal, Jean-Louis Le Roux, "MPLS Upstream Label Assignment for LDP", draft-ietf-mpls-ldp-upstream-09.txt, November 2010 [mLDP] Ina Minei, Kireeti Kompella, IJsbrand Wijnands, Bob Thomas, "Label Distribution Protocol Extensions for Point-to-Multipoint and Multipoint-to-Multipoint Label Switched Paths", draft-ietf-mpls-ldp- p2mp-11.txt, October 2010 Napierala, et al. [Page 12] Internet Draft draft-rosen-l3vpn-mvpn-segments-00.txt January 2011 [MVPN] Eric Rosen, Rahul Aggarwal (editors), "Multicast in MPLS/BGP IP VPNs", draft-ietf-l3vpn-2547bis-mcast-10.txt, January 2010 [PIM-JA] Arjen Boers, IJsbrand Wijnands, Eric Rosen, "The PIM Join Attribute Format", RFC 5384, November 2008 [PIM-SM] "Protocol Independent Multicast - Sparse Mode (PIM-SM)", Fenner, Handley, Holbrook, Kouvelas, August 2006, RFC 4601 [RFC2119] "Key words for use in RFCs to Indicate Requirement Levels.", Bradner, March 1997 [RFC5331] Rahul Aggarwal, Yakov Rekhter, Eric Rosen, "MPLS Upstream Label Assignment and Context-Specific Label Space", RFC 5331, August 2009 [TMLDP] Maria Napierala, Eric Rosen, IJsbrands Wijnands, "Using LDP Multipoint Extensions on Targeted LDP Sessions", draft-napierala- mpls-targeted-mldp-00.txt, December 2010 10. Informational References [LDP-INBAND-SHARED] Yakov Rekhter, Rahul Aggarwal, Nicolai Leymann, "Carrying PIM-SM in ASM mode Trees over P2MP mLDP LSPs", draft- rekhter-pim-sm-over-mldp-02.txt, August 2010 [MLDP-OV] Sandeep Bishnoi, Pranjal Kumar Dutta, IJsbrand Wijnands, "LDP Multipoint Opaque Value Element Types", draft-bishnoi-mpls-mldp- opaque-types-01.txt, October 2009 [RFC4360] Srihari R. Sangli, Dan Tappan, Yakov Rekhter, "BGP Extended Communities Attribute", RFC 4360, February 2006 [RFC5701] Yakov Rekhter, "IPv6 Address Specific BGP Extended Community Attribute", RFC 5701, November 2009 Napierala, et al. [Page 13]