Network Working Group Ali Sajassi Internet Draft Cisco Systems Nabil Bitar Verizon Yuji Kamite NTT Communications Expires: May 2006 November 2005 Congruency for VPLS Multicast and Unicast Paths draft-sajassi-l2vpn-vpls-multicast-congruency-00.txt Status of this Memo By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Abstract The current VPLS multicast proposals based on multicast tree has several issues that are outlined in [BRIDGE-INTEROP]. These issues stems from the divergence of the multicast and unicast paths for a given VPLS instance in the MPLS/IP network. This draft describes a mechanism for building multicast and unicast paths that are congruent in order to address these issues. Sajassi, Bitar & Kamite [Page 1] draft-sajassi-l2vpn-vpls-multicast-congruency.txt November 2005 Table of Contents 1. Conventions.....................................................2 2. Terminology.....................................................2 3. Introduction....................................................2 4. Non-congruency Issues...........................................3 5. Multicast Tree Types............................................4 6. Congruent Unicast and Multicast Paths...........................5 6.1. Multicast Path Alignment along the Unicast Path...............5 6.2. Unicast Path Alignment along the Multicast Path...............6 7. Building Multicast Trees and Unicast Tunnels....................7 7.1. PIM-based Trees...............................................7 7.2. P2MP LSP......................................................8 7.3. P2MP TE LSP...................................................8 8. Security Considerations.........................................8 9. Acknowledgments.................................................8 10. Normative References...........................................8 11. Informative References.........................................9 12. Authors' Addresses.............................................9 13. Full Copyright Statement......................................10 14. Intellectual Property Statement...............................10 1. Conventions The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119]. 2. Terminology This document uses terminology described in [L2VPN-FRWK] and [VPLS- LDP]. 3. Introduction As stated in [VPLS-MCAST-REQ], existing VPLS mechanisms have challenges in handling multicast traffic efficiently. For solutions to it, there are some VPLS multicast proposals today, which fall into two categories: i) using ingress replication over unicast PWs ii) avoiding ingress replication by using multicast tree The former method guarantees congruency between VPLS unicast and multicast data paths because both data types use the same set of PWs (and thus same paths) in the network even in the presence of ECMPs. However, this method may not be suitable in applications where the number of PE receivers in a multicast group are large or for high Sajassi, Bitar & Kamite [Page 2] draft-sajassi-l2vpn-vpls-multicast-congruency.txt November 2005 bandwidth multicast applications where ingress replication may result in exceeding physical port bandwidth on the PE facing the core, for instance. The later method handles VPLS multicast data efficiently. It results in forwarding the VPLS multicast traffic in the backbone such that no more than one copy of the packet ever traverses any link. It also maintains shortest paths (least cost paths) between the receiver PEs and the source PE. However, the VPLS multicast and unicast paths can diverge in this method for several reasons. For example, in the case of an IP network, the multicast tree (using PIM procedures) is built from the receiver PEs to the source; whereas, the unicast traffic is routed from the source to the receiver PE. Even in the absence of ECMP, the unicast and multicast traffic can take different paths because of asymmetrical path cost. In case of MPLS network where the multicast tree is built using [P2MP-LDP], the multicast and unicast paths can be different because of asymmetrical path cost or because of ECMP. Even when the path cost is symmetrical, the ECMP for building P2MP LSP tunnel (used by multicast traffic) is performed from the receiver PE toward the source PE; whereas, for unicast traffic, the ECMP is performed in the opposite direction from the source PE toward the receiver PE. The next section reviews the issues associated with non-congruent multicast and unicast paths for VPLS applications and section 5 discusses the proposal to achieve congruency for unicast paths and multicast trees in a VPLS network. 4. Non-congruency Issues Issues resulting from non-congruent unicast and multicast paths have been described in [bridge-interop] and are reviewed in this section. Before going over these issues, it should be noted that a VPLS instance as described in [VPLS-LDP] and [VPLS-interop] can provide a LAN emulation service not only among sites of a given enterprise customer but also it can provide such service among the service provider bridged LAN networks (e.g., 802.1ad or 802.1ah networks). Therefore, the following issues will have a more pronounced effect in the latter case than the former one because a problem in service provider bridged LAN network can impact many end-customers. i) Loops and Black-holing: If VPLS is used to connect customer or provider bridges, then there are two alternatives of sending BPDU frames over VPLS with multicast tree: a) over unicast PWs or b) over multicast tree. Regardless of which alternative is used, a failure along the non-BPDU path (which is different from BPDU path) will result in black holing of the traffic on the non-BPDU path since it won’t be detected by the CE bridges. Sajassi, Bitar & Kamite [Page 3] draft-sajassi-l2vpn-vpls-multicast-congruency.txt November 2005 Furthermore, a failure along the BPDU path (which is different from non-BPDU path) will result in loop within the customer network because the CE Bridge will put its blocked port into the forwarding state and thus causing loop within its bridged network (and through the service provider network). ii) Out of order delivery: If unknown unicast frames are flooded over multicast tree, then it may result in an out-of-order frame delivery at the receiving PE because the multicast and unicast paths can be different and thus two consecutive frames at the ingress PE can take two different paths through the VPLS network and the second frame that takes the unicast path may arrive ahead of the first frame that takes the multicast path. iii) CFM procedures: IEEE 802.1ag describes fault management procedures for customer and provider bridged networks. In a typical bridged network, the unicast and multicast paths are the same and thus when there is a failure along the data path (either unicast or multicast) for a given service instance, it will be detected by CFM Connectivity Check procedures. However, when the unicast and multicast paths are different in the VPLS network, then regardless which path is chosen by the CFM Connectivity Check procedure, the other one is not covered and thus a failure on that path cannot be detected by the CFM CC procedure. 5. Multicast Tree Types [L3VPN-MCAST] defines the following multicast tree types that are also leveraged by [VPLS-MCAST]: i) Inclusive Tree: A tree that carries all the multicast traffic for a single VPLS instance. ii) Aggregate Inclusive Tree: A tree that carries all the multicast traffic for several VPLS instances. iii) Selective Tree: A tree that carries one or more multicast streams belonging to the same VPLS instance. iv) Aggregate Selective Tree: A tree that carries several multicast streams belonging to several VPLS instances. To support VPLS multicast using Multicast Tree, the following components are required. . IGMP/PIM snooping over Attachment Circuits Sajassi, Bitar & Kamite [Page 4] draft-sajassi-l2vpn-vpls-multicast-congruency.txt November 2005 . Discovery & distribution of group membership info . Construction of multicast tree IGMP/PIM snooping over Attachment Circuits are only needed for (Aggregate) Selective Tree. Discovery and distribution of group membership info is required for any of the above tree types and there are several different schemes for performing this task. Once the source and receiver PEs are discovered, then the tree can be constructed using transport dependent signaling mechanism (e.g., PIM for IP transport, mLDP for MPLS LSP, and RSVP-TE for TE LSP). This draft only concentrates on the last component – namely constructing multicast tree with congruency with unicast path and it assumes that group membership information have already been discovered. Furthermore, this draft assumes that source specific multicast tree is used for the above tree types and the application of shared tree is for future study. The congruency procedures described in this draft are independent from the tree type and they are equally applicable to each of the above tree types. As noted in [L3VPN-MCAST], the above tree types are also independent from the transport type – e.g., each of the above tree types can be built using any of the transport tunnel mechanisms (such as [PIM], [P2MP-RSVP-TE], or [P2MP-LDP]). However, the congruency procedures, as will be seen, are dependent on transport tunnel mechanisms. 6. Congruent Unicast and Multicast Paths In general, there are two general approaches for having congruent multicast and unicast paths as follow: i) Align the multicast path along the unicast path ii) Align the unicast path along the multicast path 6.1. Multicast Path Alignment along the Unicast Path In this approach, the multicast tree is built along the unicast path and there are two different ways of achieving it, considering that the unicast path is established from the source toward the receiver: a) build a multicast tree from the source toward the receivers and b) perform the trace route of unicast path to identify the nodes along the unicast path (even in the presence of ECMP) and then build a receiver initiated multicast tree from the receiver PEs toward the source using unicast trace route info. Method (a) would require substantial changes to [PIM] procedures for building multicast trees and even if these changes were feasible, it Sajassi, Bitar & Kamite [Page 5] draft-sajassi-l2vpn-vpls-multicast-congruency.txt November 2005 would not render an optimum tree because different tree branches toward the receivers would have originated early from the root (e.g., packet replication would have occurred sooner and closer to the root than otherwise would have occurred if the tree was built from the receiver toward the root). Therefore, the method (a) is not considered further except when RSVP-TE is used for setting up both P2MP and P2P LSPs. Method (b) would require (software) changes to only the PIM protocol in order to incorporate trace route info into the PIM join request such that the join can follow exactly the same route as the unicast route. However, it would affect the re-construction of the multicast tree upon a network failure because besides IGP convergence, method (b) would require the unicast paths through the network to be re- traced before re-constructing the multicast tree. Because of the delay associated with re-tracing and re-constructing the multicast tree, the method (b) is for future study. 6.2. Unicast Path Alignment along the Multicast Path In this approach, the unicast paths are built along the multicast tree. This can be done easily by using the same multicast-tree- building procedures for setting up the unicast tunnels – e.g., build a unicast tunnel as a multicast tree with a single root and a single receiver. Since the same procedure is used for building both unicast tunnels and multicast tree, congruency between the two can be achieved simply by using the same identifier when ECMP paths are encountered so that the same path toward the source is selected in both cases. There are two ways for using such identifier. First, the multicast tree can be built as before without any changes to its procedures. However, when building the unicast tunnel using multicast procedures, the multicast-tree identifier is passed in the “join” message. This identifier is then used to select the same path (as multicast tree) toward the source in case ECMPs is encountered along the path. The second way for using such identifier is to have an identifier that is used for ECMP path selection for both multicast tree and unicast tunnels. The receiver initiated “join” message from egress PE (for both multicast and unicast trees of a given VPLS instance) contains the same identifier. Given that the same identifier is used, the same path is selected for both multicast and unicast trees when ECMP is encountered. The advantage of this identifier is that it allows the decoupling of multicast and unicast trees from one another such that any number of multicast trees can be aligned with any number of unicast tunnels based on the use of this common identifier. Therefore, the application of such identifier is recommended and in the subsequent discussions, it is assumed that this method is used unless stated otherwise. Sajassi, Bitar & Kamite [Page 6] draft-sajassi-l2vpn-vpls-multicast-congruency.txt November 2005 It should be noted that there is a cost associated in achieving congruency as described in the previous paragraph and it is related to the number of multicast states needed to be maintained in the P nodes. Since each unicast tunnel is represented by a multicast tree, the number of multicast states in the core increases proportionally to the number of unicast tunnels. However, the increase in number of states, as the result of unicast tunnels, does not need to be proportional to number of VPLS instances or number of multicast groups but instead it only needs to be proportional to the number of ECMP paths in order to perform proper load balancing in the core. For example if the maximum number of ECMP paths in the core network between any pair of PEs are N, then the maximum number of unicast tunnels needed (between any pair of PEs) will be N and the VPLS instances will be distributed (e.g., load balanced) among these N tunnels. For both of these approaches described in the previous sections, the advantage of achieving such congruency for VPLS multicast and unicast data over an MPLS/IP network is that it puts the LAN emulation service in par with the bridged LAN service and at the same time it provides an efficient mechanism for multicast data handling. When the ingress PE is also the root of its multicast tree, then one can achieve shortest path (or least cost) bridged LAN service when the VPLS instance is supported this way because both unicast and multicast data for that VPLS instance is forwarded along the shortest path between the PEs and at the same time have congruency with each other to satisfy the requirements of a bridged LAN network. 7. Building Multicast Trees and Unicast Tunnels As mentioned previously, the procedures for building congruent unicast and multicast paths are dependent on the transport mechanism. The following sub-sections describe what changes to the existing procedures are required for each of the following transport mechanisms. 7.1. PIM-based Trees [PIM] uses receiver initiated join request for building a tree from the receiver PE(s) toward the source PE. In this scenario, the same procedure is used for constructing both multicast and unicast trees. The joint request for both multicast and unicast tree is modified to include an ECMP identifier. The same identifier is used for both unicast and multicast tree belonging to the same VPLS instance. This identifier is used to select the same path among multiple equal cost paths when building unicast and multicast trees in order to achieve congruency between them. Then encoding of this field will be described in the future. Sajassi, Bitar & Kamite [Page 7] draft-sajassi-l2vpn-vpls-multicast-congruency.txt November 2005 7.2. P2MP LSP Similar to [PIM] procedures, this procedure uses receiver initiated “join” message for building both a P2MP LSP and a P2P LSP from the receiver PE(s) toward the source PE. Each receiver PE “joins” the P2MP or P2P tree by sending LDP label mapping message. The label mapping message for both P2MP and P2P tree is modified to include an ECMP identifier. The same identifier is used for both P2MP and P2P LSPs for the same VPLS instance. This identifier is used to select the same path among multiple equal cost paths when building the unicast and multicast trees in order to achieve congruency between them. Then encoding of this field will be described in the future. 7.3. P2MP TE LSP Contrary to the previous two procedures, this procedure is initiated from the source PE toward the receiver PEs. Since the network nodes along the path between the source and the receiver PEs in this scheme are identified a priory, both multicast tree and unicast tunnels can be established along the same path using existing P2MP and P2P LSP TE procedures. 8. Security Considerations Security aspects of this draft will be discussed at a later point. 9. Acknowledgments The authors would like to thank Dino Farinacci, John Zwiebel, Daniel Alvarez for their comments and feedbacks. 10. Normative References [RFC2119] "Key words for use in RFCs to Indicate Requirement Levels.", Bradner, March 1997 [VPLS-LDP] Lasserre, M. and et al, "Virtual Private LAN Services over MPLS", work in progress [P802.1ag] IEEE Draft P802.1ag/D0.1 “Virtual Bridge Local Area Networks: Connectivity Fault Management”, Work in Progress, October 2004 Sajassi, Bitar & Kamite [Page 8] draft-sajassi-l2vpn-vpls-multicast-congruency.txt November 2005 11. Informative References [802.1D-REV] IEEE Std. 802.1D-2003 “Media Access Control (MAC) Bridges”. [802.1Q] IEEE Std. 802.1Q-2003 "Virtual Bridged Local Area Networks". [P802.1ad] IEEE Draft P802.1ad/D2.4 “Virtual Bridged Local Area Networks: Provider Bridges”, Work in progress, September 2004 [BRIDGE-INTEROP] A. Sajassi, et. al, "VPLS Interoperability with CE Brides", draft-sajassi-l2vpn-vpls-bridge-interop-02.txt, Work in progress. [IGMP-SNOOP] Christensen, M. and et al, “Considerations for IGMP and MLD Snooping Switches”, Work in progress, May 2004 [P2MP-RSVP-TE] R. Aggarwal, D. Papadimitriou, S. Yasukawa, et. al, "Extensions to RSVP-TE for Point to Multipoint TE LSPs", draft-ietf- mpls-rsvp-te-p2mp, Work in progress. [P2MP-LDP] Minei, I., I. Wijnands, et. al, "Label Distribution Protocol Extensions for Point-to-Multipoint and Multipoint-to- Multipoint Label Switched Paths”, draft-minei-wijnands-mpls-ldp-p2mp, Work in progress. [PIM] Estrin, D., Farinacci, D., Helmy, A., Thaler, D., Deering, S., Handley, M., and V. Jacobson, "Protocol Independent Multicast-Sparse Mode (PIM-SM): Protocol Specification", RFC 2362, June 1998. [VPLS-MCAST-REQ] Y. Kamite et al, "Requirements for Multicast Support in Virtual Private LAN Services", draft-ietf-l2vpn-vpls-mcast-reqts, Work in progress. [L3VPN-MCAST] Rosen, E. and R. Aggarwal, "Multicast in MPLS/BGP IP VPNs", draft-ietf-l3vpn-2547bis-mcast-00, Work in progress. 12. Authors' Addresses Ali Sajassi Cisco Systems, Inc. 170 West Tasman Drive San Jose, CA 95134 Email: sajassi@cisco.com Nabil Bitar Verizon Communications 40 Sylvan Rd., Waltham, MA 02451 Email: nabil.bitar@verizon.com Sajassi, Bitar & Kamite [Page 9] draft-sajassi-l2vpn-vpls-multicast-congruency.txt November 2005 Yuji Kamite NTT Communications Corporation Tokyo Opera City Tower 3-20-2 Nishi Shinjuku, Shinjuku-ku, Tokyo 163-1421, Japan Email: y.kamite@ntt.com 13. Full Copyright Statement Copyright (C) The Internet Society (2005). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. “This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.” 14. Intellectual Property Statement The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf ipr@ietf.org. Sajassi, Bitar & Kamite [Page 10]