Network Working Group M. Xu Internet-Draft Y. Cui Intended status: Standards Track Tsinghua University Expires: May 22, 2008 C. Metz Cisco Systems, Inc. November 19, 2007 PE-based Multicast Framework for IPv6 Transition draft-xu-softwire-4over6multicast-01 Status of this Memo By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on May 22, 2008. Copyright Notice Copyright (C) The IETF Trust (2007). Xu, et al. Expires May 22, 2008 [Page 1] Internet-Draft softwire mcast framework November 2007 Abstract The Internet sometimes faces such scenario that: a set of customer networks are connected to a transit core who delivers messages for them, to communicate with each other; the massages from one customer network to another are tunneled to pass through the transit core. The tunnels are known as "softwires". It has been described in [I-D.ietf-softwire-mesh-framework] . The customer networks may also need to run IP multicast applications across the transit core. This memo provides a PE-based multicast framework for IPv6 transition. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Scenario . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 3. Schemes for Unicast Core . . . . . . . . . . . . . . . . . . . 6 3.1. Ingress Replication to All the Other PEs . . . . . . . . . 6 3.2. Ingress Replication to Necessary PEs . . . . . . . . . . . 6 4. Schemes for Multicast Core . . . . . . . . . . . . . . . . . . 7 4.1. RPF-Vector-Based Address Translation . . . . . . . . . . . 7 4.2. PIM-SSM based scheme . . . . . . . . . . . . . . . . . . . 8 4.3. Static PIM-SSM RPT . . . . . . . . . . . . . . . . . . . . 9 4.4. Single PE-Based Static Tree by PIM-SM . . . . . . . . . . 10 5. Select a Tunneling Technology . . . . . . . . . . . . . . . . 11 6. Security Considerations . . . . . . . . . . . . . . . . . . . 12 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 13 8. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 14 9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 15 9.1. Normative References . . . . . . . . . . . . . . . . . . . 15 9.2. Informative References . . . . . . . . . . . . . . . . . . 15 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 16 Intellectual Property and Copyright Statements . . . . . . . . . . 17 Xu, et al. Expires May 22, 2008 [Page 2] Internet-Draft softwire mcast framework November 2007 1. Introduction The multicast framework for IPv6 transition can be described like this: the customer multicast source(or RP) is in one customer network, while its multicast group members may be in the same or different customer networks. When transferred to other customer networks, multicast traffic must be transited through softwire. In the control layer, the distribution trees must cover both transit core and customer networks, so we need to connect the transit core and customer distribution trees together for a multicast group in PEs. There are two scenarios in softwire multicast: one is using multicast control protocol to construct distribution trees in transit core, the other does not use multicast but replicates the multicast data on PEs depending on ISP's support. Either in these two scenarios, there is something in common. PIM- SM(SSM) or MLD/IGMP is deployed as the multicast control protocol to construct distribution trees in all customer networks. After giving the detailed scenario in the next section, we will introduce several scheme candidates for both unicast-core and multicast-core, and some of the schemes will be further developed into more detailed solutions. Xu, et al. Expires May 22, 2008 [Page 3] Internet-Draft softwire mcast framework November 2007 2. Scenario -------- |Receiver| -------- | ._._._._. ._._._._. | | | | ----------- | E-IP | | E-IP |--|RP/Source S| | network | | network | ----------- ._._._._. ._._._._. | | PE router upstream PE Dual-Stack Dual-Stack | | __+____________________+__ / : : : : \ | : : : : | E-IP Multicast | : I-IP transit core : | message should | : : : : | get across the | : : : : | I-IP transit core \_._._._._._._._._._._._._./ + + downstream PE downstream PE Dual-Stack Dual-Stack | | ._._._._ ._._._._ -------- | | | | -------- |Receiver|-- | E-IP | | E-IP #|--|Receiver| -------- |network | |network | -------- ._._._._ ._._._._ Figure 1: Softwire Multicast Scenario The softwire multicast framework is illustrated in Figure 1. Multicast RP/Source S is in one customer network, while its receivers are in the same or different customer networks. When they are not in the same customer network, they have to communicate with each other through the I-IP transit core. Some terminology is defined as follows. E-IP: The network address family of customer network. I-IP: The network address family of the transit core. PE: Provider edge router, which supports the address family of both Xu, et al. Expires May 22, 2008 [Page 4] Internet-Draft softwire mcast framework November 2007 I-IP and E-IP. upstream PE: The PE router that located at the upstream of multicast data flow, which connects the transit core and the customer network the RP/source S belongs to. downstream PE: The PE router that located at the downstream of multicast data flow, which connects the transit core and the customer network which a group member of RP/source belongs to while the RP/ source doesn't belongs to. Xu, et al. Expires May 22, 2008 [Page 5] Internet-Draft softwire mcast framework November 2007 3. Schemes for Unicast Core In some scenarios, the I-IP transit core do not run multicast protocol, thus PEs do not construct multicast distribution trees in the I-IP transit core. Under this condition the multicast control messages from customer networks are encapsulated and decapsulated as common packets to get across the core. There are two alternative schemes in this scenario. They will be introduced in this section respectively. 3.1. Ingress Replication to All the Other PEs This scheme treats the I-IP transit core as a LAN. When a multicast message arrives at a certain PE from a customer network, it is encapsulated and unicasted to all the other PEs. Ingress PE encapsulates the E-IP multicast messages using I-IP address and unicasts it to all the other PEs. After receiving the encapsulated message, egress PE decapsulates it and forwards the message to the attached customer network if there are receivers. This scheme is easy to implement and has low cost for maintaining multicast states in the PEs. But its efficiency is low and produces redundancy in backbone. So the scalability is poor. 3.2. Ingress Replication to Necessary PEs In this scheme, only the PEs that need the multicast messages receive the encapsulated messages. When a multicast data message arrives at PE A from a customer network, it is encapsulated and unicasted to the PEs which have sent Join messages to PE A before. To achieve this we must construct and maintain a softwire multicast encapsulation table in PE A. This scheme doesn't tunnel multicast message to the PEs that do not have receivers attached. But every PE must maintain a large mount of multicast route entries. A trade off is to add some aggregation mechanism in the multicast route entry, so several (S, G) can share the same entry. Xu, et al. Expires May 22, 2008 [Page 6] Internet-Draft softwire mcast framework November 2007 4. Schemes for Multicast Core The schemes in Section 3 don't use any multicast control protocol in the transit core which will fit well where the transit core can't support multicast protocol, but bring in some redundancy in backbone. In this section, we consider the scenario where the transit core supports multicast protocol, and there are four alternative schemes. 4.1. RPF-Vector-Based Address Translation The main idea of address translation is to translate E-IP addresses of the Join/Prune messages to I-IP addresses, thus E-IP multicast messages can be translated to corresponding I-IP multicast messages at ingress PEs, and then be translated back to E-IP multicast messages after arriving at egress PEs. The translation procedure should follow some predefined rule, so that ingress PE and egress PE can finish the translation and retranslation procedure correctly without the need to negotiate. For example, if E-IP is IPv4 and I-IP is IPv6, the ingress PE uses a predefined IPv6 prefix for any case to translate an IPv4 address to an IPv6 address, and the predefined IPv6 prefix combined with the IPv4 address makes up of the new IPv6 address in the IPv6 transit core. Then the egress PE can easily retranslate it to the original IPv4 address by simply removing the predefined IPv6 prefix. Since the source and group addresses in the I-IP Join/Prune message are translated from E-IP by adding a predefined I-IP prefix, they can not be recognized by P routers in order to get to the corresponding egress PEs. We use an RPF Vector in the Join/Prune message to route them in the I-IP transit core. The RPF Vector is an optional extended attribution of PIM, which designates the routers which router the Join/Prune message must pass by. i.e., PE A fills the I-IP address of PE B in the RPF Vector of Join/Prune message to help it find a route to PE B in the transit core. Then the Join/Prune message builds a multicast tree in the transit core and finally arrives at PE B. When some multicast data packet arrives at PE B, it will be translated to an I-IP packet, and delivered along the I-IP multicast tree constructed by the former Join/Prune message in core and arrive at PE A. Then PE A will translate it back and forward it to the E-IP network. The address translation scheme is only available in the case where E-IP is IPv4 and I-IP is IPv6, as IPv6 addresses are 128bit long, it is possible to translate an IPv4 address to an IPv6 address by making IPv4 address part of the IPv6 address algorithmically. PEs can translate the IPv4 S and G into corresponding IPv4-mapped IPv6 Xu, et al. Expires May 22, 2008 [Page 7] Internet-Draft softwire mcast framework November 2007 addresses [RFC4291], and then be translated back. The precise circumstances under which these translations are done would be a matter of policy. But if E-IP is IPv6 and I-IP is IPv4, the translation can't be achieved easily, and more research is needed to fit this condition. Also, an additional RPF Vector must be applied to help to construct the I-IP tree in the transit core. To sum up, the address translation method is virtually the same multicast message taking on different appearances in different IP address family networks and the I-IP multicast tree is part of the E-IP tree while presenting an I-IP feature. 4.2. PIM-SSM based scheme In this scheme, we will construct an I-IP PIM-SSM tree in the transit core for each E-IP PIM-SSM tree in customer network. When a downstream PE receives a PIM Join/Prune message for (S,G) from a CE, it needs a multicast tunnel in I-IP transit network. Then the downstream PE assigns an I-IP multicast address G' for (S,G) according to some rule, and signals the corresponding upstream PE (with I-IP address S') in the transit core with a join message (S',G') which includes the original multicast address information. After receiving the I-IP join message, the upstream PE will get the E-IP address (S,G) out of this message and use it to send an E-IP join message to the attached E-IP network. After the whole process, multicast trees will be constructed in both E-IP and I-IP networks. For the data plane, when S wants to send data to its multicast group members which are not in the same customer network, data packets will first arrive at the upstream PE in the transit core. Then the upstream PE encapsulates the data packets in I-IP PIM packets with the multicast address G', and delivers them to all the group members (the downstream PEs) through the tree (S', G'). After that, the downstream PEs in the (S', G') tree decapsulate the I-IP messages they receive and forward them to the corresponding routers which are leaves of the (S, G) tree in the customer networks. This scheme can be used to support customer network multicast of either IPv4 or IPv6 over a transit core of the opposite protocol. However, it only works when the customer network multicast is SSM, since it provides no method for mapping a customer "prune a source off the (*, G) tree" operation into an operation on the (S', G') tree. In this scheme, since each multicast group in customer networks has a corresponding I-IP PIM-SSM in the transit core, this option is path optimistic; meanwhile, the routers which are in the corresponding Xu, et al. Expires May 22, 2008 [Page 8] Internet-Draft softwire mcast framework November 2007 I-IP PIM-SSM trees in the transit core need to store an amount of state information for the multicast groups. The storage needed is proportional to the numbers of multicast trees passing through the routers. 4.3. Static PIM-SSM RPT Common PIM-SM requires the transit core maintain one or more source- trees which are specific for a particular multicast group. Each such tree requires that state be maintained in all the routers that are in the tree. This may bring too much overload to the backbone. A trade-off solution is that for each PE we construct a source-tree whose leaves are all the other PEs, and transfer all the data from the PE through this tree. This may make some PEs receive messages they don't need, but it can balance the amount of state and the optimality of the multicast routing. Since softwire unicast uses BGP to auto-discover other members, we can use this membership to construct PIM-SSM trees whose source is a PE, and leaves are all the other PEs. Then we can forward all data from the PE through this tree in transit core. We must construct such a tree for each PE as source. It can be achieved this way: as soon as a new PE is found by another PE, the PE will send a Join message for the PIM-SSM tree whose source is this new PE. The group addresses for these trees are uniquely allocated. When a router in customer network wants to join(or prune from) a multicast group whose RP(or source S) is in another customer network, the Join/Prune message must first be delivered to the PE which connects this customer network to the transit core. Then the PE encapsulates the Join/Prune message in a I-IP unicast message whose I-IP destination address is the corresponding upstream PE , and then delivers it to the upstream PE. When the Join/Prune message arrives at the upstream PE, it is decapsulated and forwarded to the RP(or the source) in the customer network connected to it. And a multicast tree in customer network can be constructed in this way. When the RP(or the source S) wants to distribute data to all its multicast groups members. The group members in the same customer network with the RP(or the source) will receive the data messages directly through PIM protocol. For the group members who are not in the same customer network with the RP(or source), the E-IP data massages will firstly be sent to the PE(the upstream PE), and be encapsulated in I-IP multicast messages whose multicast address is the multicast address of the tree whose source is the upstream PE, and then be delivered to all the other PEs. Among the PEs which receives these data messages, if there are no receivers in the attached E-IP network, the messages will be discarded directly; Xu, et al. Expires May 22, 2008 [Page 9] Internet-Draft softwire mcast framework November 2007 otherwise the PE will decapsulate the I-IP massages to E-IP ones and forward them to the corresponding CE routers in the customer networks. In this scheme, no matter what multicast addresses of the E-IP multicast data messages are, if only their upstream PEs are the same PE, they must be distributed through the same I-IP PIM-SSM tree whose source is the shared upstream PE, and the I-IP multicast addresses of the encapsulated multicast data messages are the same. 4.4. Single PE-Based Static Tree by PIM-SM Another trade-off scheme is that, we construct only one bidirectional RPT based on BIDIR-PIM(Bidirectional PIM) in the transit core, whose leaves are all the PEs in the transit core, and transfer data from every PEs through this tree. This scheme may also make some PEs receive messages they do not need just like in Section 4.3, and it can also balance the amount of states and the efficiency of the multicast routing. When a router in customer network wants to join(or prune from) a multicast group whose RP(or source) is in another customer network, the Join/Prune message must firstly be delivered to the PE who connects this customer network with the transit core. Then the PE encapsulates the Join/Prune message in a I-IP unicast message and delivers it to the upstream PE. When the Join/Prune message arrives at the upstream PE, it is decapsulated and forwarded to the RP(or the source) in the customer network. And a multicast tree in customer network can be constructed. When the RP(or source S) wants to distribute data to all its multicast group members in customer networks, the group members in the same customer network with RP(or S) will receive the data massages directly through PIM protocol. For the group members who are not in the same customer network with RP(or S), the data messages will firstly be sent to the PE and encapsulated in multicast messages with the I-IP multicast address of the bidirectional RPT. This way, data will be forwarded along the RPT. When PEs receive these I-IP multicast messages, they will decapsulate the messages and get the corresponding E-IP multicast address. If there are no receivers in the attached E-IP network, the messages will be discarded directly; otherwise the PE will decapsulate the I-IP massages to E-IP ones and forward them to the corresponding CE routers in the customer networks. Xu, et al. Expires May 22, 2008 [Page 10] Internet-Draft softwire mcast framework November 2007 5. Select a Tunneling Technology We utilize the choice policy of tunneling technology for softwire unicast when encapsulating multicast control message. The tunneling technology can be GRE, IP-in-IP, MPLS etc. Here we use IP-in-IP, other tunneling method is alike. Detailed solution will be described in next revision. Xu, et al. Expires May 22, 2008 [Page 11] Internet-Draft softwire mcast framework November 2007 6. Security Considerations The PE routers could maintain secure communications through the use of Security Architecture for the Internet Protocol as described in [RFC4301]. Xu, et al. Expires May 22, 2008 [Page 12] Internet-Draft softwire mcast framework November 2007 7. IANA Considerations In solution 4.1, address translation is applied, and it should follow some predefined rule, especially the format of IPv6 prefix for translation should be predefined, so that ingress PE and egress PE can finish the translation and retranslation procedure correctly. The format of IPv6 prefix for translation can be unified within only the transit core , or within global area. In the later condition, the format should be assigned by IANA. Xu, et al. Expires May 22, 2008 [Page 13] Internet-Draft softwire mcast framework November 2007 8. Acknowledgments Meijia Hou, Yuntao Zhou, and Junfang Han provided useful input into this document. Xu, et al. Expires May 22, 2008 [Page 14] Internet-Draft softwire mcast framework November 2007 9. References 9.1. Normative References [RFC2362] Estrin, D., Farinacci, D., Helmy, A., Thaler, D., Deering, S., Handley, M., and V. Jacobson, "Protocol Independent Multicast-Sparse Mode (PIM-SM): Protocol Specification", RFC 2362, June 1998. [RFC2373] Hinden, R. and S. Deering, "IP Version 6 Addressing Architecture", RFC 2373, July 1998. [RFC4291] Hinden, R. and S. Deering, "IP Version 6 Addressing Architecture", RFC 4291, February 2006. [RFC4301] Kent, S. and K. Seo, "Security Architecture for the Internet Protocol", RFC 4301, December 2005. 9.2. Informative References [I-D.ietf-l3vpn-2547bis-mcast] Rosen, E. and R. Aggarwal, "Multicast in MPLS/BGP IP VPNs", draft-ietf-l3vpn-2547bis-mcast-05 (work in progress), July 2007. [I-D.ietf-l3vpn-2547bis-mcast-bgp] Aggarwal, R., "BGP Encodings and Procedures for Multicast in MPLS/BGP IP VPNs", draft-ietf-l3vpn-2547bis-mcast-bgp-03 (work in progress), July 2007. [I-D.ietf-pim-rpf-vector] Wijnands, I., "The RPF Vector TLV", draft-ietf-pim-rpf-vector-04 (work in progress), July 2007. [I-D.ietf-softwire-4over6vpns] Shepherd, G., "IPv4 unicast/multicast VPNs over an IPv6 core", draft-ietf-softwire-4over6vpns-00 (work in progress), June 2006. [I-D.ietf-softwire-mesh-framework] Wu, J., "Softwire Mesh Framework", draft-ietf-softwire-mesh-framework-02 (work in progress), July 2007. Xu, et al. Expires May 22, 2008 [Page 15] Internet-Draft softwire mcast framework November 2007 Authors' Addresses Mingwei Xu Tsinghua University Department of Computer Science, Tsinghua University Beijing 100084 P.R.China Phone: +86-10-6278-5822 Email: xmw@csnet1.cs.tsinghua.edu.cn Yong Cui Tsinghua University Department of Computer Science, Tsinghua University Beijing 100084 P.R.China Phone: +86-10-6278-5822 Email: yong@csnet1.cs.tsinghua.edu.cn Chris Metz Cisco Systems, Inc. 3700 Cisco Way San Jose, Ca. 95134 USA Email: chmetz@cisco.com Xu, et al. Expires May 22, 2008 [Page 16] Internet-Draft softwire mcast framework November 2007 Full Copyright Statement Copyright (C) The IETF Trust (2007). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Intellectual Property The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org. Acknowledgment Funding for the RFC Editor function is provided by the IETF Administrative Support Activity (IASA). Xu, et al. Expires May 22, 2008 [Page 17]