Network Working Group                                              M. Xu
Internet-Draft                                                    Y. Cui
Intended status: Standards Track                     Tsinghua University
Expires: May 22, 2008                                            C. Metz
                                                     Cisco Systems, Inc.
                                                       November 19, 2007


            PE-based Multicast Framework for IPv6 Transition
                  draft-xu-softwire-4over6multicast-01

Status of this Memo

   By submitting this Internet-Draft, each author represents that any
   applicable patent or other IPR claims of which he or she is aware
   have been or will be disclosed, and any of which he or she becomes
   aware will be disclosed, in accordance with Section 6 of BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   This Internet-Draft will expire on May 22, 2008.

Copyright Notice

   Copyright (C) The IETF Trust (2007).


Xu, et al.                Expires May 22, 2008                  [Page 1]

Internet-Draft          softwire mcast framework           November 2007


Abstract

   The Internet sometimes faces such scenario that: a set of customer
   networks are connected to a transit core who delivers messages for
   them, to communicate with each other; the massages from one customer
   network to another are tunneled to pass through the transit core.
   The tunnels are known as "softwires".  It has been described in
   [I-D.ietf-softwire-mesh-framework] .  The customer networks may also
   need to run IP multicast applications across the transit core.  This
   memo provides a PE-based multicast framework for IPv6 transition.


Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3

   2.  Scenario . . . . . . . . . . . . . . . . . . . . . . . . . . .  4

   3.  Schemes for Unicast Core . . . . . . . . . . . . . . . . . . .  6
     3.1.  Ingress Replication to All the Other PEs . . . . . . . . .  6
     3.2.  Ingress Replication to Necessary PEs . . . . . . . . . . .  6

   4.  Schemes for Multicast Core . . . . . . . . . . . . . . . . . .  7
     4.1.  RPF-Vector-Based Address Translation . . . . . . . . . . .  7
     4.2.  PIM-SSM based scheme . . . . . . . . . . . . . . . . . . .  8
     4.3.  Static PIM-SSM RPT . . . . . . . . . . . . . . . . . . . .  9
     4.4.  Single PE-Based Static Tree by PIM-SM  . . . . . . . . . . 10

   5.  Select a Tunneling Technology  . . . . . . . . . . . . . . . . 11

   6.  Security Considerations  . . . . . . . . . . . . . . . . . . . 12

   7.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 13

   8.  Acknowledgments  . . . . . . . . . . . . . . . . . . . . . . . 14

   9.  References . . . . . . . . . . . . . . . . . . . . . . . . . . 15
     9.1.  Normative References . . . . . . . . . . . . . . . . . . . 15
     9.2.  Informative References . . . . . . . . . . . . . . . . . . 15

   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 16
   Intellectual Property and Copyright Statements . . . . . . . . . . 17


Xu, et al.                Expires May 22, 2008                  [Page 2]

Internet-Draft          softwire mcast framework           November 2007


1.  Introduction

   The multicast framework for IPv6 transition can be described like
   this: the customer multicast source(or RP) is in one customer
   network, while its multicast group members may be in the same or
   different customer networks.  When transferred to other customer
   networks, multicast traffic must be transited through softwire.  In
   the control layer, the distribution trees must cover both transit
   core and customer networks, so we need to connect the transit core
   and customer distribution trees together for a multicast group in
   PEs.

   There are two scenarios in softwire multicast: one is using multicast
   control protocol to construct distribution trees in transit core, the
   other does not use multicast but replicates the multicast data on PEs
   depending on ISP's support.

   Either in these two scenarios, there is something in common.  PIM-
   SM(SSM) or MLD/IGMP is deployed as the multicast control protocol to
   construct distribution trees in all customer networks.

   After giving the detailed scenario in the next section, we will
   introduce several scheme candidates for both unicast-core and
   multicast-core, and some of the schemes will be further developed
   into more detailed solutions.


Xu, et al.                Expires May 22, 2008                  [Page 3]

Internet-Draft          softwire mcast framework           November 2007


2.  Scenario


                                               --------
                                              |Receiver|
                                               --------
                                                 |
                         ._._._._.            ._._._._.
                        |         |          |         |   -----------
                        |  E-IP   |          |  E-IP   |--|RP/Source S|
                        | network |          | network |   -----------
                         ._._._._.            ._._._._.
                            |                    |
                        PE router            upstream PE
                        Dual-Stack           Dual-Stack
                            |                    |
                          __+____________________+__
                         /   :   :           :   :  \
                        |    :      :      :     :   |  E-IP Multicast
                        |    : I-IP transit core :   |  message should
                        |    :     :       :     :   |  get across the
                        |    :   :            :  :   | I-IP transit core
                         \_._._._._._._._._._._._._./
                             +                   +
                        downstream PE      downstream PE
                         Dual-Stack          Dual-Stack
                             |                    |
                          ._._._._            ._._._._
             --------    |        |          |        |   --------
            |Receiver|-- |  E-IP  |          |  E-IP #|--|Receiver|
             --------    |network |          |network |   --------
                          ._._._._            ._._._._


   Figure 1: Softwire Multicast Scenario

   The softwire multicast framework is illustrated in Figure 1.
   Multicast RP/Source S is in one customer network, while its receivers
   are in the same or different customer networks.  When they are not in
   the same customer network, they have to communicate with each other
   through the I-IP transit core.  Some terminology is defined as
   follows.

   E-IP: The network address family of customer network.

   I-IP: The network address family of the transit core.

   PE: Provider edge router, which supports the address family of both


Xu, et al.                Expires May 22, 2008                  [Page 4]

Internet-Draft          softwire mcast framework           November 2007


   I-IP and E-IP.

   upstream PE: The PE router that located at the upstream of multicast
   data flow, which connects the transit core and the customer network
   the RP/source S belongs to.

   downstream PE: The PE router that located at the downstream of
   multicast data flow, which connects the transit core and the customer
   network which a group member of RP/source belongs to while the RP/
   source doesn't belongs to.


Xu, et al.                Expires May 22, 2008                  [Page 5]

Internet-Draft          softwire mcast framework           November 2007


3.  Schemes for Unicast Core

   In some scenarios, the I-IP transit core do not run multicast
   protocol, thus PEs do not construct multicast distribution trees in
   the I-IP transit core.  Under this condition the multicast control
   messages from customer networks are encapsulated and decapsulated as
   common packets to get across the core.  There are two alternative
   schemes in this scenario.  They will be introduced in this section
   respectively.

3.1.  Ingress Replication to All the Other PEs

   This scheme treats the I-IP transit core as a LAN.  When a multicast
   message arrives at a certain PE from a customer network, it is
   encapsulated and unicasted to all the other PEs.  Ingress PE
   encapsulates the E-IP multicast messages using I-IP address and
   unicasts it to all the other PEs.  After receiving the encapsulated
   message, egress PE decapsulates it and forwards the message to the
   attached customer network if there are receivers.

   This scheme is easy to implement and has low cost for maintaining
   multicast states in the PEs.  But its efficiency is low and produces
   redundancy in backbone.  So the scalability is poor.

3.2.  Ingress Replication to Necessary PEs

   In this scheme, only the PEs that need the multicast messages receive
   the encapsulated messages.  When a multicast data message arrives at
   PE A from a customer network, it is encapsulated and unicasted to the
   PEs which have sent Join messages to PE A before.  To achieve this we
   must construct and maintain a softwire multicast encapsulation table
   in PE A.

   This scheme doesn't tunnel multicast message to the PEs that do not
   have receivers attached.  But every PE must maintain a large mount of
   multicast route entries.  A trade off is to add some aggregation
   mechanism in the multicast route entry, so several (S, G) can share
   the same entry.


Xu, et al.                Expires May 22, 2008                  [Page 6]

Internet-Draft          softwire mcast framework           November 2007


4.  Schemes for Multicast Core

   The schemes in Section 3 don't use any multicast control protocol in
   the transit core which will fit well where the transit core can't
   support multicast protocol, but bring in some redundancy in backbone.
   In this section, we consider the scenario where the transit core
   supports multicast protocol, and there are four alternative schemes.

4.1.  RPF-Vector-Based Address Translation

   The main idea of address translation is to translate E-IP addresses
   of the Join/Prune messages to I-IP addresses, thus E-IP multicast
   messages can be translated to corresponding I-IP multicast messages
   at ingress PEs, and then be translated back to E-IP multicast
   messages after arriving at egress PEs.  The translation procedure
   should follow some predefined rule, so that ingress PE and egress PE
   can finish the translation and retranslation procedure correctly
   without the need to negotiate.  For example, if E-IP is IPv4 and I-IP
   is IPv6, the ingress PE uses a predefined IPv6 prefix for any case to
   translate an IPv4 address to an IPv6 address, and the predefined IPv6
   prefix combined with the IPv4 address makes up of the new IPv6
   address in the IPv6 transit core.  Then the egress PE can easily
   retranslate it to the original IPv4 address by simply removing the
   predefined IPv6 prefix.

   Since the source and group addresses in the I-IP Join/Prune message
   are translated from E-IP by adding a predefined I-IP prefix, they can
   not be recognized by P routers in order to get to the corresponding
   egress PEs.  We use an RPF Vector in the Join/Prune message to route
   them in the I-IP transit core.  The RPF Vector is an optional
   extended attribution of PIM, which designates the routers which
   router the Join/Prune message must pass by. i.e., PE A fills the I-IP
   address of PE B in the RPF Vector of Join/Prune message to help it
   find a route to PE B in the transit core.  Then the Join/Prune
   message builds a multicast tree in the transit core and finally
   arrives at PE B.

   When some multicast data packet arrives at PE B, it will be
   translated to an I-IP packet, and delivered along the I-IP multicast
   tree constructed by the former Join/Prune message in core and arrive
   at PE A. Then PE A will translate it back and forward it to the E-IP
   network.

   The address translation scheme is only available in the case where
   E-IP is IPv4 and I-IP is IPv6, as IPv6 addresses are 128bit long, it
   is possible to translate an IPv4 address to an IPv6 address by making
   IPv4 address part of the IPv6 address algorithmically.  PEs can
   translate the IPv4 S and G into corresponding IPv4-mapped IPv6


Xu, et al.                Expires May 22, 2008                  [Page 7]

Internet-Draft          softwire mcast framework           November 2007


   addresses [RFC4291], and then be translated back.  The precise
   circumstances under which these translations are done would be a
   matter of policy.  But if E-IP is IPv6 and I-IP is IPv4, the
   translation can't be achieved easily, and more research is needed to
   fit this condition.  Also, an additional RPF Vector must be applied
   to help to construct the I-IP tree in the transit core.  To sum up,
   the address translation method is virtually the same multicast
   message taking on different appearances in different IP address
   family networks and the I-IP multicast tree is part of the E-IP tree
   while presenting an I-IP feature.

4.2.  PIM-SSM based scheme

   In this scheme, we will construct an I-IP PIM-SSM tree in the transit
   core for each E-IP PIM-SSM tree in customer network.

   When a downstream PE receives a PIM Join/Prune message for (S,G) from
   a CE, it needs a multicast tunnel in I-IP transit network.  Then the
   downstream PE assigns an I-IP multicast address G' for (S,G)
   according to some rule, and signals the corresponding upstream PE
   (with I-IP address S') in the transit core with a join message
   (S',G') which includes the original multicast address information.

   After receiving the I-IP join message, the upstream PE will get the
   E-IP address (S,G) out of this message and use it to send an E-IP
   join message to the attached E-IP network.  After the whole process,
   multicast trees will be constructed in both E-IP and I-IP networks.

   For the data plane, when S wants to send data to its multicast group
   members which are not in the same customer network, data packets will
   first arrive at the upstream PE in the transit core.  Then the
   upstream PE encapsulates the data packets in I-IP PIM packets with
   the multicast address G', and delivers them to all the group members
   (the downstream PEs) through the tree (S', G').  After that, the
   downstream PEs in the (S', G') tree decapsulate the I-IP messages
   they receive and forward them to the corresponding routers which are
   leaves of the (S, G) tree in the customer networks.

   This scheme can be used to support customer network multicast of
   either IPv4 or IPv6 over a transit core of the opposite protocol.
   However, it only works when the customer network multicast is SSM,
   since it provides no method for mapping a customer "prune a source
   off the (*, G) tree" operation into an operation on the (S', G')
   tree.

   In this scheme, since each multicast group in customer networks has a
   corresponding I-IP PIM-SSM in the transit core, this option is path
   optimistic; meanwhile, the routers which are in the corresponding


Xu, et al.                Expires May 22, 2008                  [Page 8]

Internet-Draft          softwire mcast framework           November 2007


   I-IP PIM-SSM trees in the transit core need to store an amount of
   state information for the multicast groups.  The storage needed is
   proportional to the numbers of multicast trees passing through the
   routers.

4.3.  Static PIM-SSM RPT

   Common PIM-SM requires the transit core maintain one or more source-
   trees which are specific for a particular multicast group.  Each such
   tree requires that state be maintained in all the routers that are in
   the tree.  This may bring too much overload to the backbone.  A
   trade-off solution is that for each PE we construct a source-tree
   whose leaves are all the other PEs, and transfer all the data from
   the PE through this tree.  This may make some PEs receive messages
   they don't need, but it can balance the amount of state and the
   optimality of the multicast routing.

   Since softwire unicast uses BGP to auto-discover other members, we
   can use this membership to construct PIM-SSM trees whose source is a
   PE, and leaves are all the other PEs.  Then we can forward all data
   from the PE through this tree in transit core.  We must construct
   such a tree for each PE as source.  It can be achieved this way: as
   soon as a new PE is found by another PE, the PE will send a Join
   message for the PIM-SSM tree whose source is this new PE.  The group
   addresses for these trees are uniquely allocated.

   When a router in customer network wants to join(or prune from) a
   multicast group whose RP(or source S) is in another customer network,
   the Join/Prune message must first be delivered to the PE which
   connects this customer network to the transit core.  Then the PE
   encapsulates the Join/Prune message in a I-IP unicast message whose
   I-IP destination address is the corresponding upstream PE , and then
   delivers it to the upstream PE.  When the Join/Prune message arrives
   at the upstream PE, it is decapsulated and forwarded to the RP(or the
   source) in the customer network connected to it.  And a multicast
   tree in customer network can be constructed in this way.

   When the RP(or the source S) wants to distribute data to all its
   multicast groups members.  The group members in the same customer
   network with the RP(or the source) will receive the data messages
   directly through PIM protocol.  For the group members who are not in
   the same customer network with the RP(or source), the E-IP data
   massages will firstly be sent to the PE(the upstream PE), and be
   encapsulated in I-IP multicast messages whose multicast address is
   the multicast address of the tree whose source is the upstream PE,
   and then be delivered to all the other PEs.  Among the PEs which
   receives these data messages, if there are no receivers in the
   attached E-IP network, the messages will be discarded directly;


Xu, et al.                Expires May 22, 2008                  [Page 9]

Internet-Draft          softwire mcast framework           November 2007


   otherwise the PE will decapsulate the I-IP massages to E-IP ones and
   forward them to the corresponding CE routers in the customer
   networks.

   In this scheme, no matter what multicast addresses of the E-IP
   multicast data messages are, if only their upstream PEs are the same
   PE, they must be distributed through the same I-IP PIM-SSM tree whose
   source is the shared upstream PE, and the I-IP multicast addresses of
   the encapsulated multicast data messages are the same.

4.4.  Single PE-Based Static Tree by PIM-SM

   Another trade-off scheme is that, we construct only one bidirectional
   RPT based on BIDIR-PIM(Bidirectional PIM) in the transit core, whose
   leaves are all the PEs in the transit core, and transfer data from
   every PEs through this tree.  This scheme may also make some PEs
   receive messages they do not need just like in Section 4.3, and it
   can also balance the amount of states and the efficiency of the
   multicast routing.

   When a router in customer network wants to join(or prune from) a
   multicast group whose RP(or source) is in another customer network,
   the Join/Prune message must firstly be delivered to the PE who
   connects this customer network with the transit core.  Then the PE
   encapsulates the Join/Prune message in a I-IP unicast message and
   delivers it to the upstream PE.  When the Join/Prune message arrives
   at the upstream PE, it is decapsulated and forwarded to the RP(or the
   source) in the customer network.  And a multicast tree in customer
   network can be constructed.

   When the RP(or source S) wants to distribute data to all its
   multicast group members in customer networks, the group members in
   the same customer network with RP(or S) will receive the data
   massages directly through PIM protocol.  For the group members who
   are not in the same customer network with RP(or S), the data messages
   will firstly be sent to the PE and encapsulated in multicast messages
   with the I-IP multicast address of the bidirectional RPT.  This way,
   data will be forwarded along the RPT.  When PEs receive these I-IP
   multicast messages, they will decapsulate the messages and get the
   corresponding E-IP multicast address.  If there are no receivers in
   the attached E-IP network, the messages will be discarded directly;
   otherwise the PE will decapsulate the I-IP massages to E-IP ones and
   forward them to the corresponding CE routers in the customer
   networks.


Xu, et al.                Expires May 22, 2008                 [Page 10]

Internet-Draft          softwire mcast framework           November 2007


5.  Select a Tunneling Technology

   We utilize the choice policy of tunneling technology for softwire
   unicast when encapsulating multicast control message.  The tunneling
   technology can be GRE, IP-in-IP, MPLS etc.  Here we use IP-in-IP,
   other tunneling method is alike.  Detailed solution will be described
   in next revision.


Xu, et al.                Expires May 22, 2008                 [Page 11]

Internet-Draft          softwire mcast framework           November 2007


6.  Security Considerations

   The PE routers could maintain secure communications through the use
   of Security Architecture for the Internet Protocol as described in
   [RFC4301].


Xu, et al.                Expires May 22, 2008                 [Page 12]

Internet-Draft          softwire mcast framework           November 2007


7.  IANA Considerations

   In solution 4.1, address translation is applied, and it should follow
   some predefined rule, especially the format of IPv6 prefix for
   translation should be predefined, so that ingress PE and egress PE
   can finish the translation and retranslation procedure correctly.
   The format of IPv6 prefix for translation can be unified within only
   the transit core , or within global area.  In the later condition,
   the format should be assigned by IANA.


Xu, et al.                Expires May 22, 2008                 [Page 13]

Internet-Draft          softwire mcast framework           November 2007


8.  Acknowledgments

   Meijia Hou, Yuntao Zhou, and Junfang Han provided useful input into
   this document.


Xu, et al.                Expires May 22, 2008                 [Page 14]

Internet-Draft          softwire mcast framework           November 2007


9.  References

9.1.  Normative References

   [RFC2362]  Estrin, D., Farinacci, D., Helmy, A., Thaler, D., Deering,
              S., Handley, M., and V. Jacobson, "Protocol Independent
              Multicast-Sparse Mode (PIM-SM): Protocol Specification",
              RFC 2362, June 1998.

   [RFC2373]  Hinden, R. and S. Deering, "IP Version 6 Addressing
              Architecture", RFC 2373, July 1998.

   [RFC4291]  Hinden, R. and S. Deering, "IP Version 6 Addressing
              Architecture", RFC 4291, February 2006.

   [RFC4301]  Kent, S. and K. Seo, "Security Architecture for the
              Internet Protocol", RFC 4301, December 2005.

9.2.  Informative References

   [I-D.ietf-l3vpn-2547bis-mcast]
              Rosen, E. and R. Aggarwal, "Multicast in MPLS/BGP IP
              VPNs", draft-ietf-l3vpn-2547bis-mcast-05 (work in
              progress), July 2007.

   [I-D.ietf-l3vpn-2547bis-mcast-bgp]
              Aggarwal, R., "BGP Encodings and Procedures for Multicast
              in MPLS/BGP IP VPNs",
              draft-ietf-l3vpn-2547bis-mcast-bgp-03 (work in progress),
              July 2007.

   [I-D.ietf-pim-rpf-vector]
              Wijnands, I., "The RPF Vector TLV",
              draft-ietf-pim-rpf-vector-04 (work in progress),
              July 2007.

   [I-D.ietf-softwire-4over6vpns]
              Shepherd, G., "IPv4 unicast/multicast VPNs over an IPv6
              core", draft-ietf-softwire-4over6vpns-00 (work in
              progress), June 2006.

   [I-D.ietf-softwire-mesh-framework]
              Wu, J., "Softwire Mesh Framework",
              draft-ietf-softwire-mesh-framework-02 (work in progress),
              July 2007.


Xu, et al.                Expires May 22, 2008                 [Page 15]

Internet-Draft          softwire mcast framework           November 2007


Authors' Addresses

   Mingwei Xu
   Tsinghua University
   Department of Computer Science, Tsinghua University
   Beijing  100084
   P.R.China

   Phone: +86-10-6278-5822
   Email: xmw@csnet1.cs.tsinghua.edu.cn


   Yong Cui
   Tsinghua University
   Department of Computer Science, Tsinghua University
   Beijing  100084
   P.R.China

   Phone: +86-10-6278-5822
   Email: yong@csnet1.cs.tsinghua.edu.cn


   Chris Metz
   Cisco Systems, Inc.
   3700 Cisco Way
   San Jose, Ca.  95134
   USA

   Email: chmetz@cisco.com


Xu, et al.                Expires May 22, 2008                 [Page 16]

Internet-Draft          softwire mcast framework           November 2007


Full Copyright Statement

   Copyright (C) The IETF Trust (2007).

   This document is subject to the rights, licenses and restrictions
   contained in BCP 78, and except as set forth therein, the authors
   retain all their rights.

   This document and the information contained herein are provided on an
   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
   THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
   OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
   THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.


Intellectual Property

   The IETF takes no position regarding the validity or scope of any
   Intellectual Property Rights or other rights that might be claimed to
   pertain to the implementation or use of the technology described in
   this document or the extent to which any license under such rights
   might or might not be available; nor does it represent that it has
   made any independent effort to identify any such rights.  Information
   on the procedures with respect to rights in RFC documents can be
   found in BCP 78 and BCP 79.

   Copies of IPR disclosures made to the IETF Secretariat and any
   assurances of licenses to be made available, or the result of an
   attempt made to obtain a general license or permission for the use of
   such proprietary rights by implementers or users of this
   specification can be obtained from the IETF on-line IPR repository at
   http://www.ietf.org/ipr.

   The IETF invites any interested party to bring to its attention any
   copyrights, patents or patent applications, or other proprietary
   rights that may cover technology that may be required to implement
   this standard.  Please address the information to the IETF at
   ietf-ipr@ietf.org.


Acknowledgment

   Funding for the RFC Editor function is provided by the IETF
   Administrative Support Activity (IASA).


Xu, et al.                Expires May 22, 2008                 [Page 17]