Network Working Group                                  IJsbrand Wijnands
Internet Draft                                               Arjen Boers
Expiration Date: December 2004                                Eric Rosen
                                                     Cisco Systems, Inc.

                                                               June 2004


                  The Proxy Field in PIM Join Messages


                    draft-wijnands-pim-proxy-00.txt

Status of this Memo

   This document is an Internet-Draft and is in full conformance with
   all provisions of Section 10 of RFC2026.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

Abstract

   This document describes an extension to PIM which enables PIM to
   build multicast trees through an MPLS-enabled network, even if that
   network's IGP does not have a route to the source of the tree.


Wijnands, et al.                                                [Page 1]

Internet Draft      draft-wijnands-pim-proxy-00.txt            June 2004


Table of Contents

    1          Introduction  .......................................   2
    2          Use of the Proxy Field in Join Messages  ............   4
    2.1        Proxy and shared tree joins  ........................   4
    2.2        Proxy Hello Option  .................................   5
    2.3        The Vector Proxy  ...................................   5
    2.3.1      Inserting a Vector Proxy in a Join  .................   5
    2.3.2      Processing a Received Vector Proxy  .................   5
    2.3.3      Vector Proxy and Asserts  ...........................   6
    2.4        Other Proxy Types  ..................................   6
    2.4.1      Vector plus MDT-SAFI  ...............................   6
    2.4.2      Vector Stack  .......................................   7
    2.5        Conflicting Proxies  ................................   7
    2.6        Proxy Convergence  ..................................   8
    2.7        Multiple Proxies  ...................................   8
    3          PIM Join packet format  .............................   8
    3.1        PIM Proxy Hello option  .............................   9
    3.2        Vector Proxy TLV  ...................................   9
    3.3        MDT-SAFI Proxy TLV  .................................  10
    3.4        Vector Stack Proxy TLV  .............................  10
    4          Intellectual Property Statement  ....................  11
    5          Acknowledgments  ....................................  12
    6          Full Copyright Statement  ...........................  12
    7          Normative References  ...............................  12
    8          Informational References  ...........................  12
    9          Authors' Addresses  .................................  13


1. Introduction

   It is sometimes convenient to distinguish the routers of a particular
   network into two categories: "edge routers" and "core routers".  The
   edge routers attach directly to users or to other networks, but the
   core routers attach only to other routers of the same network.  If
   the network is MPLS-enabled, then any unicast packet which needs to
   travel outside the network can be "tunneled" via MPLS from one edge
   router to another.  To handle a unicast packet which must travel
   outside the network, an edge router needs to know which of the other
   edge routers is the best exit point from the network for that
   packet's destination IP address.  The core routers, however, do not
   need to have any knowledge of routes which lead outside the network;
   as they handle only tunneled packets, they only need to know how to
   reach the edge routers and the other core routers.


Wijnands, et al.                                                [Page 2]

Internet Draft      draft-wijnands-pim-proxy-00.txt            June 2004


   Consider, for example, the case where the network is an Autonomous
   System (AS), the edge routers are EBGP speakers, the core routers may
   be said to constitute a "BGP-free core".  The edge routers must
   distribute BGP routes to each other, but not to the core routers.

   As another example, consider the case of an inter-AS MPLS/BGP IP VPN,
   as discussed in section 10 of [RFC2547bis].  Traffic may need to flow
   from a Provider Edge (PE) router in one AS to a PE router in another,
   but the core routers  in the first AS are NOT required to have a
   route to the PE in the second AS.

   However, when multicast packets are considered, the strategy of
   keeping the core routers free of "external" routes is more
   problematic.  When using PIM [PIMv2] to create a multicast
   distribution tree for a particular multicast group, one wants the
   core routers to be full participants in the PIM protocol, so that
   multicasting can be done efficiently in the core.  This means that
   the core routers must be able to correctly process PIM Join messages
   for the group, which in turn means that the core routes must be able
   to send the Join messages towards the root of the distribution tree.
   If the root of the tree lies outside the network's borders (e.g., is
   in a different AS), and the core routers do not maintain routes to
   external destinations, then the PIM Join messages cannot be
   processed, and the multicast distribution tree cannot be created.

   In order to allow PIM to work properly in an environment where the
   core routers do not maintain external routes, a PIM extension is
   needed.  When an edge router sends a PIM Join message into the core,
   it must include in that message a "Vector" which specifies the IP
   address of the next edge router along the path to the root of the
   multicast distribution tree.  The core routers can then process the
   Join message by sending it towards the specified edge router (i.e.,
   toward the Vector).  In effect, the Vector serves as a proxy, within
   a particular network, for the root of the tree.

   This document defines a new field in the PIM Join message, called the
   "Proxy" field.  A Proxy field can consist of a single Vector (e.g.,
   IPv4 address) or a stack of Vectors (creating a form of source
   route).  It can also consist of a Vector followed by an MDT-SAFI
   address [MDT-SAFI]; this is useful in supporting L3VPN multicast
   [VPN-MCAST].


Wijnands, et al.                                                [Page 3]

Internet Draft      draft-wijnands-pim-proxy-00.txt            June 2004


2. Use of the Proxy Field in Join Messages

   Before we can start forwarding multicast packets we need to build a
   forwarding tree by sending PIM Joins hop by hop. Each router in the
   path creates a forwarding state and propagates the Join towards the
   root of the forwarding tree. The building of this tree is receiver
   driven. See Figure 1.

               ------------------- BGP -------------------
               |                                         |
    [S]---( Edge 1)--(Core 1)---( Core )--(Core 2)---( Edge 2 )---[R]

                              <--- (S,G) Join

                                   Figure 1.

   In this example, the 2 edge routers are BGP speakers. The core
   routers are not BGP speakers and do not have any BGP distributed
   routes. The route to S is a BGP distributed route, hence is known to
   the edge but not to the core.

   The Edge 2 router determines the interface leading to S, and sends a
   PIM Join to the upstream router. In this example, though, the
   upstream router is a core router, with no route to S.  Without the
   PIM extensions specified in this document, the core router cannot
   determine where the send the Join, so the tree cannot be constructed.

   To allow the core router to participate in the construction of the
   tree, the Edge 2 router will include a Proxy field in the PIM Join.
   In this example, the Proxy field will contain the IP address of Edge
   1.  Edge 2 then forwards the PIM Join towards Edge 1. The
   intermediate core router do their RPF check on the Proxy (IP address
   of Edge 1) rather than the Source, this allows the tree to be
   constructed.


2.1. Proxy and shared tree joins

   In the example above we build a source tree to illustrate the proxy
   behavior. The proxy is however not restricted to source tree only.
   The tree may also be constructed towards a Rendezvous Point (RP) IP
   address. The RP IP address is used in a similar way as the Source in
   the example above. PIM Proxy procedures defined for sources are
   equally applicable to RPs unless otherwise noted.


Wijnands, et al.                                                [Page 4]

Internet Draft      draft-wijnands-pim-proxy-00.txt            June 2004


2.2. Proxy Hello Option

   A new PIM source type has been defined to include the Proxy field.
   This source type is included in a normal PIM Join. Each router on a
   connected network needs to be able to understand and parse the Join
   message. Therefore we include a new PIM hello option to advertise our
   capability to parse and process the new source type. We can only send
   a PIM Join which includes a Proxy if ALL routers on the network
   support the new option. (Even a router which is not the upstream
   neighbor must be able parse the packet in order to do Join
   suppression or overriding.) Option value TBD.


2.3. The Vector Proxy

2.3.1. Inserting a Vector Proxy in a Join

   In the example of Figure 1, when the Edge 2 router looks up the route
   to the source of the multicast distribution tree, it will find a
   BGP-distributed route whose "BGP next-hop" is Edge 1. Edge 2 then
   looks up the route to Edge 1 to find interface and PIM adjacency
   which is the next hop to the source, namely Core 2.

   When Edge 2 sends a PIM Join to Core 2, it includes a Vector Proxy
   specifying the address of Edge 1.  Core 2, and subsequent core
   routers, will forwarding the Join along the Vector (i.e, towards Edge
   1) instead of trying to forward it towards S.

   Whether a Proxy is actually needed depends on whether the Core
   routers have a route to the source of the multicast tree.  How the
   Edge router knows whether or not this is the case (and thus how the
   Edge router determines whether or not to insert a Proxy field) is
   outside the scope of this document.


2.3.2. Processing a Received Vector Proxy

   When processing a received PIM Join which contains a Vector Proxy, a
   router must first check to see if the Vector IP address is one of its
   own IP addresses.  If so, the Vector Proxy is discarded, and not
   passed further upstream. Otherwise, the Vector Proxy is used to find
   the route to the source, and is passed along when a PIM Join is sent
   upstream.  Note that a router which receives a Vector Proxy must  use
   it, even if that router happens to have a route to the source.

   A router which discards a Vector Proxy may of course insert a new
   Vector Proxy.  This would typically happen if a PIM Join needed to
   pass through a sequence of Edge routers, each pair of which is


Wijnands, et al.                                                [Page 5]

Internet Draft      draft-wijnands-pim-proxy-00.txt            June 2004


   separated by a core which does not have external routes.

   In the absence of periodic refreshment, Vectors expire along with the
   corresponding (S,G) state.


2.3.3. Vector Proxy and Asserts

   In a PIM Assert message we include the routing protocol's "metric" to
   the source of the tree. This information is used in the selection of
   the assert winner. If a PIM Join  is being sent towards a Vector,
   rather than towards the source, the Assert message must have the
   metric to the Vector instead of the metric to the source.  The Assert
   message however does not have a Proxy field and does not mention the
   Vector.

   A router may change its upstream neighbor on a particular multicast
   tree as the result of receiving Assert messages.  However a Vector
   Proxy should not be sent in a PIM Join to an upstream neighbor which
   is chosen as the result of processing the Assert messages.
   Reachability of the Vector is only guaranteed by the router that
   advertises reachability to the Vector in it's IGP. If the assert
   winner upstream is not our real preferred next-hop, we can't be sure
   this router knows the path to the Vector.


2.4. Other Proxy Types

2.4.1. Vector plus MDT-SAFI

   This Proxy type is used in support of the multicast VPN service
   [VPN-MCAST].  Here the source of the multicast distribution tree is
   not an IPv4 address, but an MDT_SAFI [MDT-SAFI] address.  Each edge
   router along the path to the source is expected to have a table of
   BGP-distributed MDT-SAFI addresses, but the core routers are not
   expected to have any MDT-SAFI addresses or to have routes to Edge
   routers that are in other networks.  An Edge router creating a PIM
   Join would insert a "Vector plus MDT-SAFI" Proxy.  The Vector
   identifies the next Edge router on that path to the source, and the
   MDT-SAFI identifies the source of the tree.  When the Join reaches
   the Edge router identified by the Vector, that Edge router uses the
   MDT-SAFI to look up the route to the source in its BGP MDT-SAFI
   table.  When the Join is sent upstream, it continues to carry the
   "Vector plus MDT-SAFI" Proxy, but with a new Vector value identifying
   the next Edge router in the path.

   Eventually, the Join must reach a router that is identified by both
   the Vector part and the MDT-SAFI part of the Proxy.  When this


Wijnands, et al.                                                [Page 6]

Internet Draft      draft-wijnands-pim-proxy-00.txt            June 2004


   happens, the Proxy is discarded and further processing of the Join
   continues.  (Typically this will be at the source of the tree.)

   Per [MDT-SAFI], the MDT-SAFI address consists of an RD, a multicast
   group address, and the IP address of the source.  In the Proxy field
   we encode only the RD, as the other two components of the MDT-SAFI
   address can be gleaned from other parts of the Join.


2.4.2. Vector Stack

   A Vector Stack Proxy is a stack of Vectors used to build a forwarding
   tree that follows a set of routers identified by the Vectors. The
   Vectors in the stack define the path. How the Vectors are selected is
   out of the scope of this draft. Using the Vector stack we can build a
   traffic engineered path per (S,G). The rules that apply to a single
   Vector Proxy also apply to the first Vector on the stack. However,
   when the router identified by the first Vector is reached, it pops
   the stack before passing the Proxy upstream.  We could get the same
   functionality by including multiple single Vectors in the PIM Join,
   we do however prefer to have a new TLV for this. We save the overhead
   of TLV type and length for multiple Vectors, and we also limit the
   Proxy count number in the PIM Join message since we don't have to
   count each Vector as a single Proxy. This way a maximum number of 31
   proxies seems sufficient.


2.5. Conflicting Proxies

   It's possible that a router receives conflicting proxy information
   from different downstream routers. See Figure 2.


           ( Edge A1 )          ( Edge B1 )---- [R1]
          /           \        /
         /             \      /
      [S]              ( Core )
        \              /      \
         \            /        \
           ( Edge A2 )          ( Edge B2 )---- [R2]


                           Figure 2

   There are 2 receivers for the same group connected to Edge B1 and B2.
   Suppose that edge router B1 prefers A1 as the exit point and B2
   prefers A2 as exit point to reach the source S. If both Edge B1 and


Wijnands, et al.                                                [Page 7]

Internet Draft      draft-wijnands-pim-proxy-00.txt            June 2004


   B2 send a Join including a Proxy to prefer their exit router in the
   network and they cross the same core router, the core router will get
   conflicting proxy information for the source. If this happens we use
   the Proxy from the PIM adjacency with the numerically smallest IP
   address.  The Proxies from other sending routers may be kept around
   in case the best Proxy gets pruned or expires, we are able to
   immediately use the second best Proxy and converge quickly without
   waiting for the next periodic update.


2.6. Proxy Convergence

   A Proxy is included in a PIM Join message together with the source
   information. If the Proxy for this source is changed, we trigger a
   new PIM Join message to the upstream router.  This causes the new
   Proxy to be propagated. This new Proxy implicitly removes the old
   Proxy upstream. If processing the new Proxy results in a change in
   the distribution tree, a PIM Prune message may be sent.  This PIM
   Prune does not need to carry any Proxy, the sender of the prune and
   the source and group information is enough to identify the entry. The
   proxy information is removed immediately and possibly a new proxy is
   chosen from the database if available.


2.7. Multiple Proxies

   A PIM Join can contain multiple Proxies. The Proxies are encoded as
   TLVs associated with a new PIM source type in the PIM message. When a
   PIM Join with multiple Proxies is received, the first Proxy is
   processed, and the action taken depends upon the Proxy type.  This
   may or may not result in the processing of the next Proxy.  The set
   of Proxies is treated as a stack, much as described in section 3.3.
   Proxies not processed are passed upstream unchanged.


3. PIM Join packet format

   There is no space in the default PIM source encoding to include a
   Proxy field. Therefore we introduce a new source encoding type. The
   proxies are formatted as TLV's. The new Encoded source address looks
   like this:

   0                   1                   2                   3
   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | Addr Family   | Encoding Type | TLV #   |S|W|R|  Mask Len     |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                        Source Address                         |


Wijnands, et al.                                                [Page 8]

Internet Draft      draft-wijnands-pim-proxy-00.txt            June 2004


   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | Type          | Length        | Value
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+.....
   | Type          | Length        | Value
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+.....
           .                    .                     .
           .                    .                     .


   TLV # gives the number of TLV's that are included with this
   source. With the 5 bits we can include a maximum of 31 TLV's

   Type field of the TLV is 1 byte.

   Length field of the TLV is 1 byte.

   The other fields are the same as described in the [PIMv2] spec.

   The source TLV encoding type: TBD.


3.1. PIM Proxy Hello option

    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |      OptionType = XX          |      OptionLength = 0         |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   Option type: TBD.


3.2. Vector Proxy TLV

   0                   1                   2                   3
   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | Type          | Length        |         IP address
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-.......

   Type
   ----

   The Vector Proxy type is 0.

   Length
   ------


Wijnands, et al.                                                [Page 9]

Internet Draft      draft-wijnands-pim-proxy-00.txt            June 2004


   Length in bytes is 4.

   Value
   -----

   IPv4 address.


3.3. MDT-SAFI Proxy TLV

   0                   1                   2                   3
   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | Type          | Length        |      IP address
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
                                   |      RD
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-.......


   Type
   ----

   RD Proxy type is 1

   Length
   ------

   Length in bytes is 24

   Value
   -----

   IPv4 address and RD.


3.4. Vector Stack Proxy TLV

   0                   1                   2                   3
   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | Type          | Length        | Depth         | Reserved      |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                        Vector 1                               |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                        Vector n                               |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+


Wijnands, et al.                                               [Page 10]

Internet Draft      draft-wijnands-pim-proxy-00.txt            June 2004


   Type
   ----

   Vector Stack Proxy type is 2

   Length
   ------

   Length is (2 + Depth * Vector size) bytes

   Value
   -----

   Depth is 1 byte, allows for 255 Vectors.

   Reserved is 1 byte,

   Vector is an IPv4 address, 4 bytes.


4. Intellectual Property Statement

   The IETF takes no position regarding the validity or scope of any
   Intellectual Property Rights or other rights that might be claimed to
   pertain to the implementation or use of the technology described in
   this document or the extent to which any license under such rights
   might or might not be available; nor does it represent that it has
   made any independent effort to identify any such rights.  Information
   on the procedures with respect to rights in RFC documents can be
   found in BCP 78 and BCP 79.

   Copies of IPR disclosures made to the IETF Secretariat and any
   assurances of licenses to be made available, or the result of an
   attempt made to obtain a general license or permission for the use of
   such proprietary rights by implementers or users of this
   specification can be obtained from the IETF on-line IPR repository at
   http://www.ietf.org/ipr.

   The IETF invites any interested party to bring to its attention any
   copyrights, patents or patent applications, or other proprietary
   rights that may cover technology that may be required to implement
   this standard.  Please address the information to the IETF at ietf-
   ipr@ietf.org.


Wijnands, et al.                                               [Page 11]

Internet Draft      draft-wijnands-pim-proxy-00.txt            June 2004


5. Acknowledgments

   The authors would like to thank Yakov Rekhter and Dino Farinacci for
   their initial ideas on this topic and Nidhi Bhaskar for her comments
   on the draft.


6. Full Copyright Statement

   Copyright (C) The Internet Society (2004).  This document is subject
   to the rights, licenses and restrictions contained in BCP 78 and
   except as set forth therein, the authors retain all their rights.

   This document and the information contained herein are provided on an
   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
   ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
   INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
   INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.


7. Normative References

   [PIMv2]  "Protocol Independent Multicast - Sparse Mode (PIM-SM)",
   Fenner, Handley, Holbrook, Kouvelas, December 2002, draft-ietf-pim-
   sm-v2-new-06.txt


8. Informational References

   [MDT-SAFI] "MDT SAFI", Nalawade and Sreekantiah, February 2004,
   draft-nalawade-idr-mdt-safi-00.txt

   [RFC2547bis] "BGP/MPLS IP VPNs", edited by Rosen and Rekhter,
   September 2003, draft-ietf-l3vpn-rfc2547bis-01.txt

   [VPN-MCAST] "Multicast in BGP/MPLS VPNs", Cai, Rosen, Wijnands,
   draft-rosen-vpn-mcast-07.txt, May 2004


Wijnands, et al.                                               [Page 12]

Internet Draft      draft-wijnands-pim-proxy-00.txt            June 2004


9. Authors' Addresses

   IJsbrand Wijnands
   Cisco Systems, Inc.
   170 Tasman Drive
   San Jose, CA, 95134
   E-mail: ice@cisco.com

   Arjen Boers
   Cisco Systems, Inc.
   170 Tasman Drive
   San Jose, CA, 95134
   E-mail: aboers@cisco.com

   Eric Rosen
   Cisco Systems, Inc.
   1414 Massachusetts Avenue
   Boxborough, MA, 01719
   E-mail: erosen@cisco.com


Wijnands, et al.                                               [Page 13]