Internet Working Group                                 Ali Sajassi 
   Internet Draft                                         Samer Salam 
   Category: Standards Track                              Keyur Patel 
                                                                Cisco 
                                                                      
                                                                      
                                                                      
   Expires: September 23, 2010                         March 23, 2010 
                                                                         
    
                           Routed VPLS using BGP 
                   draft-sajassi-l2vpn-rvpls-bgp-00.txt 
    
   Status of this Memo 
    
   This Internet-Draft is submitted to IETF in full conformance with 
   the provisions of BCP 78 and BCP 79. 
     
   Internet-Drafts are working documents of the Internet Engineering 
   Task Force (IETF), its areas, and its working groups.  Note that 
   other groups may also distribute working documents as Internet-
   Drafts. 
    
   Internet-Drafts are draft documents valid for a maximum of six 
   months and may be updated, replaced, or obsoleted by other documents 
   at any time.  It is inappropriate to use Internet-Drafts as 
   reference material or to cite them other than as "work in progress." 
    
   The list of current Internet-Drafts can be accessed at 
   http://www.ietf.org/ietf/1id-abstracts.txt 
    
   The list of Internet-Draft Shadow Directories can be accessed at 
   http://www.ietf.org/shadow.html 
    
   This Internet-Draft will expire on July 26, 2010. 
    
   Copyright Notice 
    
   Copyright (c) 2010 IETF Trust and the persons identified as the  
   document authors. All rights reserved. 
    
   This document is subject to BCP 78 and the IETF Trust's Legal 
   Provisions Relating to IETF Documents 
   (http://trustee.ietf.org/license-info) in effect on the date of 
   publication of this document.  Please review these documents 
   carefully, as they describe your rights and restrictions with 
   respect to this document.  Code Components extracted from this 
   document must include Simplified BSD License text as described in 
   Section 4.e of the Trust Legal Provisions and are provided without 
   warranty as described in the Simplified BSD License. 
    
    
     
   Sajassi, et. al.                                           [Page 1] 
    
    
   draft-sajassi-l2vpn-rvpls-00.txt  March 2010 
    
    
   Abstract 
    
   VPLS, as currently defined, has challenges pertaining to the areas 
   of redundancy and multicast optimization. In particular, multi-
   homing with all-active forwarding cannot be supported and there's no 
   easy way for leveraging MP2MP MDTs for optimizing the delivery of 
   multi-destination frames. This document defines an evolution of the 
   current VPLS solution, referred to as Routed VPLS (R-VPLS), to 
   address these shortcomings. 
    
    
   Conventions 
    
   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 
   document are to be interpreted as described in RFC 2119 
    
    
   Table of Contents 
    
   1. Introduction.................................................... 3 
   2. Terminology..................................................... 3 
   3. Requirements.................................................... 4 
   3.1. All-Active Multi-homing....................................... 4 
   3.1.1. Flow-based Load Balancing................................... 4 
   3.1.2. Flow-based Multi-pathing.................................... 4 
   3.1.3. Geo-redundant PE Nodes...................................... 5 
   3.1.4. Optimal Traffic Forwarding.................................. 5 
   3.1.5. Flexible Redundancy Grouping Support........................ 5 
   3.1.6. Dual-homed Network.......................................... 6 
   3.2. Multicast Optimization with MP2MP MDT......................... 6 
   4. VPLS Issues..................................................... 6 
   4.1. Forwarding Loops.............................................. 7 
   4.2. Duplicate Frame Delivery...................................... 8 
   4.3. MAC Forwarding Table Instability.............................. 8 
   4.4. Identifying Source PE in MP2MP MDT............................ 8 
   5. Solution Overview: Routed VPLS (R-VPLS)......................... 9 
   6. R-VPLS Components............................................... 9 
   6.1. MAC Learning & Forwarding in Bridge Module................... 10 
   6.2. MAC Address Distribution in BGP.............................. 10 
   6.2.1. R-VPLS NLRI................................................ 11 
   6.2.2. L2VPN-MAC SAFI............................................. 12 
   6.2.3. BGP Route Targets.......................................... 12 
   6.3. Frame Forwarding over MPLS Core.............................. 12 
   6.3.1. Unicast.................................................... 12 
   6.3.2. Multicast/Broadcast........................................ 13 
    
     
   Sajassi, et al.                                            [Page 2] 
    
    
   draft-sajassi-l2vpn-rvpls-00.txt  March 2010 
    
   6.4. Loop Avoidance and Duplicates Prevention..................... 13 
   6.4.1. Filtering Based on Multi-homing ID......................... 14 
   6.4.2. Defining a Designated Forwarder............................ 14 
   6.5. LACP State Synchronization................................... 14 
   7. Security Considerations........................................ 15 
   8. IANA Considerations............................................ 15 
   9. Intellectual Property Considerations........................... 15 
   10. Normative References.......................................... 16 
   11. Informative References........................................ 16 
   12. Authors' Addresses............................................ 16 
    
 
 
 
    
   1.        Introduction 
    
   VPLS, as defined in [RFC4664][RFC4761][RFC4762], is a proven and 
   widely deployed technology. However, the existing solution has a 
   number of challenges when it comes to redundancy and multicast 
   optimization.  
    
   In the area of redundancy, current VPLS can only support multi-
   homing with active/standby resiliency model, for e.g. as described 
   in [VPLS-BGP-MH]. Flexible multi-homing with all-active ACs cannot 
   be supported without adding considerable complexity to the VPLS 
   data-path. 
    
   In the area of multicast optimization, [VPLS-MCAST] describes how 
   LSM MDTs can be used in conjunction with VPLS. However, this 
   solution is limited to P2MP MDTs, as there's no easy way for 
   leveraging MP2MP MDTs with VPLS. The lack of MP2MP support creates 
   scalability issues for certain applications. 
    
   This document defines an evolution of the current VPLS solution, to 
   address the aforementioned shortcomings. The proposed solution is 
   referred to as Routed VPLS (R-VPLS). 
  
   Section 2 provides a summary of the terminology used. Section 3 
   discusses the requirements for all-active resiliency and multicast 
   optimization. Section 4 described the issues associated with the 
   current VPLS solution in addressing the requirements. Section 5 
   offers an overview of R-VPLS and then Section 6 goes into the 
   details of its components. 
    
   2.        Terminology 
    
   CE: Customer Edge 
   DHD: Dual-homed Device 
    
     
   Sajassi, et al.                                            [Page 3] 
    
    
   draft-sajassi-l2vpn-rvpls-00.txt  March 2010 
    
   DHN: Dual-homed Network 
   LACP: Link Aggregation Control Protocol 
   LSM: Label Switched Multicast 
   MDT: Multicast Delivery Tree 
   MP2MP: Multipoint to Multipoint 
   P2MP: Point to Multipoint 
   P2P: Point to Point 
   PE: Provider Edge 
   PoA: Point of Attachment 
   PW: Pseudowire 
   R-VPLS: Routed VPLS 
    
    
    
   3.        Requirements 
    
   This section describes the requirements for all-active multi-homing 
   and MP2MP MDT support. 
    
   3.1.          All-Active Multi-homing 
 
   3.1.1.            Flow-based Load Balancing 
    
   A customer network or a customer device can be multi-homed to a 
   provider network using IEEE link aggregation standard -[802.1AX]. 
   In [802.1AX], the load-balancing algorithms by which a CE 
   distributes traffic over the Attachment Circuits connecting to the 
   PEs are quite flexible. The only requirement is for the algorithm to 
   ensure in-order frame delivery for a given traffic flow. In typical 
   implementations, these algorithms involve selecting an outbound link 
   within the bundle based on a hash function that identifies a flow 
   based on one or more of the following fields: 
     i) Layer 2: Source MAC Address, Destination MAC Address, VLAN 
     i     i) Layer 3: Source IP Address, Destination IP Address 
     i     i      i) Layer 4: UDP or TCP Source Port, Destination Port 
     iv) Combinations of the above. 
    
   A key point to note here is that [802.1AX] does not define a 
   standard load-balancing algorithm for Ethernet bundles, and as such 
   different implementations behave differently. As a matter of fact, a 
   bundle operates correctly even in the presence of asymmetric load-
   balancing over the links. This being the case, the first requirement 
   for active/active VPLS dual-homing is the ability to accommodate 
   flexible flow-based load-balancing from the CE node based on L2, L3 
   and/or L4 header fields. 
    
   3.1.2.            Flow-based Multi-pathing  
    
   [PWE3-FAT-PW] defines a mechanism that allows PE nodes to exploit 
   equal-cost multi-paths (ECMPs) in the MPLS core network by 
    
     
   Sajassi, et al.                                            [Page 4] 
    
    
   draft-sajassi-l2vpn-rvpls-00.txt  March 2010 
    
   identifying traffic flows within a PW, and associating these flows 
   with a Flow Label. The flows can be classified based on any 
   arbitrary combination of L2, L3 and/or L4 headers. Any active/active 
   VPLS dual-homing mechanism should seamlessly interoperate and 
   leverage the mechanisms defined in [PWE3-FAT-PW]. 
    
   3.1.3.            Geo-redundant PE Nodes 
    
   The PE nodes offering dual-homed connectivity to a CE or access 
   network may be situated in the same physical location (co-located), 
   or may be spread geographically (e.g. in different COs or POPs). The 
   latter is desirable when offering a geo-redundant solution that 
   ensures business continuity for critical applications in the case of 
   power outages, natural disasters, etc. An active/active VPLS dual-
   homing mechanism should support both co-located as well as geo-
   redundant PE placement. The latter scenario often means that 
   requiring a dedicated link between the PEs, for the operation of the 
   dual-homing mechanism, is not appealing from cost standpoint. 
   Furthermore, the IGP cost from remote PEs to the pair of PEs in the 
   dual-homed setup cannot be assumed to be the same when those latter 
   PEs are geo-redundant. 
    
   3.1.4.            Optimal Traffic Forwarding 
    
   In a typical network, and considering a designated pair of PEs, it 
   is common to find both single-homed as well as dual-homed CEs being 
   connected to those PEs. An active/active VPLS dual-homing solution 
   should support optimal forwarding of unicast traffic for all the 
   following scenarios: 
     i) single-homed CE to single-homed CE 
     i     i) single-homed CE to dual-homed CE 
     i     i      i) dual-homed CE to single-homed CE 
     iv) dual-homed CE to dual-homed CE 
      
   This is especially important in the case of geo-redundant PEs, where 
   having traffic forwarded from one PE to another within the same 
   redundancy group introduces additional latency, on top of the 
   inefficient use of the PE node's switching capacity. 
    
   3.1.5.            Flexible Redundancy Grouping Support  
    
   In order to simplify service provisioning and activation, the VPLS 
   dual-homing mechanism should allow arbitrary grouping of PE nodes 
   into redundancy groups. This is best explained with an example: 
   consider three PE nodes - PE1, PE2 and PE3. The dual-homing 
   mechanism must allow a given PE, say PE1, to be part of multiple 
   redundancy groups concurrently. For example, there can be a group 
   (PE1, PE2) and another group (PE1, PE3), where CEs could be dual-
   homed to any one of these two groups.  
    
     
   Sajassi, et al.                                            [Page 5] 
    
    
   draft-sajassi-l2vpn-rvpls-00.txt  March 2010 
    
    
   3.1.6.             Dual-homed Network 
    
   Supporting active/active dual-homing of an Ethernet network (a.k.a. 
   Dual-homed Network or DHN) to a pair of VPLS PEs poses a number of 
   challenges.  
    
   First, some resiliency mechanism needs to be in place between the 
   DHN and the PEs offering dual-homing, in order to prevent the 
   formation of L2 forwarding loops. Two options are possible here: 
   either the PEs participate in the control plane protocol of the DHN 
   (e.g. MST or ITU-T G.8032), or some auxiliary mechanism needs to run 
   between the CE nodes and the PEs. The latter must be complemented 
   with an interworking function, at the CE, between the auxiliary 
   mechanism and the DHN's native control protocol. However, unless the 
   PEs participate directly in the control protocol of the DHN, fast 
   control-plane re-convergence and fault recovery cannot be 
   guaranteed. Secondly, all existing Ethernet network resiliency 
   mechanisms operate at best at the granularity of VLANs. Hence, any 
   load-balancing would be limited to L2 flows only. Depending on the 
   applications at hand, this coarse flow granularity may not have 
   enough entropy to provide proper link/node utilization distribution 
   within the provider's network. Thirdly, an open issue remains with 
   the handling of DHN partitioning: the PEs need to reliably detect 
   the situation where the DHN has been segmented and each PE needs to 
   handle inbound/outbound traffic for only those customers (or hosts) 
   connected to the local partition. 
    
   3.2.          Multicast Optimization with MP2MP MDT 
    
   In certain applications, multiple multicast sources may exist for a 
   given VPLS instance, and these sources are dispersed over the 
   various PEs. For these applications, relying on P2MP MDTs for VPLS 
   is neither efficient nor scalable. In the worst case, a selective 
   MDT rooted on every PE may be required, thereby leading to an 
   exponential growth in the amount of state that needs to be 
   maintained in the MPLS core: the state required is O(N*V*M), where N 
   is the average number of PEs per VPLS instance, V is the number of 
   VPLS instances in the network and M is the average number of 
   multicast groups per instance. By using MP2MP MDTs, it is possible 
   to scale better by eliminating the number of PEs from the equation. 
   Thus, the scalability of multicast becomes no longer a function of 
   the number of sites. 
    
    
   4.        VPLS Issues 
    
   This section describes issues associated with the current VPLS 
   solution in meeting the above requirements. The current solution for 
   VPLS, as defined in [RFC4761]and [RFC4762], relies on establishing a 
   full-mesh of pseudowires among participating PEs, and data-plane 
    
     
   Sajassi, et al.                                            [Page 6] 
    
    
   draft-sajassi-l2vpn-rvpls-00.txt  March 2010 
    
   learning for the purpose of building the MAC forwarding tables. This 
   learning is performed on traffic received over both the attachment 
   circuits as well as the pseudowires. 
   Supporting an all-active multi-homing solution with current VPLS is 
   subject to three fundamental problems: the formation of forwarding 
   loops, duplicate delivery of flooded frames and MAC Forwarding Table 
   instability. These problems will be described next in the context of 
   the example network shown in figure 1 below. 
    
    
    
    
    
    
    
    
    
    
                         +--------------+ 
                         |              | 
                         |              |    
       +----+ AC1 +----+ |              | +----+   +----+ 
       | CE1|-----|VPLS| |              | |VPLS|---| CE2| 
       +----+\    | PE1| |   IP/MPLS    | | PE3|   +----+ 
              \   +----+ |   Network    | +----+  
               \         |              | 
             AC2\ +----+ |              |                  
                 \|VPLS| |              |                  
                  | PE2| |              | 
                  +----+ |              | 
                         +--------------+       
                                      
                    Figure 1: VPLS Multi-homed Network 
    
   In the network of Figure 1, it is assumed that CE1 has both 
   attachment circuits AC1 & AC2 active towards PE1 and PE2, 
   respectively. This can be achieved, for example, by running a multi-
   chassis Ethernet link aggregation group from CE1 to the pair of PEs. 
    
   4.1.          Forwarding Loops 
    
   Consider the case where CE1 sends a unicast frame over AC1, destined 
   to CE2. If PE1 doesn't have a forwarding entry in its MAC address 
   table for CE2, it will flood the frame to all other PEs in the VPLS 
   instance (namely PE3 & PE2) using either ingress replication over 
   the full-mesh of pseudowires, or alternatively over an LSM tree 
   [VPLS-MCAST]. When PE2 receives the flooded traffic, and assuming it 
   doesn't know the destination port to CE2, it will flood the traffic 
   over the ACs for the VFI in question, including AC2. Hence, a 
   forwarding loop is created where CE1 receives its own traffic. 
    
    
     
   Sajassi, et al.                                            [Page 7] 
    
    
   draft-sajassi-l2vpn-rvpls-00.txt  March 2010 
    
   4.2.          Duplicate Frame Delivery 
    
   Examine the scenario where CE2 sends a multi-destination frame 
   (unknown unicast, broadcast or multicast) to PE3. PE3 will then 
   flood the frame to both PE1 & PE2, using either ingress replication 
   over the pseudowire full-mesh or an LSM tree. Both PE1 and PE2 will 
   receive copies of the frame, and both will forward the traffic on to 
   CE1. Net result is that CE1 receives duplicate frames. 
    
   4.3.          MAC Forwarding Table Instability 
    
   Assume that both PE1 and PE2 have learnt that CE2 is reachable via 
   PE3. Now, CE1 starts sending unicast traffic to CE2. Given that CE1 
   has its ACs configured in an Ethernet link aggregation group, it 
   will forward traffic over both ACs using some load-balancing 
   technique as described in section 3.1 above. Both PE1 and PE2 will 
   forward frames from CE1 to PE3. Consequently, PE3 will see the same 
   MAC address for CE1 constantly moving between its pseudowire to PE1 
   and its pseudowire to PE2. The MAC table entry for CE1 will keep 
   flip-flopping indefinitely depending on traffic patterns. This MAC 
   table instability on PE3 may lead to frame mis-ordering for traffic 
   going from CE2 back to CE1. 
    
    
   Shifting focus towards the requirement to support MP2MP MDT, the 
   problem facing VPLS here is performing MAC learning over MP2MP MDT, 
   as discussed next. 
    
   4.4.          Identifying Source PE in MP2MP MDT 
    
   In the solution described in [VPLS-MCAST], a PE must perform MAC 
   learning on traffic received over an LSM MDT. To that end, the 
   receiving PE must be able to identify the source PE transmitting the 
   frame, in order to associate the MAC address with the p2p pseudowire 
   leading back to the source. With P2MP MDT, the MDT label uniquely 
   identifies the source PE. For inclusive trees, the MDT label also 
   identifies the VFI; whereas, for aggregate inclusive trees, a second 
   upstream-assigned label identifies the VFI.  
   However, when it comes to MP2MP MDT, the MDT label identifies the 
   root of the tree (which most likely is not the source PE), and the 
   second label (if present) identifies the VFI. There is no easy 
   solution to this problem since neither upstream nor downstream label 
   assignment can work among the VPLS PEs. 
    
    
   From the above, it should be clear that with the current VPLS 
   solution it is not possible to support all-active multi-homing or 
   MP2MP MDTs. In the sections that follow, we will explore a new 
   solution that meets the requirements identified in section 3 and 
   addresses the problems highlighted in this section. 
    
    
     
   Sajassi, et al.                                            [Page 8] 
    
    
   draft-sajassi-l2vpn-rvpls-00.txt  March 2010 
    
   5.        Solution Overview: Routed VPLS (R-VPLS) 
    
   This solution involves augmenting the current VPLS solution with 
   control-plane based MAC learning over the MPLS core. A PE continues 
   to perform data-plane based learning over its ACs, but performs no 
   such learning on traffic received from the MPLS core. MAC addresses 
   learnt by a PE over its ACs are advertised, using BGP, to all other 
   PEs in the same VPLS instance. Remote PEs receiving these BGP NLRIs 
   install forwarding entries, for the associated MAC addresses, in 
   their VFIs pointing to the PE sending the advertisements. 
   Multicast/broadcast traffic can be forwarded over the pseudowire 
   full-mesh per current VPLS, or over an LSM tree leveraging the model 
   described in [VPLS-MCAST]. Forwarding of unknown unicast traffic 
   over the MPLS/IP core is optional and the default mode is not to 
   forward it, but it is flooded over the local ACs per normal bridging 
   operations.  
    
   R-VPLS follows the same reference model for VPLS defined in 
   [RFC4664]. In particular, the PE model defined in Figure 3 of said 
   RFC applies, albeit with modifications to the functionality of the 
   Bridge and the VPLS Forwarder modules. The details of the R-VPLS 
   components are discussed in the next section. 
    
   Auto-discovery in R-VPLS works exactly as before and after PEs 
   belonging to a given VPLS instance discover each other, an inclusive 
   MP2MP MDT is setup per [MPLS-MDT]. Optionally, a full-mesh of PWs 
   per [RFC4761]/[RFC4762] or a set of P2MP MDTs per [VPLS-MCAST] can 
   be setup. The purpose of the MP2MP MDT or the full-mesh of PWs, or 
   the set of P2MP MDTs is for transporting customer multicast/ 
   broadcast frames and optionally for customer unknown unicast frames. 
   No MAC address learning is needed for frames received over the full-
   mesh of PWs or the MDT(s). 
    
   The mapping of customer Ethernet frames to a VPLS instance 
   (qualified learning versus unqualified learning) is also performed 
   as before. Furthermore, the MAC learning over Attachment Circuits is 
   done in the data-plane just as with current VPLS solution. The setup 
   of any additional MDT per user multicast group or groups is also 
   performed per [VPLS-MCAST]. 
      
    
   6.        R-VPLS Components 
    
   Figure 2 below shows the model of a PE participating in R-VPLS. The 
   modules in this figure will be used to explain the components of R-
   VPLS. 
    
                           MPLS Core 
         +-------------------------------+ 
         |               +-----------+   |  R-VPLS PE 
         |     +---------|   VPLS    |   | 
    
     
   Sajassi, et al.                                            [Page 9] 
    
    
   draft-sajassi-l2vpn-rvpls-00.txt  March 2010 
    
         |  +----+       | Forwarder |   | 
         |  |BGP |       +-----------+   | 
         |  +----+             | LAN Emulation Interface  
         |     |         +-----------+   | 
         |     +---------|  Bridge   |   | 
         |               +-----------+   | 
         +-----------------|---|---|-----+ 
                          AC1 AC2  ACn 
    
                              CEs 
    
                         Figure 2: R-VPLS PE Model 
    
    
   6.1.          MAC Learning & Forwarding in Bridge Module 
    
   The Bridge module within an R-VPLS PE performs basic bridging 
   operations as before and is responsible for: 
    
     i) Learning the source MAC address on all frames received over the 
        ACs, and dynamically building the bridge forwarding database. 
     i     i) Forwarding known unicast frames to local ACs or the LAN 
        Emulation interface for remote destinations. 
     i     i      i) Flooding unknown unicast frames over the local ACs and 
        optionally over the LAN Emulation interface. 
     iv) Flooding multicast/broadcast frames to the local ACs and to the 
        LAN Emulation interface. 
     v) Informing the BGP module of all MAC addresses learnt over the 
        local ACs. Also informing the BGP module when a MAC entry ages 
        out, or is flushed due to a topology change. 
     vi) Enforcing the filtering rules described in section 6.4. 
    
   6.2.          MAC Address Distribution in BGP 
    
   The BGP module within an R-VPLS PE is responsible for two main 
   functions: 
    
   First, advertising all MAC addresses learnt over the local ACs (by 
   the Bridge module) to all remote PEs participating in the VPLS 
   instance in question. This is done using a new L2VPN NLRI, to be 
   defined. The BGP module should withdraw the advertised NLRIs for MAC 
   addresses as they age out, or when the bridge table is flushed due 
   to a topology change. Since no MAC address learning is performed for 
   traffic received from the MPLS core, these BGP NLRI advertisements 
   are used to build the forwarding entries for remote MAC addresses 
   reachable over the MPLS network. 
   This brings the discussion to the second function of the BGP module, 
   namely: programming entries in the MAC forwarding table (in the VPLS 
    
     
   Sajassi, et al.                                           [Page 10] 
    
    
   draft-sajassi-l2vpn-rvpls-00.txt  March 2010 
    
   Forwarder module) using the information in the received BGP NLRIs. 
   These entries will be used for forwarding traffic over the MPLS core 
   to remotely reachable MAC addresses. Of course, the BGP module must 
   remove the forwarding entries corresponding to withdrawn NLRIs. Note 
   that these entries are not subject to timed aging (as they follow a 
   control-plane learning paradigm rather than data-plane learning). 
    
   BGP extensions are describe below. 
    
    
    
    
   6.2.1.            R-VPLS NLRI 
    
   A new BGP NRLI, called R-VPLS NLRI, is defined in this document as 
   follow: 
    
    
    
           +--------------------------------+ 
           |         RD (8 octets)          | 
           +--------------------------------+ 
           |    MPLS Label (4 octets)       | 
           +--------------------------------+ 
           |    MAC address (6 octets)      | 
           +--------------------------------+ 
    
                       Figure 1: R-VPLS NLRI Format 
    
   RD: Route Distinguisher encoded as described in [RFC4364] 
    
   MPLS Label: This is a downstream assigned MPLS label that identifies 
   the VPLS instance on the downstream PE (this label can be considered 
   analogous to L3VPN label associated with a given VRF).  
    
    
   MAC: This is the customer source MAC learned by the PE and being 
   advertised via BGP.  
    
    
   In order for two BGP speakers to exchange R-VPLS NLRI, they must use 
   BGP Capabilities Advertisement to ensure that they both are capable 
   of properly processing such NLRI. This is done as specified in 
   [RFC4760], by using capability code 1 (multiprotocol BGP) with an 
   AFI of 25 and an SAFI of R-VPLS. 
    
    
    
     
   Sajassi, et al.                                           [Page 11] 
    
    
   draft-sajassi-l2vpn-rvpls-00.txt  March 2010 
    
   6.2.2.            L2VPN-MAC SAFI 
    
   The R-VPLS NLRI is carried in BGP using BGP Multiprotocol Extensions 
   [RFC4760] with an AFI of 25 (L2VPN AFI), and a new SAFI known as BGP 
   L2VPN-MAC SAFI pending IANA assignment.  The NLRI field in the 
   MP_REACH_NLRI/MP_UNREACH_NLRI attribute contains the R-VPLS NLRI 
   encoded as specified in the above.  
    
    
   6.2.3.            BGP Route Targets 
    
    
   Each BGP R-VPLS NLRI will have one or more route-target extended 
   communities to associate a R-VPLS NLRI with a given VSI. These 
   route-targets control distribution of the R-VPLS NLRI and thereby 
   will control the formation of the overlay topology of the network 
   that constitutes a particular VPN. 
    
 
    
   6.3.          Frame Forwarding over MPLS Core 
    
   The VPLS Forwarder module is responsible for handling frame 
   transmission and reception over the MPLS core. The processing of the 
   frame differs depending on whether the destination is a unicast or 
   multicast/broadcast address. The two cases are discussed next. 
    
   6.3.1.            Unicast 
    
   For known unicast traffic, the VPLS Forwarder sends frames into the 
   MPLS core using the forwarding information received by BGP from 
   remote PEs. The frames are tagged with an LSP tunnel label and a 
   pseudowire label as with current VPLS. The point of variation from 
   current VPLS is in how the pseudowire label is determined and used. 
   In current VPLS, the pseudowire label serves dual purpose: (1) to 
   identify the source PE for data-plane learning, and (2) to identify 
   the VPLS instance (and hence VFI). For R-VPLS, since the MAC 
   learning is done in the control plane, there's no need for the 
   pseudowire label to identify the source PE. Hence, it is possible to 
   simplify the operation by using mp2p pseudowires, where a given PE 
   advertises the same downstream PW label, for a given VPLS instance, 
   to all peer PEs. This PW label can be advertised in the new L2VPN 
   MAC NLRIs. 
    
   For unknown unicast traffic, an R-VPLS PE can optionally forward 
   these frames over MPLS core; however, the default is not to forward. 
   If these frames are to be forwarded, then the same set of options 
   used for forwarding multicast/broadcast frames (as described in next 
   section) are also applicable here. 
    
    
    
     
   Sajassi, et al.                                           [Page 12] 
    
    
   draft-sajassi-l2vpn-rvpls-00.txt  March 2010 
    
   6.3.2.            Multicast/Broadcast 
    
   For multi-destination frames (multicast and broadcast) delivery, R-
   VPLS provides the flexibility of using a number of options: 
    
   Option 1: the VPLS Forwarder can perform ingress replication over a 
   full-mesh of p2p pseudowires, per current VPLS.  
    
   Option 2: the VPLS Forwarder can use p2mp MDT per the procedures 
   defined in [VPLS-MCAST]. 
    
   Option 3: the VPLS Forwarder can use mp2mp MDT per the procedures 
   described in section 6.4. This option is considered as default mode. 
    
   6.4.          Loop Avoidance and Duplicates Prevention 
    
   In the case where a set of VPLS PEs offer flexible multi-homing for 
   a number of CEs, special considerations are required to prevent the 
   creation of forwarding loops and delivery of duplicate frames when 
   forwarding multi-destination frames. 
    
   Consider the example network shown in figure 3 below. In this 
   network, it is assumed that the ACs from all CEs to their 
   corresponding PEs are active and forwarding, i.e. all-active 
   redundancy model. 
     
                   +-----+         
    +--------------+     | 
    |  +-----------+ PE1 | 
    |  |      +----+     | 
    |  | CE1 /     +-----+ 
    |  |     \ 
    |  CE2    \    +-----+ 
    |    \     +---+     | 
    |     +--------+     |    MPLS Core 
    |        +-----+ PE2 | 
    |       /      |     | 
    +---- CE3      +-----+ 
            \       
             \     +-----+ 
              +----+     | 
                   | PE3 | 
                   |     | 
                   +-----+ 
    
                 Figure 3: VPLS with Flexible Multi-homing 
    
   Take, for instance, the scenario where CE1 transmits a broadcast 
   frame toward PE1. PE1 will attempt to flood the frame over all its 
   local ACs and to all remote PEs (PE2 and PE3) in the same VPLS 
   instance. The R-VPLS solution ensures that these broadcast frames do 
    
     
   Sajassi, et al.                                           [Page 13] 
    
    
   draft-sajassi-l2vpn-rvpls-00.txt  March 2010 
    
   not loop back to CE1 by way of PE2. The solution also ensures that 
   CE2 and CE3 do not receive duplicates of the broadcast, via PE1/PE2 
   and PE2/PE3, respectively. This is achieved by enforcing the 
   following behavior: 
    
   6.4.1.            Filtering Based on Multi-homing ID 
    
   Every R-VPLS PE is configured with a Multi-homing ID on the AC 
   connecting to a multi-homed CE per [VPLS-BGP-DH]. The PE forwarding 
   a multi-destination frame tags the flooded traffic with the multi-
   homing ID that identifies the originating AC, so that traffic from a 
   multi-homed CE is not re-forwarded back to that CE upon receipt from 
   the MPLS core. This tagging can be achieved by embedding a 'source 
   label' as the end-of-stack label in the MPLS packets. The source 
   label is set to the Multi-homing ID (MH-ID) as defined in [VPLS-BGP-
   DH]. This source label is matched against the MH-ID of a given AC, 
   for traffic received from the MPLS core. If the source label matches 
   the AC's own MH-ID, then traffic is filtered on that AC. If there's 
   no match, then the traffic is allowed to egress that AC, as long as 
   the Designated Forwarder rule (described below) is honored. 
    
   6.4.2.            Defining a Designated Forwarder 
    
   A Designated Forwarder (DF) PE is elected for handling all multi-
   destination frames received from the MPLS core towards a given 
   multi-homed device. Only the DF PE is allowed to forward traffic 
   received from the MPLS core (over the multipoint LSP or full-mesh of 
   PWs) towards a given MHD. The DF is elected dynamically using the 
   procedures in [VPLS-BGP-DH]. This resolves the issue of duplicate 
   frame delivery. 
    
   6.5.          LACP State Synchronization 
    
   To support CE multi-homing with multi-chassis Ethernet bundles, the 
   R-VPLS PEs connected to a given CE should synchronize [802.1AX] LACP 
   state amongst each other. This includes at least the following LACP 
   specific configuration parameters: 
    
   - System Identifier (MAC Address): uniquely identifies a LACP 
      speaker. 
   - System Priority: determines which LACP speaker's port priorities 
      are used in the Selection logic. 
   - Aggregator Identifier: uniquely identifies a bundle within a LACP 
      speaker. 
   - Aggregator MAC Address: identifies the MAC address of the bundle. 
   - Aggregator Key: used to determine which ports can join an 
      Aggregator. 
   - Port Number: uniquely identifies an interface within a LACP 
      speaker. 
   - Port Key: determines the set of ports that can be bundled. 
    
     
   Sajassi, et al.                                           [Page 14] 
    
    
   draft-sajassi-l2vpn-rvpls-00.txt  March 2010 
    
   - Port Priority: determines a port's precedence level to join a 
      bundle in case the number of eligible ports exceeds the maximum 
      number of links allowed in a bundle. 
    
   The above information must be synchronized between the R-VPLS PEs 
   wishing to form a multi-chassis bundle with a given CE, in order for 
   the former to convey a single LACP peer to that CE. This is required 
   for initial system bring-up and upon any configuration change. 
   Furthermore, the PEs must also synchronize operational (run-time) 
   data, in order for the LACP Selection logic state-machines to 
   execute. This operational data includes the following LACP 
   operational parameters, on a per port basis: 
    
   - Partner System Identifier: this is the CE System MAC address. 
   - Partner System Priority: the CE LACP System Priority 
   - Partner Port Number: CE's AC port number. 
   - Partner Port Priority: CE's AC Port Priority. 
   - Partner Key: CE's key for this AC.  
   - Partner State: CE's LACP State for the AC. 
   - Actor State: PE's LACP State for the AC. 
   - Port State: PE's AC port status. 
       
   The above state needs to be communicated between R-VPLS PEs forming 
   a multi-chassis bundle during LACP initial bringup, upon any 
   configuration change and upon the occurrence of a failure.  
    
   It should be noted that the above configuration and operational 
   state is localized in scope and is only relevant to PEs within a 
   given Redundancy Group, i.e. which connect to the same multi-homed 
   CE over a given Ethernet bundle. Furthermore, the communication of 
   state changes, upon failures, must occur with minimal latency, in 
   order to minimize the switchover time and consequent service 
   disruption. [PWE3-ICCP] defines a mechanism for synchronizing LACP 
   state, using LDP, which can be leveraged for R-VPLS. The use of BGP 
   for synchronization of LACP state is left for further study. 
    
   7.        Security Considerations 
    
   There are no additional security aspects beyond those of VPLS/H-VPLS 
   that need to be discussed here.  
    
   8.        IANA Considerations 
    
   This document requires IANA to assign a new SAFI value for L2VPN_MAC 
   SAFI. 
    
    
   9.        Intellectual Property Considerations 
    
   This document is being submitted for use in IETF standards 
   discussions. 
    
     
   Sajassi, et al.                                           [Page 15] 
    
    
   draft-sajassi-l2vpn-rvpls-00.txt  March 2010 
    
    
   10.         Normative References 
    
   [RFC4664] "Framework for Layer 2 Virtual Private Networks (L2VPNs)", 
      RFC4664, September 2006. 
    
   [RFC4761] "Virtual Private LAN Service (VPLS) Using BGP for Auto-
      discovery and Signaling", January 2007. 
    
   [RFC4762] "Virtual Private LAN Service (VPLS) Using Label 
      Distribution Protocol (LDP) Signaling", RFC4762, January 2007. 
    
   [802.1AX] IEEE Std. 802.1AX-2008, "IEEE Standard for Local and 
      metropolitan area networks - Link Aggregation", IEEE Computer 
      Society, November, 2008. 
    
   11.         Informative References 
    
   [VPLS-BGP-MH] Kothari et al., "BGP based Multi-homing in Virtual 
   Private LAN Service", draft-ietf-l2vpn-vpls-multihoming-00, work in 
   progress, November, 2009. 
    
   [VPLS-MCAST] Aggarwal et al., "Multicast in VPLS", draft-ietf-l2vpn-
   vpls-mcast-06.txt, work in progress, March, 2010. 
    
    
   [PWE3-ICCP] Martini et al., "Inter-Chassis Communication Protocol 
   for L2VPN PE Redundancy", draft-ietf-pwe3-iccp-02.txt, work in 
   progress, Octoer, 2009. 
    
   [PWE3-FAT-PW] Bryant et al., "Flow Aware Transport of Pseudowires 
      over an MPLS PSN", draft-ietf-pwe3-fat-pw-03.txt, work in 
      progress, January 2010. 
    
    
    
    
   12.         Authors' Addresses 
    
   Ali Sajassi 
   Cisco 
   170 West Tasman Drive 
   San Jose, CA  95134, US 
   Email: sajassi@cisco.com 
    
   Samer Salam 
   Cisco 
   595 Burrard Street, Suite 2123 
   Vancouver, BC V7X 1J1, Canada 
   Email: ssalam@cisco.com 
    
    
     
   Sajassi, et al.                                           [Page 16] 
    
    
   draft-sajassi-l2vpn-rvpls-00.txt  March 2010 
    
   Keyur Patel 
   Cisco 
   170 West Tasman Drive 
   San Jose, CA  95134, US 
   Email: keyupate@cisco.com  
    
    
    
     
   Sajassi, et al.                                           [Page 17]