BESS Workgroup Ali Sajassi Internet Draft Samer Salam Intended Status: Standards Track Cisco Luay Jalil Verizon John Drake Tapraj Singh Juniper Expires: April 19, 2016 October 19, 2015 PBB-EVPN with Anycast IP Tunnels draft-sajassi-bess-pbb-evpn-anycast-ip-tunnels-00 Abstract This document describes how PBB-EVPN can be combined with Anycast IP Tunnels to provide resilient Layer 2 services with simplified operations on Access Nodes (ANs). Status of this Memo This Internet-Draft is submitted to IETF in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/1id-abstracts.html The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html Copyright and License Notice Copyright (c) 2012 IETF Trust and the persons identified as the document authors. All rights reserved. Sajassi et al. Expires April 19, 2016 [Page 1] INTERNET DRAFT PBB-EVPN with Anycast IP Tunnels October 19, 2015 This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1 Terminology . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Requirements . . . . . . . . . . . . . . . . . . . . . . . . . 4 3 Current Challenges with PBB-EVPN . . . . . . . . . . . . . . . . 4 3.1 All-Active Redundancy Mode . . . . . . . . . . . . . . . . . 4 3.2 Single-Active Redundancy Mode . . . . . . . . . . . . . . . 5 4 Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . 5 4.2 Operation . . . . . . . . . . . . . . . . . . . . . . . . . 5 4.2.1 Known Unicast Traffic . . . . . . . . . . . . . . . . . 6 4.2.2 BUM Traffic . . . . . . . . . . . . . . . . . . . . . . 7 4.3 Optimizing PE B-MAC Address Allocation . . . . . . . . . . . 8 4.3.1 Known Unicast Traffic . . . . . . . . . . . . . . . . . 8 4.3.2 BUM Traffic . . . . . . . . . . . . . . . . . . . . . . 8 4.3.2.1 BUM Traffic from Ethernet Segment . . . . . . . . . 8 4.3.2.2 BUM Traffic from PE in Redundancy Group . . . . . . 8 4.3.2.3 BUM Traffic from Remote PE . . . . . . . . . . . . . 8 5 Failure Scenarios . . . . . . . . . . . . . . . . . . . . . . . 9 5.1 Link/Node Failure in Access Network . . . . . . . . . . . . 9 5.2 PE Node Failure . . . . . . . . . . . . . . . . . . . . . . 9 5.3 IP Tunnel Failure . . . . . . . . . . . . . . . . . . . . . 9 5.4 AN Node Failure . . . . . . . . . . . . . . . . . . . . . . 10 6 Security Considerations . . . . . . . . . . . . . . . . . . . . 10 7 IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 10 8 References . . . . . . . . . . . . . . . . . . . . . . . . . . 10 8.1 Normative References . . . . . . . . . . . . . . . . . . . 10 8.2 Informative References . . . . . . . . . . . . . . . . . . 10 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 11 Sajassi et al. Expires April 19, 2016 [Page 2] INTERNET DRAFT PBB-EVPN with Anycast IP Tunnels October 19, 2015 1 Introduction RFC7623 defines PBB-EVPN, a solution that provides scalable MPLS Layer 2 VPN services using multi-protocol BGP combined with Provider Backbone Bridging (PBB) [802.1ah]. In this document, we describe a solution that uses PBB-EVPN to aggregate IP access networks over an MPLS backbone, while offering Layer 2 VPN services end to end. The network topology in question comprises of three domains: the customer network, the IP access network and the MPLS backbone, as shown in the figure below. Customer IP Access MPLS Network Network Backbone +--------------+ +---------+ | | +---------------+ |+---+ | +----+ | | | || | +---+ | | +-+ | | +---+ | -|AN |-|IPc|---| PE |---|P| | | |IPc| | /|+---+ +---+ /+----+ /+-+ | | /+---+ | / |+---+ | / +----+ / +----+ / +---+| +--+ || | +---+/ | |/ +-+ | |/ +---+ |AN || +--+ |CE|--|AN |-|IPc|---| PE |---|P|--| PE |---|IPc|-| |--|CE| +--+ |+---+ +---+ +----+ +-+ +----+ +---+ +---+| +--+ | | | | | | +---------+ | | +---------------+ +--------------+ Figure 1: Target Topology The customer network connects via Customer Edge (CE) devices to the IP Access Network. The IP Access Network includes Access Nodes (ANs) and IP core nodes (IPc). The ANs perform tunneling of Ethernet packets using some IP tunneling mechanism (e.g. GRE or VXLAN). The MPLS Backbone comprises of PBB-EVPN PEs as well as MPLS core nodes (P). The PBB-EVPN PEs terminate the IP tunnels which originate from the ANs in their local IP Access Network. To simplify the operations and reduce the provisioning overhead on the ANs, as well as to provide resiliency, the PEs will use Anycast IP addresses as the tunnel destination, for tunnels originating from the ANs. We will refer to this setup as PBB-EVPN with Anycast IP tunnels. 1.1 Terminology The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this Sajassi et al. Expires April 19, 2016 [Page 3] INTERNET DRAFT PBB-EVPN with Anycast IP Tunnels October 19, 2015 document are to be interpreted as described in RFC 2119 [RFC2119]. 2. Requirements The solution MUST support multipoint Layer 2 VPN services end to end (i.e. CE to CE), with the following requirements: - Support for IP as well as non-IP payloads. Hence, the solution should not rely on any ARP or ND snooping mechanism, but rather rely on MAC learning in data-plane. - Use IP as the underlying transport in the access network, with support for IPv6. - Support VLAN multiplexing with Customer VLAN (C-Tag) transparency in the IP tunnels. - Support for VLAN aware service bundling over the IP tunnels on the PBB-EVPN PEs. This means that the PE needs to identify the L2 Bridge Domain based on the combination of the IP tunnel identifier AND the C-Tag (and/or S-tag). - Support for local switching between IP tunnels on the PBB-EVPN PEs. This is due to the assumption that the ANs may not support any local switching/bridging functionality between their access ports. - Support for hierarchical QoS with two levels in the hierarchy: per IP tunnel and per C-Tag (and/or S-tag). - Provide resilient interconnect with protection against: PE node failure, path failures in the IP access network, and IP tunnel failure. - Simplify the provisioning of the ANs by using Anycast IP addresses as the tunnel destination on the PEs. This eliminates the need to explicitly provisioning the ANs with the unicast IP addresses of the redundant PEs. 3 Current Challenges with PBB-EVPN PBB-EVPN currently can operate in two different redundancy modes: All-Active redundancy and Single-Active redundancy. Neither of these modes can satisfy the requirements set forth in section 2 above. We will discuss this in detail in this section. 3.1 All-Active Redundancy Mode Sajassi et al. Expires April 19, 2016 [Page 4] INTERNET DRAFT PBB-EVPN with Anycast IP Tunnels October 19, 2015 The challenge with the All-Active redundancy mode is that the traffic arriving from the MPLS backbone get load-balanced among the PEs in the redundancy group. As such, it is not possible to enforce the QoS policies reliably for traffic in the backbone to access direction. Note that one cannot assume that the traffic will get distributed evenly between the PEs due to the possibility of "elephant" flows (i.e. flows with considerable bandwidth demands), especially when load-balancing algorithms on the PEs do not take bandwidth into account in the hashing decision. 3.2 Single-Active Redundancy Mode The challenge with the Single-Active redundancy mode is that, based on the DF election, only one of the PBB-EVPN PEs will be forwarding traffic from access to the backbone. At the same time, depending on the IGP distance between the Access Node (AN), and the PEs, traffic over the Anycast IP tunnel may be delivered to the non-DF PE, where it is permanently dropped. This is because the DF election procedures, which are triggered by BGP exchanges, are independent of the IGP path calculations. Relaxing the DF filtering rules will result in duplicate packets, and hence is not an option. 4 Solution 4.1 Overview The solution involves defining a new asymmetric redundancy scheme for PBB-EVPN, which behaves similar to All-Active redundancy for traffic destined to the MPLS backbone from the Ethernet Segment and behaves similar to Single-Active redundancy for traffic destined to the Ethernet Segment from the MPLS backbone. Since the mechanism behaves like All-Active redundancy for traffic destined towards the MPLS backbone, the PEs in the redundancy group can all receive traffic from the ANs, irrespective of how the IGP steers the Anycast traffic. The segment split horizon filtering on the PE prevents looping of BUM traffic. Moreover, since the mechanism behaves like Single-Active redundancy for traffic destined towards the Ethernet Segment, the active PE can reliably enforce the QoS policies. 4.2 Operation Each IP tunnel will be treated as a virtual Ethernet Segment (vES) [Virtual-ES] on the PBB-EVPN PEs. However, instead of having a single Sajassi et al. Expires April 19, 2016 [Page 5] INTERNET DRAFT PBB-EVPN with Anycast IP Tunnels October 19, 2015 B-MAC address associated with each vES across all PEs in the RG, each PE will assign a different B-MAC for each vES, because Single-Active redundancy is used for traffic from the MPLS backbone. This ensures that remote PEs learn the C-MAC addresses against the unique B-MAC of the active PE only. Note that the active PE here is actually determined based on where the Anycast traffic lands, according the IGP distance between the ANs and the PEs in the IP access network. Traffic filtering based on DF election applies only to BUM traffic, and only for traffic in the direction towards the Ethernet Segment. +--------------+ +---------+ | | +---------------+ |+---+ | +----+ | | | +--+ || |---------| | | | | |CE|--|AN1|-----\ |PE1 | | | | +--+ |+---+ | \ /+----+ | | | |+---+ | / +----+ +----+ +---+| +--+ || |------/ \| | | | |AN || +--+ |CE|--|AN2|---------|PE2 | |PEr |---------| |--|CE| +--+ |+---+ | +----+ +----+ +---+| +--+ | | | | | | +---------+ | | +---------------+ +--------------+ Figure 2: Example Network For what follows, please refer to the example network in Figure 2 above. 4.2.1 Known Unicast Traffic When an access nodes (e.g. AN1) forwards known unicast traffic over the IP access network, the traffic will be steered towards one of the PEs depending on the IGP distance between the AN and the PEs. In the case where both PEs are equidistant from the AN, i.e. ECMP, the load- balancing hash on the AN and IP core nodes of the access network determines which PE receives the traffic. Assume the traffic is delivered to PE1 in this example. PE1 would then encapsulate the traffic in the PBB header, using its B-MAC address that is associated with the vES corresponding to the IP tunnel on which the traffic was received. The PE then adds the MPLS encapsulation and forwards the packets towards the destination remote PE (PEr in this example). PEr would then learn the C-MAC against the B-MAC address associated with PE1. Hence, for the reverse traffic PEr will forward the traffic destined to that C-MAC to PE1. This follows normal PBB-EVPN operation. Sajassi et al. Expires April 19, 2016 [Page 6] INTERNET DRAFT PBB-EVPN with Anycast IP Tunnels October 19, 2015 The only issue to consider is how would the PEs enforce the QoS policies in case of the presence of multiple equal cost paths between the AN and the PEs. Since the QoS policies need to be enforced per IP tunnel and per C-Tag (and/or S-tag) on the PE, the AN needs to ensure that all traffic of a given tunnel lands on the same PE. This can be done by encoding the tunnel identifier in the entropy field of the packet. For example, in the case of VXLAN, the source VTEP address can be hashed into the source UDP port. Similarly, in the case of GRE, the source tunnel address can be hashed into the GRE key. This guarantees that all traffic associated with a given IP tunnel from an AN is destined to a single PE, even in the presence of ECMP. 4.2.2 BUM Traffic The PEs in a redundancy group perform the standard DF election procedures described in RFC7623 (i.e. with per I-SID load-balancing). As a result, only one of the PEs will be responsible for forwarding BUM traffic received from the MPLS backbone towards the Ethernet Segment (i.e. IP access network). This is per standard PBB-EVPN operation. For BUM traffic from the IP access network destined towards the MPLS backbone, the traffic will be sent as a unicast from the AN to one of the PEs (depending on the IGP distance, or the ECMP hash in case of equidistant PEs). This is because the AN does not support any bridging function. The ingress PE, which receives the BUM traffic, will flood it following the standard procedures of PBB-EVPN. This includes applying any DF filtering on locally attached Ethernet Segments. For example, assume that for a given I-SID (ISID1), PE1 is the DF for the vES associated with the IP tunnel to AN1, whereas PE2 is the DF for the vES associated with the IP tunnel to AN2. For BUM traffic sent by AN1 to PE1, the PE1 will not flood the traffic towards AN2 since this PE is not the DF for that vES. The ingress PE would flood the BUM traffic to other PEs in the redundancy group. In our example, this means that PE1 will flood the BUM traffic to PE2. These PEs need to apply the segment split-horizon filtering function to guarantee that the BUM traffic does not loop back into the originating AN. To achieve this, a new procedure is added to standard PBB-EVPN whereby the PE examines the source B-MAC address on the BUM traffic, and recognizes that this B-MAC is associated with a local vES. As such, the PE does not flood the BUM traffic over that specific vES. This is effectively an extension of the PBB-EVPN split-horizon functionality: in the standard, a single B-MAC is associated with a given ESI, whereas here multiple B-MACs are associated with the same ESI. Note that the PE would forward the BUM traffic on other virtual Ethernet Segments in the same bridge domain, and for which it is the assigned DF. Going back to our Sajassi et al. Expires April 19, 2016 [Page 7] INTERNET DRAFT PBB-EVPN with Anycast IP Tunnels October 19, 2015 example, PE2 in this case would forward the BUM traffic received from PE1 towards AN2 but not towards AN1. 4.3 Optimizing PE B-MAC Address Allocation The mechanisms described thus far require the allocation of a B-MAC address per IP tunnel (i.e. per CE). For high-scale deployments, this allocation scheme defeats the scaling benefits of PBB-EVPN. Hence, a more optimal B-MAC address allocation mechanism is needed. This section describes how this can be achieved. In order to reduce the number of B-MAC addresses required per PE, the approach is to adopt the "local bias" mechanism defined in [Overlay] for PBB-EVPN. This allows only a single B-MAC address to be assigned to each PE for all its vES's (instead of one B-MAC per vES). 4.3.1 Known Unicast Traffic The operation for known unicast traffic is similar to what was described in section 4.2.1, albeit with a single source B-MAC address being used for all traffic arriving at the PE from the virtual Ethernet Segments associated with the IP tunnels. 4.3.2 BUM Traffic The operation for BUM traffic can be broken down to three different scenarios, as discussed next. 4.3.2.1 BUM Traffic from Ethernet Segment For BUM traffic arriving from the (virtual) Ethernet Segment, the PE follows the local bias mechanisms. That is, it floods the traffic over all other local (virtual) Ethernet Segments in the same bridge- domain irrespective of DF election state. The PE also floods the traffic over the MPLS backbone. 4.3.2.2 BUM Traffic from PE in Redundancy Group For BUM traffic arriving from another ingress PE in the same Redundancy Group, the PE inspects the source B-MAC address in the packets to identify the right replication list for the I-SID in question. This replication list would exclude all (virtual) Ethernet Segments that are common with the ingress PE (to prevent loops). The PE would flood the BUM traffic over all (virtual) Ethernet Segments in that specific replication list, subject to the DF filtering constraints - i.e., it is the DF for that segment. 4.3.2.3 BUM Traffic from Remote PE Sajassi et al. Expires April 19, 2016 [Page 8] INTERNET DRAFT PBB-EVPN with Anycast IP Tunnels October 19, 2015 For BUM traffic arriving from a remote ingress PE that has no segments in common with the egress PE, the latter would again examine the source B-MAC address in the packets to identify the replication list for the I-SID in question. This replication list would include all (virtual) Ethernet Segments that are in the associated bridge- domain. The PE would flood the traffic over all those segments while observing the DF filtering rules. 5 Failure Scenarios In this section we will discuss the operation of the solution in case of various failure scenarios. 5.1 Link/Node Failure in Access Network For link or node failures in the access network, which do not cause the ANs to completely lose connectivity to the PEs, the failure will trigger the IGP to recalculate the shortest path. This may cause the Anycast traffic from some ANs to be steered towards new PEs. However, no action will be required in the control plane on the PEs. Remote PEs will automatically update their C-MAC/B-MAC associations through data-plane learning. There are no transient loops, and the convergence time is a function of how quickly the IGP can re- converge. 5.2 PE Node Failure Failure of the PE node is addressed by the fact that the IP tunnels use Anycast addresses. The ANs do not need to explicitly react to the failure in any special way, as the IGP re-convergence takes care of the fail-over procedures. The B-MAC routes advertised by the failed PE are withdrawn by the Route Reflector, which in turn flushes all the C-MACs associated with those B-MACs on the remote PEs. If this happens before any new traffic arrives from the CE via the new PE, temporary flooding would occur as expected. However, as soon as those remote PEs start receiving traffic from the associated C-MACs over the new PE, the remote PEs will update their C-MAC/B-MAC bindings through normal data-plane learning. 5.3 IP Tunnel Failure IP tunnel failure occurs when there is no viable path, within the access network, between the AN and a PE. The PE can detect such condition by using IGP route watch mechanisms. Upon detecting this failure, the PE reacts with two actions: - It withdraws all the Ethernet Segment routes associated with the unreachable AN. Sajassi et al. Expires April 19, 2016 [Page 9] INTERNET DRAFT PBB-EVPN with Anycast IP Tunnels October 19, 2015 - It withdraws all the MAC Advertisement routes associated with the B-MAC(s) of the affected vES's. Similar to the PE node failure scenario above, this may result in temporary flooding from remote PEs until traffic starts flowing in the other direction from the CEs. 5.4 AN Node Failure From the perspective of the PE, this failure is handled the same way as the IP tunnel failure scenario. If the CE connected to the failed AN is single-homed, then no redundancy would be available. 6 Security Considerations There are no special security considerations beyond those of PBB- EVPN. 7 IANA Considerations TBD 8 References 8.1 Normative References [KEYWORDS] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC7623] Sajassi et al., "Provider Backbone Bridging Combined with Ethernet VPN (PBB-EVPN)", RFC 7623, September 2015. [PBB] IEEE, "IEEE Standard for Local and metropolitan area networks - Media Access Control (MAC) Bridges and Virtual Bridged Local Area Networks", Clauses 25 and 26, IEEE Std 802.1Q, DOI 10.1109/IEEESTD.2011.6009146. [Overlay] Sajassi et al., "A Network Virtualization Overlay Solution using EVPN", draft-ietf-bess-evpn-overlay-01, work in progress, February 2015. 8.2 Informative References Sajassi et al. Expires April 19, 2016 [Page 10] INTERNET DRAFT PBB-EVPN with Anycast IP Tunnels October 19, 2015 Authors' Addresses Ali Sajassi Cisco EMail: sajassi@cisco.com Samer Salam Cisco EMail: ssalam@cisco.com Luay Jalil Verizon EMail: luay.jalil@verizon.com John Drake Juniper EMail: jdrake@juniper.net Tapraj Singh Juniper EMail: tsingh@juniper.net Sajassi et al. Expires April 19, 2016 [Page 11]