Network Working Group K. Vairavakkalai Internet-Draft N. Venkataraman Intended status: Standards Track B. Rajagopalan Expires: June 15, 2021 Juniper Networks, Inc. G. Mishra Verizon Communications Inc. December 12, 2020 BGP Classful Transport Planes draft-kaliraj-idr-bgp-classful-transport-planes-03 Abstract This document specifies a mechanism, referred to as "service mapping", to express association of overlay routes with underlay routes satisfying a certain SLA, using BGP. The document describes a framework for classifying underlay routes into transport classes, and mapping service routes to specific transport class. The transport class maps to a desired SLA. It specifies BGP protocol procedures that enable dissimination of such service mapping information that may span across administrative domains. It makes it possible to advertise multiple tunnels to the same destination. A new BGP transport layer address family (SAFI 76) is defined for this purpose that uses RFC-4364 technology and follows RFC-8277 NLRI encoding. This new address family is called "BGP Classful Transport", aka BGP CT. It carries transport prefixes across tunnel domain boundaries (e.g. in Inter-AS Option-C networks), parallel to BGP LU (SAFI 4) . It dissiminates "Transport class" information for the transport prefixes across the participating domains, which is not possible with BGP LU. This makes the end-to-end network a "Transport Class" aware tunneled network. Requirements Language The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119]. Vairavakkalai, et al. Expires June 15, 2021 [Page 1] Internet-Draft BGP Classful Transport Planes December 2020 Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at https://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on June 15, 2021. Copyright Notice Copyright (c) 2020 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5 3. Transport Class . . . . . . . . . . . . . . . . . . . . . . . 6 4. "Transport Class" Route Target Extended Community . . . . . . 7 5. Transport RIB . . . . . . . . . . . . . . . . . . . . . . . . 8 6. Transport Routing Instance . . . . . . . . . . . . . . . . . 8 7. Nexthop Resolution Scheme . . . . . . . . . . . . . . . . . . 8 8. BGP Classful Transport Family NLRI . . . . . . . . . . . . . 9 9. Comparison with other families using RFC-8277 encoding . . . 10 10. Protocol Procedures . . . . . . . . . . . . . . . . . . . . . 11 11. OAM considerations . . . . . . . . . . . . . . . . . . . . . 14 12. Illustration of procedures with example topology . . . . . . 15 12.1. Topology . . . . . . . . . . . . . . . . . . . . . . . . 15 12.2. Service Layer route exchange . . . . . . . . . . . . . . 16 Vairavakkalai, et al. Expires June 15, 2021 [Page 2] Internet-Draft BGP Classful Transport Planes December 2020 12.3. Transport Layer route propagation . . . . . . . . . . . 17 12.4. Data plane view . . . . . . . . . . . . . . . . . . . . 19 12.4.1. Steady state . . . . . . . . . . . . . . . . . . . . 19 12.4.2. Absorbing failure of primary path . . . . . . . . . 19 13. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 20 13.1. New BGP SAFI . . . . . . . . . . . . . . . . . . . . . . 20 13.2. New Format for BGP Extended Community . . . . . . . . . 20 13.2.1. Existing registries to be modified . . . . . . . . . 21 13.2.2. New registries to be created . . . . . . . . . . . . 21 13.3. MPLS OAM code points . . . . . . . . . . . . . . . . . . 22 14. Security Considerations . . . . . . . . . . . . . . . . . . . 23 15. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 23 16. References . . . . . . . . . . . . . . . . . . . . . . . . . 23 16.1. Normative References . . . . . . . . . . . . . . . . . . 23 16.2. URIs . . . . . . . . . . . . . . . . . . . . . . . . . . 24 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 24 1. Introduction To facilitate service mapping, the tunnels in a network can be grouped by the purpose they serve into a "Transport Class". The tunnels could be created using any signaling protocol, such as LDP, RSVP, BGP LU or SPRING. The tunnels could also use native IP or IPv6, as long as they can carry MPLS payload. Tunnels may exist between different pair of end points. Multiple tunnels may exist between the same pair of end points. Thus, a Transport Class consists of tunnels created by various protocols that satisfy the properties of the class. For example, a "Gold" transport class may consist of tunnels that traverse the shortest path with fast re-route protection, a "Silver" transport class may hold tunnels that traverse shortest paths without protection, a "To NbrAS Foo" transport class may hold tunnels that exit to neighboring AS Foo, and so on. The extensions specified in this document can be used to create a BGP transport tunnel that potentially spans domains, while preserving its Transport Class. Examples of domain are Autonomous System (AS), or IGP area. Within each domain, there is a second level underlay tunnel used by BGP to cross the domain. The second level underlay tunnels could be hetrogeneous: Each domain may use a different type of tunnel (e.g. MPLS, IP, GRE), or use a differnet signaling protocol. A domain boundary is demarcated by a rewrite of BGP nexthop to 'self' while re-advertising tunnel routes in BGP. Examples of domain boundary are inter-AS links and inter-region ABRs. The path uses MPLS label-switching when crossing domain boundary and uses the native intra-AS tunnel of the desired transport class when traversing within a domain. Vairavakkalai, et al. Expires June 15, 2021 [Page 3] Internet-Draft BGP Classful Transport Planes December 2020 Overlay routes carry sufficient indication of the Transport Classes they should be encapsulated over, in form of BGP community called the "Mapping community". Based on the mapping community, "route resolution" procedure on the ingress node selects from the corresponding Transport Class an appropriate tunnel whose destination matches (LPM) the nexthop of the overlay route. If the overlay route is carried in BGP, the protocol nexthop (or, PNH) is generally carried as an attribute of the route. The PNH of the overlay route is also referred to as "service endpoint" (SEP). The service endpoint may exist in the same domain as the service ingress node or lie in a different domain, adjacent or non-adjacent. In the former case, reachability to the SEP is provided by an intra-domain tunneling protocol, and in the latter case, reachability to the SEP is via BGP transport families. In this architecture, the intra-domain transport protocols (e.g. RSVP, SRTE) are also "Transport Class aware", and they publish ingress routes in Transport RIB associated with the Transport Class, at the tunnel ingress node. These routes are then redistributed into BGP CT to be advertised to adjacent domains. It is outside the scope of this document how exactly the transport protocols are made transport-class aware, though configuration on the tunnel ingress node is a simple mechanism to achieve it. This document describes mechanisms to: Model a "Transport Class" as "Transport RIB" on a router, consisting of tunnel ingress routes of a certain class. Enable service routes to resolve over an intended Transport Class by virtue of carrying the appropriate "Mapping community". Which results in using the corresponding Transport RIB for finding nexthop reachability. Advertise tunnel ingress routes in a Transport RIB via BGP without any path hiding, using BGP VPN technology and Add-path. Such that overlay routes in the receiving domains can also resolve over tunnels of associated Transport Class. Provide a way for co-operating domains to reconcile between independently administered extended community namespaces, and interoperate between different transport signaling protocols in each domain. In this document we focus mainly on MPLS LSPs as the intra-domain transport tunnels, but the mechanisms would work in similar manner for non-MPLS transport tunnels too. Vairavakkalai, et al. Expires June 15, 2021 [Page 4] Internet-Draft BGP Classful Transport Planes December 2020 This document assumes MPLS forwarding when crossing domain boundaries, as that is the defacto standard in deployed networks today. But mechanisms specified in this document can also support different forwarding technologies (e.g. SRv6). A future document may describe such adaptations, it is out of scope of this document. 2. Terminology LSP: Label Switched Path. TE : Traffic Engineering. SN : Service Node. BN : Border Node. TN : Transport Node, P-router. BGP-VPN : VPNs built using RFC4364 mechanisms. RT : Route-Target extended community. RD : Route-Distinguisher. PNH : Protocol-Nexthop address carried in a BGP Update message. SEP : Service End point, the PNH of a Service route. LPM : Longest Prefix Match. Service Family : BGP address family used for advertising routes for "data traffic", as opposed to tunnels. Transport Family : BGP address family used for advertising tunnels, which are in turn used by service routes for resolution. Transport Tunnel : A tunnel over which a service may place traffic. These tunnels can be GRE, UDP, LDP, RSVP, or SR-TE. Tunnel Domain : A domain of the network containing SN and BN, under a single administrative control that has a tunnel between SN and BN. An end-to-end tunnel spanning several adjacent tunnel domains can be created by "stitching" them together using labels. Transport Class : A group of transport tunnels offering the same type of service. Vairavakkalai, et al. Expires June 15, 2021 [Page 5] Internet-Draft BGP Classful Transport Planes December 2020 Transport Class RT : A Route-Target extended community used to identify a specific Transport Class. Transport RIB : At the SN and BN, a Transport Class has an associted Transport RIB that holds its tunnel routes. Transport Plane : An end to end plane comprising of transport tunnels belonging to same transport class. Tunnels of same transport class are stitched together by BGP route readvertisements with nexthop- self, to span across domain boundaries using Label-Swap forwarding mechanism similar to Inter-AS option-b. Mapping Community : BGP Community/Extended-community on a service route, that maps it to resolve over a Transport Class. 3. Transport Class A Transport Class is defined as a set of transport tunnels that share certain characteristics useful for underlay selection. On the wire, a transport class is represented as the Transport Class RT, which is a new Route-Target extended community. A Transport Class is configured at SN and BN, along with attributes like RD and Route-Target. Creation of a Transport Class instantiates the associated Transport RIB and a Transport routing instance to contain them all. The operator may configure a SN/BN to classify a tunnel into an appropriate Transport Class, which causes the tunnel's ingress routes to be installed in the corresponding Transport RIB. At a BN, these tunnel routes may then be advertised into BGP CT. Alternatively, a router receiving the transport routes in BGP with appropriate signaling information can associate those ingress routes to the appropriate Transport Class. E.g. for Classful Transport family (SAFI 76) routes, the Transport Class RT indicates the Transport Class. For BGP LU family(SAFI 4) routes, import processing based on Communities or inter-AS source-peer may be used to place the route in the desired Transport Class. When the ingress route is received via SRTE [SRTE], which encodes the Transport Class as an integer 'Color' in the NLRI as "Color:Endpoint", the 'Color' is mapped to a Transport Class during import processing. SRTE ingress route for 'Endpoint' is installed in that transport class. The SRTE route when advertised out to BGP speakers will then be advertised in Classful Transport family with Transport Class RT and a new label. The MPLS swap route thus Vairavakkalai, et al. Expires June 15, 2021 [Page 6] Internet-Draft BGP Classful Transport Planes December 2020 installed for the new label will pop the label and deliver decapsulated traffic into the path determined by SRTE route. 4. "Transport Class" Route Target Extended Community This document defines a new type of Route Target, called "Transport Class" Route Target Extended Community. "Transport Class" Route Target extended community is a transitive extended community EXT-COMM [RFC4360] of extended-type, with a new Format (Type high = 0xa) and SubType as 0x2 (Route Target). This new Route Target Format has the following encoding: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type= 0xa | SubType= 0x02 | Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Transport Class ID | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ "Transport Class" Route Target Extended Community Type: 2 octets Type field contains value 0xa. SubType: 2 octets Subtype field contain 0x2. This indicates 'Route Target'. Transport Class ID: 4 octets The least significant 32-bits of the value field contain the "Transport Class" identifier, which is a 32-bit integer. The remaining 2 octets after SubType field are Reserved, they MUST be set to zero by originator, and ignored, left unaltered by receiver. The "Transport class" Route Target Extended community follows the mechanisms for VPN route import, export as specified in BGP-VPN [RFC4364], and Route Target Contrain mechanisms as specified in VPN- RTC [RFC4684] Vairavakkalai, et al. Expires June 15, 2021 [Page 7] Internet-Draft BGP Classful Transport Planes December 2020 A BGP speaker that implements RT Constraint VPN-RTC [RFC4684] MUST apply the RT Constraint procedures to the "Transport class" Route Target Extended community as-well. The Transport Class Route Target Extended community is carried on Classful Transport family routes, and allows associating them with appropriate Transport RIBs at receiving BGP speakers. Use of the Transport Class Route Target Extended community with a new Type code avoids conflicts with any VPN Route Target assignments already in use for service families. 5. Transport RIB A Transport RIB is a routing-only RIB that is not installed in forwarding path. However, the routes in this RIB are used to resolve reachability of overlay routes' PNH. Transport RIB is created when the Transport Class it represents is configured. Overlay routes that want to use a specific Transport Class confine the scope of nexthop resolution to the set of routes contained in the corresponding Transport RIB. This Transport RIB is the "Routing Table" referred in Section 9.1.2.1 RFC4271 [1] Routes in a Transport RIB are exported out in 'Classful Transport' address family. 6. Transport Routing Instance A BGP VPN routing instance that is a container for the Transport RIB. It imports, and exports routes in this RIB with Transport Class RT. Tunnel destination addresses in this routing instance's context come from the "provider namespace". This is different from user VRFs for e.g., which contain prefixes in "customer namespace" The Transport Routing instance uses the RD and RT configured for the Transport Class. 7. Nexthop Resolution Scheme An implementation may provide an option for the service route to resolve over less preferred Transport Classes, should the resolution over preferred, or "primary" Transport Class fail. To accomplish this, the set of service routes may be associated with a user-configured "resolution scheme", which consists of the primary Transport Class, and optionally, an ordered list of fallback Transport Classes. Vairavakkalai, et al. Expires June 15, 2021 [Page 8] Internet-Draft BGP Classful Transport Planes December 2020 A community called as "Mapping Community" is configured for a "resolution scheme". A Mapping community maps to exactly one resolution scheme. A resolution scheme comprises of one primary transport class and optionally one or more fallback transport classes. A BGP route is associated with a resolution scheme during import processing. The first community on the route that matches a mapping community of a locally configured resolution scheme is considered the effective mapping community for the route. The resolution scheme thus found is used when resolving the route's PNH. If a route contains more than one mapping community, it indicates that the route considers these multiple mapping communities as equivalent. So the first community that maps to a resolution scheme is chosen. A transport route received in BGP Classful Transport family SHOULD use a resolution scheme that contains the primary Transport Class without any fallback to best effort tunnels. The primary Transport Class is identified by the Transport Class RT carried on the route. Thus Transport Class RT serves as the Mapping Community for Classful Transport routes. A service route received in a BGP service family MAY map to a resolution scheme that contains the primary Transport Class identified by the mapping community on the route, and a fallback to best effort tunnels transport class. The primary Transport Class is identified by the Mapping community carried on the route. For e.g. the Extended Color community may serve as the Mapping Community for service routes. Color:0: MAY map to a resolution scheme that has primary transport class , and a fallback to best-effort transport class. 8. BGP Classful Transport Family NLRI The Classful Transport family will use the existing AFI of IPv4 or IPv6, and a new SAFI 76 "Classful Transport" that will apply to both IPv4 and IPv6 AFIs. The "Classful Transport" SAFI NLRI itself is encoded as specified in https://tools.ietf.org/html/rfc8277#section-2 [RFC8277]. When AFI is IPv4 the "Prefix" portion of Classful Transport family NLRI consists of an 8-byte RD followed by an IPv4 prefix. When AFI is IPv6 the "Prefix" consists of an 8-byte RD followed by an IPv6 prefix. Vairavakkalai, et al. Expires June 15, 2021 [Page 9] Internet-Draft BGP Classful Transport Planes December 2020 Attributes on a Classful Transport route include the Transport Class Route-Target extended community, which is used to leak the route into the right Transport RIBs on SNs and BNs in the network. 9. Comparison with other families using RFC-8277 encoding SAFI 128 (Inet-VPN) is a RF8277 encoded family that carries service prefixes in the NLRI, where the prefixes come from the customer namespaces, and are contexualized into separate user virtual service RIBs called VRFs, using RFC4364 procedures. SAFI 4 (BGP LU) is a RFC8277 encoded family that carries transport prefixes in the NLRI, where the prefixes come from the provider namespace. SAFI 76 (Classful Transport) is a RFC8277 encoded family that carries transport prefixes in the NLRI, where the prefixes come from the provider namespace, but are contexualized into separate Transport RIBs, using RFC4364 procedures. It is worth noting that SAFI 128 has been used to carry transport prefixes in "L3VPN Inter-AS Carrier's carrier" scenario, where BGP LU/LDP prefixes in CsC VRF are advertised in SAFI 128 towards the remote-end baby carrier. In this document a new AFI/SAFI is used instead of reusing SAFI 128 to carry these transport routes, because it is operationally advantageous to segregate transport and service prefixes into separate address families, RIBs. E.g. It allows to safely enable "per-prefix" label allocation scheme for Classful Transport prefixes without affecting SAFI 128 service prefixes which may have huge scale. "per prefix" label allocation scheme keeps the routing churn local during topology changes. A new family also facilitates having a different readvertisement path of the transport family routes in a network than the service route readvertisement path. viz. Service routes (Inet-VPN) are exchanged over an EBGP multihop session between Autonomous systems with nexthop unchanged; whereas Classful Transport routes are readvertised over EBGP single hop sessions with "nexthop- self" rewrite over inter-AS links. The Classful Transport family is similar in vein to BGP LU, in that it carries transport prefixes. The only difference is, it also carries in Route Target an indication of which Transport Class the transport prefix belongs to, and uses RD to disambiguate multiple instances of the same transport prefix in a BGP Update. Vairavakkalai, et al. Expires June 15, 2021 [Page 10] Internet-Draft BGP Classful Transport Planes December 2020 10. Protocol Procedures This section summarizes the procedures followed by various nodes speaking Classful Transport family Preparing the network for deploying Classful Transport planes Operator decides on the Transport Classes that exist in the network, and allocates a Route-Target to identify each Transport Class. Operator configures Transport Classes on the SNs and BNs in the network with unique Route-Distinguishers and Route-Targets. Implementations may provide automatic generation and assignment of RD, RT values for a transport routing instance; they MAY also provide a way to manually override the automatic mechanism, in order to deal with any conflicts that may arise with existing RD, RT values in the different network domains participating in a deployment. Origination of Classful Transport route: At the ingress node of the tunnel's home domain, the tunneling protocols install routes in the Transport RIB associated with the Transport Class the tunnel belongs to. The ingress node then advertises this tunnel destination into BGP as a Classful Transport family route with NLRI RD:TunnelEndpoint, attaching a 'Transport Class' Route Target that identifies the Transport Class. This BGP CT route is advertised to EBGP peers and IBGP peers which are RR-clients. This route MUST NOT be advertised to the IBGP peers who are not RR-clients. Alternatively, the egress node of the tunnel i.e. the tunnel endpoint can originate the same BGP Classful Transport route, with NLRI RD:TunnelEndpoint and PNH TunnelEndpoint, which will resolve over the tunnel route at the ingress node. When the tunnel is up, the Classful Transport BGP route will become usable and get re- advertised. Unique RD SHOULD be used by the originator of a Classful Transport route to disambiguate the multiple BGP advertisements for a transport end point. Ingress node receiving Classful Transport route Vairavakkalai, et al. Expires June 15, 2021 [Page 11] Internet-Draft BGP Classful Transport Planes December 2020 On receiving a BGP Classful Transport route with a PNH that is not directly connected, e.g. an IBGP-route, a mapping community on the route (the Transport Class RT) indicates which Transport Class this route maps to. The routes in the associated Transport RIB are used to resolve the received PNH. If there does not exist a route in the Transport RIB matching the PNH, the Classful Transport route is considered unusable, and MUST NOT be re- advertised further. Border node readvertising Classful Transport route with nexthop self: The BN allocates an MPLS label to advertise upstream in Classful Transport NLRI. The BN also installs an MPLS swap-route for that label that swaps the incoming label with a label received from the downstream BGP speaker, or pops the incoming label. And then pushes received traffic to the transport tunnel or direct interface that the Classful Transport route's PNH resolved over. Border node receiving Classful Transport route on EBGP : If the route is received with PNH that is known to be directly connected, e.g. EBGP single-hop peering address, the directly connected interface is checked for MPLS forwarding capability. No other nexthop resolution process is performed, as the inter-AS link can be used for any Transport Class. If the inter-AS links should honor Transport Class, then the BN SHOULD follow procedures of an Ingress node described above, and perform nexthop resolution process. The interface routes SHOULD be installed in the Transport RIB belonging to the associated Transport Class. Avoiding path-hiding through Route Reflectors When multiple BNs exist that advertise a RDn:PEn prefix to RRs, the RRs may hide all but one of the BNs, unless ADDPATH [RFC7911] is used for the Classful Transport family. This is similar to L3VPN option-B scenarios. Hence ADDPATH SHOULD be used for Classful Transport family, to avoid path-hiding through RRs. Ingress node receiving service route with mapping community Service routes received with mapping community resolve using Transport RIBs determined by the resolution scheme. If the resolution process does not find an usable Classful Transport route or tunnel route in any of the Transport RIBs, the service route MUST be considered unusable for forwarding purpose. Vairavakkalai, et al. Expires June 15, 2021 [Page 12] Internet-Draft BGP Classful Transport Planes December 2020 Coordinating between domains using different community namespaces. Domains not agreeing on RT, RD, Mapping-community values because of independently administered community namespaces may deploy mechanisms to map and rewrite the Route-target values on domain boundaries, using per ASBR import policies. This is no different than any other BGP VPN family. Mechanisms employed in inter-AS VPN deployments may be used with the Classful Transport family also. The resolution schemes SHOULD allow association with multiple mapping communities. This helps with renumbering, network mergers, or transitions. Though RD can also be rewritten on domain boundaries, deploying unique RDs is strongly RECOMMENDED, because it helps in trouble shooting by uniquely identifying originator of a route, and avoids path-hiding. This document defines a new format of Route-Target extended- community to carry Transport Class, this avoids collision with regular Route Target namespace used by service routes. Constrained distribution of PNHs to SNs. This section describes how the number of Protocol Nexthops advertised to a SN can be constrained using BGP Classsful Transport and VPN RTC [RFC4684] An egress SN MAY advertise BGP CT route for RD:eSN with two Route Targets: transport-target:0: and a RT carrying :. Where TC is the Transport Class identifier, and eSN is the IP- address used by SN as BGP nexthop in it's service route advertisements. transport-target:0: is the new type of route target (Transport Class RT) defined in this document. It is carried in BGP extended community attribute (BGP attribute code 16). The RT carrying : MAY be an IP-address specific regular RT (BGP attribute code 16), IPv6-address specific RT (BGP attribute code 25), or a Wide-communities based RT (BGP attribute code 34) as described in RTC-Ext [RTC-Ext] An ingress SN MAY import BGP CT routes with Route Target carrying: :. The ingress SN MAY learn the eSN values either by configuration, or it MAY discover them from the BGP nexthop field in the BGP VPN service routes received from eSN. A BGP ingress SN Vairavakkalai, et al. Expires June 15, 2021 [Page 13] Internet-Draft BGP Classful Transport Planes December 2020 receiving a BGP service route with nexthop of eSN SHOULD generate a RTC/Extended-RTC route for Route Target prefix :/ in order to learn BGP CT transport routes to reach eSN. This allows constrained distribution of the transport routes to the PNHs actually required by iSN. A BN in the core of the network SHOULD import BGP CT routes with Transport Class Route Target: 0:TC. Because it is interested in transport routes to all eSN nodes. 11. OAM considerations Standard MPLS OAM procedures specified in [RFC8029] also apply to BGP Classful Transport. The 'Target FEC Stack' sub-TLV for IPv4 Classful Transport has a Sub- Type of [TBD], and a length of 13. The Value field consists of the RD advertised with the Classful Transport prefix, the IPv4 prefix (with trailing 0 bits to make 32 bits in all), and a prefix length, encoded as follows: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Route Distinguisher | | (8 octets) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | IPv4 prefix | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Prefix Length | Must Be Zero | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 1: Classful Transport IPv4 FEC The 'Target FEC Stack' sub-TLV for IPv6 Classful Transport has a Sub- Type of [TBD], and a length of 25. The Value field consists of the RD advertised with the Classful Transport prefix, the IPv6 prefix (with trailing 0 bits to make 128 bits in all), and a prefix length, encoded as follows: Vairavakkalai, et al. Expires June 15, 2021 [Page 14] Internet-Draft BGP Classful Transport Planes December 2020 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Route Distinguisher | | (8 octets) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | IPv6 prefix | | | | | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Prefix Length | Must Be Zero | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 2: Classful Transport IPv6 FEC 12. Illustration of procedures with example topology 12.1. Topology [RR26] [RR27] [RR16] | | | | | | |+-[ABR23]--+|+--[ASBR21]---[ASBR13]-+|+--[PE11]--+ || ||| ` / ||| | [CE41]--[PE25]--[P28] [P29] `/ [P15] [CE31] | | | /` | | | | | | / ` | | | | | | / ` | | | +--[ABR24]--+ +--[ASBR22]---[ASBR14]-+ +--[PE12]--+ | | | | + + + + CE | region-1 | region-2 | |CE AS4 ...AS2... AS1 AS3 41.41.41.41 ------------ Traffic Direction ----------> 31.31.31.31 This example shows a provider network that comprises of two Autonomous systems, AS1, AS2. They are serving customers AS3, AS4 respectively. Traffic direction being described is CE41 to CE31. CE31 may request a specific SLA, e.g. Gold for this traffic, when traversing these provider networks. AS2 is further divided into two regions. So there are three tunnel domains in provider space. AS1 uses ISIS Flex-Algo intra-domain tunnels, whereas AS2 uses RSVP intra-domain tunnels. Vairavakkalai, et al. Expires June 15, 2021 [Page 15] Internet-Draft BGP Classful Transport Planes December 2020 The network has two Transport classes: Gold with transport-class id 100, Bronze with transport-class id 200. These transport classes are provisioned at the PEs and the Border nodes (ABRs, ASBRs) in the network. Following tunnels exist for Gold transport class. PE25_to_ABR23_gold - RSVP tunnel PE25_to_ABR24_gold - RSVP tunnel ABR23_to_ASBR22_gold - RSVP tunnel ASBR13_to_PE11_gold - ISIS FlexAlgo tunnel ASBR14_to_PE11_gold - ISIS FlexAlgo tunnel Following tunnels exist for Bronze transport class. PE25_to_ABR23_bronze - RSVP tunnel ABR23_to_ASBR21_bronze - RSVP tunnel ABR23_to_ASBR22_bronze - RSVP tunnel ABR24_to_ASBR21_bronze - RSVP tunnel ASBR13_to_PE12_bronze - ISIS FlexAlgo tunnel ASBR14_to_PE11_bronze - ISIS FlexAlgo tunnel These tunnels are either provisioned or auto-discovered to belong to transport class 100 or 200. 12.2. Service Layer route exchange Service nodes PE11, PE12 negotiate service families (SAFI 1, 128) on the BGP session with RR16. Service helpers RR16, RR26 have multihop EBGP session to exchange service routes between the two AS. Similarly PE25 negotiates service families with RR26. Forwarding happens using service routes at service nodes PE25, PE11, PE12 only. Routes received from CEs are not present in any other nodes' FIB in the network. CE31 advertises a route for example prefix 31.31.31.31 with nexthop self to PE11, PE12. CE31 can attach a mapping community Color:0:100 on this route, to indicate its request for Gold SLA. Or, PE11 can Vairavakkalai, et al. Expires June 15, 2021 [Page 16] Internet-Draft BGP Classful Transport Planes December 2020 attach the same using locally configured policies. Let us assume CE31 is getting VPN service from PE25. The 31.31.31.31 route is readvertised in SAFI 128 by PE11 with nexthop self (1.1.1.1) and label V-L1, to RR16 with the mapping community Color:0:100 attached. This SAFI 128 route reaches PE25 via RR16, RR26 with the nexthop unchanged, as PE11 and label V-L1. Now PE25 can resolve the PNH 1.1.1.1 using transport routes received in BGP CT or BGP LU. The IP FIB at PE25 will have a route for 31.31.31.31 with a nexthop thus found, that points to a Gold tunnel in ingress domain. 12.3. Transport Layer route propagation ASBR13 negotiates BGP CT family with transport ASBRs ASBR21, ASBR22. They negotiate BGP CT family with RR27 in region 2. ABR23, ABR24 negotiate BGP CT family with RR27 in region 2 and RR26 in region 1. PE25 receives BGP CT routes from RR26. BGP LU family is also negotiated on these sessions alongside BGP CT family. BGP LU carries "best effort" transport class routes, BGP CT carries gold, bronze transport class routes. ASBR13 is provisioned with transport class 100, RD value 1.1.1.3:10 and a transport route target 0:100. And a Transport class 200 with RD value 1.1.1.3:20, and transport route target 0:200. Similarly, these transoprt classes are also configured on ASBRs, ABRs and PEs, with same transport route target, but unique RDs. Ingress route for ASBR13_to_PE11_gold is advertised by ASBR13 in BGP CT family to ASBRs ASBR21, ASBR22. This route is sent with a NLRI containing RD prefix 1.1.1.3:10:1.1.1.1, Label B-L1 and a route target extended community transport-target:0:100. MPLS swap route is installed at ASBR13 for B-L1 with a nexthop pointing to ASBR13_to_PE11_gold tunnel. Ingress route for ASBR13_to_PE11_bronze is advertised by ASBR13 in BGP CT family to ASBRs ASBR21, ASBR22. This route is sent with a NLRI containing RD prefix 1.1.1.3:20:1.1.1.1, Label B-L2 and a route target extended community transport-target:0:200. MPLS swap route is installed at ASBR13 for label B-L2 with a nexthop pointing to ASBR13_to_PE11_bronze tunnel ASBR21 receives BGP CT route 1.1.1.3:10:1.1.1.1 over the single hop EBGP sesion, and readvertises with nexthop self (loopback adderss 2.2.2.1) to RR27, advertising a new label B-L3. MPLS swap route is installed for label B-L3 at ASBR21 to swap to received label B-L1 and Vairavakkalai, et al. Expires June 15, 2021 [Page 17] Internet-Draft BGP Classful Transport Planes December 2020 forwards to ASBR13. RR27 readvertises this BGP CT route to ABR23, ABR24. ASBR22 receives BGP CT route 1.1.1.3:10:1.1.1.1 over the single hop EBGP sesion, and readvertises with nexthop self (loopback adderss 2.2.2.2) to RR27, advertising a new label B-L4. MPLS swap route is installed for label B-L4 at ASBR21 to swap to received label B-L2 and forwards to ASBR13. RR27 readvertises this BGP CT route to ABR23, ABR24. Addpath is enabled for BGP CT family on the sessions between RR27 and ASBRs, ABRs. Such that routes for 1.1.1.3:10:1.1.1.1 with the nexthops ASBR21 and ASBR22 are reflected to ABR23, ABR24 without any path hiding. Thus giving ABR23 visibiity of both available nexthops for Gold SLA. ABR23 receives the route with nexthop 2.2.2.1, label B-L3 from RR27. The route target "transport-target:0:100" on this route acts as mapping community, and instructs ABR23 to strictly resolve the nexthop using transport class 100 routes only. ABR23 is unable to find a route for 2.2.2.1 with transport class 100. Thus it considers this route unusable and does not propagate it further. This prunes ASBR21 from Gold SLA tunneled path. ABR23 also receives the route with nexthop 2.2.2.2, label B-L4 from RR27. The route target "transport-target:0:100" on this route acts as mapping community, and instructs ABR23 to strictly resolve the nexthop using transport class 100 routes only. ABR23 successfully resolves the nexthop to point to ABR23_to_ASBR22_gold tunnel. ABR23 readvertises this route with nexthop self (loopback address 2.2.2.3) and a new label B-L5 to RR26. Swap route for B-L5 is installed by ABR23 to swap to label B-L4, and forward into ABR23_to_ASBR22_gold tunnel. RR26 reflects the route from ABR23 to PE25. PE25 receives the BGP CT route for prefix 1.1.1.3:10:1.1.1.1 with label B-L5, nexthop 2.2.2.3 and transport-target:0:100 from RR26. And it similarly resolves the nexthop 2.2.2.3 over transport class 100, pushing labels associated with PE25_to_ABR23_gold tunnel. In this manner, BGP CT creates a single end-to-end gold path that traverses over gold underlay LSPs in each domain (ASBR13_to_PE11_gold, ABR23_to_ASBR22_gold and PE25_to_ABR23_gold), satisfying Gold SLA end-to-end. MPLS swap routes are installed at ASBR13, ASBR21 and ABR23, when propagating the PE11 BGP CT route with nexthop self towards PE25. Vairavakkalai, et al. Expires June 15, 2021 [Page 18] Internet-Draft BGP Classful Transport Planes December 2020 When PE25 receives service routes with nexthop 1.1.1.1 and mapping community Color:0:100, it resolves over this BGP CT route 1.1.1.3:10:1.1.1.1. Thus pushing label B-L5, and pushing as top label the labels associated with PE25_to_ABR23_gold tunnel. 12.4. Data plane view 12.4.1. Steady state This section describes how the data plane looks like in steady state. CE41 transmits an IP packet with destination as 31.31.31.31. On receiving this packet PE25 performs a lookup in the IP FIB associated with the CE41 interface. This lookup yeids the service route that pushes the VPN service label V-L1, BGP CT label B-L5, and labels for PE25_to_ABR23_gold tunnel. Thus PE25 encapsulates the IP packet in MPLS packet with label V-L1(innermost), B-L5, and top label as PE25_to_ABR23_gold tunnel. This MPLS packet is thus transmitted to ABR23 using Gold SLA. ABR23 decapsulates the packet received on PE25_to_ABR23_gold tunnel as required, and finds the MPLS packet with label B-L5. It performs lookup for label B-L5 in the global MPLS FIB. This yields the route that swaps label B-L5 with label B-L4, and pushes top label provided by ABR23_to_ASBR22_gold tunnel. Thus ABR23 transmits the MPLS packet with label B-L4 to ASBR22, on a tunnel that satisfies Gold SLA. ASBR22 similarly performs a lookup for label B-L4 in global MPLS FIB, finds the route that swaps label B-L4 with label B-L2, and forwards to ASBR13 over the directly connected MPLS enabled interface. This interface is a common resource not dedicated to any specific transport class. ASBR13 receives the MPLS packet with label B-L2, and performs a lookup in MPLS FIB, finds the route that pops label B-L2, and pushes labels associated with ASBR13_to_PE11_gold tunnel. This transmits the MPLS packet with VPN label V-L1 to PE11, using a tunnel that preserves Gold SLA in AS 1. PE11 receives the MPLS packet with V-L1, and performs VPN forwarding. Thus transmitting the original IP payload from CE41 to CE31. The payload has traversed path satisfying Gold SLA end-to-end. 12.4.2. Absorbing failure of primary path This section describes how the data plane reacts when gold path experiences a failure. Vairavakkalai, et al. Expires June 15, 2021 [Page 19] Internet-Draft BGP Classful Transport Planes December 2020 Let us assume tunnel ABR23_to_ASBR22_gold goes down, such that now end-to-end Gold path does not exist in the network. This makes the BGP CT route for RD prefix 1.1.1.1:10:1.1.1.1 unusable at ABR23. This makes ABR23 send a BGP withdrawal for 1.1.1.1:10:1.1.1.1 to RR26, which then withdraws the prefix from PE25. Withdrawal for 1.1.1.1:10:1.1.1.1 allows PE25 to react to the loss of gold path to 1.1.1.1. Let us assume PE25 is provisioned to use best- effort transport class as the backup path. This withdrawal of BGP CT route allows PE25 to adjust the nexthop of the VPN Service-route to push the labels provided by the BGP LU route. That repairs the traffic to go via best effort path. PE25 can also be provisioned to use Bronze transport class as the backup path. The repair will happen in similar manner in that case as-well. Traffic repair to absorb the failure happens at ingress node PE25, in a service prefix scale independent manner. This is called PIC (Prefix scale Independent Convergence). The repair time will be proportional to time taken for withdrawing the BGP CT route. 13. IANA Considerations This document makes following requests of IANA. 13.1. New BGP SAFI New BGP SAFI code for "Classful Transport". Value 76. This will be used to create new AFI,SAFI pairs for IPv4, IPv6 Classful Transport families. viz: o "Inet, Classful Transport". AFI/SAFI = "1/76" for carrying IPv4 Classful Transport prefixes. o "Inet6, Classful Transport". AFI/SAFI = "2/76" for carrying IPv6 Classful Transport prefixes. 13.2. New Format for BGP Extended Community Please assign a new Format (Type high = 0xa) of extended community EXT-COMM [RFC4360] called "Transport Class" from the following registries: the "BGP Transitive Extended Community Types" registry, and the "BGP Non-Transitive Extended Community Types" registry. Please assign the same low-order six bits for both allocations. Vairavakkalai, et al. Expires June 15, 2021 [Page 20] Internet-Draft BGP Classful Transport Planes December 2020 This document uses this new Format with subtype 0x2 (route target), as a transitive extended community. The Route Target thus formed is called "Transport Class" route target extended community. Taking reference of RFC7153 [RFC7153] , following requests are made: 13.2.1. Existing registries to be modified 13.2.1.1. Registries for the "Type" Field 13.2.1.1.1. Transitive Types This registry contains values of the high-order octet (the "Type" field) of a Transitive Extended Community. Registry Name: BGP Transitive Extended Community Types TYPE VALUE NAME + 0x0a Transitive Transport Class Extended + Community (Sub-Types are defined in the + "Transitive Transport Class Extended + Community Sub-Types" registry) 13.2.1.1.2. Non-Transitive Types This registry contains values of the high-order octet (the "Type" field) of a Non-transitive Extended Community. Registry Name: BGP Non-Transitive Extended Community Types TYPE VALUE NAME + 0x4a Non-Transitive Transport Class Extended + Community (Sub-Types are defined in the + "Non-Transitive Transport Class Extended + Community Sub-Types" registry) 13.2.2. New registries to be created 13.2.2.1. Transitive "Transport Class" Extended Community Sub-Types Registry Vairavakkalai, et al. Expires June 15, 2021 [Page 21] Internet-Draft BGP Classful Transport Planes December 2020 This registry contains values of the second octet (the "Sub-Type" field) of an extended community when the value of the first octet (the "Type" field) is 0x07. Registry Name: Transitive Transport Class Extended Community Sub-Types RANGE REGISTRATION PROCEDURE 0x00-0xBF First Come First Served 0xC0-0xFF IETF Review SUB-TYPE VALUE NAME 0x02 Route Target 13.2.2.2. Non-Transitive "Transport Class" Extended Community Sub-Types Registry This registry contains values of the second octet (the "Sub-Type" field) of an extended community when the value of the first octet (the "Type" field) is 0x47. Registry Name: Non-Transitive Transport Class Extended Community Sub-Types RANGE REGISTRATION PROCEDURE 0x00-0xBF First Come First Served 0xC0-0xFF IETF Review SUB-TYPE VALUE NAME 0x02 Route Target 13.3. MPLS OAM code points The following two code points are sought for Target FEC Stack sub- TLVs: o IPv4 BGP Classful Transport o IPv6 BGP Classful Transport Vairavakkalai, et al. Expires June 15, 2021 [Page 22] Internet-Draft BGP Classful Transport Planes December 2020 14. Security Considerations Mechanisms described in this document carry Transport routes in a new BGP address family. That minimizes possibility of these routes leaking outside the expected domain or mixing with service routes. When redistributing between SAFI 4 and SAFI 76 Classful Transport routes, there is a possibility of SAFI 4 routes mixing with SAFI 1 service routes. To avoid such scenarios, it is RECOMMENDED that implementations support keeping SAFI 4 routes in a separate transport RIB, distinct from service RIB that contain SAFI 1 service routes. 15. Acknowledgements The authors thank Jeff Haas, John Scudder, Navaneetha Krishnan, Ravi M R, Chandrasekar Ramachandran, Shradha Hegde, Richard Roberts, Krzysztof Szarkowicz, John E Drake, Srihari Sangli, Vijay Kestur, Santosh Kolenchery, Robert Raszuk for the valuable discussions. The decision to not reuse SAFI 128 and create a new address-family to carry these transport-routes was based on suggestion made by Richard Roberts and Krzysztof Szarkowicz. 16. References 16.1. Normative References [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, . [RFC4271] Rekhter, Y., Ed., Li, T., Ed., and S. Hares, Ed., "A Border Gateway Protocol 4 (BGP-4)", RFC 4271, DOI 10.17487/RFC4271, January 2006, . [RFC4360] Sangli, S., Tappan, D., and Y. Rekhter, "BGP Extended Communities Attribute", RFC 4360, DOI 10.17487/RFC4360, February 2006, . [RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private Networks (VPNs)", RFC 4364, DOI 10.17487/RFC4364, February 2006, . Vairavakkalai, et al. Expires June 15, 2021 [Page 23] Internet-Draft BGP Classful Transport Planes December 2020 [RFC4684] Marques, P., Bonica, R., Fang, L., Martini, L., Raszuk, R., Patel, K., and J. Guichard, "Constrained Route Distribution for Border Gateway Protocol/MultiProtocol Label Switching (BGP/MPLS) Internet Protocol (IP) Virtual Private Networks (VPNs)", RFC 4684, DOI 10.17487/RFC4684, November 2006, . [RFC7153] Rosen, E. and Y. Rekhter, "IANA Registries for BGP Extended Communities", RFC 7153, DOI 10.17487/RFC7153, March 2014, . [RFC7911] Walton, D., Retana, A., Chen, E., and J. Scudder, "Advertisement of Multiple Paths in BGP", RFC 7911, DOI 10.17487/RFC7911, July 2016, . [RFC8029] Kompella, K., Swallow, G., Pignataro, C., Ed., Kumar, N., Aldrin, S., and M. Chen, "Detecting Multiprotocol Label Switched (MPLS) Data-Plane Failures", RFC 8029, DOI 10.17487/RFC8029, March 2017, . [RFC8277] Rosen, E., "Using BGP to Bind MPLS Labels to Address Prefixes", RFC 8277, DOI 10.17487/RFC8277, October 2017, . [RTC-Ext] Zhang, Z., Ed., "Route Target Constrain Extension", 07 2020, . [SRTE] Previdi, S., Ed., "Advertising Segment Routing Policies in BGP", 11 2019, . 16.2. URIs [1] https://www.rfc-editor.org/rfc/rfc4271#section-9.1.2.1 Authors' Addresses Kaliraj Vairavakkalai Juniper Networks, Inc. 1133 Innovation Way, Sunnyvale, CA 94089 US Email: kaliraj@juniper.net Vairavakkalai, et al. Expires June 15, 2021 [Page 24] Internet-Draft BGP Classful Transport Planes December 2020 Natrajan Venkataraman Juniper Networks, Inc. 1133 Innovation Way, Sunnyvale, CA 94089 US Email: natv@juniper.net Balaji Rajagopalan Juniper Networks, Inc. Electra, Exora Business Park~Marathahalli - Sarjapur Outer Ring Road, Bangalore, KA 560103 India Email: balajir@juniper.net Gyan Mishra Verizon Communications Inc. 13101 Columbia Pike Silver Spring, MD 20904 USA Email: gyan.s.mishra@verizon.com Vairavakkalai, et al. Expires June 15, 2021 [Page 25]