Internet Draft draft-xu-bgp-gmpls-01.txt Expiration Date: January 2002 Yangguang Xu Lucent Anindya Basu Lucent Yong Xue UUNet/WorldCom July 2001 A BGP/GMPLS Solution for Inter-Domain Optical Networking Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026 [RFC-2026], except that the right to produce derivative works is not granted. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet- Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Abstract Current CCAMP works focus on intra-domain control plane. This document focuses on inter-domain control plane. It views the overall network as a collection of administrative domains, partitioned by operators according to administrative, technological or geographical considerations. This document specifies a BGP/GMPLS based inter-domain solution for setting up circuit LSPs(Label Switched Paths) that span multiple administrative domains. A few extensions and modifications are proposed to BGP in order to disseminate appropriate topology information and calculate end-to-end circuit paths. GMPLS signaling is also extended for the inter-domain connection operations. The solution specified in this draft is fundermental for inter-domain operations and can be used as a base for various applications, e.g. OVPN. It is also very flexible and scalable. Y. Xu, et. al. [Page 1] draft-xu-bgp-gmpls-01.txt Jan. 2002 1. Summary for Sub-IP Area SUMMARY: Please see the abstract above RELATED DOCUMENTS: See the Reference Section WHERE DOES IT FIT IN THE PICTURE OF THE SUB-IP WORK: This work fits in the Control Plane of CCAMP WHY IS IT TARGETED AT THIS WG: This draft specifies a BGP/GMPLS based inter-domain control plane and both control plane and GMPLS are in the charter of CCAMP. JUSTIFICATION Solutions for inter-domain is equally important and urgent as intra-domain solution because (1) backbone transport network is made of multiple service providers (2) Network Elements of different technologies are typically partitioned as different domains. (3) many service providers are looking for inter-domain solutions. The solution specified in this draft is fundermental for inter-domain operations and can be used as a base for various inter-domain applications. It minimizes changes to current protocols and is also very flexible and scalable. So CCAMP should accept this work. 2 Acronym CAG: Client Access Group CAP: Client Access Point NE: Network Element BNE: Border NE FA: Forwarding Adjacency PFA: Potential Forwarding Adjacency NNI: Network Network Interface UNI: User Network Interface OXC: Optical Cross Connect IDP: Initial Domain Part AS: Autonomous System LSP: Label Switched Path SLA: Service Level Agreement Y. Xu, et. al. [Page 2] draft-xu-bgp-gmpls-01.txt Jan. 2002 3 Introduction The GMPLS architecture extends MPLS signaling protocols and other IP control protocols and applies them to non packet-switched networks. In this fashion, it enables functionality that transforms the optical transport network into an automatic switched transport network. Current work in this area has almost exclusively concentrated on LSP setup and management in a single administrative domain (i.e., in the intra-domain context). However, in operational networks, some end-to-end LSPs will span multiple service provider networks, and therefore require inter-domain signaling and information dissemination. Typically, such LSPs would be required in the optical transport backbone network, where the IXCs and carrier's carriers would provide this functionality as a service to ISPs, CLECs, ILECs, global enterprise and so on. In this document, we propose a method for setting up and managing inter- domain circuit LSPs. In our proposal, the information dissemination and path computation components are implemented using a few extensions to the BGP protocol and the signaling component is implemented using GMPLS signaling extensions. In the rest of the document, we first describe a network model that we shall use as a reference (Section 4), followed by a sample application (Section 5), followed by the description of topology information required for inter-domain circuit LSP setup (Section 6), and a description of the BGP extensions to disseminate this information (Section 7). We then describe the extensions to GMPLS signaling for LSP setup (Section 8) and security considerations (Section 9). It should be noted that the solution proposed in this draft does not disrupt conventional IP networking or hurt its scalability. The extended BGP and the GMPLS signaling protocol run on circuit switches for setting up circuit-LSPs. They are not used for conventional packet forwarding. 4. Network Model We consider the overall network as a collection of Autonomous Systems(AS-es). Each AS is managed by a single administrative entity and is subject to local policies. The exact partition of the network is decided by operators according to their administrative, technological and/or geographical considerations. Furthermore, a carrier may divide its network into multiple AS-es. Each AS may consist of a single type of NE or multiple types of NEs. We recommend to partition packet switching NEs and circuit switching NEs into different AS-es for scalability. A core circuit switch typically has very limited packet processing power comparing to a core IP router. Millions of IP routes that a core circuit switch doesn't care can overload the core circuit switch easily. Further partition of the circuit switching network according to the granularity of NEs' switching fabrics is also possible. Y. Xu, et. al. [Page 3] draft-xu-bgp-gmpls-01.txt Jan. 2002 For circuit connection operations, some AS-es act as clients, some as providers. It should be noted that the same network may be a client in one scenario and a provider in another. -- Clients are networks that request circuit connection services. An example is an IP network that is inter-connected by an optical transport backbone. This IP network could be an ISP, a global enterprise or the data division of a carrier. -- Providers are networks that provide circuit connection services. They make up the optical transport backbone. A provider AS consists of a collection of circuit switches. If two IP client networks are connected through a provider SONET network, these two client networks maintain conventional IP adjacency for IP layer operations. Meanwhile, they have adjacencies with the provider SONET network for circuit connection operations. Below are some terms that are used in this draft. -- We refer to a network element (NE) at the edge of an AS as a Border NE. -- An IP BNE has two types of external neighbors. (1)Packet switching neighbor is its conventional IP router neighbor in another AS. They may connected through either a dedicated line or an underlying circuit switching network. (2) Circuit switching neighbor is its circuit switch neighbor in another AS. They may be physically adjacent or connected through dedicated lines. -- The AS BNE that receives a connection request from another AS is called AS ingress BNE. -- The AS BNE that sends a connection request to another AS is called AS egress BNE. -- Client BNE attached to provider BNE through Client Access Points (CAPs). Each CAP represents a physical/logical interface on a client BNE. -- A group of CAPs that share the same physical and/or logical attributes that are significant to circuit connection operation is referred to as a Client Access Group (CAG). This implies that individual CAPs belonging to the same CAG are treated equivalently in the routing calculation. Indeed, only CAG information is disseminated through the network. The CAP information is locally maintained by provider BNEs. On the receipt of a path set up request, an AS ingress BNE chooses one of the CAPs in the requested CAG and connects to it. This decision is made locally at the BNE. The goal of this BGP/GMPLS proposal is to set up circuit connections between CAPs, subject to various routing constraints. We refer to these connections as circuit-LSPs (Label Switched Paths). A circuit-LSP may span multiple provider AS-es. Y. Xu, et. al. [Page 4] draft-xu-bgp-gmpls-01.txt Jan. 2002 5. A Sample Configuration - Optical VPN We now provide an optical VPN application as an example to illustrate how this BGP/GMPLS solution works. | | | | | | +--+ +--+ | +--+ +--+ | +--+ +--+ | +--+ +--+ |A1|-///-|A2|-+-|X1|-///-|X2|-+-|Y1|-///-|Y2|-+-|A3|--|A4| +--+ +--+ | +--+ +--+ | +--+ +--+ | +--+ +--+ client A | \ | | | | client A Location 1 | \ +--+ | /// | | Location 2 | \|X5|\ | | | +-------------- --------------+ +--+ \ | | | | +--+ +--+ +--+ +--+ | +--+ +--+ | +--+ +--+-|A5|--|A6| |B1|-///-|B2|-+-|X3|-///-|X4|-+-|Y3| | +--+ +--+ +--+ +--+ | +--+ +--+ | +--+ | | | | | | +--+ | | | +-----------+-|A7|----+ | | | +--+ client B | Provider X | Provider Y | client A | | | Location 3 Figure 1: Optical VPN Example Consider the network shown in Figure 1. In this network, -- Providers X and Y are two separate AS-es - we refer to the AS for provider X as network X and the AS for provider Y as network. X1, X2, X3, X4, X5, Y1, Y2, Y3 are Optical Cross Connects (OXCs). -- There are two clients A and B. A1, A2,..., A7, B1 and B2 are IP routers. Client A has three locations: 1, 2 and 3. * Client A location 1 connects to Provider X at X1 through A2 * Client A location 2 connects to Provider Y at Y2 through A3 * Client A location 3 connects to Provider Y at Y2 through A5 * Client A location 3 also connects to Provider Y at Y3 through A7 -- IP routers A2, B2, A3, A5 and A7 are connected to their providers through multiple optical interfaces (not shown in the figure). For example, A2 is connected to X2 through 4 OC-48c interfaces, each of which represents a CAP. In our example, all the CAPs on a single router (such as A2 or A3) form a CAG. We use the router ID to refer to the CAG that consists of the CAPs on the router. For example, we refer to the CAG on router A2 as CAG A2. Y. Xu, et. al. [Page 5] draft-xu-bgp-gmpls-01.txt Jan. 2002 -- A2, A3, A5, A7, X1, X2, X3, X4, Y1, Y2 and Y3 are BNEs. They are also BGP speakers. A2 treats X1 as a BGP circuit switching neighbor. X2 treats Y1 as a BGP circuit switching neighbor and so on. For the control traffic, circuit switching neighbors have dedicated control channels. A2, A3, A5, A7 may run conventional IP BGP. They are either FA or PFA (section 6.1). For example, * If A2 and A5 are connected through an optical path, they are FAs. Their BGP speakers are packet switching neighbors. * If A2 and A3 are not connected through an optical path, they are PFAs. 4.1 Operation Scenario - Time Sequence 1. A2 registers service and local CAPs with X1; A3 registers service and local CAPs with Y2; A5 registers service and local CAPs with Y2; A7 registers service and local CAPs with Y3; B2 Registers service and local CAPs with X3; 2. BNEs (X1 and X3) at network X aggregate local CAPs into CAG routes and disseminate them through I-BGP to each other and other BNEs (X2 and X4). BNEs at network Y aggregate local CAPs into CAG routes and disseminate them through I-BGP to each other and other BNEs. 3. Y2 and Y3 disseminate A3 information to A5 and A7 respectively because they belong to the same client (Client A). For the same reason, Y2 disseminate A5 and A7's information to A3. 4. X and Y exchange CAG routes through E-BGP according to business agreements. In this fashion, locations 1, 2 and 3 of Client A learn each other's CAGs. Note that this kind of information dissemination provides a directory service for client networks. The directory advertises (to client networks) which resources are accessible through provider networks. 5. Note that information dissemination is controlled by local business agreements. For example, Client B may know nothing about client A if the business agreement between client A and provider X prohibits provider X from disseminating Client A's information to Client B. 6. While disseminating the CAG information, OXCs in networks X and Y maintain different attributes related to CAG routes, including the BGP Y. Xu, et. al. [Page 6] draft-xu-bgp-gmpls-01.txt Jan. 2002 Next-Hop information. They also calculate the local degree of preference for each CAG route. 7. If an interface on A2 decides to establish an optical path to A7, it (the interface) initiates a GMPLS signaling request with destination CAG = A7. 8. Each AS ingress BGP speaker decides which BGP Next Hop to choose for the connection when it receives the GMPLS connection request. This decision is later fed into the intra-domain routing process for intra- domain circuit-LSP creation. 9. Step 8 is repeated at each AS. Eventually, GMPLS signaling messages propagate all the way to A7 and a connection between A2 and A7 is set up. Note that in addition to a destination, A2 can also specify other constraints in the path setup request, such as minimum bandwidth available, AS-es that the path must go through, AS-es that the path must avoid, and so on. If possible, the intermediate providers shall try to satisfy these constraints as the request propagates across the network domains. In the next section, we provide a description of the topology information that needs to be maintained by the NEs to provide the kind of functionality described above. Y. Xu, et. al. [Page 7] draft-xu-bgp-gmpls-01.txt Jan. 2002 6. Network Topology View for Inter-domain Circuit Connection Operations In order to request, set up and manage circuit-LSPs that span multiple AS-es, both the client NEs and the provider NEs require access to certain network topology information. However, client and provider NEs usually have different views of the network topology. This is because local policies may prevent a provider from disclosing detailed topology information to its clients and to other providers. We now describe in detail the network topology view in both clients as well as providers. 6.1. An IP Client Network Topology View The network topology view at a client consists of the following: 1. Conventional Routes to IP peers These are standard IP routes to various destinations learnt through conventional IGPs (e.g., ISIS or OSPF) as well as EGPs (e.g., BGP). 2. Forwarding Adjacencies Forwarding Adjacencies [LSP-HIER] are IP (i.e. Layer 3) neighbors that are connected by an underlying circuit-LSP through a provider network. These connections are dynamic in nature and can be set up and torn down as required. Both IP routes and Forwarding Adjacencies are used for packet forwarding. Such information is disseminated using the usual IGP/EGP routing protocols. A Provider network treats traffic carrying this kind of information as user traffic. In addition to these two types of information, a client network also maintains information about: 3. Potential FAs PFAs are remote CAGs. They represent NEs that belong to the same client network (in a remote location) or to a different client network that allows NEs in the local network to connect to it. The NEs represented by a PFA are not connected yet but can be connected to the local NE. Such a connection is a direct connection at the IP layer with an underlying physical connection that spans multiple AS-es. Information about PFAs can be disseminated through BGP extensions in the provider network or some other directory server/yellow pages mechanism. 4. Accessible IP Routes through the PFAs This provides information about all the IP routers reachable through a PFA endpoint. For example, in Figure 1, a route to A1 would be an accessible IP route reachable through PFA A2 from the viewpoint of Y. Xu, et. al. [Page 8] draft-xu-bgp-gmpls-01.txt Jan. 2002 router A5. Such information can be used for the IP Network Engineering [NE-FRWK] function to determine the source and destination of the optical circuit connections. Although this document does not address the issue of how this information is disseminated, we note here that various options exist for doing so. These options include dissemination through established client network connections or a dedicated client network control plane network or a provider's control plane network. As this information may be proprietary and large in size, the last choice may require service contracts between a client and a provider. 5. Filtered/Abstracted Topology Information of the Provider Network This information is leaked from a provider to a client in accordance with their business agreement. It can be used by client networks to specify path constraints within a provider's network. The nature of this information can vary from no information to full link state information about the provider network. 6.2. Provider Network Topology View In provider networks, both BNEs and non-BNEs have 1. Intra-domain link state information for intra-domain connection operation. This information can be disseminated through an IGP. Note that a provider does not need to store information about the topology of other AS-es (learnt through BGP) for setting up circuit-LSPs. This is because the decision about which BGP Next-Hop to choose is made locally at each AS, without any input from the previous AS on the path. When a circuit-LSP set up request arrives, an AS ingress BNE has to determine the BGP Next Hop (or the AS egress BNE) for the destination CAG. For this purpose, a provider BNE requires the following information: 2. CAG routes from client networks that are directly connected to the provider network. 3. CAG routes from client networks that are directly connected to other provider networks . These two types of information are disseminated through an EGP. Provider BNEs do not care about routes from other provider networks because end-to-end circuit-LSPs do not terminate there. Y. Xu, et. al. [Page 9] draft-xu-bgp-gmpls-01.txt Jan. 2002 7. BGP Extensions BGP is used to disseminate information necessary for inter-domain circuit connection operations. More specifically, BGP could be used to provide three kinds of functionalities: (1) basic routing functions for circuit connection setup (2) distributed directory services, and (3) dissemination of abstracted/filtered topology information. 1. Basic Routing Functions The BGP basic routing functions serves two purposes: (a) Dissemination of CAG routes. This includes aggregation of CAG routes, distribution of CAG routes subject to AS specific policies, and (b) Path selection subject to client specified constraints and/or BGP selection rules. Note that these functions use aggregated CAG routes. This is in contrast to the distributed directory service function (described below) that does not allow aggregation of CAG routes. Furthermore, the path selection is done on a hop-by-hop basis for the inter-domain case. In other words, the BNEs in each AS on the path independently choose the BGP Next Hop. This is because detailed information about an AS topology is not disseminated to other AS-es. Consequently, the path chosen by such a distributed process may be sub-optimal. 2. Distributed Directory Service BGP can also be extended to disseminate PFA information. It tells a client NE (and other NEs in the client network) about individual CAGs they can connect to. In this case, BGP provides functionality akin to that of a distributed directory service, and corresponds to a "push" scenario (vs. the typical "pull" scenario in conventional directory server/yellow pages) where the directory information is "pushed" to clients. When BGP distributes CAG information as a directory service, aggregation should not be performed since the clients need to know about each CAG individually. Furthermore, distribution of CAG information may need to be controlled - for example, a client network may only want to disseminate its CAG information to certain client networks. Extended attributes could be used to implement this functionality. Finally, PFAs can also be used for routing since they are simply non- aggregated CAG routes. 3. Abstracted/Filtered Topology Information SLAs between a client and a provider may allow the provider to leak certain topology information for the provider network to the client for some Y. Xu, et. al. [Page 10] draft-xu-bgp-gmpls-01.txt Jan. 2002 applications. For example, the client could specify certain routing constraints for its circuit-LSP set up requests. BGP can also be extended to disseminate this type of information. This extension needs further study and will be covered in later versions of this document. In the next few subsections, we describe the BGP extensions necessary to disseminate the topology information discussed above. 7.1 Client Access Group Address Family A transport backbone provider should be able to serve different types of clients, e.g. SONET/SDH, PDH, ATM or IP networks. The requirements are: 1. Each client may have its own preferred addressing mode, e.g. IPv4, IPv6, NSAP etc. 2. For the same addressing scheme, each client may use its own private address space. This implies that a given address may denote different entities in different client networks. Therefore, we need a new address family for CAG to accommodate different network scenarios. Inter-domain routing is only interested in the CAG address, which should be globally unique for the inter-domain case. CAG can be either numbered or un-numbered [GMPLS-BUND]. Need to explain this. A CAG address is an external address and may be different from the client address that is used for internal routing. A CAG address can be generated in two different ways: 1. Using a client self-assigned address This approach makes it easier for clients to map between internal addresses and external addresses. However, it is hard for provider to aggregate client assigned addresses. At the same time, providers may wish to keep client information private. CAG route using this type of address structure may disclose a client's identity. 2. Using a provider assigned address This approach provides providers better control, easy address aggregation and client information hiding. However, it is more difficult for clients map between CAG addresses and internal addresses. In both the approaches, 1. If the CAG address is globally unique, then it is presented as it is. Y. Xu, et. al. [Page 11] draft-xu-bgp-gmpls-01.txt Jan. 2002 2. If the CAG address is not globally unique, it should be prefixed with a globally unique network ID. We now specify a new address family for CAG. Such an address family enables clients/providers using different address schemes to share a common addressing structure for inter-domain circuit-LSP operations. The address family has an overall length of 22 bytes and consists of two fields as shown below: +---------------+------------------------------------------------+ | Type(2 bytes) | Value(20 bytes) | +---------------+------------------------------------------------+ Address Type: 2 bytes This field indicates the address type for the value field. Value Type ----- ---- 1 Numbered Public IPv4 2 Numbered Public IPv6 3 Un-numbered Public IPv4 4 Un-numbered Public IPv6 5 NSAP 6 Numbered Private IPv4 7 Un-numbered Private IPv4 Value: 20 bytes The value field is formatted and interpreted according to Address Type field. The 20 bytes is patted with "0" if not fully occupied. -- If Address Type = Numbered Public IPv4, value is a public IPv4 address. -- If Address Type = Numbered Public IPv6, value is a public IPv6 address. -- If Address Type = Un-numbered Public IPv4, value is a public IPv4 address + 4 bytes local unique interface ID -- If Address Type = Un0numbered Public IPv6, value is a public IPv6 address + 4 bytes local unique interface ID. -- If Address Type = NSAP, value is a public NSAP address. -- If Address Type = Numbered Private IPv4, the value has 4-byte "Organization Distinguisher" (OD) before the private IPv4 address. An OD is a 2/4 bytes Autonomous System number. These numbers must be assigned by IANA and are publicly known. Y. Xu, et. al. [Page 12] draft-xu-bgp-gmpls-01.txt Jan. 2002 -- If Address Type = Un-numbered Private IPv4, the value has 4-byte "Organization Distinguisher" (OD) before the private IPv4 address. It also has 4 bytes local unique interface ID. An OD is a 2/4 bytes Autonomous System number. These numbers must be assigned by IANA and are publicly known. 7.2 Client Access Group NLRI The MP-REACH-NLRI [BGP-MP] attribute is extended to support the CAG addressing scheme. 1. The AFI field needs a new type for the CAG address family. (IANA consideration). On receiving this new type of NLRI, the BGP speaker should understand that the NLRI is not used for conventional packet forwarding. Instead, it should follow the procedures defined in this document for inter-domain circuit-LSP setup operations. A BGP speaker that does not understand the CAG address family shall ignore the attribute. Whether a BGP speaker can interpret the CAG address family can be negotiated during BGP options negotiations [BGP-CAP]. 2. The Network Address of Next Hop field has the same format as the new CAG address family. It is the network address of the egress BNE on the path to the destination system. 3. NLRIs are the CAG addresses or CAG address prefixes that are accessible through the given next hop field. 7.3 New Path Attributes and Extended Community Attributes 7.3.1 CAG_TYPE (type code TBD) This is an optional transit attribute that defines the CAG type. It is encoded in the format as shown below: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Encod. Type | Resv. | Interface Type | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | G-PID Num. | G-PIDs | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ~ G-PIDs ~ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Y. Xu, et. al. [Page 13] draft-xu-bgp-gmpls-01.txt Jan. 2002 Encoding Type: 1 bytes This field is the same as LSP Encoding Type field in [GMPLS-SIG]. It indicates the encoding type of the CAG. The following shows permitted values and their meaning: Value Type ----- ---- 1 Packet 2 Ethernet V2/DIX 3 ANSI PDH 4 ETSI PDH 5 SDH ITU-T G.707 1996 6 SONET ANSI T1.105-1995 7 Digital Wrapper 8 Lambda (photonic) 9 Fiber 10 Ethernet 802.3 11 SDH ITU-T G.707 2000 12 SONET ANSI T1.105-2000 Reserved: 1 bytes This field is reserved. It MUST be set to zero on transmission and MUST be ignored on receipt. Interface Type: 2 bytes The Interface Type field further specifies the physical attribute of a CAG. This field is interpreted according to the Encoding Type field. Details of this field are TBD. G-PID Number: 1 bytes A CAG may support multiple types of payload. Each payload type can be identified by a G-PID as defined in the G-PID field in [GMPLS- SIG]. This field indicates how many types of payload that the CAG can support. G-PID: 1 byte This field indicates a G-PID that can be supported by this CAG. It uses the value defined in the G-PID filed in [GMPLS-SIG]. This attribute is used to control CAG route dissemination. A provider disseminates a remote CAG route to a client only if the client BNE's CAG is compatible with the remote CAG according to their CAG_TYPEs. Similarly, a BNE in a provider network shall disseminate the CAG route to another BNE (the remote BNE) only if the remote BNE can support connections compatible Y. Xu, et. al. [Page 14] draft-xu-bgp-gmpls-01.txt Jan. 2002 to CAG_TYPE. However, between provider BNEs, payload types MUST not be considered. 7.3.2 NO_AGGREGATION (type code TBD) This is an optional transit attribute that indicates an CAG route that should not be aggregated and MUST be disseminated individually. This attribute is defined to enable the distributed directory service. 7.3.3 DISCLOSE_SET (type code TBD) This is an optional transit attribute that defines the client networks to which this route should be disseminated. The format of this attribute is shown below: +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Number of Client Networks (2 bytes) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | First Client Network ID (in TLV format) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Last Client Network ID (in TLV format) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ The client network ID is identified either by the AS number for networks using IP addresses or initial domain part (IDP) for networks using NSAP based addressing structure. The Client Network ID field is encoded in TLV format. If number of client networks byte is all "1"s, this route should be "pushed" to all client/provider networks. 7.3.4 EXTENDED-LINK-BANDWIDTH (type code TBD) This attribute is extended from Link Bandwidth Community [BGP-EXTCOMM]. It follows operation rules defined in [BGP-EXTCOMM]. This attribute represents the TE attributes of the link that connects the neighbor AS to local AS. This attribute can be used to calculate the degree of local preference for the route. The EXTENDED-LINK-BANDWIDTH attribute consists of four fields, representing the Total Link Bandwidth, the Maximum Reservable Bandwidth, the Minimum Reservable Bandwidth and the Total Unreserved Bandwidth (as in [GMPLS-OSPF] and [GMPLS-ISIS]). Each field is 4 octets in size and represents a number in IEEE floating point format. Y. Xu, et. al. [Page 15] draft-xu-bgp-gmpls-01.txt Jan. 2002 7.3.5 CAG_ORIGINATOR (type code TBD) This attribute is an optional transit attribute that indicates the network ID of the CAG route originator. Typically, the CAG route originator is the client BNE that injects the route into the provider network. The CAG_ORIGINATOR attribute is encoded as a TLV with the same format as the Client Network ID field in the DISCLOSE_SET attribute. 7.4 CAG Route Dissemination Control at Client/Provider Interface At this interface, there is asymmetric information flow. (1) From Client BNE to Provider BNE A client BNE register its information to the attached provider BNE through neighbor discovery, client NE registration and service discovery procedures, such as defined in [LMP]. The route distribution process can be controlled either by the client BNE that originates the route, or the Provider BNE that is directly attached to the CAP, or a combination of the two. In each case, the controlling entity associates a set of attributes with the route that determine how and where the route is going to get distributed. In the case that both the CAP and the BNE control the distribution, each of them attach a separate set of attributes to the route. A provider MAY ignore any or all of the attributes that have been attached by a client. For example, if a client-attached attribute conflicts with the local policies at a provider, it shall ignore the offending attribute (2) From Provider BNE to Client BNE A provider may disseminate certain information about certain CAGs to a client BNE (the distributed directory service). Typically, this is controlled by the policies derived from agreements between a client and its provider. When a provider BNE receives a PFA through BGP (E-BGP or I-BGP), it disseminates it to the client BNE only if: 1. The CAG-TYPE of the route matches the type of connection that can be supported by the client BNE, and 2. The route originator and the client NE belong to the same network according to the client network ID (AS number or IDP), or 3. The client's network ID matches one of the network IDs in the CAG route's DISCLOSE-SET attribute, or the "Number of Client Networks" field in DISCLOSE-SET is all "1"s, or Y. Xu, et. al. [Page 16] draft-xu-bgp-gmpls-01.txt Jan. 2002 4. One of the route's Export Targets [BGP-MPLS/VPN] match one of the Import Targets for the client BNE. Note that Import Targets here refer to a set of attributes that a provider associates with a client BNE. These attributes, in conjunction with the Export Target attributes of a route, help to determine the client BNEs to which a CAG route is distributed. If a provider BNE decides to disseminate a CAG route to a client BNE, it may choose to truncate the AG route's AS-PATH attribute if it doesn't want the client to know the information. However, this MUST not be allowed between providers. 7.5 Route Dissemination Control at Provider/Provider Interface The information interchanged at the provider/provider interface are CAG routes. CAG routes without NO-AGGREGATION may be aggregated by a provider. CAG Route distribution between providers is mainly controlled by local policies. For example, a provider may want to enforce certain transit policies. BGP route attributes and community attributes can help operator to control where and what to disseminate. When a provider BNE receives a CAG route through BGP (I-BGP or E-BGP), it puts the route into the Adj-RIB-Out for an E-BGP neighbor only if: 1. The E-BGP neighbor can handle connections of the type specified by CAG-TYPE, and 2. The "Number of Client Networks" field in DISCLOSE-SET is all "1"s, or 3. One of the CAG route's Export Targets match one of the Import Targets for the Adj-RIB-Out for the E-BGP neighbor. 7.6 CAG Route Dissemination via I-BGP The default I-BGP behavior is that a router does not change the next hop attribute for a route learned via an E-BGP session when it advertises the route via I-BGP. For dissemination of CAG routes, this behavior is changed because the IGP in circuit networks only deals with circuit connections within the domain. It does not need to know any external routes. Therefore, EGP and IGP do not need to exchange topology information for the circuit- LSP set up operations. A provider BNE puts its own address as the Next Hop when disseminating a CAG route (received via E-BGP) to other BNEs through I-BGP. This simplifies the protocol by avoiding the unnecessary IGP/EGP interaction. BGP route Y. Xu, et. al. [Page 17] draft-xu-bgp-gmpls-01.txt Jan. 2002 selection procedure described below chooses AS ingress and egress points. CSPF and IGP link state databases are then used to compute the exact path within the domain between the AS ingress and egress points. When Route Reflectors are used [BGP-RR], a route reflector leaves the Next Hop attribute unchanged when it receives a route from one of its clients via I-BGP or from one if its Route Reflector peers. Client nodes continue to behave in the way described in the previous paragraph. 7.7 Circuit Connection Route Selection Process For the inter-domain circuit connections, path selection is typically done hop by hop (here the hop is in AS granularity). This is because a provider's local policies may conflict with an explicitly source routed path from a previous hop, which could be either a client or another provider. Since the next hop selection at every stage is done locally, the end-to-end path that is eventually used may not be the most optimal. The inputs to this route selection process are: 1. The set of routes to the destination CAG (or the longest CAG prefix, in case aggregation is allowed) that have been learned and accepted by the local system, and 2. A set of routing constraints from clients or other providers. The output of the route selection is the selected BGP next hop and some modified constraints (e.g. ER object/TLV). Path selection involves several steps: 1. Calculation of the degree of preference. This step should be executed whenever the local BGP speaker receives an UPDATE message from a peer located in a neighboring AS that advertises a new route, a replacement route, or a withdraw route. It also should be executed whenever a BGP Next Hop's TE information has changed significantly (i.e., the magnitude of the change has exceeded a threshold). The BGP Next Hop's TE information can be obtained from the IGP and the EXTENDED-LINK-BANDWIDTH community attribute. For each new route, the local BGP speaker shall determine a local degree of preference. The local degree of preference represents a local decision that may require TE, economic, policy and service related analysis and decision-making. 2. Route selection. This step is invoked on completion of step 1 and the receiving of a Circuit-LSP set up request. It is a constraint-based route selection procedure. Y. Xu, et. al. [Page 18] draft-xu-bgp-gmpls-01.txt Jan. 2002 The selection process is based on the local degree of preference. In essence, the route with the highest local degree of preference that satisfies all the constraints in the set up request is chosen. In case there are ties in the route selection, the local system can use the tie-breaking rules below (see [BGP-4]): a. The route with the highest LOCAL-PREF is selected. b. If there is a tie, the route with the shortest AS-PATH is selected. c. If there is a tie, and multiple routes were learned from the same neighboring AS, the route with the lowest MULTI-EXIT-DISCRIMINTOR value is selected. d. If there is a tie, the route with the minimum IGP cost to Next Hop is selected. e. If there is a tie and all routes were learned via I-BGP, the system goes to step f. Otherwise, if one or more routes were learned via E-BGP, the route learned from the E-BGP neighbor with the lowest BGP ID is selected. f. If all routes were learned via I-BGP, the route that was learned from the I-BGP neighbor with the lowest BGP ID is selected. 3. Inter-domain signaling relay. Details are covered in the inter- domain signaling section. 7.8 Scalability Enhancement Route reflectors and/or confederations can be used enhance scalability for I-BGP connections. Furthermore, it may happen that an AG route is being disseminated to a router that always discards it because of local policies. In such cases, BGP traffic can be considerably reduced by using the cooperative route filtering capability [BGP-ORF] if the BGP speakers support it. In this case, a BGP speaker can distribute a route filter to all its neighbors indicating the subset of routes that it is interested in. Y. Xu, et. al. [Page 19] draft-xu-bgp-gmpls-01.txt Jan. 2002 8 Inter-Domain Extension to GMPLS Signaling Current work on GMPLS signaling has mainly focused on intra-domain signaling [GMPLS-SIG]. This section specifies the signaling required between two different AS-es. The main differences between inter-domain signaling and intra-domain signaling are: 1. The hop granularity for inter-domain signaling is the AS, while the hop granularity for intra-domain signaling is the area or the NE. 2. Inter-domain signaling has to take into account business related issues such as service level agreements, policy and security. In contrast, intra-domain signaling does not consider such issues. 8.1 New Objects/TLVs In this subsection, we define the objects/TLVs that are specific to inter-domain signaling. Detailed formats of these objects/TLVs are protocol dependent. 8.1.1 Routing Constrain Related Objects/TLVs 1. Inter-Domain Explicit Route Both RSVP-TE and CR-LDP have defined objects/TLVs for explicit routing. In either case, the ER-HOP sub-object/TLV can be either an IP address or an AS number. In practice, it is desirable to have a separate inter-domain explicit route object/TLV. This ensures that the natural separation of intra-domain and inter-domain routing operations is maintained. The Inter-Domain Explicit Routing Object/TLV contains the following components: (1) Source CAG (mandatory) (2) Destination CAG or Network ID (mandatory) The destination of a circuit-LSP set up request can be either a PFA (a specific CAG) or the Network ID of another client. If a client network wants to connect to another client network and does not care about which specific CAG it will be connected, it can simply specify the client network ID (AS number or IDP) as the destination in its request. Y. Xu, et. al. [Page 20] draft-xu-bgp-gmpls-01.txt Jan. 2002 For example, consider a regional ISP that wants an OC-12c connection to an Internet Backbone Network, which has multiple CAGs. The ISP may specify the Network ID of the client network in its connection request to the service provider. The request propagates till it comes to the intermediate provider that has multiple choices with respect to the destination CAG. At this point, the intermediate provider determines the best CAG that the regional ISP should attach to, based on the local policies and the constraints embedded in the request. (3) Intermediate AS list (optional) Since inter-domain routing is done hop by hop, the intermediate AS list is typically empty. If a provider wants to specify the AS list, the list MUST be an all-strict AS list. Otherwise, it is ignored. In this case, the strict AS List MUST be followed by subsequent AS-es. Otherwise, an error condition should be returned indicating that the ER AS-List cannot be met. 2. AS Record Route Since inter-domain routing is done hop by hop, it could lead to loops in circuit-LSP set ups. The AS Record Route Object/TLV is used to prevent such routing loops. When an AS ingress BNE selects the AS egress BNE, it ensures that the chosen BNE does not directly connect to an AS that is not in the AS Record Route Object/TLV. The ingress BNE can determine the AS that the (potential) egress BNE connects to from the route's AS-PATH attribute. An AS Record Route Object/TLV consists of a series of variable length sub-objects, each of which is a 2-byte or a 4-byte AS number of IDP. The AS Record Route Object/TLV is only used by provider networks for circuit-LSP set up operations. Providers may choose not to propagate this information to their clients. 3. Avoid AS List A client or a provider may want to avoid transiting a specific provider because of business considerations. It can use the Avoid AS List Object/TLV to specify a list of AS-es that should be avoided when a circuit-LSP is being set up. The Avoid AS List Object/TLV also enables a client to create AS diversified paths, if possible. To accomplish this, the client first sets up a path and then requests a second path using the first path's AS Record Route information as the Avoid AS List Object/TLV for the second path set up request. Y. Xu, et. al. [Page 21] draft-xu-bgp-gmpls-01.txt Jan. 2002 The AS-es specified in the Avoid AS List MUST be avoided during circuit-LSP set up. Otherwise, an error should be returned indicating that the Avoid AS List constraint cannot be met. 4. Avoid Connection ID and Diversity Options A client may want to specify diversified paths at different granularities. The Avoid Connection ID and the Diversity Option Objects/TLVs can be used together for this purpose. The Avoid Connection ID Object/TLV specifies the connection ID of the circuit-LSP for which a diversified circuit-LSP is being requested. The Diversity Options Object/TLV is the same as the Diversity Options Object as defined in [OIF-UNI]. It can specify link level, node level or SRLG level diversification. For example, consider a circuit-LSP with ID 19 exists between NEs A and B. If NE A wishes to set up another circuit-LSP that does not share the same links as circuit-LSP 19, it uses an Avoid Connection ID Object/TLV with ID 19, and a Diversity Options Object/TLV specifying link level diversification. Note that in order for this scheme to work, connection IDs must be globally unique and should be maintained by all the NEs on the circuit- LSP path. 8.1.2 Security, Service, Policy and Contract Related Objects/TLVs These types of objects are negotiated between neighboring networks and only have local significance. It should be noted that client networks that are connected through underlying provider networks should also be treated as neighbors. When signaling messages cross AS boundaries, they should include some of these objects in accordance with the business agreements between neighbor AS-es. Each AS must validate the information in these objects in order to guarantee the integrity of network operation and to enforce business agreements. For circuit connections over multiple providers, two sets of such objects are required: -- The first set is for neighbor to neighbor (client/provider, p rovider/provider, or provider/client) relationships. -- The second set is for client to client relationships (i.e., between endpoints). This set should be transparent to providers. Details of these objects are TBD. Y. Xu, et. al. [Page 22] draft-xu-bgp-gmpls-01.txt Jan. 2002 8.2 Connection Operation Procedures A circuit-LSP in a circuit switched network really consists of two logical components: 1. A link connection (as in a Forwarding Adjacency) component that represents the logical connection between the two client networks that correspond to the source and the destination CAGs, and 2. A network connection component that represents the underlying circuit through the intermediate provider networks. It is possible that the client network to which the destination CAG belongs may not want to accept the connection request from the originator (e.g., due to policy reasons). Hence, it is desirable to verify that the link connection request at the destination client NE before initiating the actual network connection creation. This is done using a three-stage process during the Inter-domain circuit-LSP set up process: | | | +--+ +--+ | +--+ +--+ | +--+ +--+ | +--+ +--+ |A1|-///-|A2|-+-|X1|-///-|X2|-+-|Y1|-///-|Y2|-+-|B1|-///-|B2| +--+ +--+ | +--+ +--+ | +--+ +--+ | +--+ +--+ | | | AS 1 AS 2 AS 3 AS 4 Client A Provider X Provider Y Client B Figure 2: Inter-domain Signaling Scenario 1. Link Connection Verification Stage This stage ensures that destination client NE is willing to set up the connection and have enough capacity for the link connection. In the scenario in Figure 2, A and B are clients, and X and Y are providers. A2 initiates the request, reserves the local resource and then forwards the request to X1. X1 selects the BGP Next Hop (X2) and forwards the message to X2. At this point, X1 does not invoke intra- domain signaling, but it may consult its IGP link state database to make sure the intra-domain path to X2 is available. Furthermore, X1 also does not reserve any resources during this stage. Resource reservation is done in the next stage. This procedure is repeated across different providers until the message reaches B1. If B1 is ready for the connection, it sets up the local connection and reverses the request to Y2. The connection setup process now goes to the next stage. Y. Xu, et. al. [Page 23] draft-xu-bgp-gmpls-01.txt Jan. 2002 2. Network Connection Setup Stage In this stage, the inter-domain signaling reverses the original request message. At the boundary of each intermediate provider network, the message triggers intra-domain signaling for sub-network connection setup. This implies that resources are also reserved at each NE during this stage. During this process, an intermediate provider should not change the path selected at the end of the first stage. 3. Acknowledge Stage In this stage, both the network connection and the link connection are set up. An acknowledgement is sent from A2 to B1 to confirm the connection set up. 8.3 LSP Protection Switching and Restoration For a LSP over multiple AS-es, restoration could be done either in an end-to-end (client-to-client) fashion or could be localized to the AS where the failure occurs. The exact mechanism depends on the business agreements between the parties involved. Details and implications for signaling protocols need further study. 9 Security Security is critical at the points where different administrative domains interact. When a service request crosses business domains, encryption and authentication mechanisms are required at the interfaces. The details of an overall security architecture need to be addressed further in an independent document. Y. Xu, et. al. [Page 24] draft-xu-bgp-gmpls-01.txt Jan. 2002 10 Acknowledgements The authors want to thank Harvey Epstein, Charles Zhou, Raghu Srinivasan, Nabil Bitar and Roshan Rao for their reviews and comments. 11 Authors' Address Yangguang Xu 21-2A41, 1600 Osgood St. Lucent Technologies, Inc. N. Andover, MA 01845 Email: xuyg@lucent.com Anindya Basu Lucent Technologies 600 Mountain Avenue, 2C-417 Murray Hill, NJ 07974 Email: anindyabasu@lucent.com Yong Xue Global Network Architecture UUNET/WorldCom Ashburn, Virginia Email: yxue@uu.net Y. Xu, et. al. [Page 25] draft-xu-bgp-gmpls-01.txt Jan. 2002 Reference [BGP-4] Y. Rekhter and T. Li, "A Border Gateway Protocol 4 (BGP-4)", RFC 1771, March, 1995 [BGP-COMM] R. Chandra, P. Traina, and T. Li, "BGP Communities Attribute", RFC 1997, August 1996 [BGP-MP] Bates, Chandra, Katz, and Rekhter, "Multiprotocol Extensions for BGP4", February 1998, RFC 2283 [BGP-EXTCOMM] Ramachandra, Tappan, "BGP Extended Communities Attribute", February 2000, work in progress [BGP-ORF] Chen, Rekhter, "Cooperative Route Filtering Capability for BGP-4", March 2000, work in progress [GMPLS-AR] Mannie, et. al., "GMPLS Architecture", March, 2001, Work in progress [MPLS-VPN] Rosen, Rekhter, et. al., "MPLS/BGP VPN", June, 2000, Work in progress [GMPLS-SIG] P. Ashwood, et. al., "Generalized MPLS Signaling Functional Spec.", April 2001, work in progress [OIF-UNI] Many, OIF2000-125.4, "OIF UNI Functional Spec.", April 2001, work in progress [NE-FRMK] Y. Xu, "An Internet Network Engineering Framework", July, 2001, Work in progress [GMPLS-OSPF] K. Kompella, et. al., "OSPF Extension for GMPLS", April, 2001, work in progress [BGP-CONF] Traina, P., "Limited Autonomous System Confederations for BGP", RFC 1965, June 1996. [BGP-REF] Bates, T. and R. Chandra, "BGP Route Reflection An alternative to full mesh IBGP", RFC 2796, June 1996. Y. Xu, et. al. [Page 26]