Routing Research Group Frank Kastenholz Internet Draft Unisphere Networks Document May 2002 Category: Informational ISLAY A New Routing and Addressing Architecture Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026 [1]. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Kastenholz Informational - Expires November 2002 1 An Interdomain Routing Architecture May 2002 Table of Contents 1 Abstract...................................................4 2 Conventions used in this document..........................5 3 Overview of ISLAY..........................................5 4 The ISLAY Architecture.....................................8 4.1 Elements of the Architecture...........................8 4.1.1 Destination.........................................8 4.1.2 Destination Identifier (ID).........................9 4.1.2.1 Structure.........................................9 4.1.2.2 Forwarding Table..................................9 4.1.2.3 Scoping and Uniqueness............................9 4.1.2.4 Merging..........................................10 4.1.3 Aggregates.........................................10 4.1.3.1 Aggregate Hierarchy..............................10 4.1.3.2 Adjacencies......................................11 4.1.3.3 Interconnection..................................12 4.1.3.4 Information Hiding...............................12 4.1.3.5 Transit Service..................................13 4.1.4 Aggregate Identifier...............................13 4.1.5 Aggregate Border Router (ABR)......................15 4.1.6 Administrative Domain (AD).........................19 4.1.7 Policy.............................................19 4.1.7.1 Policy Consistency...............................20 4.1.7.2 Traffic Policies.................................21 4.1.7.3 Routing Data Policies............................22 4.1.8 Topology Information Base..........................22 4.1.9 Forwarding Table (FT)..............................22 4.1.10 Links.............................................22 4.2 Procedures............................................23 4.2.1 ABR Peering........................................23 4.2.1.1 Internal ABR Peering.............................24 4.2.1.2 External ABR Peering.............................25 4.2.2 Topology Discovery.................................26 4.2.2.1 Advertisement Content............................28 4.2.2.2 Internal Topology and Abstractions...............28 4.2.2.3 Policy...........................................29 4.2.3 Aggregate Content Discovery........................29 4.2.3.1 Advertising Contents.............................31 4.2.3.2 Learning Internal Contents.......................31 4.2.3.3 Learning External Contents.......................31 4.2.3.4 Transit Content Discovery........................32 4.2.3.5 Content Reachability.............................33 Kastenholz Informational - Expires November 2002 2 An Interdomain Routing Architecture May 2002 4.2.3.6 Content Validation...............................33 4.2.4 Creation of the Forwarding Table...................34 4.2.5 Hierarchical Aggregation...........................35 4.2.6 Multi-Homing.......................................37 4.2.7 Mobility...........................................39 4.2.8 Connectivity Changes...............................39 4.2.8.1 External Changes.................................40 4.2.8.2 External Visibility of Internal Changes..........40 4.2.8.3 Aggregate Partitions.............................40 4.2.9 Hiding of Aggregates...............................43 4.2.10 Policies..........................................43 4.2.10.1 Routing Data Policies...........................43 4.2.10.2 Traffic Policies................................44 4.2.10.2.1 Metrics.......................................44 4.2.10.2.2 Multi-Path....................................45 4.2.10.2.3 Transit.......................................46 5 Performance Considerations................................46 5.1 Reduction In Quantity of Data.........................47 5.2 Convergence...........................................47 5.3 Forwarding Table......................................47 5.4 Rope..................................................48 6 Security Considerations...................................48 6.1 Peering...............................................50 7 For Further Study.........................................50 8 IANA Considerations.......................................51 8.1 Aggregate Identifiers.................................51 8.2 Addresses.............................................51 8.3 Protocol Identifiers..................................51 9 MPLS......................................................51 10 Multicast.................................................52 11 Requirements Considerations...............................52 11.1 Evaluation against [2]..............................52 11.1.1 Architecture......................................53 11.1.2 Separable Components..............................53 11.1.3 Scalable..........................................53 11.1.4 Lots of Interconnectivity.........................53 11.1.5 Random Structure..................................54 11.1.6 Convergence.......................................54 11.1.7 Routing System Security...........................54 11.1.8 End Host Security.................................54 11.1.9 Rich Policy.......................................54 11.1.10 Incremental Deployment...........................55 11.1.11 Multi-homing.....................................55 11.1.12 Multi-path.......................................55 Kastenholz Informational - Expires November 2002 3 An Interdomain Routing Architecture May 2002 11.1.13 Mobility.........................................55 11.1.14 Address Portability..............................55 11.1.15 Multi-Protocol...................................56 11.1.16 Abstraction......................................56 11.1.17 Administrative Entities and the EGP/IGP split....56 11.1.18 Simplicity.......................................56 11.1.19 Media Independence...............................56 11.1.20 Stand-alone......................................56 11.1.21 Safety of Configuration..........................57 11.1.22 Renumbering......................................57 11.1.23 Multi-prefix Subnets.............................57 11.1.24 Cooperative Anarchy..............................57 11.1.25 Network Layer Protocols and Forwarding Model.....58 11.1.26 Routing Algorithm................................58 11.1.27 Positive Benefit.................................58 11.2 Evaluation against [3]..............................58 11.2.1 TBD...............................................58 12 References................................................58 13 Acknowledgments...........................................59 14 Author's Addresses........................................59 1 Abstract This document defines ISLAY, a new architecture for routing and addressing in the Internet. ISLAY is applicable primarily to inter-domain routing since most, if not all, problems with routing and addressing lie in that area. However, ISLAY may also be used for intra-domain and local-area routing. This note does not specify the protocols or algorithms used to implement ISLAY. It defines the objects comprising the new architecture, describes the relationships between those objects, and the operations that those objects perform. Other documents will specify a routing protocol, called BOWMORE, to implement ISLAY. The main feature of ISLAY is that it separates network topology from forwarding. A separate set of topological objects, with their own name space, is used to describe the network topology. Routing algorithms and protocols use these topological objects to determine the network topology. As a separate function, mappings of destinations (such as IP Address prefixes) to containing topological object are disseminated. A router determines the paths to the topological objects using any desired routing algorithms (such as shortest-path), determines the destinations reached in that topological object by the Kastenholz Informational - Expires November 2002 4 An Interdomain Routing Architecture May 2002 mappings, and then installs entries in the router's FT to those destinations. ISLAY has been developed to meet the requirements outlined in the Routing Research Group's Inter-Domain Routing and Addressing requirements document [2]. 2 Conventions used in this document This note introduces a new architecture, ISLAY, and in the course of doing so, introduces a set of new concepts and objects. In order to avoid confusion, we try to use new terms for these concepts and objects, even when a term currently in use would do or when an object is identical (or nearly so) to something in the current architecture. In addition, we Capitalize the terms for the new objects. Some of the elements of ISLAY are Links, Aggregates and Destinations. In the text and diagrams Links are generally named L (such as L12), Aggregates A (such as A91), Aggregate Border Routers are R (such as R167) and Destinations are D (such as D6). 3 Overview of ISLAY This section is a brief description of ISLAY. The intent is to convey a general sense of how ISLAY is meant to operate. This should make the detailed technical descriptions in Chapter 4 easier to follow. The basis of ISLAY is separation of network topology from network protocol addressing and packet forwarding. We do this by introducing a new network object, called an "Aggregate", and a new name space for that object, called "Aggregate Identifiers". Aggregates contain other Aggregates and/or Destinations (such as IP subnetworks), which are identified by their Destination Identifier (such as an IP Address prefix). MPLS attempts to make a similar separation. However, it does not completely achieve that separation. Even though MPLS introduces a new object, used for forwarding (the MPLS Label) and uses the IP routing and addressing system, for topology, IP addressing is still used for packet forwarding (for non-MPLS packets). Routers exchange topology information and other attributes about Aggregates. That information is used to build their Topology Databases. The Topology Database identifies what Aggregates are known, what links exist between them, and various attributes and properties of the Aggregates and Links. Kastenholz Informational - Expires November 2002 5 An Interdomain Routing Architecture May 2002 Routers use the Topology Database to determine paths to Aggregates. A new protocol mechanism propagates "Aggregate Content Data", lists of which Destinations (e.g., IPv4 subnets) and their attributes are contained in which Aggregates. There is no requirement that the Destinations be IPv4 prefixes; they may be any identifiers or addresses that can be used by a router's forwarders, such as IPv6 addresses. When a router builds its Forwarding Table, it performs the topology calculations on the Aggregate-based Topology Database. The router uses these calculations to select paths to individual Aggregates. For each destination Aggregate, the router populates the Forwarding Table with the Destination Identifiers (e.g., IP Address prefixes) found in that destination Aggregate, using the next hop to that Aggregate as the next hop for each Destination in the Aggregate. The process for determining next-hops, paths, and so on, is, of course, influenced by policies that may be manually installed in the router or disseminated through the network. The Topology Database in a router is limited to Aggregates and inter-Aggregate Links. Since each Aggregate represents many Destinations, the Topology Database's size and complexity is not a function of the number of Destinations (e.g., IP prefixes). There are three benefits to this. First, the resources required by a router to store and process the database are reduced. Second, for each topology calculation iteration, FT entries for many Destinations can be generated. The cost of each topology calculation is reduced, and the number of such calculations required is also reduced. Thus, the load on the router is reduced. Third, the amount of routing protocol traffic required when there is a topology change is a function of the number of Aggregates affected by the change, not the number of IP Prefixes (as it is today). This should significantly reduce routing protocol traffic. In addition to aggregating destination information, Aggregates can be combined. Instead of having an Aggregate's contents be limited to Destinations, an Aggregate can also contain other Aggregates. That is, instead of Aggregate A1 advertising that it contains destinations D1, D2, D3... D9, A1 could advertise that it contains destinations D1, D2, D3, and Aggregate A2. Aggregate A2 in turn, advertises that it contains Destinations D4, D5... D9. While A2's contents are explicitly advertised, the topology within A1 (i.e., how A2 is situated within A1 and how it is connected) is NOT advertised. If reachability from within A1 to A2 is lost, then A1 simply marks its "A1 contains A2" advertisement with a "but it's not reachable" qualifier. This ability to hierarchically structure Aggregates means that Kastenholz Informational - Expires November 2002 6 An Interdomain Routing Architecture May 2002 the effects of the topological complexity of those sections of the Internet can be limited. This hierarchy has to end someplace. At any given point, the hierarchy can end in one of two ways: 1. An Aggregate at the bottom of the hierarchy contains exactly one Destination. In this case, the entire topology above that bottom aggregate uses ISLAY and ISLAY's protocols. 2. An Aggregate at the bottom of the hierarchy contains multiple Destinations. These Destinations are connected to each other via routers. The topology within the Aggregate is invisible to the rest of the network. The routers in the Aggregate must run some kind of routing protocol. They may run legacy protocols (such as RIP, OSPF and IS-IS) or they could run protocols supporting the new architecture. ISLAY supports multi-homing at all levels. An individual IP network, or prefix, is multi-homed by being advertised as contained in two or more Aggregates. When the prefix's entry in the FT is being built, a router notes that this prefix is contained in two (or more) Aggregates and selects one of the paths/Aggregates for the prefix. The selection is the hard part. There are at least two mechanisms that can be used: Local Policy Some policy in the router is used to select a path. This policy may, for instance, require selecting paths with certain transmission characteristics (e.g., higher- speed paths over lower-speed ones), or paths through certain ISPs. Destination Preference The Destination itself may indicate a reference for which path to take (e.g., a "primary" vs. a "backup" path). ISLAY also supports network mobility. If a network moves but does not change Aggregates, the only routing that is affected is the routing within that Aggregate. The information that the Aggregate presents to the rest of the network does not change. If the network changes Aggregates then the original Aggregate removes the mobile network's prefix from its content advertisements and the new Aggregate adds the network's prefix. This mechanism is identical to the mechanism used when an end a site changes service providers. The mobile network (or end site) does NOT need to do any renumbering. ISLAY is divided into major components, Topology Management Routers use topology management to acquire the topological information they need in order to choose paths to destinations and install those paths in their forwarding database. The main elements of topology Kastenholz Informational - Expires November 2002 7 An Interdomain Routing Architecture May 2002 management are the Aggregates and the Links between them. Topology management is more or less identical to what today's routing protocols do. Destination Management Destination Management binds specific network Destinations to topological elements (Aggregates). Destinations are carried in individual packets and are used as the keys in the forwarding lookup. For the Internet today, Destinations are IP subnets, keyed by either an IPv4 or an IPv6 prefix. MPLS, mapping of LSPS to Destinations, and generation of LSPS is seen as an application layered on top of the routing and addressing system. ISLAY does not directly address MPLS LSP setup. Building of multicast trees is an application that is layered on top of the core routing and addressing system. ISLAY does not directly address multicast tree setup. 4 The ISLAY Architecture This chapter defines ISLAY. There are two sections. The first, called "Elements of the Architecture", defines the elements that make up ISLAY. The second section, called "Procedures" describes the operations that are performed on and with the elements. 4.1 Elements of the Architecture This section defines the elements that comprise the ISLAY architecture. Each element is defined in a following section. 4.1.1 Destination A Destination is the place where packets are sent. ISLAY does not place any more specific definition on Destinations. We use the term Destination rather than IP Subnet because ISLAY is not limited to just IP. The definition of a Destination is purposely kept abstract in order to support multiple protocols. One attribute of a Destination is the protocol used to reach that Destination. For example, IPv4 is the protocol for a Destination that is an IPv4 subnet. IPv4 and IPv6 subnets are Destinations. Individual end-hosts may also be destinations. However, this is strongly discouraged since it does not scale. The primary goal of the routing system is to find paths to Destinations. Kastenholz Informational - Expires November 2002 8 An Interdomain Routing Architecture May 2002 Destinations are contained in Aggregates (see section 4.1.3). A Destination must be contained in at least one Aggregate. It may be contained in more than one Aggregate. Destinations MUST have names. See section 4.1.2. 4.1.2 Destination Identifier (ID) A Destination Identifier (ID) is the name of a Destination. 4.1.2.1 Structure ISLAY supports multiple different network layer protocols and forwarding models (e.g., IPv4 and IPv6). Therefore, the Destination Identifier must contain protocol-specific information (e.g., IPv4 prefixes for IPv4 subnets). The Destination Identifier has a common part, which identifies the protocol used to reach the destination (IPv4, IPv6, etc). A second part of the destination identifier is a protocol- specific part. This part contains the protocol-specific information needed to identify and reach the Destination. For example, a destination ID for an IPv4 subnet would contain a 32-bit IPv4 address, while a destination ID for an IPv6 subnet would contain a 128-bit IPv6 address. 4.1.2.2 Forwarding Table The forwarding table in a router is keyed using the protocol- specific parts of the Destination IDs. Therefore, these protocol-specific parts MUST be transported in all packets to be forwarded. The IPv4 and IPv6 destination addresses in the IPv4 and IPv6 headers satisfy this requirement. 4.1.2.3 Scoping and Uniqueness The visibility of a Destination (and therefore the Destination ID) may be scoped by policy mechanisms in the protocols. Destinations IDs must be unique only within their scope. It is better if Destination IDs are globally unique. That way, if the scoping changes, there will not be any problems. Author's Note: I would prefer not to have scooping. I believe that it is just so much Architectural Aspartame. However, policies and use of private address space, along with a need to be able to deploy ISLAY, compels me to include the feature. Kastenholz Informational - Expires November 2002 9 An Interdomain Routing Architecture May 2002 4.1.2.4 Merging When the underlying network layer protocol(s) allow it, multiple Destination Identifiers may be merged together into a single Destination Identifier. For example, if Destinations are IPv4 subnets with addresses 10.[0...255]/16, the prefixes may be super-netted into 10/8 so that 1 Destination ID is used, rather than 256. Mechanisms and details of this are outside of the scope of ISLAY. 4.1.3 Aggregates Aggregates are the topological elements of the network. An Aggregate is a collection of Destinations and other Aggregates. Aggregates contain at least one Destination or Aggregate. They should contain more since efficiencies are gained by reducing the network to a relatively small number of Aggregates. The Destinations and Child Aggregates are referred to as Content or Aggregate Content. Aggregates are named with Aggregate Identifiers (see section 4.1.4). 4.1.3.1 Aggregate Hierarchy Aggregates may be hierarchically structured. An Aggregate is a Parent Aggregate (sometimes shortened to Parent) with respect to the Aggregates and Destinations that it contains. An Aggregate is a Child Aggregate (sometimes shortened to Child) with respect to the Aggregate(s) that contain it. Aggregates that have the same Parent are called Peer Aggregates (or Peers): . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . +--------------------------+ . . | | . . | Aggregate A1 | +--------------------+ . . | | | | . . | +-----------+ D1 | | Aggregate A2 | . . | | Aggregate | | | | . . | | A3 | D2 | +--------------------+ . . | | D4 D5 D6 | D7 | . . | +-----------+ | . . | D3 | . . +--------------------------+ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Figure 1 Kastenholz Informational - Expires November 2002 10 An Interdomain Routing Architecture May 2002 In the above diagram, Aggregate A1 contains Aggregate A3. A1 is the Parent and A3 is the Child. Aggregate A2 is a Peer of Aggregate A1. Aggregate A1 also contains Destinations D1, D2, D3, and D7. Aggregate A3 contains Destinations D4, D5, and D6. An Aggregate may be contained in more than one Aggregate (i.e., it may have more than one Parent). 4.1.3.2 Adjacencies Aggregates that are Parents, Children, or Peers of a particular Aggregate are considered Adjacent to that Aggregate. Aggregates that are not Adjacent are Distant. For example, in the Figure 2: +-----------------------+ | | | +-----------------+ | | | | | | | +-----------+ | | +--------------+ +--------------+ | | | Aggregate | | | | | | | | | | A1 | | |---| Aggregate A4 |---| Aggregate A5 | | | +-----------+ | | | | | | | | | | +--------------+ +--------------+ | | Aggregate A2 | | | | | | | +-----------------+ | | | | Aggregate A3 | | | +-----------------------+ Figure 2 The following table shows which Aggregates in Figure 2 are Adjacent with respect to the other Aggregates and which are Distant. A "D" indicates that the Aggregates in the of the row and column are distant, an "A" indicates that they are Adjacent. Note that these relationships are symmetric: Aggrega Aggregate te A1 A2 A3 A4 A5 A1 A D D D A2 A A D D A3 D A A D A4 D D A A A5 D D D D An Aggregate learns about its Adjacent Aggregates directly. Kastenholz Informational - Expires November 2002 11 An Interdomain Routing Architecture May 2002 An Aggregate learns about the Distant Aggregates indirectly, via its Peers, Parents, or Children. 4.1.3.3 Interconnection Aggregates are presumed to be fully internally connected. That is, traffic can get between any two Destinations within the Aggregate without leaving the Aggregate (assuming no network failures). This means that traffic destined for any destination within the Aggregate (or its Child Aggregates) can enter the Aggregate anywhere. Policies and metrics may be applied that make some entry points more preferable than others, or even disable some points for some Destinations. It is possible for a failure, or set of failures, to partition an Aggregate. We believe this to be a relatively rare case. ISLAY does handle the case, but is not optimized to do so. See section 4.2.8.3 for more information on this. 4.1.3.4 Information Hiding Generally, the topology within an Aggregate is hidden from external view. To an Aggregate's peers, an Aggregate is an opaque collection of the destinations it contains (and its children contain) However, an Aggregate may reveal the presence of some or all of its Child Aggregates, associating the contents of the Child Aggregate with the Child Aggregate. In addition, the Parent may create Virtual Child Aggregates to contain some of the Parent's Content. Outside of the Parent, a Virtual Child Aggregate is indistinguishable from a real one. Typically, a Child Aggregate is revealed when the Parent wishes to have the Child's contents all treated the same way (e.g., with a common set of policies). The internal topological structure of the Parent (i.e., how the Children are interconnected within the parent) is not revealed by the Parent. By hiding information in this manner, we 1. Reduce the amount of topological data that is seen outside of the Parent, 2. Reducing the complexity of the topology graphs contained in routers outside of the Parent, 3. Reducing the computational loads on the routing processors of those routers, 4. Improving the scaling properties of ISLAY. Kastenholz Informational - Expires November 2002 12 An Interdomain Routing Architecture May 2002 4.1.3.5 Transit Service A Parent Aggregate may not automatically assume that a Child provides transit traffic service. A Child may or may not provide transit service. That is a policy matter for the Child Aggregate. Consider Aggregate A1, shown in Figure 3: . . . . R2. . . . . . . . . . . ___ || . . +---+| Aggregate A1 . . | | . . . . . . . / | . . . / | +----R4 . . R1----+ | | . A1.1. . \ | | | . . R5. . . . \ | | L1 | . . | | | / | . . | +---R3 L3 . . L4 \ / . . | L2 / . . | . .|. . ./. . . . | . R7 R6 . . . | . . . . +--L4----R8 A1.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . Figure 3 If Router R3 failed, then traffic could reach A1.1 if and only if A1.2 was willing to provide transit service. 4.1.4 Aggregate Identifier Aggregate Identifiers name Aggregates. Aggregate IDs should be unique through the entire Internet. It is conceivable that an Aggregate is known within some limited scope so therefore its name need not be globally unique. However, this "hidden" aggregate could later become "unhidden". Rather than renaming this Aggregate, its name should be globally unique; renaming is then not necessary. However, knowledge of Aggregate IDs is limited just to routers, so renaming is not a terribly big problem. Aggregate Identifiers are unstructured values. They simply name an Aggregate. An Aggregate Identifier has no topological significance; it does not define the position of the Aggregate on the graph. That is, nothing can be inferred from an Aggregate's name: o One cannot tell "where" the Aggregate is in the network, Kastenholz Informational - Expires November 2002 13 An Interdomain Routing Architecture May 2002 o One cannot tell what Aggregates and/or Destination IDs are contained in the Aggregate o One cannot tell what Aggregate(s) contains the named Aggregate. That said, structured naming of Aggregates is not prohibited by ISLAY either. There may be places where local optimizations allow Aggregate Identifiers to be hierarchically delegated by Aggregates to Sub-Aggregates. Implementations may make use of these delegations where practical. ISLAY does not place any direct requirements or restrictions on the form of Aggregate IDs. We believe that the number of aggregates might grow to be very large, especially if hierarchies are extensively used. Therefore we recommend that the Aggregate ID be a fairly large bit field. Aggregate Identifiers do not to appear in the headers of forwarded IPv4, IPv6, or MPLS traffic. The forwarders in routers do not see Aggregate Identifiers, nor do they use them in making forwarding decisions. Aggregate Identifiers appear only as data in the routing protocols and are to be used only by the routing algorithms. Thus, their form and structure may be optimized for topology calculations. Aggregate Identifiers are "flat", that there is no structure or topological information encoded within them. A premise of ISLAY is that any one point in the Internet will see at most only a few thousand Aggregates (and therefore Aggregate Identifiers). Performing topology calculations on this number of objects with flat names is well within the capabilities of current processors and routing algorithms. Therefore, there is no need for structure in the Aggregate IDs. It is not impossible that Aggregate Identifiers will be included in some future forwarding model and protocol headers. We strongly recommend that this not be done. If the Aggregate ID is included in the forwarding model then it becomes overloaded and is "cast in stone" and it is difficult to change later on. In effect, this would put things where they are today. The Aggregate ID contains a "partition" field (perhaps 8 or 16 bits). This field is used to indicate which partition of the Aggregate is being referred to when the Aggregate is partitioned. Aggregates are always partitioned; nominally they have but one partition, when faults occur they may have several. Section 4.2.8.3 discusses the use of this field in more detail. Kastenholz Informational - Expires November 2002 14 An Interdomain Routing Architecture May 2002 4.1.5 Aggregate Border Router (ABR) An Aggregate Border Router (ABR) is a router by which an Aggregate interacts with other Aggregates. Aggregates must have one Aggregate Border Router; they may have more than one. Each ABR must be configured with an identifier that is unique within the Aggregate which the ABR represents. ABRs establish Peering Relationships with each other. These relationships are used to exchange Topology, Content, and Policy information. A single ABR may establish Peering Relationships to multiple ABRs, which may be in more than one other Aggregate. Peering Relationships are manually configured. Peering Relationships are bi-directional. That is, if ABR R1 has a Peering Relationship with R2, then R2 also has one with R1. The Aggregate Border Routers filtering incoming routing information according to their local Routing Data Policies. They apply Routing Data Policies to routing data that they transmit. Consider the network topology in Figure 4. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A1, D1 . . A2, D2 . . A3, D3 . . . . ======= . . ======= . . ======= . . . . . .|. . . . . .|. . . . . .|. . . . . | | | . . | +----+ | . . +----------| R1 |---------+ . . Aggregate +----+ . . A4 | . . ========D4 . . | . . . . . . . . . . . .|. . . . . . . . . . | . . . . . . . .|. . . . . . +----+ . . +------| R2 | . . | +----+ . . +----+ | . . | R3 | | . . +----+ ======D11 . . | . . | Aggregate . . =====D10 A5 . . . . . . . . . . . . . . Figure 4 Kastenholz Informational - Expires November 2002 15 An Interdomain Routing Architecture May 2002 R1 and R2 are Aggregate Border Routers. R3 is not. R1 represents Aggregates A1, A2, A3, and A4. R2 represents A5. R1 and R2 have established a peering relationship between them. The topology in Figure 4 shows several other important points: 1. A routing protocol must run between R2 and R3 so that traffic can flow to/from D10. This routing protocol can be any existing routing protocol, such as RIP, OSPF, IS-IS or BGP. This routing is not visible to ISLAY. 2. R1 and R2 are not directly connected. There is a subnet, D4, between them. 3. R1 is not contained within A1, A2, or A3, even though it represents those Aggregates. All ABRs of an aggregate establish internal peering relationships with each other. They are established so that the external routing data received by one ABR from a Peer Aggregate can be exported (after application of local policies), to the other Peer aggregates. See Figure 5. Kastenholz Informational - Expires November 2002 16 An Interdomain Routing Architecture May 2002 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .................. .................. . . . Aggregate A1 . . Aggregate A2 . . . . +--------+ . . +--------+ . . . . | ABR R1 | . . | ABR R2 | . . . ....+--------+.... ....+--------+.... . . | | | . . | +----------------+ | . . | | | . . . . +--------+. . . . . . .+--------+ +--------+ . . . | ABR R3 |- | ABR R4 |---| ABR R9 | . . . +--------+ \ +--------+ +--------+ . . . | \ | . . . . | -------- | . . . . | Aggregate A3 \ | . . . . +--------+ +--------+ . . . . | ABR R5 |-----------| ABR R6 | . . . . . +--------+. . . . +--------+ . . . . . | . | . . . | . ...+--------+.... . . . ....+--------+.... . . | ABR R8 | . . . . . | ABR R7 | . . . +--------+ . . . . . +--------+ . . . A5 . . . . . Aggregate A4 . . ................. . . . .................. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Figure 5 The routing data that ABR R5 receives from R7 must be forwarded over the internal peering relationships to R3 and R6 (who forwards it to R4). R3 can then forward the data on to R1 and R2. 1. Note that R2 will receive the data from both R3 and R4. A protocol mechanism is required in the protocols to either prevent this from happening (e.g., for R2 to "listen" to either R3 or R4, but not both) or so that receipt of multiple copies of some information is not a problem. 2. There is no direct internal relationship between R5 and R4. R6 must forward the routing information from R5 and R3 to R4 (and vice versa). Aggregate Border Routers are responsible for determining the "best" way to get to various Destinations that are outside of the Aggregate and for injecting that information into the routing protocols operating within the Aggregate: Kastenholz Informational - Expires November 2002 17 An Interdomain Routing Architecture May 2002 +-----------+ | | | \ | | R1----(cost=1)--D1 | / | / | | / | \ | / | R2---(cost=2) | / \| | \ | Aggregate |\ | A1 | -----D2 +-----------+ Figure 6 In Figure 6, D1 is reachable through ABR R1 at a "cost" of 1 and through R2 at a "cost" of 2. These costs are injected into the Interior Routing Protocol. In this way R1 becomes the "preferred" router to get to D1 (everything else being equal). D2 is reachable only via R2, so therefore R2 must be the router used to get to D2. R1 and R2 MUST inject the appropriate information into the Interior Routing Protocols to reflect this. As implied in Figure 4, an Aggregate Border Router may serve more than one Aggregate and these Aggregates MAY be at multiple levels of the hierarchy. . . . . . . . . . . . . . . . A1 . . ............... . . . . . . . \ . | . . . A2 +-----+ . ............| ABR | . | R1 |--- . ............| | . . +-----+ . . / . | . . . A3 . . . ............... . . . . . . . . . . . . . . . Figure 7 In Figure 7, ABR R1 serves all three Aggregates A1, A2, and A3. R1 keeps three sets of state, one for A1, one for A2, and one for A3. R1 should, in effect, act as three independent ABRs, as shown in Figure 8: Kastenholz Informational - Expires November 2002 18 An Interdomain Routing Architecture May 2002 . . . . . . . . . . . . . . . . . A1 . . ............... | . . . . | . . . \+-------+------+ . . A2 | R1.2 | | . .......+-------+ | . | R1.1 |--- . .......+-------+ | . . | R1.3 | | . . /+-------+------+ . . A3 . | . . ............... | . . . . . . . . . . . . . . . . . Figure 8 4.1.6 Administrative Domain (AD) An Administrative Domain (AD) is a collection of one or more Aggregates and/or Destinations that are under the control of a single administrative entity. The main characteristic of an Administrative Domain is that it has a single coherent set of Policies that are enforced by administrative fiat. Administrative domains are the areas to which a set of policies apply. An administrative entity may have multiple Administrative Domains. An AD may contain multiple Aggregates and/or Destinations. A Destination or Aggregate, however, is in one and exactly one Administrative Domain. Destinations and Child Aggregates are not necessarily within the Administrative Domain of their containing/Parent Aggregate. Author's Note Is this really needed? 4.1.7 Policy Policies are administrative rules, imposed by an Administrative Domain, that alter the "natural" or "default" behavior of the routing system. Administrative Domains put policies in place for a variety of reasons, such as 1. Business or contractual issues (for instance, certain customers of an ISP might get "gold" service and have their traffic carried over dedicated paths), 2. Financial Considerations (for instance, an ISP might want to have traffic use lower-cost links when possible), Kastenholz Informational - Expires November 2002 19 An Interdomain Routing Architecture May 2002 3. Security concerns (e.g., traffic to certain destinations or from certain sources might be segregated, or even prohibited), 4. Operational considerations (such as keeping traffic off of links that are known to have reliability problems), or 5. Performance considerations (traffic might preferentially use better performing paths, such as those with higher bandwidth or lower latency, or taking a more direct path to the destination). ISLAY cannot say "these are the N policies that are supported" since each Administrative Domain has its own unique needs. Instead, we provide general mechanisms and techniques that can be used to meet those needs. Policies can be enforced only within the Administrative Domain that imposes them. There are two general types of policies: o Traffic policies, which alter the flow of traffic in some way, and o Routing Data policies, which alters how the routing system accepts and processes Topology, Content, and other routing data. 4.1.7.1 Policy Consistency It is possible that Policies could be created, either by a single Administrative Domain or by multiple, independent, Administrative Domains, that are inconsistent and result in black holes. Consider the network graph in Figure 9: . . . . . . . . . . Aggregate . . A3 . . . . . . . . / \ / \ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Aggregate .-----. Aggregate . . Aggregate A5 . . A1 . . A2 . . (Dest. D1) . . . . . . . . . . . . . . . . . . . . . . . . \ / \ / . . . . . . . . . . Aggregate . . A4 . . . . . . . . Figure 9 Kastenholz Informational - Expires November 2002 20 An Interdomain Routing Architecture May 2002 Traffic is coming from Aggregate A1 and going to Destination D1 in Aggregate A5. Suppose that there are two policies: 1. The Administrative Domain that contains A3 may have a Policy that it will not forward traffic to D1. 2. The Administrative Domain that contains A4 may have one that says it will not accept traffic from Aggregate A1. The net result is that traffic cannot go from any source in A1 to D1. ISLAY DOES NOT provide any mechanism to prevent inconsistencies such as these from occurring. It can't. The flexibility to do this is also required to do the things that the network operations community desires. 4.1.7.2 Traffic Policies The basic mechanisms of policies involve classifying traffic in some way and then either 1. Forcing that traffic onto a certain path or set of paths, 2. Prohibiting that traffic from a certain path or set of paths, or 3. Biasing that traffic toward or away from a certain path or set of paths. An important aspect of traffic Policies is the establishment of transit rules. These rules define the traffic that is allowed to cross an Aggregate. Consider the topology in Figure 10: --------------------------- ( The Rest ) ( Of The ) ( Internet ) --------------------------- / \ . . . . . . . . . . . . . . . . . . . . . . . . . . . Backbone Provider,. . Backbone Provider,. . Aggregate A1 .----------. Aggregate A2 . . . . . . . . . . . . . . . . . . . . . . . . . . . \ / \ / . . . . . . . . . . . . . Access Provider,. . Aggregate A3 . . . . . . . . . . . Figure 10 Kastenholz Informational - Expires November 2002 21 An Interdomain Routing Architecture May 2002 There are two backbone service providers, represented by Aggregates A1 and A2. Another Aggregate, A3, is an access provider (e.g., they provide dialup service to some city). A3 is multi-homed to A1 and A2 for reliability. If the A1/A2 link fails, the routing protocols could reroute all of the A1/A2 traffic over the A1-A3-A2 path. A3's network is not designed to handle this load. Thus, A3 must have a Transit Policy that prohibits this sort of traffic. 4.1.7.3 Routing Data Policies These policies alter the information that is received and processed by a router. These may be as simple as filtering out certain information. For example, Destinations that are IPv4 Prefixes may be dropped if the prefix is too long. Another possible policy may be to reject routing information that comes from certain Aggregates. 4.1.8 Topology Information Base The Topology Information Base (TIB) is the set of data structures within a router that contains the topology information known to the router. The TIB is generated from configuration information in the router and information received via the routing protocols. 4.1.9 Forwarding Table (FT) The Forwarding Table (FT) is the set of tables that a router uses to actually forward packets. A router produces its FT from the Topology Information Base and any local forwarding and topology information (such as manually configured static routes). The structure of the FT is optimized for forwarding lookups in a particular router's forwarding software and/or hardware. 4.1.10 Links Links connect aggregates together. "Real" Links connect directly adjacent Aggregates. Virtual Links connect Aggregates that are not adjacent. Virtual links are built out of a tunneling technology such as MPLS. Links carry inter-Aggregate traffic. Links may be tagged with attributes of various flavors. Links are not peering relationships. A Link may exist between two routers without a Peering Relationship existing between the routers. That is, Links do not terminate at ABRs. They go between "ordinary" routers. There might be a single peering relationship between ABRs of two Aggregates, but many links between the two Aggregates. Consider two Aggregates, A1 and Kastenholz Informational - Expires November 2002 22 An Interdomain Routing Architecture May 2002 A2. They may have a single peering relationship between their ABRs and multiple links connecting them: . . . . . . . . . . . . . . . . . . . . . . . . . +-----+ +-----+ . . | ABR |----L1-----| ABR | . . | R1 | Peering | R2 | . . +-----+ +-----+ . . A1 . . A2 . . +-------+ +-------+ . . | Non- | | Non- | . . | ABR |----L2-----| ABR | . . | R3 | | R4 | . . +-------+ +-------+ . . . . . . . . . . . . . . . . . . . . . Figure 11 The graph of peering relationships between A1 and A2 looks like: +----+ +----+ | A1 |-----| A2 | +----+ +----+ Figure 12 But the graph of links between A1 and A2 is: +----+ +----+ | |--L1--| | | A1 | | A2 | | |--L2--| | +----+ +----+ Figure 13 4.2 Procedures This section describes the key procedures of the ISLAY Architecture. These procedures are described in terms of the elements described in section 4.1, "Elements of the Architecture". 4.2.1 ABR Peering Aggregate Border Routers establish Peering relationships with other ABRs. This relationship is used to propagate Topology, Aggregate Content information within Aggregates and between Aggregates. The network administrators define peering Relationships. Cryptographically strong identity and authentication Kastenholz Informational - Expires November 2002 23 An Interdomain Routing Architecture May 2002 information MUST be included in the configuration of the two ABRs and in the protocols used to establish a Peering Session. Peering Relationships, once defined, remain in existence until they are explicitly removed by administrative action. If a Peering Relationship has been defined between two ABRs, but the link between them is down, or one or both ABRs is down, then the Relationship exists and it is marked as "DOWN". There are two types of ABR Peering, internal and external. Internal ABR Peering is how the ABR within an Aggregate interact with each other. External ABR Peering is how ABRs of different Aggregates interact. 4.2.1.1 Internal ABR Peering Internal ABR Peering is how the ABRs belonging to a single Aggregate interact. Peering routers need not be adjacent (i.e. share a common subnet, link, etc). The peering relationship is established across the elements contained within the Aggregate (internal subnets, point-to-point links, sub-Aggregates, routers, etc). The goal of the Internal Peering relationships is to provide a way for external topology and content information to transit the Aggregate, and for the Aggregate to apply its policies to that information. For example, in Figure 14, internal peering relationships within A2 facilitate the transfer of content and topology information from A1 to A3, across A2. +----+ +----+ +----+ | A1 |---| A2 |---| A3 | +----+ +----+ +----+ Figure 14 The Internal Peering Relationships can have any structure and topology, so long as all of the ABRs of the Aggregate are reachable and information can be passed from one ABR to all of the others. The internal Peering relationships of an Aggregate's ABRs must form a connected graph covering all of the Aggregates ABRs: o The topology can be any combination of point-to-point and point-to-multipoint links and paths. o The internal ABRs do not need to be directly connected to each other (i.e. they do not need to 'share' a network link). o The set of ABRs do not need to be fully interconnected. It is not necessary for one ABR to have internal peering relationships with all other ABRs of the Aggregate. The internal-peers exchange all routing and addressing information they 1. Have been configured with, Kastenholz Informational - Expires November 2002 24 An Interdomain Routing Architecture May 2002 2. Have learned from other internal-peers, or 3. Have learned from their external peers. In this way, all ABRs of a given Aggregate have the same information. The ABRs of an Aggregate must o Detect when one of their internal peering relationships fails. They cannot rely on the status of the links since the peering relationship may traverse several internal links and routers. The only reliable way to detect failures is via a keep-alive mechanism (or equivalent). o Spread the status of their Internal Peering Relationships to all the other ABRs of the Aggregate. Accordingly, each ABR must keep track of all of the Internal Peering Relationships within the Aggregate. o Select a partition ID. This number is added to the Aggregate ID in all the topology and content advertisements. One mechanism for doing this is for each ABR to note the identifier associated with each other ABR in the Aggregate. The identifier with the numerically lowest value becomes the "partition ID". o If the ABR whose ID is used as the partition ID is no longer reachable then the remaining ABRs go through the process again and select a new partition ID and use that ID in all advertisements. 4.2.1.2 External ABR Peering External peering relationships are peering relationships that cross Aggregate boundaries. They "connect" one Aggregate to another. For example, in Figure 5, some of the external peering relationships are R5-R7, R6-R8, and R4-R9. External peering relationships are how routing information moves from Aggregate to Aggregate. External Peering relationships generally are between two ABRs that share a common network link. However, this is not required. The routers of a peering relationship do not need to be topologically adjacent. Other routers, links, Aggregates, and so on may separate them. A tunnel of some kind must connect the two ABRs of the peering relationship. This makes them virtually adjacent. MPLS is one tunneling technology. In an External Peering session, each ABR can be the 'next hop' to a set of destinations from the other ABRs perspective. When the ABRs are not adjacent, the tunnel between them is the "link" that connects them, so the virtual interface to the tunnel is the "next hop". Kastenholz Informational - Expires November 2002 25 An Interdomain Routing Architecture May 2002 4.2.2 Topology Discovery Aggregate Topology Discovery is the process by which the network topology is discovered. The topology consists of Aggregates and Links. The ABRs then use this topology to determine the best path, and corresponding next-hop, to use to send traffic to a particular Aggregate, and thereby the Destinations it contains. Topology discovery is performed by routing protocols. There are two general classes of routing protocol: Link State and Distance Vector (Path Vector, used in BGP-4, is a variant of Distance Vector; we shall always refer to the general class as Distance Vector). Both classes are consistent with ISLAY. Currently, the Internet can be represented as a graph, where the nodes represent routers and networks and the arcs are the connections of routers to networks. In this model, the fundamental units of the routing algorithms are routers and IP subnets. ISLAY alters this model. The nodes of the graph are the Aggregates. The individual Aggregate Border Routers are internal to the node and cooperate to give the image that the Aggregate is a single entity. They do not explicitly appear in the graph. The arcs are the Links between Aggregates. Thus, the fundamental unit of routing is the Aggregate. An Aggregate's ABRs cooperate with each other so that the Aggregate they represent appears as a single entity to the Aggregate's Peers and Parent. The entire Aggregate then appears as a single node in the topology graph. An important aspect of this cooperation is that topology information that enters an Aggregate via one peering relationship must transit the Aggregate and be forwarded to all other external peers (modulo the effects of policies local to the Aggregate, such as filtering). Each ABR is responsible for o Receiving topology information from its Peers, o Transmitting topology information to its Peers, o Relaying topology data about distant Aggregates, and o Building inter- and intra-Aggregate topology graphs out of the received information. Aggregates have internal structure. They contain child- Aggregates and Destinations. The topology of the Children and Destinations is kept within the containing aggregate. That is, internal structure is not explicitly exported, though the Aggregate may indicate preferences (e.g., different metrics to different Destinations via different Links to the Aggregate). Figure 15 may pull all of this together: Kastenholz Informational - Expires November 2002 26 An Interdomain Routing Architecture May 2002 . . . . . . . . . . . . . . . Aggregate . . Aggregate . . A2 . . A3 . . . . . . . . . . . . . . . | | . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A# A# A# A# . . \ \ / / . . ............. ............. ............. . . . . . . . . . . . Aggregate . . Aggregate . . Aggregate . . . . A1.1 .----. A1.2 .---. A1.3 . . . . . . . . . . . ............. ............. ............. . . / / \ \ . . A# A# Aggregate A1 A# A# . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . | | . . . . . . . . . . . . . . . Aggregate . . Aggregate . . A4 . . A5 . . . . . . . . . . . . . . . Figure 15 o The internal structure of A1 is kept within A1. That is, A2, A3, A4, and A5 do NOT know of the existence of Aggregates A1.1, A1.2, and A1.3. o The ABRs that comprise A1.2 provide "transit" service for topology advertisements between A1.1 and A1.3. In this way, A1.3 learns of the existence of A1.1 and what A1.1 is connected to, and vice-versa. o The ABRs that comprise A1 provide a similar transit service for the topology advertisements received from A2, A3, A4, and A5. o To A2, A3, A4, and A5, A1 appears as a single point: A2 A3 \ / A1 / \ A4 A5 Figure 16 The following sections address some specific issues of Topology Discovery. Kastenholz Informational - Expires November 2002 27 An Interdomain Routing Architecture May 2002 4.2.2.1 Advertisement Content In the current Internet architecture the topology advertisements carry destination IP prefixes, regardless of the protocol family. The resulting Topology Information Base is built using IP subnets as basic elements. ISLAY builds topology using Aggregates. Thus, the Topology Advertisements carry Aggregate IDs and the Topology Information Base is constructed using Aggregate IDs as basic elements. The routing algorithms determine paths to Aggregates. 4.2.2.2 Internal Topology and Abstractions Normally, an Aggregate's internal topology is completely hidden from the outside. The Aggregate's default topology and data transit policies are that traffic received by the ABR's of an Aggregate on any Link will be forwarded on to that traffic's destination. From the outside, the Aggregate is a black box. Under local configuration and policy control, an Aggregate may wish to change this default behavior. Its topology advertisements are then altered to include abstractions of the Aggregate's internal topology. For example, the aggregate may wish to advertise different metrics between the various Link pairs: \ L1 \ A1--L3-- / L2 / Figure 17 In the topology of Figure 17, the external assumption would be that traffic entering A1 is treated the same, and the internal paths all have the same attributes, regardless of the Links by which the traffic enters and leaves A1. Thus, if the attribute we care about is a metric, A1's topology advertisement would state that the metric between any two Links is X. However, if A1 wishes to show that the L1-L2 path is "better" (say has a shorter metric) than the L3-L2 path, it would include this information in its topology advertisement. The topology advertisement would say: 1. The default link-to-link metric is (e.g.) 9, 2. The metric from L1 to L2 is 8. Kastenholz Informational - Expires November 2002 28 An Interdomain Routing Architecture May 2002 4.2.2.3 Policy The distribution of topology information can be controlled. The Aggregate decides how much of its internal topology, and what form of that topology, is distributed to its external peers. The default is that only a highly abstracted version of the topology is exported. The aggregate may export more explicit information to selected peers. Different information (or abstractions) can be exported to different peers. 4.2.3 Aggregate Content Discovery Topology discovery (see section 4.2.2, "Topology Discovery") is the first major step in the routing process. The second major step is Content Discovery. Topology discovery is the process by which routers learn where things are, Content Discovery is the process by which the routers learn what is at those locations. Content Discovery is performed by the ABRs. Each ABR is responsible for: o Learning the destinations contained within the Aggregate the ABR represents, o Exporting the Aggregate's content data to their external peers, o Receiving the content advertisements from their external peers (external contents), o Relaying content information from one ABR to another via the internal peering relationships, o Advertising the external contents into their Aggregate, and o Propagating received external content advertisements across the Aggregate to the Aggregate's other ABRs so that the content advertisement can continue through the Internet The mechanism used is the Content Advertisement. Content Advertisements list the contents of an Aggregate. The contents may be either Destinations or child Aggregates. In addition, Content Advertisements contain 1. An indication of whether a specific content item (Destination or Aggregate) is currently reachable or not. The content item may not be reachable due to routing problems internal to the Aggregate. Note that if a Destination is not reachable then the Destination is NOT removed from the advertisements, as in current routing models. 2. Optionally, a set of attributes for the content. Among these attributes may be metrics showing the relative cost of reaching the content through each Link to the Aggregate. Kastenholz Informational - Expires November 2002 29 An Interdomain Routing Architecture May 2002 This information is included in the advertisement only when Policies require that traffic be directed in certain ways. ABRs are responsible for properly forwarding Content Advertisements. Figure 18 shows the flow of Content Advertisements. ............. . Aggregate . . A3 . . +-------+ . . | R5 | . ..+-------+.. C(A3)| | /|\ C(A5) \|/ | |C(A1,2,3,5) C(A4) ................+--------+.. ............. C(A3) . ----| R4 | . . . C(A2) . / +--------+ . . +--------+ <- +--------+/ | . . | R1 |-------| R2 | | . . +--------+ -> +--------+ | . . Aggregate . C(A1) . \ | . . A1 . . \ | . ............. .Aggr. +--------+ . . A2 | R3 | . ...........+--------+....... C(A1,2,3)| | /|\ \|/ | | C(A4,5) ....+--------+.... . | R6 | . . +--------+ . . Aggregate | . . A4 | . . +--------+ . . | R7 | . ....+--------+.... C(A1,2,3,4)| | /|\ \|/ | | C(A5) ....+--------+.... . | R8 | . . +--------+ . . Aggregate A5 . .................. Figure 18 The Content information flows as shown in the figure. Where o C(x) means the content information for Aggregate "x". o C(x,y,z) means the content information for aggregates "x", "y", and "z" Kastenholz Informational - Expires November 2002 30 An Interdomain Routing Architecture May 2002 4.2.3.1 Advertising Contents Regardless of how content information was learned, an ABR advertises the content information it knows about to all of its Peers (internal and external). Which content information is advertised, and to which peers that information is advertised is under the control of the routing information policies. In the absence of any policy, all information is advertised to all peers. The information is tagged with the Aggregate containing the content (Sub-Aggregate or Destination). In this way, all ABRs learn which Aggregates contain which Destinations in the Internet. An ABR may filter or otherwise alter the content advertisements that pass through it. Individual content elements may be modified, filtered or the entire advertisement dropped. The Routing Information Policies of the Aggregate determine what actions are taken, if any, is done. 4.2.3.2 Learning Internal Contents An ABR must learn the internal contents of the Aggregate it represents, including Child-Aggregates, in order to properly advertise those contents to the Peer and Parent Aggregates. An ABR learns the internal contents of an Aggregate via three mechanisms: Manual Configuration Administrators manually enter content information into the routers' databases. Learning from the internal routing protocol The routing protocol operating within the Aggregate provides the ABRs with the content information. This means that the ABR participates in the internal routing protocol. Note that the internal routing protocol may not provide separate connectivity and reachability information. Learning from Child Aggregates Content advertisements from Child Aggregates inform the Parent's ABRs of the contents of the Child Aggregates. The ABRs then take the information and either separately advertise the child Aggregate and its contents, or fold the Child's contents into their own for unified content advertising. Note that while the topology advertisements from the Child Aggregate do not travel beyond the Child, the Destination advertisements will. 4.2.3.3 Learning External Contents An ABR learns of the contents of other Aggregates from two sources. Kastenholz Informational - Expires November 2002 31 An Interdomain Routing Architecture May 2002 First, the ABRs External Peers inform it of the contents of the Aggregate represented by the External Peers. The External Peer also passes on the content data that it learns about Distant Aggregates. Second, the Internal Peers pass on all of the content information they have learned about their External Peers (and the distant aggregates via those peers). For example, as shown in Figure 18, R2 learns about C(A3) via R4. 4.2.3.4 Transit Content Discovery Transit Content Discovery is the process by which one Aggregate's content advertisements cross another Aggregate and then are re-advertised to yet more Aggregates. For example, in Figure 19, the content advertisements from A3 and A4 transit A2. A1 then learns the contents of A3 and A4. The same procedure occurs with regard to A3 learning what is in A1/A4 and A4 learning what is in A1/A3. It is also possible for A2 to "hide" the existence of A3 and adopt A3's content as its own. ............. . Aggregate . . A3 . . +------+ . . | R5 | . ..+------+... | .............+------+... ............. . ---| R4 | . . . . / +------+ . . +------+ +------+/ | . . | R1 |---| R2 | | . . +------+ +------+\ | . . Aggregate . . \ | . . A1 . . \ | . ............. .Aggregate \+------+ . . A2 | R3 | . ...........+------+.... | ...+------+... . | R6 | . . +------+ . . Aggregate . . A4 . .............. Figure 19 Aggregate A2 receives the following Content Advertisements from its peers: Kastenholz Informational - Expires November 2002 32 An Interdomain Routing Architecture May 2002 C(A1): R1->R2 C(A4): R6->R3 C(A3): R5->R4 Within Aggregate A2, the information must be forwarded as follows: C(A1): R2->R3,R4 C(A3): R4->R2,R3 C(A4): R3->R2,R4 Since there is a loop-free set of Peering Relationships that spans all of the ABRs of a single Aggregate, each External Content Advertisement received by an ABR is forwarded to all other ABRs of the Aggregate. 4.2.3.5 Content Reachability A part of the Content Advertisement is a reachability indicator for each Destination or Aggregate contained in the advertising Aggregate. This is a single bit in the advertisement that indicates whether the Destination or Child Aggregate is currently reachable. This reachability flag serves two purposes. First, the reachability status of a Destination (or Child Aggregate) can be disseminated as far as possible through the Internet. When a the status is "unreachable", traffic going to these destinations can be dealt with (rerouted or dropped) as close to the source as possible, reducing wasted bandwidth. The second purpose is for multi-homing. If a Destination is multi- homed and reachability via one Aggregate is lost, the network can swiftly reroute the traffic via the other Aggregate(s). 4.2.3.6 Content Validation It is possible for an Aggregate to advertise that it contains some particular Destination when, in fact, it does not. This can be done either maliciously or accidentally. ISLAY cannot prevent that from happening. The current routing and addressing models have similar vulnerabilities. Basically, the ISPs all trust each other to provide good, truthful, data. If an ISP breaks that trust, the other ISPs ignore it; BGP peering is terminated, etc. This turns out to be a sufficient mechanism. However, if stronger trust is required, it is possible to develop a cryptographic infrastructure which does not conflict with ISLAY and which provides cryptographically strong and verifiable credentials that content advertisements are correct. Each content "item" would, in effect, be required to Kastenholz Informational - Expires November 2002 33 An Interdomain Routing Architecture May 2002 cryptographically sign a statement declaring which aggregate(s) contain the item. This would come at a cost o Cryptographic software tends to be compute-intensive, o Adding digests, signatures, etc, to the protocols would increase the bandwidth required, o Some kind of keying infrastructure would need to be provided so that any ABR could verify some credentials (and that would lead to dependency loop issues). o A centralized service would be needed from which keys or digital certificates could be obtained and used to validate advertisements. This service is not critical for the correct operation of ISLAY. That is, if the security system is unavailable (e.g. the key-server crashes) routing can still work. The information is just signed, not encrypted, so even though the signatures can not be verified, the information can still be used. 4.2.4 Creation of the Forwarding Table Once they have the relevant Content and Topology information, the ABRs of an Aggregate need to create their Forwarding Tables(FTs). Using the information in the Topology Information Base, the ABR determines the next hop that is appropriate for each Aggregate. An entry is made in the FT for each Destination contained in that Aggregate. The next hop for that destination is the next- hop for the Aggregate. For example: Kastenholz Informational - Expires November 2002 34 An Interdomain Routing Architecture May 2002 for (all known Aggregates, A) { for (each destination, D, in A) { // If the destination is NOT in the FT or // A is closer than the distance already // associated with D, add D. if ((is-FT-entry(D)==FALSE)) || ((is-FT-entry(D)==TRUE) && (distance-to-aggregate(A) Content Advertisement, Contains D1 . . . --------------- . . | D1 | . ..|...............|.. | | | A2 |-----> Content Advertisement, Contains D1 | | --------------- Figure 25 So, when a third Aggregate, A3, receives the advertisements, its topology graph looks like: A1(D1) \ A3 / A2(D1) (where Ax(Dy) means "Aggregate x contains Destination Dy) Figure 26 Indicating that D1 is contained in both A1 and A2. When there is a connectivity problem, for example if D1 is no longer reachable via A1 in Figure 26, A1's content advertisements indicate that D is no longer reachable via A1 (though it remains contained in A1). If A3 had selected the Kastenholz Informational - Expires November 2002 38 An Interdomain Routing Architecture May 2002 path through A1 then A3 would alter its FT to send traffic destined for D1 to A2. 4.2.7 Mobility ISLAY concerns itself only with mobility of networks. Mobility of hosts is adequately handled by the current Mobile IP protocols. ISLAY does not contain any special mechanisms for host mobility. ISLAY inherently handles network mobility. At the highest level, a mobile network is considered to be a Destination that moves from one Aggregate to another. Thus, the original Aggregate would stop advertising the destination and the new one would start. This might happen when, e.g., an end-site changes providers. When large-scale changes occur (e.g., when ISPs change their connectivity), the contents themselves do not change, but the topology graph does. Additional mechanisms for forwarding, or redirecting, traffic that is "in flight" when the move occurs are not inconsistent with ISLAY. Specification and design of these mechanisms is outside of the scope of this note. We note that host mobility could be handled within ISLAY by making the mobile host a Destination and advertising that host's /32 IPv4 (or /128 IPv6) address as the Destination ID. This mechanism is not recommended since it has many security and scaling issues. 4.2.8 Connectivity Changes Detecting and reacting to connectivity changes is a major job of a routing system. In ISLAY, connectivity change information is disseminated through the network via the Topology Discovery mechanism. There are two basic causes of a connectivity change, faults and configuration changes. A fault is where a network element (such as an Aggregate, router, interface, or link) fails for some reason; faults are unintended and presumably will be cleared (i.e. the network will revert to its "pre-fault" condition) after some time. Configuration changes are when network elements are intentionally and permanently added to or removed from the network by the network operators. ISLAY differentiates between the two conditions. Configuration changes are reflected in Topology or Content advertisements by adding or removing elements from the advertisement. Fault changes are reflected in the advertisements by setting an attribute of the affected object(s) showing their fault status. Kastenholz Informational - Expires November 2002 39 An Interdomain Routing Architecture May 2002 In addition to faults vs. configuration changes, there are also internal and external changes. Internal changes are changes that occur within an Aggregate. External changes are changes that occur outside of the aggregate (e.g., in the Links connecting the Aggregate to the rest of the Network). Note that since Aggregates can be hierarchically structured, an external change to one Aggregate could be an internal change to that Aggregate's parent. 4.2.8.1 External Changes External changes are changes that happen outside of an Aggregate. These changes may be Link Failure, peering failures, or disruption of ABRs. 4.2.8.2 External Visibility of Internal Changes The internal topology of an Aggregate is not directly visible outside of the Aggregate. Generally, when there is a change in that topology it is not directly visible outside of the Aggregate. However, the internal topology may be visible in an abstract fashion; the aggregate may make some aspects of its internal topology visible via, e.g., content attribute tags in the Aggregate's content advertisements. For example, if the interesting attribute is "hop count" and we assume the topology in Figure 27: . . . . . . . . . . . . . . . . . D1--a--b--c--d--e--f--R1-----... . \ / . . ---------- . . Aggregate A1 . . . . . . . . . . . . . . . Figure 27 A1 would advertise, via R1, that it contains D1 and that the metric to get to D1 is 4 (the path is D1-a-e-f-R1). If the a-e path fails, then the path from R1 to D1 would be A1-a-b-c-d-e- f-R1. This path is 7 hops. Thus, the metric to D1 would change from 4 to 7. If the owner of A1 does not wish to export even this information, R1 could be configured to advertise a constant metric. 4.2.8.3 Aggregate Partitions Another way that internal topology becomes externally visible is when an Aggregate partitions. An Aggregate partition is when one or more topology changes occur such that it is no longer possible to get from one part of the Aggregate to another without leaving the Aggregate. Kastenholz Informational - Expires November 2002 40 An Interdomain Routing Architecture May 2002 Within an Aggregate we expect there to be fairly rich and redundant connectivity. Failures within an Aggregate should normally not result in a partition. However, it is always possible that a failure or set of failures will cause a partition. This condition is expected to be fairly rare, so the methods for dealing with it need not be optimized. There are two types of partition. The first type of partition is one where the partitioned part of the Aggregate does not have connectivity to the rest of the Internet. When this type of partition occurs, the ABRs for the partitioned Aggregate simply stop advertising reachability to the Destinations and Child Aggregates that are no longer reachable. For example, in the topology in Figure 28: . . . . . . . . . . . . --------------- . D1 . ( ) . \+-------+ ( The Rest of ) . D2-| R1 |---( the Internet ) . /+-------+ ( ) . D3 | . --------------- . /L1 . . D4 / . . \+----------+ . . D5-| Internal | . . | Router | . . /+----------+ . . D6 . . . . Aggregate A1 . . . . . . . . . . . Figure 28 Assume that L1 fails. R1's content advertisements change to show that D4, D5, and D6 are no longer reachable. The advertisements still contain D4, D5, and D6 since these Destinations are still a part of A1. They are merely marked as unreachable. The second type of partition is one where each part of the Aggregate has an ABR and connectivity to the rest of the Internet. Consider the Aggregate in Figure 29: Kastenholz Informational - Expires November 2002 41 An Interdomain Routing Architecture May 2002 . . . . . . . . . . --------------- . D1 . ( ) . \+-------+ ( The Rest of ) . D2-| R1 |---( the Internet ) . /+-------+ ( ) . . . . . . . . D3 | . ( ) . . . |L1 . ( )-. Aggregate . . D4 | . ( ) . A2 . . \+-------+ ( ) . . . . . . . . D5-| R2 |---( ) . /+-------+ ( ) . D6 . ( ) . . --------------- . Aggregate A1. . . . . . . . . Figure 29 If L1 fails then the Aggregate partitions into two parts, "A" and "B" and the result looks like: . . . . . . . . /. . --------------- / . D1 . ( ) Partition | . \+-------+ ( The Rest of ) "A" | . D2-| R1 |---( the Internet ) \ . /+-------+ ( ) . . . . . . . \. D3 | . ( ) . . =============== ( )-. Aggregate . /. D4 | . ( ) . A2 . / . \+-------+ ( ) . . . . . . . | . D5-| R2 |---( ) Partiion | . /+-------+ ( ) "B" | . D6 . ( ) | . . --------------- \ . Aggregate A1. \. . . . . . . . Figure 30 Obviously, connectivity to the rest of the Internet for each part of A1 should be maintained. Whether an Aggregate is partitioned or not is visible externally via the "Partition Identifier" field of the Aggregate ID (see section 4.1.4, "Aggregate Identifier"). When all advertisements with the same Aggregate ID have the same Partition Identifier value, the Aggregate is not partitioned. When an external ABR receives advertisements from the same Aggregate, but with different Partition Identifiers then the Aggregate has partitioned. When a partition occurs, the external ABRs temporarily the Aggregate as two separate Kastenholz Informational - Expires November 2002 42 An Interdomain Routing Architecture May 2002 Aggregates, each with some of the content of the formerly-whole Aggregate. In the topology in the Figure 29 and Figure 30, suppose that ABR R1 has ABR ID value 1 and R2 has 2. Before the partition, A2 would get advertisements from R1 and R2 with Aggregate ID A1.1 (Aggregate A1, Partition ID 1). A2 knows that the Aggregate is "whole". Once A1 is partitioned, A2 would get one advertisement with Aggregate ID A1.1 and a second with ID A1.2. This indicates that the Aggregate has partitioned and A2 can take appropriate measures. This mechanism requires that an Aggregate's ABRs cooperate with each other in determining the proper Partition ID to use. The mechanisms to do this are a matter of protocol design. An alternate method would be to "heal" the partition by reconnecting the two parts via an external path, built with a tunnel (such as MPLS). 4.2.9 Hiding of Aggregates Aggregates are hidden in order to reduce the amount of information carried in the routing protocols and processed by the routing algorithms. When one Aggregate, A1, hides another Aggregate, A2: o A1 advertises all of the Destinations in A2 (and the Aggregates contained in A2, recursively...) o A1 undertakes to carry any and all traffic to all of A2's destinations. 4.2.10 Policies There are two kinds of policies, routing data policies and traffic policies. There is no single mechanism that implements policies. Policies are implemented by a variety of mechanisms. These mechanisms are parts of the other components of ISLAY. 4.2.10.1 Routing Data Policies Routing Data Policies are policies that control the reception, internal distribution, and transmission of routing data by an Aggregate's ABRs. [2] Enumerates a set of topics covered by Routing Data Policy functions: o Selecting to which others routing information will be transmitted. o Specifying the "granularity" and type of transmitted information. The length of IPv4 prefixes is an example of "granularity". o Selection and filtering of topology and service information that is transmitted. This gives different Kastenholz Informational - Expires November 2002 43 An Interdomain Routing Architecture May 2002 'views' of internal structure and topology to different peers. o Selecting the level of security and authenticity for transmitted information o Being able to cause the level of detail that is visible for some portion of the network to reduce the farther you get from that part of the network. o Selecting from whom routing information will be accepted. This control should be "provisional" in the sense of "accept routes from "foo" only if there are no others available". o Accepting or rejecting routing information based on the path the information traveled (using the current system as an example, this would be filtering routes based on an AS appearing anywhere in the AS path). This control should be "provisional" in the sense of "accept routes that traverse "foo" only if there are no others available". o Selecting the desired level of "granularity" for received routing information (this would include, but is not limited to, things similar in nature to the prefix-length filters widely used in the current routing and addressing system). o Selecting the level of security and authenticity of received information in order for that information to be accepted. o Determining the treatment of received routing information based on attributes supplied with the information. o Applying attributes to routing information that is to be transmitted and then determining treatment of information (eg, sending it "here" but not "there") based on those tags. o Selection and filtering of topology and service information that is received. These mechanisms are all primarily protocol design and implementation issues. 4.2.10.2 Traffic Policies Traffic policies are policies that affect the flow of traffic into and across an Aggregate. 4.2.10.2.1 Metrics Metrics may be placed on several different parts of ISLAY. These metrics are used by the topology calculations to select paths. Some of the metrics that may be attached, and some of their uses, are: 1. Links This is the "traditional" place that metrics are placed. Kastenholz Informational - Expires November 2002 44 An Interdomain Routing Architecture May 2002 2. Aggregates (for transit traffic) This would be used to bias the flow of transit traffic to or away from a particular aggregate. 3. Destinations within an aggregate These metrics would be used to bias traffic to certain destinations towards or away from the Aggregate. This would be very useful when a Destination is multi-homed and wishes to have its traffic reach it via a "primary" service provider. 4. Destination/Entry-point pairs These metrics would bias traffic for certain Destinations in the Aggregate toward or away from certain Entry Points. This might be done to try and get traffic going to the Destination to enter the Aggregate at the Entry Point "nearest" that Destination. 5. Entry points These metrics would bias traffic toward or away from a specific entry point regardless of its destination. Besides a finite numeric range (e.g., 1 to 255), there must be two "special" values for the metrics: 1. Infinity This value indicates that the thing to which the metric is attached is "unreachable" or can not be transited. This stops traffic from flowing via the affected network element. 2. Very Large This value can be thought of as "infinity-1". It means that the object is reachable (passable) but that it is not to be used unless no other path is available. 4.2.10.2.2 Multi-Path One important desired policy is to use all available paths to carry traffic to a particular Aggregate. In order to meet this goal: 1. The routing protocols and algorithms must support the ability to find and use multiple equivalent paths to a destination. 2. All routers must support some form of equal-cost, multi- path (ECMP) forwarding. When multiple paths are supported, certain anomalous traffic patterns can arise. Consider the topology in Figure 31: Kastenholz Informational - Expires November 2002 45 An Interdomain Routing Architecture May 2002 A2----L4----A3 / \ \ / \ \ L1 \ L6 / \ \ / \ \ A1 L3 A6 \ \ / \ \ / L2 \ L7 \ \ / \ \ / A4----L5----A5 Figure 31 In this topology, A1 may have traffic going to A6. It might send half the traffic via L1 and half via L2. A2 would receive 1/2 of the traffic and split it in half again. Thus 1/4 of the A1-A6 traffic would go via L4/L5 and 1/4 via L3/L7. A5 would see 3/4 of the A1-A6 traffic; 1/4 would come in via L3, 1/2 via L5. Thus, 1/4 of the traffic would arrive at A6 via L6, 3/4 via L6. Note that there is no way to balance the traffic such that all links are equally loaded. 4.2.10.2.3 Transit Aggregates may have rules defining what traffic they will let cross their networks. They may wish to limit traffic entering their networks to - Traffic going only to contained Destinations and Child Aggregates, - Traffic going to certain, select, other Aggregates, or - Traffic going to adjacent Aggregates If the routing protocols are Distance-Vector, then the Aggregate can enforce these policies simply by not including in the topology and Content advertisements Aggregates and Destinations to which it will not send traffic. If the protocols are link-state, then a mechanism is required for Aggregates to inform others of the policies. One possible mechanism is to include transit policy information in the topology advertisements. Others may be possible. 5 Performance Considerations Performance is a critical problem in the current architecture. A major goal of ISLAY is to improve the performance characteristics of the routing algorithms and protocols. There are two paths to improved performance: Kastenholz Informational - Expires November 2002 46 An Interdomain Routing Architecture May 2002 o Elements of ISLAY which, fundamentally, lead to improvements in performance (assuming that they are not misused) o Attributes and features of the routing protocols that either directly improve performance or allow implementation strategies that can improve performance. These attributes are not, per se, a part of ISLAY. They are called out here, however, in order to guide the development of the protocols themselves. 5.1 Reduction In Quantity of Data The propagation of content and connectivity information across a large network, by all of the ABRs could lead to a large amount of traffic, both in terms of number of packets and number of bytes. It is quite possible that, if the Internet grows "too large", all of the ABRs could spend all of their time doing nothing but receiving and propagating routing information. ISLAY is designed to provide mechanisms to allow network operators to reduce the amount of data that is propagated to perform routing. The primary mechanism is the combining of a number of distinct destinations into a single aggregate and then doing the topological calculations just once, for that Aggregate. In addition, the network protocols should be designed so that they can rapidly and expeditiously communicate o That no changes have taken place in the network o That changes have taken place and what those changes are. 5.2 Convergence Another aspect is the time it takes for routers to converge after topology changes. We believe that this is addressed by grouping many destination IDs (such as IP Address prefixes) into a single topological entity (the Aggregate). Thus, when a topology change affecting the entity occurs, the router needs to do its topology calculations once and then apply the result to all destinations. 5.3 Forwarding Table We do not believe that it is critical to optimize the size and 'depth' of the Forwarding Tables in routers. It is quite easy to build large (millions of entries), fast (25-100M lookups/second), forwarding tables. No attempts have been made Kastenholz Informational - Expires November 2002 47 An Interdomain Routing Architecture May 2002 to optimize in this dimension (for instance, by combining longer IP Address prefixes into shorter ones). In the Internet, the destinations will be IPv4 and IPv6 prefixes. It is quite possible that, with poor allocation policies, the number of these prefixes will grow to be too large. This can be dealt with only by better allocation policies on the part of IANA and the regional address registries. The goal of ISLAY is to split topology from IP Addressing, which has been accomplished. Therefore, the routing calculations and topology tables no longer scale as a function of the number of IP Address prefixes in the Internet. By splitting the topology name space from IP Addressing, ISLAY allows new IP address allocation policies to be placed in effect without requiring changes to the routing system. These policies could provide FT size reduction, should that become necessary. 5.4 Rope ISLAY provides a good deal of rope with which network designers and operators may hang themselves. The capabilities provided by ISLAY are diverse and powerful. It is possible for administrators to incorrectly configure their systems. This misconfiguration could limit, or even eliminate, the potential performance gains. Making ISLAY "bulletproof" against configuration errors would limit its use as an inter-domain routing system. 6 Security Considerations In general, security considerations apply to protocol specifications and this document is not a protocol specification. However, we can identify areas of ISLAY where security will raise its ugly and complicated head, and maybe offer some suggestions for addressing those concerns Attacks Routing systems have shown themselves to be juicy targets. If one can bring down the routing system then one has brought down the network. The routing protocols MUST protect against o In-transit modification of data by unauthorized parties (ABRs who are not peers) o Injecting data by unauthorized parties (ABRs who are not peers) o Deleting data in-transit o Replay attacks by bad-guys Kastenholz Informational - Expires November 2002 48 An Interdomain Routing Architecture May 2002 Spoofing It is possible for an Aggregate to advertise that it contains a prefix when, in fact, it does not. We note that this is no different than the current routing architecture. There is no "proof" that a router can advertise a specific prefix in the current routing protocols. To solve this problem, content advertisements could be cryptographically signed. Each Destination ID in a record could be signed with keys unique to that Destination ID. Of course, this requires a central, trusted, location where keys can be obtained and used to verify the record. This does lead to problems, of course, when an Aggregate is hidden. ABR Trust If an ABR allowed any node to connect to it and say "Hi, I'm an ABR", then all kinds of mischief may occur. To get around this, ISLAY requires that ABR peering relationships be manually established. In addition, manual keying can be done, allowing the data exchanged in the relationship to be cryptographically signed. This would ensure that the relationship is A) correct (i.e. the two human/administrative sides of the relationship actually want it) and B) that the connection has not been hijacked nor bogus data inserted. Aggregate Trust Aggregates have to trust one another. They have to assume that the topology and content information they receive is "correct". Most significantly, there is no way to tell that an aggregate is truthfully advertising its contents or that an aggregate has not changed the advertisements of another aggregate. The only way to do this would be to cryptographically sign all the advertisements. This would require some well-known, central, authority to validate the signatures, which cannot be done. Confidentiality is generally not believed to be important in routing protocols and architectures. Usually, networks that "have something to hide" don't tell other networks in the first place, so the protocol therefore does not need encryption. Unfortunately, the one big hole in routing systems is that once A believes that B is a "good guy", A is then completely open to attack. B could do just about anything. And, worse, since C trusts A, B's attack on A could carry through to C (and D and E and...) Kastenholz Informational - Expires November 2002 49 An Interdomain Routing Architecture May 2002 6.1 Peering Aggregate Peering is the main relationship between routers in ISLAY. If this relationship is not secured then the whole architecture is open to attack. Thus, ISLAY explicitly requires that 1. Peering Relationships be defined by administrators. Automatic discovery is expressly forbidden. While this tends to increase the work of the network administrators, it adds an element of positive control, 2. The identity of the ABRs involved in Peering Relationships be cryptographically authenticated. This is to prevent hijacking of relationships. This explicit configuration does add to the workload of the network administrators and increases the probability of errors. However, the improved security of the Internet's routing system as a whole is worth the cost. 7 For Further Study There are several issues that this document does not directly address, yet are ripe for further study to see either how they fit into ISLAY as explained in this document or how the architecture can be extended to support them: 1. How do MPLS labels fit into ISLAY? 2. Do Destination Identifiers need to be globally unique 3. Building virtual aggregate-to-aggregate links across other aggregates. This may be useful for doing virtual private networks and traffic engineering. 4. How to fit with MPLS and the other SUB-IP technologies. 5. QOS 6. Multicast 7. VPNs 8. Anycast 9. Modifying existing routing protocols to support ISLAY, as opposed to developing new ones. 10. Reverse-path-checking. 11. Traffic Engineering 12. Automatic creation of Aggregates and Hierarchies. 13. Default routes 14. ? Kastenholz Informational - Expires November 2002 50 An Interdomain Routing Architecture May 2002 8 IANA Considerations This section covers Architectural issues related to allocation of Aggregate Identifiers and IP Addresses. This should not be considered a "final" IANA Considerations since this is an architecture document, not a protocol one. Protocol specifications will do the "final" IANA Considerations section. 8.1 Aggregate Identifiers Aggregate Identifiers are global, so a central allocator, such as IANA, must allocate them. We currently believe that the number of Aggregates will be kept fairly small, certainly less than 100,000, very possibly less than 10,000. Thus, there is no need to give the identifiers topological significance. Therefore, the Aggregate Identifiers could be allocated sequentially. IANA may wish to allocate blocks of Aggregate IDs to regional and local registries so as to distribute the workload. IANA may also wish to allocate blocks to some aggregates, allowing the Aggregates themselves to assign Aggregate IDs to their children. 8.2 Addresses IP Address allocation can proceed in any way desired by IANA and the various registries. ISLAY places no requirements or restrictions on this. In particular, IP addresses would no longer be used as topologically sensitive identifiers. Addresses can be allocated in ways that maximize allocation efficiency rather than topological efficiency. 8.3 Protocol Identifiers ISLAY includes some protocol-specific values, such as Destination IDs. These values generally must be tagged to indicate which protocol family they belong to. These tags must be allocated by IANA. A simple, flat, number space is probably adequate. 9 MPLS We do not see MPLS as a fundamental part of the routing and addressing architecture. There are two aspects of MPLS that are of interest to ISLAY -- MPLS LSP setup and use of MPLS LSPs by the routing system. Kastenholz Informational - Expires November 2002 51 An Interdomain Routing Architecture May 2002 First, MPLS LSP setup is performed by a separate application (MPLS Signaling) that is layered on top of the core routing and addressing system. MPLS Signaling would have "read access" to the routing and addressing data in routers and uses that data to set up tunnels. The second facet is use of MPLS LSPs by the routing system. When LSPS are created, they appear to the routing system as point-to-point links from the LSP-ingress to the LSP-egress. It would appear the same as having a new PPP link installed directly from the ingress router to the egress router. There probably will be very severe traffic policies applied to the LSP, based on the traffic engineering, etc, requirements of the organization(s) creating the LSP. The LSP may not even be visible beyond a select group of routers. (this is all a local policy decision). 10 Multicast Like with MPLS, we see multicast as an application layered on top of the basic architecture. The multicast routing protocols would use the topology database generated by the architecture to determine how to build its distribution tree. - hierarchy of the aggregates can be the basis for the mcast tree - scooping of aggregates (you don't see external topology) limits the work of mcast protocols -- they only have to get mcasts to all 'exits' from the aggregate (either children, parents, or peers), and any directly contained destinations. 11 Requirements Considerations This section evaluates ISLAY against the requirements given in [2] and [3]. Each of the following sections describes how ISLAY addresses a requirement from [2] or [3]. This section only discusses how ISLAY meets (or doesn't meet) the requirements. The actual protocols require their own review against the requirements. 11.1 Evaluation against [2] The following subsections evaluate the architecture in relation to the requirements in [2]. Kastenholz Informational - Expires November 2002 52 An Interdomain Routing Architecture May 2002 11.1.1 Architecture This requirement states that there must be a "clear, well thought out, architecture". This note is that architecture. 11.1.2 Separable Components [2] requires that the architecture place different functions into different components. This architecture separates forwarding, and topology management. It adds new objects to the network to represent network topology (Aggregates). It has separate namespaces for the topology objects. These features of the architecture meet the requirement for separate components. 11.1.3 Scalable The architecture meets the scaling requirements in three ways. 1. First, it creates a separate class of objects, Aggregates, which are used by the topology management system. This decouples the growth of the end-sites of the network from the load placed on the topology calculations. 2. Second, it allows a hierarchical structure, based on Aggregates, to be developed. This structure would allow considerable information hiding and abstraction. This structure can limit the scope of some topology information, further reducing the load on the topology calculations 3. The architecture supports multiple network layer protocols. IPv4further reducing the load on the topology calculations 4. The architecture supports multiple network layer protocols. IPv4, IPv6, and possible MPLS, can all be routed by a single system. This eliminates the need for separate routing systems for each network layer protocol. Eliminating protocols reduces the overall load on routers. 11.1.4 Lots of Interconnectivity There are no apparent limitations in the architecture that would preclude supporting networks with high degrees of interconnectivity. Normally this would lead to large and complex tables, but the ability to partition the network into Aggregates and build hierarchies of these aggregates serves to limit the scope of this complexity. Therefore, the load on any one router should not be excessive. Kastenholz Informational - Expires November 2002 53 An Interdomain Routing Architecture May 2002 When there are multiple, valid, paths to a destination the architecture does not limit the use of those paths. The actual routing protocols and implementations may do so, however. 11.1.5 Random Structure The architecture does not assume or require any particular structure of the Internet. 11.1.6 Convergence Convergence time is primarily a function of the complexity of the topology database. The architecture provides several mechanisms that can limit the size and complexity of this database. The first mechanism is the ability to combine many Destinations (IP Subnets/Prefixes) into a single Aggregate. Thus, the cost of doing the routing calculations grows as a function of the number of Aggregates (which can easily be controlled) rather than as a function of the number of Prefixes. The second mechanism is the ability to build hierarchies of Aggregates. This mechanism has two roles. It limits the scope of a topology change (limiting the number of routers that have to process that change) and it limits the number of Aggregates seen by a router. We believe that these two features, WHEN PROPERLY USED, can adequately limit convergence times. 11.1.7 Routing System Security Section 6, "Security Considerations", discusses the security features of the architecture. We believe that at the architectural level, security features are provided. 11.1.8 End Host Security The architecture does not require that routers examine encrypted parts of packets. ESP will continue to work. The architecture separates the forwarding operations and data from the topology system. The format of the IPv6 address, in particular the low-order 64 bits which are sometimes used as a host-ID, can be changed without affecting the routing system. Any enhancements to privacy and security that are believed to accrue from changing the IPv6 address will still be available. 11.1.9 Rich Policy Section 4.2.10, "Policies", discusses the policy features and capabilities of the architecture. Kastenholz Informational - Expires November 2002 54 An Interdomain Routing Architecture May 2002 11.1.10 Incremental Deployment We believe that the architecture supports incremental deployment. The final answer depends on the protocol specifications. However, one possible scenario might map the current autonomous systems onto Aggregates. Some of these will run the "new" protocols. The remaining part(s) of the Internet would, in effect, be their own default Aggregate. As the protocols and architecture prove themselves, more aggregates will be added. Eventually, these independent islands will grow and start to "touch". In effect, the "old architecture" will be squeezed out of the Internet. The protocols, when they are finally developed, MUST describe this process in detail. 11.1.11 Multi-homing The architecture implicitly provides for multihoming by allowing Destinations to appear on multiple Aggregates. See section 4.2.6, "Multi-Homing". 11.1.12 Multi-path. As described in section 11.1.4, "Lots of Interconnectivity", he architecture places no restrictions on the number of paths that may be used to a given destination. The protocols, algorithms, and implementations may place restrictions that are beyond the control of the architecture. 11.1.13 Mobility Section 4.2.7, "Mobility", describes how the architecture supports mobility. Additional work can be done to optimize certain facets of mobility, such as re-forwarding "in-flight" traffic. However, these are only optimizations. The basic functions are still there. The architecture does not impede or inhibit the current Mobile- IP protocols. 11.1.14 Address Portability The architecture supports address portability by removing IP Prefixes from the topology calculations. The Aggregate Content Binding, a new function, explicitly maps IP Prefixes to the Aggregate they reside in. Kastenholz Informational - Expires November 2002 55 An Interdomain Routing Architecture May 2002 11.1.15 Multi-Protocol The architecture is inherently multi-protocol in that the routing and topology calculations work on protocol independent objects (Aggregates). Only at a very late stage (building the FT) are protocol dependent objects (such as IP Prefixes) bound to the topology. 11.1.16 Abstraction The architecture's Aggregates provide the essential capabilities for this requirement. Aggregates group together "contained" elements for administrative (among other) purposes. The internal structures of aggregates can be hidden, or only small parts of that structure revealed. Aggregates support transit rules. We believe that aggregates can map one-to-one to the current Autonomous Systems (see section 11.1.10, "Incremental Deployment", for more information). 11.1.17 Administrative Entities and the EGP/IGP split The Architecture does not make an explicit Interior/Exterior distinction. Hierarchical structure of Aggregates is supported. More than two levels of hierarchy are possible and are expected. The architecture applies equally well at all levels of a hierarchy. 11.1.18 Simplicity Though there is no existence proof of how fast Radia can explain the architecture, section 3, "Overview of ISLAY" fully describes the basic principles of the architecture and is only two pages long. We hope that Radia can read faster than 30 minutes per page. 11.1.19 Media Independence This specification does not mention any Layer-2 issues or constructs at all. Therefore, the architecture does not depend on any specific layer-2 concepts or capabilities, meeting this requirement. 11.1.20 Stand-alone This specification includes no other components of the Internet. Thus, since the components are not mentioned, there can be no reliance on them and the architecture therefore meets this requirement. Section 4.2.3.6, "Content Validation" discusses issues surrounding the validity of content advertisements. The Kastenholz Informational - Expires November 2002 56 An Interdomain Routing Architecture May 2002 architecture does not provide strong validation of content advertisement correctness. To solve this problem, a central digital certificate authority may be used. This would violate this requirement. 11.1.21 Safety of Configuration We believe that this requirement is more appropriately addressed in the specification, design, and implementation of the protocols. We do not that proper network design and configuration is required in order to gain the advantages of the architecture. For example, it is quite possible to place each /32 IPv4 address in its own Aggregate and then export those aggregates to the rest of the network. This is not a good plan for building a large, scalable, system. 11.1.22 Renumbering of Subnets When a subnet is renumbered (i.e., assigned a new IP Prefix and therefore a new Destination ID), the content advertisement showing that prefix changes to reflect the new prefix. Thus, subnets may be renumbered. We suggest that any protocols implementing the architecture include mechanisms to explicitly propagate the renumbering operation. That is, there should be a "Destination X is now known as Destination Y" sort of operation. But this is not necessary. Besides, it is outside of the scope of the architecture. 11.1.23 Multi-prefix Subnets There is nothing in the architecture that prohibits multi- prefix subnets. 11.1.24 Cooperative Anarchy There are no "central control points" defined in the architecture. Administratively autonomous entities (e.g., service providers) are free to design their networks and route traffic within their networks as they see fit. The only central point is IANA/ICANN (and its delegates), for handing out Aggregate IDs and Destination IDs (IP prefixes). As this is well accepted today, we believe that it does not violate either the letter or the spirit of this requirement. Kastenholz Informational - Expires November 2002 57 An Interdomain Routing Architecture May 2002 11.1.25 Network Layer Protocols and Forwarding Model The architecture does not require any new or modified forwarding model. The traditional "hop by hop" paradigm works for IP. 11.1.26 Routing Algorithm The architecture is specified without reference to a particular routing protocol or algorithm family. There is nothing in the fundamental architecture that requires one algorithm or the other, though some features may work more efficiently with one or the other algorithm. 11.1.27 Positive Benefit It is difficult to explain how this criterion is met. As we've stated in other parts of this chapter, the architecture can provide network administrators with a much better control over their routing. The architecture also can have much better scaling properties than the current one. The architecture is easily deployable over the current Internet. The architecture does not require changes in the network layer protocol nor does it require new procedures for allocating IP Addresses (much less would it require that the current address allocations be re-done!). 11.2 Evaluation against [3] The following subsections evaluate the architecture in relation to the requirements in [2]. 11.2.1 TBD 12 References [1] Bradner, S., "The Internet Standards Process – Revision 3", BCP9, RFC2026, October 1996 [2] Kastenholz, F., Ed.,Routing Research Group, " Requirements For a New Inter-Domain Routing and Addressing Architecture", draft-irtf-routing-reqs-groupa-00.txt, Work in Progress. [3] Doria, A, and E. Davies, Eds., Future Domain Routing Requirements, Group B contribution, draft-irtf-routing- reqs-groupb-00.txt. Work in progress. Kastenholz Informational - Expires November 2002 58 An Interdomain Routing Architecture May 2002 13 Acknowledgments Moe, Larry, and Curley 14 Author's Addresses Frank Kastenholz Unisphere Networks 10 Technology Park Westford, MA, 01886, USA Phone: +1 978 589 0286 Email: fkastenholz@unispherenetworks.com Kastenholz Informational - Expires November 2002 59