Network Working Group F. Templin, Ed. Internet-Draft Boeing Research & Technology Intended status: Informational August 05, 2011 Expires: February 6, 2012 The Internet Routing Overlay Network (IRON) draft-templin-ironbis-01.txt Abstract Since the Internet must continue to support escalating growth due to increasing demand, it is clear that current routing architectures and operational practices must be updated. This document proposes an Internet Routing Overlay Network (IRON) architecture that supports sustainable growth while requiring no changes to end systems and no changes to the existing routing system. IRON further addresses other important issues including routing scaling, mobility management, multihoming, traffic engineering and NAT traversal. While business considerations are an important determining factor for widespread adoption, they are out of scope for this document. Status of this Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on February 6, 2012. Copyright Notice Copyright (c) 2011 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents Templin Expires February 6, 2012 [Page 1] Internet-Draft IRON August 2011 carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Templin Expires February 6, 2012 [Page 2] Internet-Draft IRON August 2011 Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5 3. The Internet Routing Overlay Network . . . . . . . . . . . . . 7 3.1. IRON Client . . . . . . . . . . . . . . . . . . . . . . . 9 3.2. IRON Serving Router . . . . . . . . . . . . . . . . . . . 9 3.3. IRON Relay Router . . . . . . . . . . . . . . . . . . . . 10 4. IRON Organizational Principles . . . . . . . . . . . . . . . . 11 5. IRON Initialization . . . . . . . . . . . . . . . . . . . . . 13 5.1. IRON Relay Router Initialization . . . . . . . . . . . . . 13 5.2. IRON Serving Router Initialization . . . . . . . . . . . . 14 5.3. IRON Client Initialization . . . . . . . . . . . . . . . . 14 6. IRON Operation . . . . . . . . . . . . . . . . . . . . . . . . 15 6.1. IRON Client Operation . . . . . . . . . . . . . . . . . . 15 6.2. IRON Serving Router Operation . . . . . . . . . . . . . . 16 6.3. IRON Relay Router Operation . . . . . . . . . . . . . . . 17 6.4. IRON Reference Operating Scenarios . . . . . . . . . . . . 17 6.4.1. Both Hosts within Same IRON Instance . . . . . . . . . 17 6.4.2. Mixed IRON and Non-IRON Hosts . . . . . . . . . . . . 22 6.4.3. Hosts within Different IRON Instances . . . . . . . . 25 6.5. Mobility, Multiple Interfaces, Multihoming, and Traffic Engineering Considerations . . . . . . . . . . . . 25 6.5.1. Mobility Management . . . . . . . . . . . . . . . . . 26 6.5.2. Multiple Interfaces and Multihoming . . . . . . . . . 26 6.5.3. Inbound Traffic Engineering . . . . . . . . . . . . . 27 6.5.4. Outbound Traffic Engineering . . . . . . . . . . . . . 27 6.6. Renumbering Considerations . . . . . . . . . . . . . . . . 27 6.7. NAT Traversal Considerations . . . . . . . . . . . . . . . 27 6.8. Multicast Considerations . . . . . . . . . . . . . . . . . 28 6.9. Nested EUN Considerations . . . . . . . . . . . . . . . . 28 6.9.1. Host A Sends Packets to Host Z . . . . . . . . . . . . 29 6.9.2. Host Z Sends Packets to Host A . . . . . . . . . . . . 30 7. Implications for the Internet . . . . . . . . . . . . . . . . 31 8. Additional Considerations . . . . . . . . . . . . . . . . . . 32 9. Related Initiatives . . . . . . . . . . . . . . . . . . . . . 32 10. Security Considerations . . . . . . . . . . . . . . . . . . . 33 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 34 12. References . . . . . . . . . . . . . . . . . . . . . . . . . . 34 12.1. Normative References . . . . . . . . . . . . . . . . . . . 34 12.2. Informative References . . . . . . . . . . . . . . . . . . 34 Appendix A. IRON VPs over Internetworks with Different Address Families . . . . . . . . . . . . . . . . . . 36 Appendix B. Scaling Considerations . . . . . . . . . . . . . . . 37 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 38 Templin Expires February 6, 2012 [Page 3] Internet-Draft IRON August 2011 1. Introduction Growth in the number of entries instantiated in the Internet routing system has led to concerns regarding unsustainable routing scaling [RADIR]. Operational practices such as the increased use of multihoming with Provider-Independent (PI) addressing are resulting in more and more fine-grained prefixes being injected into the routing system from more and more end user networks. Furthermore, depletion of the public IPv4 address space has raised concerns for both increased address space fragmentation (leading to yet further routing table entries) and an impending address space run-out scenario. At the same time, the IPv6 routing system is beginning to see growth [BGPMON] which must be managed in order to avoid the same routing scaling issues the IPv4 Internet now faces. Since the Internet must continue to scale to accommodate increasing demand, it is clear that new routing methodologies and operational practices are needed. Several related works have investigated routing scaling issues. Virtual Aggregation (VA) [GROW-VA] and Aggregation in Increasing Scopes (AIS) [EVOLUTION] are global routing proposals that introduce routing overlays with Virtual Prefixes (VPs) to reduce the number of entries required in each router's Forwarding Information Base (FIB) and Routing Information Base (RIB). Routing and Addressing in Networks with Global Enterprise Recursion (RANGER) [RFC5720] examines recursive arrangements of enterprise networks that can apply to a very broad set of use-case scenarios [RFC6139]. IRON specifically adopts the RANGER Non-Broadcast, Multiple Access (NBMA) tunnel virtual-interface model, and uses Virtual Enterprise Traversal (VET) [INTAREA-VET] and the Subnetwork Adaptation and Encapsulation Layer (SEAL) [INTAREA-SEAL] as its functional building blocks. This document proposes an Internet Routing Overlay Network (IRON) architecture with goals of supporting sustainable growth while requiring no changes to the existing routing system. IRON borrows concepts from VA and AIS, and further borrows concepts from the Internet Vastly Improved Plumbing (Ivip) [IVIP-ARCH] architecture proposal along with its associated Translating Tunnel Router (TTR) mobility extensions [TTRMOB]. Indeed, the TTR model to a great degree inspired the IRON mobility architecture design discussed in this document. The Network Address Translator (NAT) traversal techniques adapted for IRON were inspired by the Simple Address Mapping for Premises Legacy Equipment (SAMPLE) proposal [SAMPLE]. IRON supports scalable addressing without changing the current BGP [RFC4271] routing system. IRON observes the Internet Protocol standards [RFC0791][RFC2460], while other network-layer protocols that can be encapsulated within IP packets (e.g., OSI/CLNP Templin Expires February 6, 2012 [Page 4] Internet-Draft IRON August 2011 (Connectionless Network Protocol) [RFC1070], etc.) are also within scope. IRON is a global routing system comprising Virtual Service Provider (VSP) overlay networks that service Virtual Prefixes (VPs) from which End User Network (EUN) prefixes (EPs) are delegated to customer sites. IRON is motivated by a growing customer demand for multihoming, mobility management, and traffic engineering while using stable addressing to minimize dependence on network renumbering [RFC4192][RFC5887]. IRON VSP overlay network instances use the existing IPv4 and IPv6 global Internet routing systems as virtual NBMA links for tunneling inner network protocol packets within outer IPv4 or IPv6 headers (see Section 3). Each IRON instance requires deployment of a small number of new BGP core routers and supporting servers, as well as IRON-aware clients that connect customer EUNs. No modifications to hosts, and no modifications to most routers, are required. The following sections discuss details of the IRON architecture. 2. Terminology This document makes use of the following terms: End User Network (EUN): an edge network that connects an organization's devices (e.g., computers, routers, printers, etc.) to the Internet. End User Network Prefix (EP): a more specific inner network-layer prefix (e.g., an IPv4 /28, an IPv6 /56, etc.) derived from an aggregated Virtual Prefix (VP)and delegated to an EUN by a Virtual Service Provider (VSP). End User Network Prefix Address (EPA): a network-layer address belonging to an EP and assigned to the interface of an end system in an EUN. Forwarding Information Base (FIB): a data structure containing network prefixes to next-hop mappings; usually maintained in a router's fast-path processing lookup tables. Internet Routing Overlay Network (IRON): the union of all VSP overlay network instances. Each such IRON instance supports routing within the overlay through encapsulation of inner packets with EPA addresses within outer headers that use locator addresses. Each IRON instance connects to the global Internet the same as for any autonomous system. Templin Expires February 6, 2012 [Page 5] Internet-Draft IRON August 2011 IRON Client Router/Host ("Client"): a customer's router or host that logically connects the customer's EUNs and their associated EPs to an IRON instance via an NBMA tunnel virtual interface. IRON Serving Router ("Server"): a VSP's IRON instance router that provides forwarding and mapping services for the EPs owned by customer Clients. IRON Relay Router ("Relay"): a VSP's IRON instance router that acts as a relay between the IRON and the native Internet. IRON Agent (IA): generically refers to any of an IRON Client/Server/Relay. Internet Service Provider (ISP): a service provider that connects customer EUNs to the underlying Internetwork. In other words, an ISP is responsible for providing basic Internet connectivity for customer EUNs. Locator an IP address assigned to the interface of a router or end system within a public or private network. Locators taken from public IP prefixes are routable on a global basis, while locators taken from private IP prefixes are made public via Network Address Translation (NAT). Routing and Addressing in Networks with Global Enterprise Recursion (RANGER): an architectural examination of virtual overlay networks applied to enterprise network scenarios, with implications for a wider variety of use cases. Subnetwork Encapsulation and Adaptation Layer (SEAL): an encapsulation sublayer that provides extended packet identification and a Control Message Protocol to ensure deterministic network-layer feedback. Virtual Enterprise Traversal (VET): a method for discovering border routers and forming dynamic tunnel-neighbor relationships over enterprise networks (or sites) with varying properties. Virtual Prefix (VP): a prefix block (e.g., an IPv4 /16, an IPv6 /20, an OSI Network Service Access Protocol (NSAP) prefix, etc.) that is owned and managed by a Virtual Service Provider (VSP). Templin Expires February 6, 2012 [Page 6] Internet-Draft IRON August 2011 Virtual Service Provider (VSP): a company that owns and manages a set of VPs from which it delegates EPs to EUNs. VSP Overlay Network: a specialized set of routers deployed by a VSP to service customer EUNs through an IRON instance configured over an underlying Internetwork (e.g., the global Internet). 3. The Internet Routing Overlay Network The Internet Routing Overlay Network (IRON) is a union of Virtual Service Provider (VSP) overlay network instances connected to a common Internetwork. While the principles presented in this document are discussed within the context of the public global Internet, they can also be applied to any autonomous Internetwork. The rest of this document therefore refers to the terms "Internet" and "Internetwork" interchangeably except in cases where specific distinctions must be made. Each IRON instance consists of IRON Agents (IAs) that automatically tunnel the packets of end-to-end communication sessions within encapsulating headers used for Internet routing. IAs use the Virtual Enterprise Traversal (VET) [INTAREA-VET] virtual NBMA link model in conjunction with the Subnetwork Encapsulation and Adaptation Layer (SEAL) [INTAREA-SEAL] to encapsulate inner network-layer packets within outer headers, as shown in Figure 1. Templin Expires February 6, 2012 [Page 7] Internet-Draft IRON August 2011 +-------------------------+ | Outer headers with | ~ locator addresses ~ | (IPv4 or IPv6) | +-------------------------+ | SEAL Header | +-------------------------+ +-------------------------+ | Inner Packet Header | --> | Inner Packet Header | ~ with EP addresses ~ --> ~ with EP addresses ~ | (IPv4, IPv6, OSI, etc.) | --> | (IPv4, IPv6, OSI, etc.) | +-------------------------+ +-------------------------+ | | --> | | ~ Inner Packet Body ~ --> ~ Inner Packet Body ~ | | --> | | +-------------------------+ +-------------------------+ Inner packet before Outer packet after encapsulation encapsulation Figure 1: Encapsulation of Inner Packets within Outer IP Headers VET specifies the automatic tunneling mechanisms used for encapsulation, while SEAL specifies the format and usage of the SEAL header as well as a set of control messages. Most notably, IAs use the SEAL Control Message Protocol (SCMP) to deterministically exchange and authenticate control messages such as route redirections, indications of Path Maximum Transmission Unit (PMTU) limitations, destination unreachables, etc. IAs appear as neighbors on an NBMA virtual link, and form bidirectional and/or unidirectional tunnel-neighbor relationships. Each IRON instance comprises a set of IAs distributed throughout the Internet to serve highly aggregated Virtual Prefixes (VPs). VSPs delegate sub-prefixes from their VPs, which they lease to customers as End User Network Prefixes (EPs). In turn, the customers assign the EPs to their customer edge IAs, which connect their End User Networks (EUNs) to the VSP IRON instance. VSPs may have no affiliation with the ISP networks from which customers obtain their basic Internet connectivity. Therefore, a customer could procure its summary network services either through a common provider or through separate entities. In that case, the VSP can open for business and begin serving its customers immediately without the need to coordinate its activities with ISPs or other VSPs. Further details on business considerations are out of scope for this document. IRON requires no changes to end systems or to most routers in the Templin Expires February 6, 2012 [Page 8] Internet-Draft IRON August 2011 Internet. Instead, IAs are deployed either as new platforms or as modifications to existing platforms. IAs may be deployed incrementally without disturbing the existing Internet routing system, and act as waypoints (or "cairns") for navigating VSP overly networks. The functional roles for IAs are described in the following sections. 3.1. IRON Client An IRON client (or, simply, "Client") is a customer's router or host that logically connects the customer's EUNs and their associated EPs to its VSP's IRON instance via tunnels, as shown in Figure 2. Client routers obtain EPs from their VSPs and use them to number subnets and interfaces within their EUNs. A Client can be deployed on the same physical platform that also connects the customer's EUNs to its ISPs, but it may also be a separate router or even a standalone server system located within the EUN. (This model applies even if the EUN connects to the ISP via a Network Address Translator (NAT) -- see Section 6.7). Finally, a Client may also be a simple end system that connects a singleton EUN and exhibits the outward appearance of a host. .-. ,-( _)-. +--------+ .-(_ (_ )-. | Client |--(_ ISP ) +---+----+ `-(______)-' | <= T \ .-. .-. u \ ,-( _)-. ,-( _)-. n .-(_ (- )-. .-(_ (_ )-. n (_ Internet ) (_ EUN ) e `-(______)- `-(______)-' l ___ | s => (:::)-. +----+---+ .-(::::::::) | Host | .-(::: IRON :::)-. +--------+ (:::: Instance ::::) `-(::::::::::::)-' `-(::::::)-' Figure 2: IRON Client Router Connecting EUN to IRON Instance 3.2. IRON Serving Router An IRON serving router (or, simply, "Server") is a VSP's router that provides forwarding and mapping services within the IRON instance for the EPs owned by customer Client routers. In typical deployments, a VSP will deploy many Servers around the IRON instance in a globally distributed fashion (e.g., as depicted in Figure 3) so that Clients Templin Expires February 6, 2012 [Page 9] Internet-Draft IRON August 2011 can discover those that are nearby. +--------+ +--------+ | Boston | | Tokyo | | Server | | Server | +--+-----+ ++-------+ +--------+ \ / | Seattle| \ ___ / | Server | \ (:::)-. +--------+ +------+-+ .-(::::::::)------+ Paris | \.-(::: IRON :::)-. | Server | (:::: Instance ::::) +--------+ `-(::::::::::::)-' +--------+ / `-(::::::)-' \ +--------+ | Moscow + | \--- + Sydney | | Server | +----+---+ | Server | +--------+ | Cairo | +--------+ | Server | +--------+ Figure 3: IRON Serving Router Global Distribution Example Each Server acts as a tunnel-endpoint router that forms bidirectional tunnel-neighbor relationships with each of its Client customers and also serves as the tunnel egress of dynamically discovered unidirectional tunnel-neighbors. Each Server also associates with a set of Relays that can forward packets from the IRON out to the native Internet and vice versa, as discussed in the next section. 3.3. IRON Relay Router An IRON Relay Router (or, simply, "Relay") is a router that acts as a relay between the VSP's IRON instance and the native Internet. Therefore, it also serves as an Autonomous System Border Router (ASBR) that is owned and managed by the VSP. Each VSP configures one or more Relays that advertise the company's VPs into the IPv4 and IPv6 global Internet BGP routing systems. Each Relay associates with all of the VSP's IRON instance Servers, e.g., via tunnels over the IRON instance, via a direct interconnect such as an Ethernet cable, etc. The Relay role is depicted in Figure 4. Templin Expires February 6, 2012 [Page 10] Internet-Draft IRON August 2011 .-. ,-( _)-. .-(_ (_ )-. (_ Internet ) `-(______)-' | +--------+ | |--| Server | +----+---+ | +--------+ | Relay |----| +--------+ +--------+ |--| Server | _|| | +--------+ (:::)-. (Ethernet) .-(::::::::) +--------+ .-(::: IRON :::)-. +--------+ | Server |=(:::: Instance ::::)=| Server | +--------+ `-(::::::::::::)-' +--------+ `-(::::::)-' || (Tunnels) +--------+ | Server | +--------+ Figure 4: IRON Relay Router Connecting IRON Instance to Native Internet 4. IRON Organizational Principles The IRON consists of the union of all VSP overlay networks configured over a common Internetwork (e.g., the public Internet). Each such IRON instance represents a distinct "patch" on the Internet "quilt", where the patches are stitched together by standard Internet routing. When a new IRON instance is deployed, it becomes yet another patch on the quilt and coordinates its internal routing system independently of all other patches. Each IRON instance maintains a set of Relays and Servers that provide services to Client customers. In order to ensure adequate customer service levels, the VSP should conduct a traffic scaling analysis and distribute sufficient Relays and Servers for the IRON instance globally throughout the Internet. Figure 5 depicts the logical arrangement of Relays, Servers, and Clients in an IRON instance. Templin Expires February 6, 2012 [Page 11] Internet-Draft IRON August 2011 .-. ,-( _)-. .-(_ (_ )-. (__ Internet _) `-(______)-' <------------ Relays ------------> ________________________ (::::::::::::::::::::::::)-. .-(:::::::::::::::::::::::::::::) .-(:::::::::::::::::::::::::::::::::)-. (::::::::::: IRON Instance :::::::::::::) `-(:::::::::::::::::::::::::::::::::)-' `-(::::::::::::::::::::::::::::)-' <------------ Servers ------------> .-. .-. .-. ,-( _)-. ,-( _)-. ,-( _)-. .-(_ (_ )-. .-(_ (_ )-. .-(_ (_ )-. (__ ISP A _) (__ ISP B _) ... (__ ISP x _) `-(______)-' `-(______)-' `-(______)-' <----------- NATs ------------> <----------- Clients and EUNs -----------> Figure 5: IRON Organization Each Relay connects the IRON instance directly to the IPv4 and IPv6 Internets. It also advertises the VSP's IPv4 VPs into the IPv4 BGP routing system and advertises the VSP's IPv6 VPs into the IPv6 BGP routing system. Relays will therefore receive packets with EPA destination addresses sent by end systems in the Internet and forward them via tunnels toward EPA-addressed end systems connected to the VSP's IRON instance. Each VSP also manages a set of Servers that connect their Clients and associated EUNs to the IRON instance and to the IPv6 and IPv4 Internets via their associations with Relays. IRON Servers therefore need not be BGP routers themselves; they can be simple commodity hardware platforms. The Server and Relay functions can further be deployed together on the same physical platform as a unified gateway, or they may be deployed on separate platforms (e.g., for load balancing purposes). Each Server maintains a working set of bidirectional tunnel-neighbor Clients for which it caches EP-to-Client mappings in its Forwarding Information Base (FIB). Each Server also, in turn, propagates the list of EPs in its working set to each of the Relays in the IRON Templin Expires February 6, 2012 [Page 12] Internet-Draft IRON August 2011 instance via a dynamic routing protocol (e.g., an internal BGP instance that carries only the EP-to-Server mappings and does not interact with the external BGP routing system). Therefore, each Server only needs to track the EPs for its current working set of Clients, while each Relay will maintain a full EP-to-Server routing information base that represents reachability information for all EPs in the IRON instance. Customers establish Clients that obtain their basic Internet connectivity from ISPs and connect to Servers to attach their EUNs to the IRON instance. Each EUN can further connect to the IRON instance via multiple Clients as long as the Clients coordinate with one another, e.g., to mitigate EUN partitions. Unlike Relays and Servers, Clients may use private addresses behind one or several layers of NATs. Each Client initially discovers a list of nearby Servers then forms a bidirectional tunnel-neighbor relationship with one of the Servers through an initial exchange followed by periodic keepalives. After the Client selects a Server, it forwards initial outbound packets from its EUNs by tunneling them to the Server, which may, in turn, forward them to the nearest Relay within the IRON instance. The Client may subsequently receive redirect messages informing it of a more direct route through a different Server within the IRON instance that serves the final destination EUN. This Server in turn provides a unidirectional tunnel-neighbor egress for route optimization purposes,. The IRON can also be used to support VPs of network-layer address families that cannot be routed natively in the underlying Internetwork (e.g., OSI/CLNP over the public Internet, IPv6 over IPv4-only Internetworks, IPv4 over IPv6-only Internetworks, etc.). Further details for the support of IRON VPs of one address family over Internetworks based on other address families are discussed in Appendix A. 5. IRON Initialization Each IRON instance is initialized through the startup actions of IAs and customer EUNs. The following sub-sections discuss these startup procedures. 5.1. IRON Relay Router Initialization Each IRON Relay is provisioned with the list of VPs that it will serve, as well as the locators for all Servers within the IRON instance. The Relay is also provisioned with external BGP Templin Expires February 6, 2012 [Page 13] Internet-Draft IRON August 2011 interconnections -- the same as for any BGP router. Upon startup, the Relay engages in BGP routing exchanges with its peers in the IPv4 and/or IPv6 Internets the same as for any BGP router. It then connects to all of the Servers in the IRON instance (e.g., via a secured TCP connection over a bidirectional tunnel, via an Internal BGP (IBGP) route reflector, etc.) for the purpose of discovering EP-to-Server mappings. After the Relay has fully populated its EP-to-Server mapping information database, it is said to be "synchronized" with respect to its VPs. After this initial synchronization procedure, the Relay then advertises the VPs externally. In particular, the Relay advertises the IPv6 VPs into the IPv6 BGP routing system and advertises the IPv4 VPs into the IPv4 BGP routing system. The Relay then engages in ordinary packet-forwarding operations. 5.2. IRON Serving Router Initialization Each IRON Server is provisioned with the locators for all Relays within the IRON instance. Upon startup, each Server must connect to all of the Relays within the IRON instance (e.g., via a secured TCP connection, via an IBGP route reflector, etc.) for the purpose of reporting the list of EPs it is currently serving. The Server then actively listens for Client customers that register their EP prefixes as part of establishing a bidirectional tunnel-neighbor relationship. When a new Client connects, the Server announces the new EP routes to all Relays; when an existing Client disconnects, the Server withdraws its announcements. 5.3. IRON Client Initialization Each Client obtains one or more EPs in an initial secured exchange with the VSP, e.g., as part of the initial customer signup agreement. Upon startup, the Client connects to a location broker (e.g., a well known website run by the VSP) to discover a list of nearby Servers. After the Client obtains a list of nearby Servers, it initiates a short transaction with one or more Servers (e.g., via a secured TCP connection) in order to establish a bidirectional tunnel-neighbor relationship. During the transaction, each Server provides the Client with a tunnel-neighbor identifier ("NBR_ID") and a Shared Secret that the Client will use to sign and authenticate certain control messages. The protocol details of the transaction are specific to the VSP, and hence out of scope for this document. Templin Expires February 6, 2012 [Page 14] Internet-Draft IRON August 2011 6. IRON Operation Following the initialization operations detailed in Section 5, IAs engage in the cooperative process of receiving and forwarding packets. All IAs forward encapsulated packets over the IRON instance using the mechanisms of VET [INTAREA-VET] and SEAL [INTAREA-SEAL], while Relays additionally forward packets to and from the native IPv6 and IPv4 Internets. IAs also use SCMP to coordinate with other IAs, including the process of sending and receiving redirect messages, error messages, etc. Each IA operates as specified in the following sub-sections. 6.1. IRON Client Operation After selecting Servers as specified in Section 5.3, the Client registers one or more active ISP connections with each Server. To do so, it sends periodic beacons (e.g., cryptographically signed SRS messages) to the Server via each ISP connection to maintain tunnel- neighbor address mapping state. The beacons should be sent at no more than 60 second intervals (subject to a small random delay) so that state in NATs on the path as well as on the Server itself is refreshed regularly. Although the Client may connect via multiple ISPs, a single NBR_ID is used to represent the set of all ISP paths the Client has registered with this Server. The NBR_ID therefore names this "bundle" of tunnel-neighbor ISP connections. If the Client ceases to receive acknowledgements from a Server via a specific ISP connection, it marks the Server as unreachable from that ISP. (The Client should also inform the Server of this outage via one of its working ISP connections.) If the Client ceases to receive acknowledgements from the Server via multiple ISP connections, it withdraws its registration with this server and registers with a new nearby Server. The act of withdrawing from the old server and registering with the new server will soon propagate the appropriate routing information among the IRON instance's Relay Routers. When an end system in an EUN sends a flow of packets to a correspondent, the packets are forwarded through the EUN via normal routing until they reach the Client, which then tunnels the initial packets to a Server as the next hop. In particular, the Client encapsulates each packet in an outer header with its locator as the source address and the locator of the Server as the destination address. After sending the initial packets of a flow, the Client may receive important control messages, such as indications of PMTU limitations, redirect messages that indicate a better tunnel-neighbor next hop, etc. The Client uses the mechanisms specified in VET and SEAL to Templin Expires February 6, 2012 [Page 15] Internet-Draft IRON August 2011 encapsulate each packet to be forwarded. The Client further uses the SCMP protocol to coordinate with Servers, including accepting redirects and other control messages. 6.2. IRON Serving Router Operation After the Server is initialized, it accepts Client connections and authenticates the SRS messages it receives from its connected tunnel- neighbor Clients. The Server discards any SRS messages that failed authentication, and responds to authentic SRS messages by returning signed SRAs. When the Server receives a SEAL-encapsulated data packet from one of its bidirectional tunnel-neighbor Clients, it uses normal longest- prefix-match rules to locate a FIB entry that matches the packet's inner destination address. If the matching FIB entry is more- specific than default, the next hop is another of the Server's tunnel-neighbor Clients; otherwise, the next-hop is a Relay which serves as a default router. The Server then re-encapsulates the packet (i.e., it removes the outer header and replaces it with a new outer header of the same address family), sets the outer destination address to the locator address of the next hop and tunnels the packet to the next hop. When the Server receives a SEAL-encapsulated data packet from either a Relay or from a unidirectional tunnel-neighbor Client, it again locates a FIB entry that matches the packet's inner destination address. If the matching FIB entry is more-specific than default, the Server re-encapsulates the packet and forwards it to the correct bidirectional tunnel-neighbor Client. If the Client has recently moved to a different Server, however, the Server also returns an SCMP redirect message listing a NULL next hop to inform the previous hop that the Client has moved. Note that Server-to-Server tunneling is not permitted, since this could result in sustained routing loops in which Server A has a route to Server B, and Server B has a route to Server A. This implies that a Server must never accept and process a redirect message, but must instead relay the redirect message to the appropriate bidirectional Client. The permissible data flow paths for tunneled packets that flow through a Server are therefore: o From a bidirectional Client customer to another bidirectional Client customer (i.e., a hairpin route) o From a bidirectional Client customer to a default Relay router Templin Expires February 6, 2012 [Page 16] Internet-Draft IRON August 2011 o From a default Relay router to a bidirectional Client customer o From a unidirectional foreign Client to a bidirectional Client customer 6.3. IRON Relay Router Operation After each Relay has synchronized its VPs (see Section 5.1) it advertises them in the IPv4 and IPv6 Internet BGP routing systems. These prefixes will be represented as ordinary routing information in the BGP, and any packets originating from the IPv4 or IPv6 Internet destined to an address covered by one of the prefixes will be forwarded to one of the VSP's Relays. When a Relay receives a packet from the Internet destined to an EPA covered by one of its VPs, it behaves as an ordinary IP router. In particular, the Relay looks in its FIB to discover a locator of a Server that serves the EP covering the destination address. The Relay then simply encapsulates the packet with its own locator as the outer source address and the locator of the Server as the outer destination address and forwards the packet to the Server. 6.4. IRON Reference Operating Scenarios IRON supports communications when one or both hosts are located within EP-addressed EUNs. The following sections discuss the reference operating scenarios. 6.4.1. Both Hosts within Same IRON Instance When both hosts are within EUNs served by the same IRON instance, it is sufficient to consider the scenario in a unidirectional fashion, i.e., by tracing packet flows only in the forward direction from source host to destination host. The reverse direction can be considered separately and incurs the same considerations as for the forward direction. The simplest case occurs when the EUNs that service the source and destination hosts are connected to the same server, while the general case occurs when the EUNs are connected to different Servers. The two cases are discussed in the following sections. 6.4.1.1. EUNs Served by Same Server In this scenario, the packet flow from the source host is forwarded through the EUN to the source's Client. The Client then tunnels the packets to the Server, which simply relays the tunneled packets to the destination's Client. The destination's Client then removes the packets from the tunnel and forwards them over the EUN to the Templin Expires February 6, 2012 [Page 17] Internet-Draft IRON August 2011 destination. Figure 6 depicts the sustained flow of packets from Host A to Host B within EUNs serviced by the same Server(S) via a "hairpinned" route: ________________________________________ .-( )-. .-( )-. .-( )-. .( ). .( ). .( +------------+ ). ( +===================>| Server(S) |=====================+ ) ( // +------------+ \\ ) ( // .-. .-. \\ ) ( //,-( _)-. ,-( _)-\\ ) ( .||_ (_ )-. .-(_ (_ ||. ) ((_|| ISP A .) (__ ISP B ||_)) ( ||-(______)-' `-(______)|| ) ( || | | vv ) ( +-----+-----+ +-----+-----+ ) | Client(A) | | Client(B) | +-----+-----+ VSP IRON Instance +-----+-----+ ^ | ( (Overlaid on the Native Internet) ) | | | .-. .-( .-) .-. | | ,-( _)-. .-(________________________)-. ,-( _)-. | .|(_ (_ )-. .-(_ (_ )| (_| IRON EUN A ) (_ IRON EUN B|) |`-(______)-' `-(______)-| | | Legend: | | | +---+----+ <---> == Native +----+---+ | +-| Host A | <===> == Tunnel | Host B |<+ +--------+ +--------+ Figure 6: Sustained Packet Flow via Hairpinned Route With reference to Figure 6, Host A sends packets destined to Host B via its network interface connected to EUN A. Routing within EUN A will direct the packets to Client(A) as a default router for the EUN, which then uses VET and SEAL to encapsulate them in outer headers with its locator address as the outer source address, the locator address of Server(S) as the outer destination address, and the NBR_ID parameters associated with its tunnel-neighbor state as the identity. Client(A) then simply forwards the encapsulated packets into its ISP network connection that provided its locator. The ISP will forward the encapsulated packets into the Internet without filtering since the (outer) source address is topologically correct. Once the packets have been forwarded into the Internet, routing will direct them to Server(S). Templin Expires February 6, 2012 [Page 18] Internet-Draft IRON August 2011 Server(S) will receive the encapsulated packets from Client(A) then check its FIB to discover an entry that covers destination address B with Client(B) as the next hop. Server(S) then re-encapsulates the packets in a new outer header that uses the source address, destination address, and NBR_ID parameters associated with the tunnel-neighbor state for Client(B). Server(S) then forwards these re-encapsulated packets into the Internet, where routing will direct them to Client(B). Client(B) will, in turn, decapsulate the packets and forward the inner packets to Host B via EUN B. In this scenario, no further route optimization is supported within the IRON framework, since IRON does not make provisions for Client- to-Client binding updates. Each Client therefore need only coordinate its locator to EP mappings with its Server(s), and does not update bindings with any of its recent correspondents. 6.4.1.2. EUNs Served by Different Servers In this scenario, the initial packets of a flow produced by a source host within an EUN connected to the IRON instance by a Client must flow through both the Server of the source host and a nearby Relay, but route optimization can eliminate these elements from the path for subsequent packets in the flow. Figure 7 shows the flow of initial packets from Host A to Host B within EUNs of the same IRON instance: Templin Expires February 6, 2012 [Page 19] Internet-Draft IRON August 2011 ________________________________________ .-( )-. .-( +------------+ )-. .-( +======>| Relay(R) |=======+ )-. .( || +*-----------+ || ). .( || * vv ). .( +--------++--+* +--++--------+ ). ( +==>| Server(A) *| | Server(B) |====+ ) ( // +----------*-+ +------------+ \\ ) ( // .-. * .-. \\ ) ( //,-( _)-. * ,-( _)-\\ ) ( .||_ (_ )-. * .-(_ (_ ||. ) ((_|| ISP A .) * (__ ISP B ||_)) ( ||-(______)-' * `-(______)|| ) ( || | * | vv ) ( +-----+-----+ * +-----+-----+ ) | Client(A) |<* | Client(B) | +-----+-----+ VSP IRON Instance +-----+-----+ ^ | ( (Overlaid on the Native Internet) ) | | | .-. .-( .-) .-. | | ,-( _)-. .-(________________________)-. ,-( _)-. | .|(_ (_ )-. .-(_ (_ )| (_| IRON EUN A ) (_ IRON EUN B|) |`-(______)-' `-(______)-| | | Legend: | | | +---+----+ <---> == Native +----+---+ | +-| Host A | <===> == Tunnel | Host B |<+ +--------+ ***** == Redirect +--------+ Figure 7: Initial Packet Flow Before Redirects With reference to Figure 7, Host A sends packets destined to Host B via its network interface connected to EUN A. Routing within EUN A will direct the packets to Client(A) as a default router for the EUN, which then encapsulates them in outer headers and forwards the encapsulated packets into the ISP network connection that provided its locator. The ISP will forward the encapsulated packets into the Internet, where routing will direct them to Server(A). Server(A) receives the encapsulated packets from Client(A) then rewrites the outer source address to one of its own locator addresses and rewrites the outer destination address to the address of a nearby Relay(R). Server(A) then forwards the revised encapsulated packets into the Internet, where routing will direct them to Relay(R). Relay(R) receives the encapsulated packets from Server(A) then checks its FIB to discover an entry that covers inner destination address B with Server(B) as the next hop. Relay(R) then returns SCMP redirect Templin Expires February 6, 2012 [Page 20] Internet-Draft IRON August 2011 messages to Server(A), rewrites the outer destination address of the encapsulated packets to the locator address of Server(B), and forwards these revised packets to Server(B). Server(B) receives the encapsulated packets from Relay(R) then checks its FIB to discover an entry that covers destination address B with Client(B) as the next hop. Server(B) then re-encapsulates the packets in a new outer header that uses the source address, destination address, and NBR_ID parameters associated with the tunnel-neighbor state for Client(B). Server(B) then forwards these re-encapsulated packets into the Internet, where routing will direct them to Client(B). Client(B) will, in turn, decapsulate the packets and forward the inner packets to Host B via EUN B. Note that after the initial flow of packets, Server(A) will have received one or more SCMP redirect messages from Relay(R) listing Server(B) as a better next hop. Server(A) will, in turn, proxy the redirects to Client(A), which will establish unidirectional tunnel- neighbor state listing Server(B) as the next hop toward the EP that covers Host B. Client(A) thereafter forwards its encapsulated packets directly to the locator address of Server(B) without involving either Server(A) or Relay(B), as shown in Figure 8. Templin Expires February 6, 2012 [Page 21] Internet-Draft IRON August 2011 ________________________________________ .-( )-. .-( )-. .-( )-. .( ). .( ). .( +------------+ ). ( +====================================>| Server(B) |====+ ) ( // +------------+ \\ ) ( // .-. .-. \\ ) ( //,-( _)-. ,-( _)-\\ ) ( .||_ (_ )-. .-(_ (_ ||. ) ((_|| ISP A .) (__ ISP B ||_)) ( ||-(______)-' `-(______)|| ) ( || | | vv ) ( +-----+-----+ +-----+-----+ ) | Client(A) | | Client(B) | +-----+-----+ IRON Instance +-----+-----+ ^ | ( (Overlaid on the Native Internet) ) | | | .-. .-( .-) .-. | | ,-( _)-. .-(________________________)-. ,-( _)-. | .|(_ (_ )-. .-(_ (_ )| (_| IRON EUN A ) (_ IRON EUN B|) |`-(______)-' `-(______)-| | | Legend: | | | +---+----+ <---> == Native +----+---+ | +-| Host A | <===> == Tunnel | Host B |<+ +--------+ +--------+ Figure 8: Sustained Packet Flow After Redirects 6.4.2. Mixed IRON and Non-IRON Hosts The cases in which one host is within an IRON EUN and the other is in a non-IRON EUN (i.e., one that connects to the native Internet instead of the IRON) are described in the following sub-sections. 6.4.2.1. From IRON Host A to Non-IRON Host B Figure 9 depicts the IRON reference operating scenario for packets flowing from Host A in an IRON EUN to Host B in a non-IRON EUN. Templin Expires February 6, 2012 [Page 22] Internet-Draft IRON August 2011 _________________________________________ .-( )-. )-. .-( +-------)----+ )-. .-( | Relay(A) |--------------------------+ )-. .( +------------+ \ ). .( +=======>| Server(A) | \ ). .( // +--------)---+ \ ). ( // ) \ ) ( // IRON ) \ ) ( // .-. Instance ) .-. \ ) ( //,-( _)-. ) ,-( _)-. \ ) ( .||_ (_ )-. ) The Native Internet .- _ (_ )-| ) ( _|| ISP A ) ) (_ ISP B |)) ( ||-(______)-' ) `-(______)-' | ) ( || | )-. | v ) ( +-----+ ----+ )-. +-----+-----+ ) | Client(A) |)-. | Router(B) | +-----+-----+ +-----+-----+ ^ | ( ) | | | .-. .-( .-) .-. | | ,-( _)-. .-(________________________)-. ,-( _)-. | .|(_ (_ )-. .-(_ (_ )| (_| IRON EUN A ) (_ IRON EUN B|) |`-(______)-' `-(______)-| | | Legend: | | | +---+----+ <---> == Native +----+---+ | +-| Host A | <===> == Tunnel | Host B |<+ +--------+ +--------+ Figure 9: From IRON Host A to Non-IRON Host B In this scenario, Host A sends packets destined to Host B via its network interface connected to IRON EUN A. Routing within EUN A will direct the packets to Client(A) as a default router for the EUN, which then encapsulates them and sends them into the ISP network. The ISP will pass the packets without filtering since the (outer) source address is topologically correct. Once the packets have been released into the native Internet, the Internet routing system will direct them to Server(A). Server(A) receives the encapsulated packets from Client(A) then re- encapsulates and forwards them to Relay(A), which simply decapsulates them and forwards the unencapsulated packets into the Internet. Once the packets are released into the Internet, routing will direct them to the final destination B. (Note that Server(A) and Relay(A) are depicted in Figure 9 as two halves of a unified gateway. In that case, the "forwarding" between Server(A) and Relay(A) is a zero- instruction imaginary operation within the gateway.) Templin Expires February 6, 2012 [Page 23] Internet-Draft IRON August 2011 6.4.2.2. From Non-IRON Host B to IRON Host A Figure 10 depicts the IRON reference operating scenario for packets flowing from Host B in an Non-IRON EUN to Host A in an IRON EUN. _________________________________________ .-( )-. )-. .-( +-------)----+ )-. .-( | Relay(A) |<-------------------------+ )-. .( +------------+ \ ). .( +========| Server(A) | \ ). .( // +--------)---+ \ ). ( // ) \ ) ( // IRON ) \ ) ( // .-. Instance ) .-. \ ) ( //,-( _)-. ) ,-( _)-. \ ) ( .||_ (_ )-. ) The Native Internet .- _ (_ )-| ) ( _|| ISP A ) ) (_ ISP B |)) ( ||-(______)-' ) `-(______)-' | ) ( vv | )-. | | ) ( +-----+ ----+ )-. +-----+-----+ ) | Client(A) |)-. | Router(B) | +-----+-----+ +-----+-----+ | | ( ) | | | .-. .-( .-) .-. | | ,-( _)-. .-(________________________)-. ,-( _)-. | .|(_ (_ )-. .-(_ (_ )| (_| IRON EUN A ) (_ IRON EUN B|) |`-(______)-' `-(______)-| | | Legend: | | | +---+----+ <---> == Native +----+---+ | +>| Host A | <===> == Tunnel | Host B |-+ +--------+ +--------+ Figure 10: From Non-IRON Host B to IRON Host A In this scenario, Host B sends packets destined to Host A via its network interface connected to non-IRON EUN B. Internet routing will direct the packets to Relay(A), which then forwards them to Server(A) using encapsulation if necessary. Server(A) will then check its FIB to discover an entry that covers destination address A with Client(A) as the next hop. Server(A) then (re-)encapsulates the packets in an outer header that uses the source address, destination address, and NBR_ID parameters associated with the tunnel-neighbor state for Client(A). Next, Server(A) forwards these (re-)encapsulated packets into the Internet, where routing will direct them to Client(A). Client(A) will, in turn, decapsulate the Templin Expires February 6, 2012 [Page 24] Internet-Draft IRON August 2011 packets and forward the inner packets to Host A via its network interface connected to IRON EUN A. 6.4.3. Hosts within Different IRON Instances Figure 11 depicts the IRON reference operating scenario for packets flowing between Host A in an IRON instance A and Host B in a different IRON instance B. In that case, forwarding between hosts A and B always involves the Servers and Relays of both IRON instances, i.e., the scenario is no different than if one of the hosts was serviced by an IRON EUN and the other was serviced by a non-IRON EUN. _________________________________________ .-( )-. .-( )-. .-( +-------)----+ +---(--------+ )-. .-( | Relay(A) | <---> | Relay(B) | )-. .( +------------+ +------------+ ). .( +=======>| Server(A) | | Server(B) |<======+ ). .( // +--------)---+ +---(--------+ \\ ). ( // ) ( \\ ) ( // IRON ) ( IRON \\ ) ( // .-. Instance A ) ( Instance B .-. \\ ) ( //,-( _)-. ) ( ,-( _). || ) ( .||_ (_ )-. ) ( .-'_ (_ )|| ) ( _|| ISP A ) ) ( (_ ISP B ||)) ( ||-(______)-' ) ( '-(______)-|| ) ( vv | )-. .-( | vv ) ( +-----+ ----+ )-. .-( +-----+-----+ ) | Client(A) |)-. .-(| Client(B) | +-----+-----+ The Native Internet +-----+-----+ ^ | ( ) | ^ | .-. .-( .-) .-. | | ,-( _)-. .-(________________________)-. ,-( _)-. | .|(_ (_ )-. .-(_ (_ )| (_| IRON EUN A ) (_ IRON EUN B|) |`-(______)-' `-(______)-| | | Legend: | | | +---+----+ <---> == Native +----+---+ | +>| Host A | <===> == Tunnel | Host B |<+ +--------+ +--------+ Figure 11: Hosts within Different IRON Instances 6.5. Mobility, Multiple Interfaces, Multihoming, and Traffic Engineering Considerations While IRON Servers and Relays can be considered as fixed infrastructure, Clients may need to move between different network points of attachment, connect to multiple ISPs, or explicitly manage Templin Expires February 6, 2012 [Page 25] Internet-Draft IRON August 2011 their traffic flows. The following sections discuss mobility, multihoming, and traffic engineering considerations for IRON Client routers. 6.5.1. Mobility Management When a Client changes its network point of attachment (e.g., due to a mobility event), it configures one or more new locators. If the Client has not moved far away from its previous network point of attachment, it simply informs its Server of any locator additions or deletions. This operation is performance sensitive and should be conducted immediately to avoid packet loss. If the Client has moved far away from its previous network point of attachment, however, it re-issues the Server discovery procedure described in Section 5.3 to discover whether its candidate set of Servers has changed. If the Client's current Server is also included in the new list received from the VSP, this provides indication that the Client has not moved far enough to warrant changing to a new Server. Otherwise, the Client may wish to move to a new Server in order to reduce routing stretch. This operation is not performance critical, and therefore can be conducted over a matter of seconds/ minutes instead of milliseconds/microseconds. To move to a new Server, the Client first engages in the EP registration process with the new Server, as described in Section 5.3. The Client then informs its former Server that it has departed; again, via a VSP-specific secured reliable transport connection. The former Server will then withdraw its EP advertisements from the VSP routing system and retain the (stale) FIB entries until their lifetime expires. In the interim, the Server continues to deliver packets to the Client's last-known locator addresses for the short term while informing any unidirectional tunnel-neighbors that the Client has moved. 6.5.2. Multiple Interfaces and Multihoming A Client may register multiple ISP connections with each Server. It can assign metrics with its registrations to inform the Server of preferred ISP connections, and it can select outgoing ISP connections according to its outbound traffic requirements. Therefore, multiple interfaces are naturally supported. A Client may further register with multiple Servers for fault tolerance and reduced routing stretch. In that case, the Client should register each of its ISP connections with each of its Servers unless it has a way of carefully coordinating its ISP-to-Server mappings. (However, unpredictable performance may result if the Templin Expires February 6, 2012 [Page 26] Internet-Draft IRON August 2011 Client registers only preferred ISP connections with Server A and backup ISP connections with Server B.) Client registration with multiple Servers results in "pseudo- multihoming", in which the multiple homes are within the same VSP IRON instance. True multihoming would only apply if the Client were to connect to multiple IRON instances and receive a different set of EPs from each instance. 6.5.3. Inbound Traffic Engineering A Client can dynamically adjust the priorities of its locator registrations with its Server in order to influence inbound traffic flows. It can also change between Servers when multiple Servers are available, but should strive for stability in its Server selection in order to limit VSP network routing churn. 6.5.4. Outbound Traffic Engineering A Client can select outgoing locators, e.g., based on current Quality-of-Service (QoS) considerations such as minimizing delay or variance. 6.6. Renumbering Considerations As new link-layer technologies and/or service models emerge, customers will be motivated to select their service providers through healthy competition between ISPs. If a customer's EUN addresses are tied to a specific ISP, however, the customer may be forced to undergo a painstaking EUN renumbering process if it wishes to change to a different ISP [RFC4192][RFC5887]. When a customer obtains EP prefixes from a VSP, it can change between ISPs seamlessly and without need to renumber. If the VSP itself applies unreasonable costing structures for use of the EPs, however, the customer may be compelled to seek a different VSP and would again be required to confront a renumbering scenario. 6.7. NAT Traversal Considerations The Internet today consists of a global public IPv4 routing and addressing system with non-IRON EUNs that use either public or private IPv4 addressing. The latter class of EUNs connect to the public Internet via Network Address Translators (NATs). When a Client is located behind a NAT, it selects Servers using the same procedures as for Clients with public addresses and can then send SRS messages to Servers in order to get SRA messages in return. The only requirement is that the Client must configure its SEAL encapsulation Templin Expires February 6, 2012 [Page 27] Internet-Draft IRON August 2011 to use a transport protocol that supports NAT traversal, e.g., UDP, TCP, SSL, etc. Since the Server maintains state about its Client customers, it can discover locator information for each Client by examining the UDP port number and IP address in the outer headers of the Client's encapsulated SRS packets. When there is a NAT in the path, the transport port number and IP address in each encapsulated packet will correspond to state in the NAT box and might not correspond to the actual values assigned to the Client. The Server can then encapsulate packets destined to hosts in the Client's EUN within outer headers that use this IP address and UDP port number. The NAT box will receive the packets, translate the values in the outer headers, then forward the packets to the Client. In this sense, the Server's "locator" for the Client consists of the concatenation of the IP address and transport port number. IRON does not introduce any new issues to complications raised for NAT traversal or for applications embedding address referrals in their payload. 6.8. Multicast Considerations IRON Servers and Relays are topologically positioned to provide Internet Group Management Protocol (IGMP) / Multicast Listener Discovery (MLD) proxying for their Clients [RFC4605]. Further multicast considerations for IRON (e.g., interactions with multicast routing protocols, traffic scaling, etc.) will be discussed in a separate document. 6.9. Nested EUN Considerations Each Client configures a locator that may be taken from an ordinary non-EPA address assigned by an ISP or from an EPA address taken from an EP assigned to another Client. In that case, the Client is said to be "nested" within the EUN of another Client, and recursive nestings of multiple layers of encapsulations may be necessary. For example, in the network scenario depicted in Figure 12, Client(A) configures a locator EPA(B) taken from the EP assigned to EUN(B). Client(B) in turn configures a locator EPA(C) taken from the EP assigned to EUN(C). Finally, Client(C) configures a locator ISP(D) taken from a non-EPA address delegated by an ordinary ISP(D). Using this example, the "nested-IRON" case must be examined in which a Host A, which configures the address EPA(A) within EUN(A), exchanges packets with Host Z located elsewhere in the Internet. Templin Expires February 6, 2012 [Page 28] Internet-Draft IRON August 2011 .-. ISP(D) ,-( _)-. +-----------+ .-(_ (_ )-. | Client(C) |--(_ ISP(D) ) +-----+-----+ `-(______)-' | <= T \ .-. .-. u \ ,-( _)-. ,-( _)-. n .-(_ (- )-. .-(_ (_ )-. n (_ Internet ) (_ EUN(C) ) e `-(______)-' `-(______)-' l ___ | EPA(C) s => (:::)-. +-----+-----+ .-(::::::::) | Client(B) | .-(::: IRON :::)-. +-----------+ +-----+-----+ (:::: Instance ::::) | Relay(Z) | | `-(::::::::::::)-' +-----------+ .-. `-(::::::)-' +-----------+ ,-( _)-. | Server(Z) | .-(_ (_ )-. +-----------+ +-----------+ (_ EUN(B) ) | Server(C) | +-----------+ `-(______)-' +-----------+ | Client(Z) | | EPA(B) +-----------+ +-----------+ +-----+-----+ | Server(B) | +--------+ | Client(A) | +-----------+ | Host Z | +-----------+ +-----------+ +--------+ | | Server(A) | .-. +-----------+ ,-( _)-. EPA(A) .-(_ (_ )-. +--------+ (_ EUN(A) )---| Host A | `-(______)-' +--------+ Figure 12: Nested EUN Example The two cases of Host A sending packets to Host Z, and Host Z sending packets to Host A, must be considered separately, as described below. 6.9.1. Host A Sends Packets to Host Z Host A first forwards a packet with source address EPA(A) and destination address Z into EUN(A). Routing within EUN(A) will direct the packet to Client(A), which encapsulates it in an outer header with EPA(B) as the outer source address and Server(A) as the outer destination address then forwards the once-encapsulated packet into EUN(B). Routing within EUN(B) will direct the packet to Client(B), which encapsulates it in an outer header with EPA(C) as the outer source address and Server(B) as the outer destination address then forwards the twice-encapsulated packet into EUN(C). Routing within Templin Expires February 6, 2012 [Page 29] Internet-Draft IRON August 2011 EUN(C) will direct the packet to Client(C), which encapsulates it in an outer header with ISP(D) as the outer source address and Server(C) as the outer destination address. Client(C) then sends this triple- encapsulated packet into the ISP(D) network, where it will be routed into the Internet to Server(C). When Server(C) receives the triple-encapsulated packet, it removes the outer layer of encapsulation and forwards the resulting twice- encapsulated packet into the Internet to Server(B). Next, Server(B) removes the outer layer of encapsulation and forwards the resulting once-encapsulated packet into the Internet to Server(A). Next, Server(A) checks the address type of the inner address 'Z'. If Z is a non-EPA address, Server(A) simply decapsulates the packet and forwards it into the Internet. Otherwise, Server(A) rewrites the outer source and destination addresses of the once-encapsulated packet and forwards it to Relay(Z). Relay(Z), in turn, rewrites the outer destination address of the packet to the locator for Server(Z), then forwards the packet and sends a redirect to Server(A) (which forwards the redirect to Client(A)). Server(Z) then re-encapsulates the packet and forwards it to Client(Z), which decapsulates it and forwards the inner packet to Host Z. Subsequent packets from Client(A) will then use Server(Z) as the next hop toward Host Z, which eliminates Server(A) and Relay(Z) from the path. 6.9.2. Host Z Sends Packets to Host A Whether or not Host Z configures an EPA address, its packets destined to Host A will eventually reach Server(A). Server(A) will have a mapping that lists Client(A) as the next hop toward EPA(A). Server(A) will then encapsulate the packet with EPA(B) as the outer destination address and forward the packet into the Internet. Internet routing will convey this once-encapsulated packet to Server(B), which will have a mapping that lists Client(B) as the next hop toward EPA(B). Server(B) will then encapsulate the packet with EPA(C) as the outer destination address and forward the packet into the Internet. Internet routing will then convey this twice- encapsulated packet to Server(C), which will have a mapping that lists Client(C) as the next hop toward EPA(C). Server(C) will then encapsulate the packet with ISP(D) as the outer destination address and forward the packet into the Internet. Internet routing will then convey this triple-encapsulated packet to Client(C). When the triple-encapsulated packet arrives at Client(C), it strips the outer layer of encapsulation and forwards the twice-encapsulated packet to EPA(C), which is the locator address of Client(B). When Client(B) receives the twice-encapsulated packet, it strips the outer layer of encapsulation and forwards the once-encapsulated packet to EPA(B), which is the locator address of Client(A). When Client(A) Templin Expires February 6, 2012 [Page 30] Internet-Draft IRON August 2011 receives the once-encapsulated packet, it strips the outer layer of encapsulation and forwards the unencapsulated packet to EPA(A), which is the host address of Host A. 7. Implications for the Internet The IRON architecture envisions a hybrid routing/mapping system that benefits from both the shortest-path routing afforded by pure dynamic routing systems and the routing-scaling suppression afforded by pure mapping systems. Therefore, IRON targets the elusive "sweet spot" that pure routing and pure mapping systems alone cannot satisfy. The IRON system requires a VSP deployment of new routers/servers throughout the Internet to maintain well-balanced virtual overlay networks. These routers/servers can be deployed incrementally without disruption to existing Internet infrastructure and appropriately managed to provide acceptable service levels to customers. End-to-end traffic that traverses an IRON instance may experience delay variance between the initial packets and subsequent packets of a flow. This is due to the IRON system allowing a longer path stretch for initial packets followed by timely route optimizations to utilize better next hop routers/servers for subsequent packets. IRON instances also work seamlessly with existing and emerging services within the native Internet. In particular, customers serviced by an IRON instance will receive the same service enjoyed by customers serviced by non-IRON service providers. Internet services already deployed within the native Internet also need not make any changes to accommodate VSP customers. The IRON system operates between routers within provider networks and end user networks. Within these networks, the underlying paths traversed by the virtual overlay networks may comprise links that accommodate varying MTUs. While the IRON system imposes an additional per-packet overhead that may cause the size of packets to become slightly larger than the underlying path can accommodate, IRON routers have a method for naturally detecting and tuning out instances of path MTU underruns. In some cases, these MTU underruns may need to be reported back to the original hosts; however, the system will also allow for MTUs much larger than those typically available in current Internet paths to be discovered and utilized as more links with larger MTUs are deployed. Finally, and perhaps most importantly, the IRON system provides in- built mobility management, multihoming and traffic engineering Templin Expires February 6, 2012 [Page 31] Internet-Draft IRON August 2011 capabilities that allow end user devices and networks to move about freely while both imparting minimal oscillations in the routing system and maintaining generally shortest-path routes. This mobility management is afforded through the very nature of the IRON customer/ provider relationship, and therefore requires no adjunct mechanisms. The mobility management and multihoming capabilities are further supported by forward-path reachability detection that provides "hints of forward progress" in the same spirit as for IPv6 Neighbor Discovery (ND). 8. Additional Considerations Considerations for the scalability of Internet Routing due to multihoming, traffic engineering, and provider-independent addressing are discussed in [RADIR]. Other scaling considerations specific to IRON are discussed in Appendix B. Route optimization considerations for mobile networks are found in [RFC5522]. 9. Related Initiatives IRON builds upon the concepts of the RANGER architecture [RFC5720] , and therefore inherits the same set of related initiatives. The Internet Research Task Force (IRTF) Routing Research Group (RRG) mentions IRON in its recommendation for a routing architecture [RFC6115]. Virtual Aggregation (VA) [GROW-VA] and Aggregation in Increasing Scopes (AIS) [EVOLUTION] provide the basis for the Virtual Prefix concepts. Internet Vastly Improved Plumbing (Ivip) [IVIP-ARCH] has contributed valuable insights, including the use of real-time mapping. The use of Servers as mobility anchor points is directly influenced by Ivip's associated TTR mobility extensions [TTRMOB]. [RO-CR] discusses a route optimization approach using a Correspondent Router (CR) model. The IRON Server construct is similar to the CR concept described in this work; however, the manner in which Clients coordinate with Servers is different and based on the redirection model associated with NBMA links [RFC5214]. Numerous publications have proposed NAT traversal techniques. The NAT traversal techniques adapted for IRON were inspired by the Simple Address Mapping for Premises Legacy Equipment (SAMPLE) proposal Templin Expires February 6, 2012 [Page 32] Internet-Draft IRON August 2011 [SAMPLE]. The IRON Client-Server relationship is managed in essentially the same way as for the Tunnel Broker model [RFC3053]. Numerous existing tunnel broker provider networks (e.g., Hurricane Electric, SixXS, freenet6, etc.) provide existence proofs that IRON-like overlay network services can be deployed and managed on a global basis [BROKER]. 10. Security Considerations Security considerations that apply to tunneling in general are discussed in [V6OPS-TUN-SEC]. Additional considerations that apply also to IRON are discussed in RANGER [RFC5720] , VET [INTAREA-VET] and SEAL [INTAREA-SEAL]. The IRON system further depends on mutual authentication of IRON Clients to Servers and Servers to Relays. This is accomplished through initial authentication exchanges that establish tunnel- neighbor NBR_ID values that can be used to detect off-path attacks. As for all Internet communications, the IRON system also depends on Relays acting with integrity and not injecting false advertisements into the BGP (e.g., to mount traffic siphoning attacks). IRON Servers must ensure that any changes in a Client's locator addresses are communicated only through an authenticated exchange that is not subject to replay. For this reason, Clients periodically send digitally-signed SRS messages to the Server. If the Client's locator address stays the same, the Server can accept the SRS message without verifying the signature as long as the NBR_ID of the SRS matches the Client. If the Client's locator address changes, the Server must verify the SRS message's signature before accepting the message. Once the message has been authenticated, the Server updates the Client's locator address to the new address. Each IRON instance requires a means for assuring the integrity of the interior routing system so that all Relays and Servers in the overlay have a consistent view of Client<->Server bindings. Finally, Denial- of-Service (DoS) attacks on IRON Relays and Servers can occur when packets with spoofed source addresses arrive at high data rates. However, this issue is no different than for any border router in the public Internet today. Middleboxes can interfere with tunneled packets within an IRON instance in various ways. For example, a middlebox may alter a packet's contents, change a packet's locator addresses, inject spurious packets, replay old packets, etc. These issues are no Templin Expires February 6, 2012 [Page 33] Internet-Draft IRON August 2011 different than for middlebox interactions with ordinary Internet communications. If man-in-the-middle attacks are a matter for concern in certain deployments, however, IRON Agents can use IPsec to protect the authenticity, integrity and (if necessary) privacy of their tunneled packets. 11. Acknowledgements The ideas behind this work have benefited greatly from discussions with colleagues; some of which appear on the RRG and other IRTF/IETF mailing lists. Robin Whittle and Steve Russert co-authored the TTR mobility architecture, which strongly influenced IRON. Eric Fleischman pointed out the opportunity to leverage anycast for discovering topologically close Servers. Thomas Henderson recommended a quantitative analysis of scaling properties. The following individuals provided essential review input: Jari Arkko, Mohamed Boucadair, Stewart Bryant, John Buford, Ralph Droms, Wesley Eddy, Adrian Farrel, Dae Young Kim, and Robin Whittle. 12. References 12.1. Normative References [RFC0791] Postel, J., "Internet Protocol", STD 5, RFC 791, September 1981. [RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6 (IPv6) Specification", RFC 2460, December 1998. 12.2. Informative References [BGPMON] net, B., "BGPmon.net - Monitoring Your Prefixes, http://bgpmon.net/stat.php", June 2010. [BROKER] Wikipedia, W., "List of IPv6 Tunnel Brokers, http://en.wikipedia.org/wiki/List_of_IPv6_tunnel_brokers", August 2011. [EVOLUTION] Zhang, B., Zhang, L., and L. Wang, "Evolution Towards Global Routing Scalability", Work in Progress, October 2009. [GROW-VA] Francis, P., Xu, X., Ballani, H., Jen, D., Raszuk, R., and L. Zhang, "FIB Suppression with Virtual Aggregation", Work Templin Expires February 6, 2012 [Page 34] Internet-Draft IRON August 2011 in Progress, February 2011. [INTAREA-SEAL] Templin, F., Ed., "The Subnetwork Encapsulation and Adaptation Layer (SEAL)", Work in Progress, February 2011. [INTAREA-VET] Templin, F., Ed., "Virtual Enterprise Traversal (VET)", Work in Progress, January 2011. [IVIP-ARCH] Whittle, R., "Ivip (Internet Vastly Improved Plumbing) Architecture", Work in Progress, March 2010. [RADIR] Narten, T., "On the Scalability of Internet Routing", Work in Progress, February 2010. [RFC1070] Hagens, R., Hall, N., and M. Rose, "Use of the Internet as a subnetwork for experimentation with the OSI network layer", RFC 1070, February 1989. [RFC3053] Durand, A., Fasano, P., Guardini, I., and D. Lento, "IPv6 Tunnel Broker", RFC 3053, January 2001. [RFC4192] Baker, F., Lear, E., and R. Droms, "Procedures for Renumbering an IPv6 Network without a Flag Day", RFC 4192, September 2005. [RFC4271] Rekhter, Y., Li, T., and S. Hares, "A Border Gateway Protocol 4 (BGP-4)", RFC 4271, January 2006. [RFC4548] Gray, E., Rutemiller, J., and G. Swallow, "Internet Code Point (ICP) Assignments for NSAP Addresses", RFC 4548, May 2006. [RFC4605] Fenner, B., He, H., Haberman, B., and H. Sandick, "Internet Group Management Protocol (IGMP) / Multicast Listener Discovery (MLD)-Based Multicast Forwarding ("IGMP/MLD Proxying")", RFC 4605, August 2006. [RFC5214] Templin, F., Gleeson, T., and D. Thaler, "Intra-Site Automatic Tunnel Addressing Protocol (ISATAP)", RFC 5214, March 2008. [RFC5522] Eddy, W., Ivancic, W., and T. Davis, "Network Mobility Route Optimization Requirements for Operational Use in Aeronautics and Space Exploration Mobile Networks", RFC 5522, October 2009. Templin Expires February 6, 2012 [Page 35] Internet-Draft IRON August 2011 [RFC5720] Templin, F., "Routing and Addressing in Networks with Global Enterprise Recursion (RANGER)", RFC 5720, February 2010. [RFC5743] Falk, A., "Definition of an Internet Research Task Force (IRTF) Document Stream", RFC 5743, December 2009. [RFC5887] Carpenter, B., Atkinson, R., and H. Flinck, "Renumbering Still Needs Work", RFC 5887, May 2010. [RFC6115] Li, T., "Recommendation for a Routing Architecture", RFC 6115, February 2011. [RFC6139] Russert, S., Fleischman, E., and F. Templin, "Routing and Addressing in Networks with Global Enterprise Recursion (RANGER) Scenarios", RFC 6139, February 2011. [RO-CR] Bernardos, C., Calderon, M., and I. Soto, "Correspondent Router based Route Optimisation for NEMO (CRON)", Work in Progress, July 2008. [SAMPLE] Carpenter, B. and S. Jiang, "Legacy NAT Traversal for IPv6: Simple Address Mapping for Premises Legacy Equipment (SAMPLE)", Work in Progress, June 2010. [TTRMOB] Whittle, R. and S. Russert, "TTR Mobility Extensions for Core-Edge Separation Solutions to the Internet's Routing Scaling Problem, http://www.firstpr.com.au/ip/ivip/TTR-Mobility.pdf", August 2008. [V6OPS-TUN-SEC] Krishnan, S., Thaler, D., and J. Hoagland, "Security Concerns With IP Tunneling", Work in Progress, October 2010. Appendix A. IRON VPs over Internetworks with Different Address Families The IRON architecture leverages the routing system by providing generally shortest-path routing for packets with EPA addresses from VPs that match the address family of the underlying Internetwork. When the VPs are of an address family that is not routable within the underlying Internetwork, however, (e.g., when OSI/NSAP [RFC4548] VPs are used within an IPv4 Internetwork) a global VP mapping database is required. The mapping database allows the Relays of the local IRON instance to map VPs belonging to other IRON instances to companion prefixes taken from address families that are routable within the Internetwork. For example, an IPv6 VP (e.g., 2001:DB8::/32) could be Templin Expires February 6, 2012 [Page 36] Internet-Draft IRON August 2011 paired with a companion IPv4 prefix (e.g., 192.0.2.0/24) so that encapsulated IPv6 packets can be forwarded over IPv4-only Internetworks. In that case, every VP must be represented in a globally distributed Master VP database (MVPd) that maintains VP-to-companion prefix mappings for all VPs in the IRON. The MVPd is maintained by a globally managed assigned numbers authority in the same manner as the Internet Assigned Numbers Authority (IANA) currently maintains the master list of all top-level IPv4 and IPv6 delegations. The database can be replicated across multiple servers for load balancing, much in the same way that FTP mirror sites are used to manage software distributions. Upon startup, each Relay advertises an IPv4 companion prefix (e.g., 192.0.2.0/24) into the internetwork IPv4 routing system and/or an IPv6 companion prefix (e.g., 2001:DB8::/64) into the internetwork IPv6 routing system for the IRON instance that it serves. The Relay then configures the host number '1' in the IPv4 companion prefix (e.g., as 192.0.2.1) and the interface identifier '0' in the IPv6 companion prefix (e.g., as 2001:DB8::0), and assigns the resulting addresses as "Relay anycast" addresses for the IRON instance. The Relay then discovers the full set of VPs for all other IRON instances by reading the MVPd. The Relay reads the MVPd from a nearby server and periodically checks the server for deltas since the database was last read. After reading the MVPd, the Relay has a full list of VP-to-companion prefix mappings. The Relay can then forward packets toward EPAs belonging to other IRON instances by encapsulating them in an outer header of the companion prefix address family and using the Relay anycast address as the outer destination address. Possible encapsulations in this model include IPv6-in-IPv4, IPv4-in- IPv6, OSI/CLNP-in-IPv6, OSI/CLNP-in-IPv4, etc. Appendix B. Scaling Considerations Scaling aspects of the IRON architecture have strong implications for its applicability in practical deployments. Scaling must be considered along multiple vectors, including Interdomain core routing scaling, scaling to accommodate large numbers of customer EUNs, traffic scaling, state requirements, etc. In terms of routing scaling, each VSP will advertise one or more VPs into the global Internet routing system from which EPs are delegated to customer EUNs. Routing scaling will therefore be minimized when each VP covers many EPs. For example, the IPv6 prefix 2001:DB8::/32 Templin Expires February 6, 2012 [Page 37] Internet-Draft IRON August 2011 contains 2^24 ::/56 EP prefixes for assignment to EUNs; therefore, the IRON could accommodate 2^32 ::/56 EPs with only 2^8 ::/32 VPs advertised in the interdomain routing core. (When even longer EP prefixes are used, e.g., /64s assigned to individual handsets in a cellular provider network, considerable numbers of EUNs can be represented within only a single VP.) In terms of traffic scaling for Relays, each Relay represents an ASBR of a "shell" enterprise network that simply directs arriving traffic packets with EPA destination addresses towards Servers that service customer EUNs. Moreover, the Relay sheds traffic destined to EPAs through redirection, which removes it from the path for the vast majority of traffic packets. On the other hand, each Relay must handle all traffic packets forwarded between its customer EUNs and the non-IRON Internet. The scaling concerns for this latter class of traffic are no different than for ASBR routers that connect large enterprise networks to the Internet. In terms of traffic scaling for Servers, each Server services a set of the VSP customer EUNs. The Server services all traffic packets destined to its EUNs but only services the initial packets of flows initiated from the EUNs and destined to EPAs. Therefore, traffic scaling for EPA-addressed traffic is an asymmetric consideration and is proportional to the number of EUNs each Server serves. In terms of state requirements for Relays, each Relay maintains a list of all Servers in the IRON instance as well as FIB entries for all customer EUNs that each Server serves. This state is therefore dominated by the number of EUNs in the IRON instance. Sizing the Relay to accommodate state information for all EUNs is therefore required during overlay network planning. In terms of state requirements for Servers, each Server maintains tunnel-neighbor state only for the customer EUNs it serves, and not for the customers served by other Servers in the IRON instance. Finally, neither Relays nor Servers need keep state for final destinations of outbound traffic. Clients source and sink all traffic packets originating from or destined to the customer EUN. Therefore, traffic scaling considerations for Clients are the same as for any site border router. Clients also retain unidirectional tunnel state for the Servers for final destinations of outbound traffic flows. This can be managed as soft state, since stale entries purged from the cache will be refreshed when new traffic packets are sent. Templin Expires February 6, 2012 [Page 38] Internet-Draft IRON August 2011 Author's Address Fred L. Templin (editor) Boeing Research & Technology P.O. Box 3707 MC 7L-49 Seattle, WA 98124 USA EMail: fltemplin@acm.org Templin Expires February 6, 2012 [Page 39]