Network Working Group D. Meyer Internet-Draft D. Lewis Intended status: Informational Cisco Expires: June 11, 2009 December 8, 2008 Architectural Implications of Locator/ID Separation draft-meyer-loc-id-implications-00.txt Status of this Memo By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on June 11, 2009. Abstract Recent work on Locator/ID Separation has focused primarily on the control plane protocols concerned with finding Identifier-to-Locator mappings. However, experience gained with a trial deployment of a system designed to implement Locator/ID Separation has revealed two general classes of problems which must be resolved after the mapping is found: The Locator Path Liveness Problem and the State Synchronization Problem. These problems have implications for the data plane as well as the control plane. Meyer & Lewis Expires June 11, 2009 [Page 1] Internet-Draft Loc/ID Split Implications December 2008 Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 2. The Problem Space . . . . . . . . . . . . . . . . . . . . . . 4 3. The Locator Path Liveness Problem . . . . . . . . . . . . . . 4 3.1. Complexity . . . . . . . . . . . . . . . . . . . . . . . . 6 3.1.1. Complexity of Host-Based Probing . . . . . . . . . . . 7 3.1.2. Complexity of Network-Based Probing . . . . . . . . . 7 3.2. Possible Optimizations . . . . . . . . . . . . . . . . . . 7 3.3. Security Issues . . . . . . . . . . . . . . . . . . . . . 9 4. Site-Based State Synchronization . . . . . . . . . . . . . . . 10 4.1. Complexity . . . . . . . . . . . . . . . . . . . . . . . . 11 5. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . 11 6. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 11 7. IANA Considersations . . . . . . . . . . . . . . . . . . . . . 11 8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 11 8.1. Normative References . . . . . . . . . . . . . . . . . . . 11 8.2. Informative References . . . . . . . . . . . . . . . . . . 13 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 13 Intellectual Property and Copyright Statements . . . . . . . . . . 14 Meyer & Lewis Expires June 11, 2009 [Page 2] Internet-Draft Loc/ID Split Implications December 2008 1. Introduction Locator/ID Separation (hereafter Loc/ID split) has been proposed as an architectural enhancement to the Internet architecture to facilitate, among other things, scaling of the global routing system [RFC1498][Chiappa99][Fuller06][RFC4984]. The basic idea is that the current number space (the IPv4/IPv6 address space) is overloaded with both location and identity semantics. One consequence of this overloading is that it is difficult to assign routing locators (RLOCs) in a way that is congruent with the underlying network topology; this makes aggregation difficult (if not impossible). This property is sometimes referred to as Rekhter's Law, and is frequently formulated as follows: "Addressing can follow topology or topology can follow addressing. Choose one." Endpoint Identifiers (EIDs), on the other hand, are typically assigned without regard to the underlying network topology (e.g., Host Identity Tags [RFC4423]). This makes it difficult for a single numbering space to efficiently serve the routing locator and endpoint identifier roles. Locator/Identity Separation can be used to decouple the allocation of of EIDs from RLOCs, enabling the RLOC space to be aggressively aggregated (i.e., by aligning RLOC allocations with the underlying network topology). The positive effect of such aggregation would be to control the growth of global routing state (note that aggregation in the EID space may also an issue, but as of this writing hasn't been extensively explored). Recent work on Locator/ID Separation has focused almost exclusively on control plane protocols for finding Identifier-to-Locator mappings (for example, [I-D.fuller-lisp-alt][I-D.jen-apt] [I-D.lear-lisp-nerd]). However, experience gained with a trial deployment of a system designed to implement Locator/ID Separation has revealed two general classes of problems which must be resolved after the mapping is found: The Locator Path Liveness Problem and the State Synchronization Problem. These problems have implications for the data plane as well as the control plane. This document focuses on the Locator Path Liveness and State Synchronization problems, and is organized as follows: Section 2 provides an overview of the problem space. Section 3 discusses the Locator Path Liveness problem, and Section 4 discusses the State Synchronization problem. Finally, Section 5 provides a few conclusions. Meyer & Lewis Expires June 11, 2009 [Page 3] Internet-Draft Loc/ID Split Implications December 2008 2. The Problem Space Decoupling Location and Identity has profound implications both the control and data planes. In particular, decoupling location from identity leads to the two difficult problems: First, give a set of source locators and a set of destination locators, it must be possible to determine whether a particular destination locator is reachable. We refer to this general problem as the Locator Path Liveness Problem. The Locator Path Liveness Problem is exhibited in host-based architectures such as SHIM6 [I-D.ietf-shim6-proto]) and network-based architectures such as eFIT [EFIT] and [LISP]). The "Hybrid Rewriting" class of architectures (e.g., ,GSE [ODell97]) exhibit a slight variant on the problem. Locator Liveness is discussed in Section 3. The second problem is that mapping state may need to be shared among network elements (this is as opposed to the determining if the locator itself is up or down). This is referred to as the Site-Based State Synchronization Problem, and is specific to network-based architectures. The Site-Based State Synchronization problem is discussed in Section 4. 3. The Locator Path Liveness Problem The Locator Path Liveness Problem can be stated as follows: Given a set of source locators and a set of destination locators, can bidirectional connectivity be determined between the address pairs? A simple example illustrates the problem: Consider the scenario depicted in Figure 1. Here a site S0 is multihomed to provider A and provider B. Further, suppose that S0 has a Provider Assigned (PA) locator from provider A (call it La) and a PA locator, Lb, from provider B. Suppose that provider A peers with provider B. In this case, S0 might "advertise" that its EID-prefixes can be reached through nodes La and Lb (either via DNS, explicit protocol message such as a Map-Reply message [LISP], or other method) to its correspondent sites. Now, suppose that a correspondent site S1 is connected to provider C, and that S0 has told S1 that it can reach S0 on either La or Lb. Suppose further that S1 chooses La to reach S0, so that packets sourced from S1 destined for S0 traverse the path S1->C->B->A->S0. Note that if connectivity between provider B and provider A is disrupted (for either business or technical reasons), La will not be reachable from S1. In this case, S1 must detect that La is no longer Meyer & Lewis Expires June 11, 2009 [Page 4] Internet-Draft Loc/ID Split Implications December 2008 reachable and use Lb to restore connectivity (in the event that S1 wants to restore connectivity; in today's Internet, would S0 would continue to be unreachable). S1 | | C | | A-----------------B \ peering link / \ / \ / \ / \ / La Lb \ / S0 Figure 1: Reachibility Failure The Locator Path Liveness problem arises in subtly different ways, depending on the contents of the mapping database (i.e., EIDs, RLOCs, or some combination of these), who queries the database (host or network element), and how knowledge is distributed between hosts and routing. Note that in general, Locator Path Liveness must be tested in the data plane (although an implementation might take advantage of various "hints; see Section 3.2). Host-Based Architectures: In host-based architectures (e.g., SHIM6 [I-D.ietf-shim6-proto]), the problem arises because queries to the database (DNS in this case) return "addresses" which can be thought of as a concatenation of the RLOC and EID. Since a host is anticipated to have multiple such "addresses" (at least in the SHIM6 case), it must choose a working pair from among its potential source addresses and its correspondent destination addresses. REAP [I-D.ietf-shim6-failure-detection] is a probe-based reachability protocol which is designed to address this problem. Hybrid Network-Based Rewriting Archtectures: In hybrid network-based rewriting architectures (e.g., GSE [ODell97]), the problem arises because there is a knowledge asymmetry between the host and routing. Specifically, while the host is responsible for selecting the destination Routing Goop (RG) (i.e., the ingress point to the destination domain, essentially the destination RLOC), it is routing that selects the source RG. So while the IGP Meyer & Lewis Expires June 11, 2009 [Page 5] Internet-Draft Loc/ID Split Implications December 2008 routing in a domain can be intelligent about egress points from the domain, it is the destination address, chosen by the host, that selects the ingress point in the destination domain. However it is routing, and not the host, that knows if the destination is reachable or not. Section 4.2 of [Zhang06] discusses this issue from a slightly different point of view. This asymmetry gives rise to the following problem: Hosts will likely want information, at some granularity, about which pairs currently work. However, the host has no information about how many RGs are available to the site or if they are currently reachable. So the host can not test the set of pairs for active paths. On the other hand, the routing can't either, unless it snoops on TCP connections (which doesn't deal with asymmetric paths, UDP flows, or unidirectional flows). It is worth noting that unlike most "modern" descriptions of how GSE uses the DNS (e.g., [Zhang06]), the original GSE design [ODell97][ODell08] envisioned that the DNS would have a new resource record type, the RG record, to carry a site's RGs. Hosts would only have AAAA records. The idea was that for a given destination domain, a host in the source domain would compute the Cartesian Product {RGs}x{A4s}. Thus alternate path sensing would become a a matter of local policy, and not hard-wired by the destination domain (or whoever happens to be authoritative for the destination domain's names). Notice however that even the introduction of the RG resource record, the knowledge asymmetry remains. Network-Based Map-and-Encap Archtectures: In the case of map-and- encap network-based architectures, the problem arises because the mapping element (e.g., Ingress Tunnel Router, or ITR) must choose among the RLOCs it has learned for a given EID-prefix. Here since the ITR holds the mappings that knows the set of possible remote "addresses" and not the host, the host may choose among multiple EIDs, but it cannot choose among the possible RLOCs (the host has no access to that information). Hence if the ITR chooses a RLOC that may not be reachable, traffic to the destination site will be blackholed, and the host is left with no recourse. 3.1. Complexity The complexity of testing Locator Path Liveness in the data plane is roughly O(M*N), where there are M source addresses and N destination addresses. The following sections more closely analyze the complexity of host-based and network-based liveness probing. Meyer & Lewis Expires June 11, 2009 [Page 6] Internet-Draft Loc/ID Split Implications December 2008 3.1.1. Complexity of Host-Based Probing Host-based implementations must keep per-correspondent host liveness state. The complexity of probing in a host-based implementation can be though of as follows: Let C = the number of correspondent hosts Let D_i = the number of destination locators for host C_i Let S = the number of source locators Then the complexity of host-based probing, P_host, is O(P_host), where P_host = S*sum(D_i), i = 0...C-1 3.1.2. Complexity of Network-Based Probing Network-based implementations must keep per-destination egress point liveness. The complexity of probing in a network-based implementation can be thought of as follows: Let N = the number of EID-prefixes in a network element's cache Let L_i = the number of locators for EID-prefix N_i Let M = the number of source locators Then the complexity of network-based probing, P_network, can be described as O(P_network), where P_network = M*sum(L_i), i = 0...N-1 Note that a network-based probing scheme might have an advantage here since a single EID-prefix may cover many correspondent hosts. That is, sum(L_i), i = 0...N-1 << sum(D_i), i = 0...C-1 3.2. Possible Optimizations The previous sections analyzed the complexity of explicitly probing to assess Locator Path Liveness. In order to mitigate this complexity, an implementation might attempt to rely on the various "hints". The following sections, while not intended to be an exhaustive survey, outline some of the Locator Path Liveness hints an implementation may utilize. Data Traffic: When data is received, an implementation might assume that the source of that traffic is reachable, and as such probing might not be needed. Of course, this is at best a unidirectional "hint" that an implementation might use to determine locator liveness. Of course, only a complete round trip, wherein the Meyer & Lewis Expires June 11, 2009 [Page 7] Internet-Draft Loc/ID Split Implications December 2008 distant site says something back to the local site which the local site originally sent to the distant site, can one then guarantee that the distant site can hear the local site. A variation on this theme is to "piggyback" liveness testing on user data traffic, by adding a Solicit-User-Probe-Reply bit, which tells the far end to send back the next user data packet(s) with the outbound nonce, and a User-Probe-Reply bit set. Of course, this optimization depends on the existence of some traffic (even if not for the same connection) going between pairs of border elements. That is, if a particular pair has only traffic in one direction, this method fails. In addition, it requires extra processing on user data packets, extra overhead in the packets (a field, some bits), and extra protocol complication. Of course, such piggybacking only provides the view from remote domain, not whether the locator is actually reachable from the recipient of the "User-Probe-Bit". Protocol Control Messages: If a protocol control message is received (for example, a Map-Reply), an implementation may conclude that the source of that is reachable. Again, in the best case, this is only a hint, since receipt of the control message proves only unidirectional connectivity. Piggybacking Liveness Indications: A network-based architecture might piggyback indication of intra-domain locator liveness on other data and/or protocol messages. An example of this approach is LISP's use of loc-reach bits to indicate which Egress Tunnel Routers in a domain are up (from an inside the domain perspective). Existence of the Locator in underlying routing: A device which is responsible for locator liveness can utilize underlying routing to determine if the locator is at all available. If the network prefix (or a covering aggregate) for the destination locator is NOT found in underlying routing, then the path will not be available. This is at best a negative detection, it can show when a path is not available, but liveness of a particular locator. A given locator may still be unavailable and this not be shown in routing, due to data plane filtering, or the reachability being hidden by aggregation of the particular locator prefix. Positive Feedback From Other Protocols: An implementation may be able to deduce some forms of reachability from other protocols. For example, TCP might indicate to the IP layer that it believes that there is bidirectional connectivity between a given address pair. This might be signalled to the source when it receives a SYN-ACK from the destination RLOC. As pointed out in Meyer & Lewis Expires June 11, 2009 [Page 8] Internet-Draft Loc/ID Split Implications December 2008 [I-D.ietf-shim6-failure-detection], this is similar to how IPv6 Neighbor Unreachability Detection can be avoided when upper layers provide information about bidirectional connectivity [RFC4861]. If an implementation has access to higher layer protocols (e.g., BGP), it might get a hint as to the reachability of a given locator. In the case of BGP, an implementation might conclude that the locator is reachable if there is a covering prefix in the BGP Routing Information Base (RIB). Again, this is a hint, because the correspondent host may be down. Timeouts: An implementation may be able to deduce some forms of Unreachability from timeouts of other protocols.For example, TCP to indicate that there is a lack of connectivity because it is not getting ACKs (of course, the signal is overloaded: there may be congestion). ICMP Messages: While ICMP is an available signalling protocol, due to its lack of security (in particular, ease of spoofing [I-D.ietf-tcpm-icmp-attacks]) and the fact that common policy is to block or rate limited ICMP, its utility has been somewhat marginalized (see Section 3.3). As such, ICMP may perhaps be used as a hint but beyond that, an implementation can not rely on ICMP as a signalling mechanism. QQQ: Again, when do I know a locator is up? If I probe and the response is positive, does that mean its up (i.e., it can go down in the interim, so what is the time granularity, and what effect does that have on efficiency? In general, depending on end-to-end liveness indications is applicable to only to host-based solutions (e.g., [I-D.ietf-shim6-proto]). A network-based implementation may rely on higher layer protocols to indicate liveness (for example, an implementation may be able deduce a limited form of reachability from the existence of a BGP route covering the destination RLOC), but these too can only be used as hints. In the general case, however, an architecture that implements Loc/ID split (either host-based or network-based) will need to test Locator Path Liveness in the data plane 3.3. Security Issues Mere inspection of insecure traffic may lead to false negative detection due to the insertion of malicious traffic. For instance, packets that masquerade as coming from a site may tamper with the loc-reach-bits, making the site locators look unreachable where in fact they are reachable [LISP]. Meyer & Lewis Expires June 11, 2009 [Page 9] Internet-Draft Loc/ID Split Implications December 2008 ICMP Messages: ICMP messages are are easily spoofable [I-D.ietf-tcpm-icmp-attacks], so may be exploited to provide false negatives. However, they are also rate limited and often outright disabled, leaving a site sending data to a remote RLOC under the impression that the RLOC is reachable (as a false positive side effect). Existance of the Locator in the BGP RIB: This vulnerability is shared by non-Loc/ID split architectures (need reference to Pakistani-youtube example as a way compromised routing can break path liveness). Aside from the ability to mislead a poorly implemented probing mechanism with data spoofing, probing creates a fundamentally unscalable relationship between site pairs (see Section 3.1). This leads to both implicit (unscalable) and explicit (vulnerable to probe floods) Denial of Service vulnerability in the systems receiving probe requests. Finally, note that in the case of network-based Loc/ID split architectures, the RLOCs of border elements represent reachability on behalf of entire site. As a result, failure to detect path liveness can disrupt connectivity to the entire site. On the other hand, in host-based LIS, only individual hosts are compromised. 4. Site-Based State Synchronization The Site-Based State Synchronization problem is specific to network- based Loc/ID split architectures. There are two kinds of state synchronization that might need to be performed: mapping state synchronization and locator liveness synchronization. The Site-Based State Synchronization problem can most easily be demonstrated by a simple example. Consider the following case: A site has two ITRs; one ITR is on the active path and the other ITR is on a backup path. In this case, all traffic egressing from the site traverses the ITR on the active path, and as a result that ITR is caching the mapping state for all of the active flows. The ITR on the backup path has no mapping state. Now, when the ITR on the active path fails, traffic is naturally shifted to the ITR on the backup path. If the now active ITR hasn't synchronized its state with the previously active ITR(s), then the newly active ITR has to reconstruct the mapping state for the flows that were traversing the failed ITR. In particular, the failure, which is local to the site, requires the now active ITR to go off-site to reconstruct the state. Meyer & Lewis Expires June 11, 2009 [Page 10] Internet-Draft Loc/ID Split Implications December 2008 4.1. Complexity TBD 5. Conclusions Architectures that implement Locator/ID Separation (either host or network based) need to carefully evaluate the complexity inherent in determining Locator Path Liveness. The complexity of mapping state synchronization is an additional concern for network-based architectures. 6. Acknowledgments Scott Brim, Noel Chiappa, John Day, Dino Farinacci, Vince Fuller, Mike O'Dell, Andrew Partan, and John Zwiebel provided insightful comments on early versions of this document. 7. IANA Considersations This document creates no new requirements on IANA namespaces [RFC2434]. 8. References 8.1. Normative References [Chiappa99] Chiappa, N., "Endpoints and Endpoint Names: A Proposed Enhancement to the Internet Architecture", xxx 1999. [EFIT] Massey, D., "A Proposal for Scalable Internet Routing & Addressing", Feb 2007. [Fuller06] Fuller, V., "Scaling issues with ipv6 routing+ multihoming", Oct 2006. [I-D.fuller-lisp-alt] Farinacci, D., "LISP Alternative Topology (LISP+ALT)", draft-fuller-lisp-alt-02 (work in progress), April 2008. [I-D.ietf-shim6-failure-detection] Arkko, J. and I. Beijnum, "Failure Detection and Locator Meyer & Lewis Expires June 11, 2009 [Page 11] Internet-Draft Loc/ID Split Implications December 2008 Pair Exploration Protocol for IPv6 Multihoming", draft-ietf-shim6-failure-detection-13 (work in progress), June 2008. [I-D.ietf-shim6-proto] Nordmark, E. and M. Bagnulo, "Shim6: Level 3 Multihoming Shim Protocol for IPv6", draft-ietf-shim6-proto-10 (work in progress), February 2008. [I-D.ietf-tcpm-icmp-attacks] Gont, F., "ICMP attacks against TCP", draft-ietf-tcpm-icmp-attacks-03 (work in progress), March 2008. [I-D.jen-apt] Jen, D., Meisel, M., Massey, D., Wang, L., Zhang, B., and L. Zhang, "APT: A Practical Transit Mapping Service", draft-jen-apt-01 (work in progress), November 2007. [I-D.lear-lisp-nerd] Lear, E., "NERD: A Not-so-novel EID to RLOC Database", draft-lear-lisp-nerd-04 (work in progress), April 2008. [LISP] Farinacci, D., Fuller, V., Oran, D., and D. Meyer, "Locator/ID Separation Protocol (LISP)", draft-farinacci-lisp-10 (work in progress), Oct 2008. [ODell08] Odell, M., "GSE - An Alternate Addressing Architecture for IPv6 (Private Communication)", Dec 2008. [ODell97] Odell, M., "GSE - An Alternate Addressing Architecture for IPv6", Oct 2006. [RFC1498] Saltzer, J., "On the Naming and Binding of Network Destinations", RFC 1498, August 1993. [RFC2434] Narten, T. and H. Alvestrand, "Guidelines for Writing an IANA Considerations Section in RFCs", BCP 26, RFC 2434, October 1998. [RFC4423] Moskowitz, R. and P. Nikander, "Host Identity Protocol (HIP) Architecture", RFC 4423, May 2006. [RFC4861] Narten, T., Nordmark, E., Simpson, W., and H. Soliman, "Neighbor Discovery for IP version 6 (IPv6)", RFC 4861, September 2007. [RFC4984] Meyer, D., Zhang, L., and K. Fall, "Report from the IAB Meyer & Lewis Expires June 11, 2009 [Page 12] Internet-Draft Loc/ID Split Implications December 2008 Workshop on Routing and Addressing", RFC 4984, September 2007. [Zhang06] Zhang, L., "An Overview of Multihoming and Open Issues in GSE", Sept 2006. 8.2. Informative References Authors' Addresses David Meyer Cisco Email: dmm@1-4-5.net Darrel Lewis Cisco Email: darlewis@cisco.com Meyer & Lewis Expires June 11, 2009 [Page 13] Internet-Draft Loc/ID Split Implications December 2008 Full Copyright Statement Copyright (C) The IETF Trust (2008). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Intellectual Property The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org. Meyer & Lewis Expires June 11, 2009 [Page 14]