Network Working Group X. Xu Internet-Draft Huawei Intended status: Standards Track P. Francis Expires: April 29, 2009 Cornell U. October 26, 2008 Tunnel Endpoints in BGP draft-xu-idr-tunnel-00.txt Status of this Memo By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on April 29, 2009. Abstract Virtual Aggregation (VA) is a mechanism for shrinking the size of the DFZ FIB in routers [I-D.francis-idr-intra-va]. VA can result in longer paths and increased load on routers within the ISP that deploys VA. This document describes a mechanism that allows an AS that originates a route to associate a tunnel endpoint terminating at itself with the route. This allows routers in a remote AS to tunnel packets to the originating AS. If transit ASes between the remote AS and the originating AS install the prefixes associated with tunnel endpoints in their FIBs, then tunneled packets that transit through them will take the shortest path. This results in reduced load for the transit AS, and better performance for the customers at the Xu & Francis Expires April 29, 2009 [Page 1] Internet-Draft BGP Tunnel Endpoint October 2008 source and destination. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1. Requirements notation . . . . . . . . . . . . . . . . . . . 4 2. Endpoint Address Sub-TLV Definition . . . . . . . . . . . . . . 4 3. Usage of the TE-Attribute with an Endpoint Address Sub-TLV . . 4 3.1. Originating AS . . . . . . . . . . . . . . . . . . . . . . 4 3.2. Non-Originating ASes . . . . . . . . . . . . . . . . . . . 6 4. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 7 5. Security Considerations . . . . . . . . . . . . . . . . . . . . 7 6. Normative References . . . . . . . . . . . . . . . . . . . . . 7 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 7 Intellectual Property and Copyright Statements . . . . . . . . . . 9 Xu & Francis Expires April 29, 2009 [Page 2] Internet-Draft BGP Tunnel Endpoint October 2008 1. Introduction Virtual Aggregation (VA) [I-D.francis-idr-intra-va] is a mechanism for reducing FIB size for routers within the AS that deploys VA. This is done through "FIB Suppression", where certain routers in the AS may not install routes to certain prefixes in their FIB. The downside of using VA is that packets addressed to suppressed prefixes transiting the AS may take a longer path than otherwise necessary. For instance, imagine a packet traversing AS-path S-A-B-C-D, where ASes S and D are the service providers for their respective customers. Further, assume that ASes A, C, and D are using VA, and that A and C are FIB-suppressing the prefix associated with the packet. In this case, when the packet transits A and C, there is a good chance that it will take an extra router hop within A and C. This increases load for A and C, and degrades performance for S's and D's customers. The mechanism described in this draft allows D, for instance, to associate a tunnel endpoint with the prefixes that it originates. The tunnel endpoint is an anycasted address that terminates at all of D's routers. If A and C FIB-install the route to the prefix associated with the tunnel endpoint, then packets tunneled to the FIB-suppressed prefix will take the shortest path. This draft describes a mechanism for advertising the tunnel endpoint address in BGP. It does so without changes to how BGP computes routes, and in such a way that packets always follow the expected AS path. In other words, a tunnel T to a prefix P is not used unless the AS-path of the tunnel route and the AS-path of the prefix route are the same. This draft uses the Tunnel Encapsulation Attribute (TE-Attribute) defined in [I-D.ietf-softwire-encaps-safi] to encode the tunnel information. However, whereas [I-D.ietf-softwire-encaps-safi] couples the TE-Attribute with the "Encapsulation SAFI", this draft uses the TE-Attribute in normal BGP updates transmitted over multiple ASes across the Internet. This draft extends the use of the optional, transitive TE-Attribute defined in section 4 of [I-D.ietf-softwire-encaps-safi]. Its purpose, as defined in [I-D.ietf-softwire-encaps-safi], is to allow a router acting as a tunnel endpoint to signal its tunnel type and tunnel parameters. The TE-Attribute does not convey the actual IPv4 or IPv6 address of the tunnel endpoint. Rather, this information is carried in the NEXT_HOP field of BGP. As such, the scope of [I-D.ietf-softwire-encaps-safi] is only within the set of routers that do not change the NEXT_HOP field. Xu & Francis Expires April 29, 2009 [Page 3] Internet-Draft BGP Tunnel Endpoint October 2008 This draft extends the use of the TE-Attribute so that it can be passed from AS to AS in normal BGP reachability updates. 1.1. Requirements notation The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119]. 2. Endpoint Address Sub-TLV Definition This draft defines a new sub-TLV to be used with the TE-Attribute, the "Endpoint Address" sub-TLV. The sub-TLV Type is TBD. The sub- TLV Value field is defined as: +---------------------------------------------------------+ | Address Family Identifier (2 octets) | +---------------------------------------------------------+ | Reserved (1 octet) | +---------------------------------------------------------+ | Length of Autonomous System Number (1 octet) | +---------------------------------------------------------+ | Autonomous System Number (Variable) | +---------------------------------------------------------+ | Endpoint Address (variable) | +---------------------------------------------------------+ The Autonomous System (AS) Number is that of the AS originating the route. The Endpoint Address is that of the tunnel endpoint. 3. Usage of the TE-Attribute with an Endpoint Address Sub-TLV The following usage rules apply only to TE-Attributes that are NOT associated with an encapsulating SAFI (i.e. as defined by [I-D.ietf-softwire-encaps-safi]) and that include an Endpoint Address Sub-TLV. 3.1. Originating AS Only the router originating a route may include a TE-Attribute. In other words, the TE-Attribute MUST NOT be added to received routes. The AS number in the Endpoint Address Sub-TLV MUST match that used as the first AS in the AS path. The Endpoint Address itself does not have to be from the same AF as the reachable NLRI in the update. The reachable NLRI may be both IPv4 and IPv6. However, there MUST be an NLRI in the UPDATE that contains the endpoint address. Xu & Francis Expires April 29, 2009 [Page 4] Internet-Draft BGP Tunnel Endpoint October 2008 An originating AS as defined here may be an AS that receives a route from a customer that uses a private AS number. If a tunnel endpoint router receives a packet on the tunnel, and the only known route to the destination is via routes originated by other ASes (not including private ASes of customers), then the packet must be dropped. This prevents transient loops whereby the ASes of a multi-homed customer both think that the other AS can reach the customer. Once the route withdraw reaches all other ASes, no more packets will be received via the tunnel. All routers in the origin AS MUST use the same Endpoint Address, which is anycasted across all routers. The reason for imposing this restriction is as follows. Say that an origin AS used different endpoint addresses for different routers, and that an upstream AS that does not recognize the TE-Attribute decided to aggregate two UPDATEs with different endpoint addresses. The aggregating AS might drop one of the TE-Attributes but include the other, with the result that the tunnel endpoint in the resulting UPDATE would be undetectably incorrect with respect to some of the NLRI in the UPDATE. The complete TE-attribute produced by all routers in the originating AS MUST be identical. The Protocol and Color Sub-TLV types are not used. If the encapsulation technique is GRE, and no key value is used, then the Endpoint Address Sub-TLV is the only one required. If the key value is used, or L2TPv3 is the tunnel type, then the Encapsulation Sub-TLV associated with the tunnel type is included. Note that even though the above paragraph states that the TE- attribute produced by all routers must be identical, in practice this is not strictly possible. If an AS decides to modify the endpoint address it uses, or decides to modify the tunnel type or tunnel parameters it uses, then for some period of time different routers will in fact be producing different TE-Attributes (i.e. while routers are being reconfigured). When this is the case, all routers MUST be able to receive tunneled packets for every TE-Attribute being produced by any router in the AS. For example, assume that an AS wants to modify its TE-Attributes from tunnel A to tunnel B (where A and B have different endpoint address, different tunnel types, or different tunnel parameters). The network administrator would first configure all routers to accept both tunnels A and B. He or she would then modify routers to produce TE-Attributes for tunnel B. After that was complete, he or she would delete tunnel A from all routers. It is for further study whether IP-in-IP encapsulation is required. It is also for further study whether multiple encapsulation types are required for the same UPDATE (i.e. to allow a remote router with Xu & Francis Expires April 29, 2009 [Page 5] Internet-Draft BGP Tunnel Endpoint October 2008 limited encapsulation types to be able to select an encapsulation type that works for it.) 3.2. Non-Originating ASes ASes that have deployed VA SHOULD FIB-install routes containing the Endpoint Address. This will prevent packets tunneled to Endpoint Addresses from taking any extra hops. When a router in a non-originating AS receives a route with an associated Endpoint Address, it must decide whether or not to use the tunnel. The router always has the option of ignoring the tunnel (and will do so by default if it does not recognize the TE-attribute). This section describes the criteria that determines when the router may use the tunnel. The router MUST NOT use the tunnel UNLESS the following criteria are met: 1. The AS_PATH to the tunnel endpoint matches the AS path to the reachable prefix. 2. The AS_PATH advertised by the AS, for all NLRI for which a tunnel is used, matches that of the tunnel. 3. The first AS in the AS_PATH Attribute is in an AS-SEQUENCE (not an AS-SET), and this AS matches the AS in the TE-attribute. This prevents an error whereby an aggregating AS combines NLRI from different originating ASes, and throws away all but one of the TE-attributes, thus resulting in an Endpoint Address that is incorrect. 4. If there are multiple TE-attributes in the update, they MUST all be identical. In this case, the AS SHOULD delete all but one of the TE-attributes from UPDATEs it passes on. If they are not all identical, then the AS MUST ignore them and remove all of them from any UPDATES that it passes on. Note that the above rules have the characteristic that, if a transit AS decides to use one AS path to some prefixes from an origin AS, and another AS path to other prefixes from the origin AS, then only one of these paths can have a valid endpoint address associated with it. Packets transmitted to the other path cannot be tunneled. One way to fix this, that would require changes to this draft, would be to encode the tunnel endpoint as a block of addresses. In this case, the transit AS that wishes to use multiple paths to different prefixes from an origin AS can deaggregate the block of addresses, and associate one tunnel endpoint block deaggregate with each selected path. Whether this is a good idea is for further study. Xu & Francis Expires April 29, 2009 [Page 6] Internet-Draft BGP Tunnel Endpoint October 2008 4. IANA Considerations IANA must issue a new Sub-TLV type for the Tunnel Encapsulation Attribute for the Endpoint Address Sub-TLV. 5. Security Considerations Because there are no changes in the BGP route selection process, there are no changes to the security properties of BGP as a result of this draft. 6. Normative References [I-D.francis-idr-intra-va] Francis, P., Xu, X., and H. Ballani, "FIB Suppression with Virtual Aggregation and Default Routes", draft-francis-idr-intra-va-01 (work in progress), September 2008. [I-D.ietf-softwire-encaps-safi] Mohapatra, P. and E. Rosen, "BGP Encapsulation SAFI and BGP Tunnel Encapsulation Attribute", draft-ietf-softwire-encaps-safi-03 (work in progress), June 2008. [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. Authors' Addresses Xiaohu Xu Huawei Technologies No.3 Xinxi Rd., Shang-Di Information Industry Base, Hai-Dian District Beijing, Beijing 100085 P.R.China Phone: +86 10 82836073 Email: xuxh@huawei.com Xu & Francis Expires April 29, 2009 [Page 7] Internet-Draft BGP Tunnel Endpoint October 2008 Paul Francis Cornell University 4108 Upson Hall Ithaca, NY 14853 US Phone: +1 607 255 9223 Email: francis@cs.cornell.edu Xu & Francis Expires April 29, 2009 [Page 8] Internet-Draft BGP Tunnel Endpoint October 2008 Full Copyright Statement Copyright (C) The IETF Trust (2008). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Intellectual Property The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org. Xu & Francis Expires April 29, 2009 [Page 9]