Internet-Draft BGP community for IPv6 site multihoming October 2003 Individual Submission Internet Draft Sang-Ha Kim* Da-Hye Choi* ChungNam National University* Hyoung-Jun KIM ** Hyun-Wook Cha** ETRI** Expires: 18 April 2004 20 October 2003 An Application of the BGP Extended Community Attribute for Distributed IPv6 Site Multihoming Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026 except that the right to produce derivative works is not granted [1]. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress". The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Abstract This document presents for a new IPv6 site multihoming scheme and its operational requirements. It aims at solving potential deployment problems in "IPv6 Multihoming Support at Site Exit Routers [2]" using an application of BGP extended community attribute, called multihomed community. In case of link failure in [2], basic operations to support multihoming are entirely dependent on functionality of border router in other ISPs. Hence, it causes following problems; centralized encapsulation overhead for re-routing, K. I. Kim et al. Expires - April 2004 [Page 1] Internet-Draft BGP community for IPv6 site multihoming October 2003 packet delivery along un-optimized tunneling session, and losing connectivity in case of ISP failure. Also, [2] do not provide any alternative mechanism for long-term failure. For above reasons, if the link remains failed for a long time, it has a little applicability due to previously mentioned problems. In this memo, we propose a new IPv6 site multihoming scheme, which establishes multiple direct tunneling sessions between sender's site exit router and reachable receiver's site exit router instead of one tunneling session between border router and site exit router in [2]. Furthermore, with some additional BGP operations, it can preserve connectivity of on-going sessions regardless of ISP, intermediate link, as well as directly connected site link failure. Conventions used in this document The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [2]. Table of Contents 1. Terminology....................................................2 2. Introduction...................................................3 3. Overview.......................................................4 4. Distributed IPv6 Site Multihoming..............................4 4.1 Operations for site link failure...........................4 4.2 Operations for ISP or intermediate link failure............7 5. Requirements...................................................8 5.1 Host requirement...........................................8 5.2 Router requirement.........................................9 6. Algorithm Identifying Failure..................................9 7. Design Recommendation.........................................10 8. Security Considerations.......................................10 9. References....................................................10 10. Authors' Addresses...........................................11 1. Terminology This memo uses the terminology described in [2]. In addition, a new term is defined below: Multihomed community A newly defined BGP extended community that is used to address information and current status of multihomed network to other customer networks. K. I. Kim et al. Expires - April 2004 [Page 2] Internet-Draft BGP community for IPv6 site multihoming October 2003 Primary site exit router A designated site exit router among all site exit routers within a multihomed site for delegated address block. The packet destined to delegated address is routed via this site exit router. 2. Introduction The typical motivation for IP multihoming is to improve reliability and network performance. In addition, congestion avoidance through load sharing can be achievable by being multihomed. However, the firm route aggregation, which is one of fundamental IPv6 routing policies, makes it difficult to provide the multihoming as IPv4 does. It is mainly because announcement of the IP space that you obtained from one of your providers to both of your upstream providers is not allowed in IPv6 network due to strict route aggregation. To seamlessly support IPv6 multihoming, many different researches have been discussed in the past few years. They are largely classified as host multihoming and site multihoming. Site multihoming solution refers to multihoming scheme, which is supported only by functionality of routers in ISPs that have direct physical links to multihomed customer network. So, it does not require any modification and have influence on routing table in core networks so that they are considered as scalable and simple solutions. Among site multihoming approaches, "IPv6 multihoming support at site exit routers [2]" proposes non-direct BGP peering for multihoming support. Though it does not require new protocols and mechanisms, it still has unsolved deployment problems. These problems are resulted in the basic operational principle of [2]. When link failure or router misbehavior occur in [2], the support of multihoming is mostly dependent on functionality of border router in other ISPs. So, it may suffer from problems such as centralized re-routing overhead, increased end-to-end delay due to data delivery along not-optimized path, losing connectivity in case of ISP failure. In particular, since it does not provide any alternative solution for long-term failure, it has a little applicability. This document defines a new process to support IPv6 multihoming, which can solve described above problems by establishing multiple direct tunneling sessions between sender's border router and site exit router in multihomed site. Doing so, it is possible to distribute centralized overhead to several related ISPs as well as decrease end-to-end delay by constructing a new optimized path. The detailed procedure will be described in section 4. K. I. Kim et al. Expires - April 2004 [Page 3] Internet-Draft BGP community for IPv6 site multihoming October 2003 3. Overview The intended goal of this multihoming scheme is to distribute centralized overhead for re-routing in [2] to each sender's exit router while preserving connectivity even though ISP and intermediate link failure occur. In order to achieve them, multiple direct tunneling sessions between sender's site exit router and reachable exit routers in multihomed sites are established in case of link problem. For this reason, it is required to propagate information and current status of each multihomed site to apply this scheme into IPv6 networks. This information is defined as a new extended BGP community, called Multihomed community. This community is encoded in BGP UPDATE message and then propagated into whole networks. Based on information specified in this community, each border router decides whether a new direct tunneling session is established or not. With above operational features, the proposed scheme is capable of most of requirements of IPv4 multihoming, such as redundancy, load sharing, and transport-layer survivability. In addition to these functionalities, our scheme can meet the following features. - Scalability: The centralized encapsulation overhead to deliver all data packets toward reachable site exit router can be dispersed to many border routers in sender's ISPs. So, it is much more scalable to large number of flows and long-term failure than [2]. - Simplicity: There is no need for any additional control protocol or operational requirement. The current link status and information within multihomed network can be propagated to whole network with extended BGP community in BGP UPDATE message. Hence, it has no impact on hosts so that they do not need to concern about anything to support multihoming, for example, source/destination address selection. - Impact on routers: The extended BGP multihomed community is transitive to all intermediate routers. Furthermore, the volume of this information is very tiny. Since this community is processed only on site exit routers, it can minimize the impact on routers. 4. Distributed IPv6 Site Multihoming 4.1 Operations for site link failure In the configuration below, the multihomed site is connected to the Internet through two different ISPs, ISP A and ISP B, via link A and link B, respectively. Each of ISP has delegated an address space to K. I. Kim et al. Expires - April 2004 [Page 4] Internet-Draft BGP community for IPv6 site multihoming October 2003 the multihomed site. ISP A has delegated PrefA:Presite::/nA+n and ISP B has delegated PrefB:Presite::/nB+n. The link A is defined as primary link for PreA:Presite::/nA+n and secondary link for PreB:Presite::/nB+n. Vise versa, link B is defined as primary link for PreB:Presite::/nB+n and secondary link for PreA:Presite::/nA+n. We assume that one sender, host C, is within customer network C and another sender, host D, is within customer network D. The two senders simultaneously communicate with a receiver, host A, in multihomed site. The host A has two unique global addresses delegated from each ISP. At first, we assume that both senders start communication with host A with PrefA:Prefsite::hostA address. That is, in normal conditions, the packets issued from both host C and D are routed toward host A through ISP A along link A. +------------------+ +---------+ +---------+ +------------------+ |customer network C|--| ISP C | | ISP D |--|customer network D| |E-BR-C | |ISP-BR-C | |ISP-BR-D | |E-BR-D | +------------------+ +---------+ +---------+ +------------------+ | | | | +-----------------------------------+ | Internet | | | +-----------------------------------+ / \ / \ +-----------+ +----------+ | ISP A | | ISP B | | ISP-BR-A | | ISP-BR-B | +-----------+ +----------+ | | link A | | link B | | +------------------------------------------+ |E-BR-A E-BR-B | |PreA:Presite::/nA+n PreB:Presite::/nB+n | | | | Multihomed site | +------------------------------------------+ [Figure 1] Example of configuration in multihomed site With operations in local interior routing protocol within multihomed site, the border routers in multihomed site, E-BR-A and E-BR-B, can know that the customer network is multihomed. After identifying network being multihomed, E-BR-A and E-BR-B advertise the each extended BGP multihomed community toward each upstream ISP's border router via BGP UPDATE message. This multihomed community contains K. I. Kim et al. Expires - April 2004 [Page 5] Internet-Draft BGP community for IPv6 site multihoming October 2003 delegated address block, their own address and corresponding preference values. The E-BR-A and E-BR-B denote higher preference values to directly connected site exit router for delegated address space. For example, E-BR-A builds the multihomed community as follows. In general, primary link has the greater value than secondary link. If there are links more than two, the preference values for each link except for primary one can be defined depending on metrics used in interior routing protocol. --------------------------------------------------------------------- Site exit router address Delegated Address Block Preference --------------------------------------------------------------------- E-BR-A PreA:Presite::/nA+n 100 E-BR-A PreB:Presite::/nB+n 50 --------------------------------------------------------------------- [Figure 2] Preference value at E-BR-A --------------------------------------------------------------------- Site exit router address Delegated address block Preference --------------------------------------------------------------------- E-BR-B PreA:Presite::/nA+n 50 E-BR-B PreB:Presite::/nB+n 100 --------------------------------------------------------------------- [Figure 3] Preference value at E-BR-B Since this newly defined multihomed community is transitive and transparent to all routers except site exit routers, all intermediate routers simply re-advertises multihomed community to neighboring and remote BGP peers. After all, propagation of this community continues until site exit router receives this BGP UPDATE message including BGP multihomed community. Once receiving BGP UPDATE message, site exit router does not transfer this community to any inter-domain routers within an ISP and records primary link information specified in multihomed community into cache table. This cache table consists of delegated address block in multihomed site and address of site exit router on primary link (primary site exit router). In this example, both E-BR-C and E-BR-D record the following information. 1) E-BR-A as primary site exit router for PreA:Presite::/nA+n 2) E-BR-B as primary site exit router for PreA:Presite::/nB+n When link A is broken, the sender cannot preserve connectivity with receiver via link A. Similarly, E-BR-A can only advertise this unreachability information to E-BR-B with interior routing protocol. So, E-BR-B can recognize failure on link A. Once detecting failure of link A, E-BR-B encodes a new BGP UPDATE message with multihomed community, which encodes the modified preference value for PreA:Presite::/nA+n address block. Since link A is not reachable, link K. I. Kim et al. Expires - April 2004 [Page 6] Internet-Draft BGP community for IPv6 site multihoming October 2003 B takes a role of primary link for both delegated network addresses space. So, E-BR-B encodes preference value for link A as the same one as link B. And then, it sends BGP UPDATE message including this multihomed community. This BGP UPDATE message is gradually propagated into whole networks. When site exit routers, both E-BR-C and E-BR-D, receive this BGP UPDATE message, they find out that E-BR-B is re- designated as the primary site exit router for PreA:Presite::/nA+n. This is done by comparing preference value specified in BGP UPDATE message and recorded one, which indicates the value in normal conditions. Once detecting the failure of link A, the packets destined to PreA:Presite::/nA+n should be detoured toward E-BR-B in order to be transmitted along link B. This is accomplished by IPv6-in-IPv6 encapsulation at both E-BR-C and E-BR-D. To encapsulate packets, the additional forwarding entry for PreA:Presite::/nA+n is appended into forwarding table on both E-BR-C and E-BR-D. After that, the packets are encapsulated and then directly forwarded toward E-BR-B referring to forwarding table at intermediate routers. When E-BR-B receives these encapsulated packets, it de-capsulates the packets and gets the original packets. The original packets are forwarded to multihomed host by referring to interior forwarding table in multihomed site. When link A is restored to normal conditions, E-BR-A sends BGP UPDATE message with multihomed community, which has preference value as the same as ordinal case. When both E-BR-D and E-BR-C receive this message, they cease encapsulation procedure to packets destined to PreA:Presite::/nA+n by deleting corresponding entry on forwarding table on itself. After then, all packets destined to PreA:Presite::/nA+n are delivered to receivers via link A. Different from case of inbound packets previously described, for outbound packets, the support of multihoming is entirely dependent on behavior of interior routing protocols. When link A is broken, all packets are delivered along link B. 4.2 Operations for ISP or intermediate link failure In the section 4.1, operational procedures are described in case of failure on one directly connected link among multiple links within multihomed site. But, it cannot support ISP or intermediate node failure similar to mechanism in [2]. However, with some additional operational procedures, the connectivity of on-going sessions can be preserved regardless of intermediate link or ISP failure. That is to generate periodical BGP UPDATE message with multihomed community to identify failure between sender's site exit router and each site exit router of the multihomed site. This periodical BGP UPDATE message is used to identify any problems occurred between K. I. Kim et al. Expires - April 2004 [Page 7] Internet-Draft BGP community for IPv6 site multihoming October 2003 sender's customer network and multihomed site. If problem occurs, the periodical message is not propagated into each site exit router. That is, each site exit router cannot receive periodical BGP UPDATE message from primary site exit router for each delegated address block. So, a procedure to establish a new route should be accomplished. In this case, extra overhead due to periodical update message should be taken into account. However, since the volume of this information is very tiny and all routers except for each site exit router are transitive to this BGP UPDATE message, the periodical BGP UPDATE message is not big concerns. The detailed procedure is described with following examples. In Figure 1, both E-BR-A and E-BR-B send each BGP UPDATE message with multihomed community to upstream routers, ISP-BR-A and ISP-BR-B, respectively. This status information of multihomed site is propagated into the whole networks. When both E-BR-C and E-BR-D receive this BGP UPDATE message, it designates the primary site exit router for address block delegated to multihomed site and records the related community information and sets up additional associated timer. When ISP-BR-A fails, E-BR-C and E-BR-D cannot receive periodical BGP UPDATE message issued from E-BR-A. On the other hand, BGP UPDATE message issued from E-BR-B is propagated into the whole networks. When site exit routers, E-BR-C and E-BR-D, receive only BGP UPDATE message generated from E-BR-B that indicates a new primary site exit router for PreA:Presite::/nA+n, they wait for the BGP UPDATE message issued from primary site exit router, E-BR-A, until timer is expired. When this associated timer is expired and the BGP UPDATE message issued from E-BR-A is not arrived, the sender's site exit routers, E-BR-C and E-BR- D, regard this situation as one of the possible failure situations. Once site exit routers, E-BR-C and E-BR-D, perceiving above situations, the packets destined to PreA:Presite::/nA+n should be detoured toward E-BR-B. To do so, a new entry is appended into forwarding table. When problem is recovered, E-BR-C and E-BR-D can receive BGP UPDATE message issued from both E-BR-A and E-BR-B. After receiving BGP UPDATE message from two different site exit routers, the site exit router does not need to encapsulate packets any more. Therefore, the appended entry related to multihomed site is removed from forwarding table. 5. Requirements 5.1 Host requirements A host in multihomed site as well as corresponding host has no requirements to support multihoming in our scheme. Significantly different from host multihoming apporach, a host does not need to select source and destination address depending on network status. K. I. Kim et al. Expires - April 2004 [Page 8] Internet-Draft BGP community for IPv6 site multihoming October 2003 Consequently, the host requirements in this scheme are totally equal to one in [2]. 5.2 Router requirements All site exit routers within a multihomed site should generate BGP UPDATE message periodically and handle a new multihomed community. Original BGP UPDATE message is generated only when there is any changes in routing table. However, in our mechanism, it is required for each site exit router in multihomed site to generate periodical BGP UPDATE message even though there are no changes on BGP routing table. Excepting this situation, in case of core routers, there is no requirement for supporting multihoming. Since a newly defined BGP community is designed as transitive property, it simply re-advertises this community toward peering BGP routers. 6. Algorithm to Identify Failure In this section, we describe algorithm to identify failure on both basic and extended mechanisms. After identifying failure, a new direct tunneling session is established and maintained by appending a forwarding entry. A) In case of site link failure When a direct site link is broken, each site exit router in multihomed site resets preference value for delegated address block so that it automatically designates one of site exit routers among reachable site exit routers as new primary site exit router. And then, it generates BGP UPDATE message including this community attribute. When each site exit router receives BGP UPDATE message, re-designated primary site exit router can be easily identified with modified value. When link failure is recovered, each site exit router in multihomed site restores preference value as in normal conditions. This status information is propagated into the whole network and then each site exit router deletes corresponding entry from forwarding table. B) In case of ISP or intermediate link failure In this case, each site exit router in multihomed site periodically advertises multihomed site information. All other site exit routers record this information in cache table. When a site exit router receives only a BGP UPDATE message from non primary site exit router for delegated address block, it waits until timer expiration. By that time, if it cannot receive BGP UPDATE message from primary site exit router, each site exit router considers multihomed site as unreachable K. I. Kim et al. Expires - April 2004 [Page 9] Internet-Draft BGP community for IPv6 site multihoming October 2003 due to one of the probable following reasons; ISP failure or intermediate link failure toward site exit router. Once completing restoration, each site exit router can receive BGP multihomed community from primary site exit router as well as other all site exit routers in multihomed site. After that, site exit router performs normal packet forwarding. 7. Design Recommendation The following is a suggestion if you are just starting out with our proposed scheme in your own network. - Since it takes some time to notify problems to all site exit routers in networks with BGP UPDATE message, service disruption during propagation is inevitable. To minimize the service disruption of on-going session, temporary tunneling section via non direct BGP peering [2] can be applied if packets are continually delivered at ISP's border router. Since architecture in [2] has no any affection on our scheme, it can enhance network performance. 8. Security Considerations The proposed scheme simply adapts a new extended BGP community attribute for supporting multihoming. So, its security consideration is mostly dependent on [3]. 9. References [1] J. Yu, "IPv6 Multihoming with Route Aggregation." IETF Internet-Draft, draft-ietf-ipngwg-ipv6multihome-with-aggr-01.txt, Aug. 2000. [2] J. Hagino et al., "IPv6 Multihoming Support at Site Exit Routers," IETF RFC 3178, Oct. 2001. [3] Y. Rekhter et al., "A Border Gateway Protocol 4 (BGP-4)," IETF RFC 1771, Mar. 1995. [4] R. Chandra et al., "BGP Communities Attribute," IETF RFC 1997, Aug. 1996. [5] E. Chen et al., "An Application of the BGP Community Attribute in Multihoming Routing," IETF RFC 1998, Aug. 1996. K. I. Kim et al. Expires - April 2004 [Page 10] Internet-Draft BGP community for IPv6 site multihoming October 2003 [6] S. R. Sangli et al., "BGP Extended Communities Attribute," IETF Internet-Draft, draft-ietf-idr-bgp-ext-communities-06.txt, Aug. 2003. 10. Authors' Addresses Ki-Il Kim ChungNam National University 220 Gung-dong, Yuseong-gu, Daejon , Korea Tel : +82 42 821 7451 Fax : +82 42 822 9959 E-mail : kikim@cclab.cnu.ac.kr Sang-Ha Kim ChungNam National University 220 Gung-dong, Yuseong-gu, Taejon , Korea Tel : +82 42 821 6271 Fax : +82 42 822 9959 E-mail : shkim@cclab.cnu.ac.kr Da-Hye Choi ChungNam National University 220 Gung-dong, Yuseong-gu, Daejon , Korea Tel : +82 42 821 7451 Fax : +82 42 822 9959 E-mail : dhchoi@cclab.cnu.ac.kr Hyoung-Jun Kim ETRI PEC 161 Gajeong-dong, Yuseong-gu, Daejon , Korea Tel : +82 42 860 6576 Fax : +82 42 861 5404 E-mail : khj@etri.re.kr Hyun Wook Cha ETRI PEC 161 Gajeong-dong, Yuseong-gu, Daejon , Korea Tel : +82 42 860 1076 Fax : +82 42 861 5404 E-mail : jafy@etri.re.kr K. I. Kim et al. Expires - April 2004 [Page 11]