Network working group X. XU Internet Draft Huawei Category: BCP Expires: December 2009 June 9, 2009 Redundancy and Load Balancing Mechanisms for Stateful Network Address Translators (NAT) draft-xu-behave-stateful-nat-standby-00 Status of this Memo This Internet-Draft is submitted to IETF in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on December 9, 2009. Copyright Notice Copyright (c) 2009 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents in effect on the date of publication of this document (http://trustee.ietf.org/license-info). Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Xu. Expires December 9, 2009 [Page 1] Internet-Draft Redundancy and Load Balancing June 2009 Mechanism for Stateful NAT Abstract This document defines some redundancy and/or load balancing mechanisms for stateful Network Address Translators (NAT), including IPv4->IPv4 NAT, IPv4->IPv6 NAT and IPv6->IPv4 NAT. Conventions used in this document The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC-2119 [RFC2119]. Table of Contents 1. Introduction.................................................3 2. Terminology..................................................3 3. Scenario Description.........................................4 4. Redundancy Mechanisms........................................5 4.1. Cold Standby Mechanism..................................5 4.2. Hot Standby Mechanism...................................6 5. Load Balancing Mechanisms....................................7 6. Election Protocol Consideration..............................8 7. State Synchronization Protocol Consideration.................8 8. Security Considerations......................................9 9. IANA Considerations..........................................9 10. Acknowledgments.............................................9 11. References..................................................9 Authors' Addresses.............................................10 Xu. Expires December 9, 2009 [Page 2] Internet-Draft Redundancy and Load Balancing June 2009 Mechanism for Stateful NAT 1. Introduction Network Address Translation (NAT) has been used as an efficient way to delay IPv4 address exhaustion and been deemed as a major mechanism for IPv4/IPv6 transition and coexistence. In the Large Scale NAT (LSN) scenarios as described in some proposals, e.g., [NAT444], [DS-Lite] and [NAT64], the LSN routers are deployed in large-scale networks (e.g., ISP networks, campus networks or enterprise networks) and serve a huge amount of users. Hence redundancy and/or load-balancing capabilities are strongly desired for these LSN routers in order to provide high availability services to users. However, a failure of stateful NAT, which maintains states per session, would cause interruption of those sessions. In this document, we describe some redundancy and/or load balancing mechanisms for stateful NAT, including IPv4->IPv4 NAT, IPv4->IPv6 NAT and IPv6->IPv4 NAT. Note that stateless NAT is out of the scope of this document. Unless mentioned otherwise, NAT and LSN throughout this document will pertain to stateful NAT and stateful LSN separately. 2. Terminology The majority of terms used in this document are borrowed almost as is from [RFC2633], the following are some terms specific to this document. LSN (Large-Scale NAT): a NAT device placed at the border between large-scale user networks (e.g., ISP network, enterprise network, or campus network) and the Internet. LSN internal address realm (internal realm for short): a realm where the communication initiators (e.g., a client in the client/server application) are located. For IPv4->IPv4 NAT, the internal realm refers to the private networks, as opposed to the IPv4 Internet. For IPv6->IPv4 NAT, the internal realm means IPv6 network or IPv6 Internet. For IPv4->IPv6 NAT, the internal realm refers to IPv4 network or IPv4 Internet. Accordingly, the hosts located in the internal realm are called internal hosts, and the addresses used in the internal realm are called internal addresses. LSN external address realm (external realm for short): a realm where the communication responders (e.g., a server in the client/server application) are located. For IPv4->IPv4 NAT, the external realm refers to the IPv4 Internet. For IPv6->IPv4 NAT, the external realm Xu. Expires December 9, 2009 [Page 3] Internet-Draft Redundancy and Load Balancing June 2009 Mechanism for Stateful NAT means the IPv4 Internet or IPv4 network. For IPv4->IPv6 NAT, the external realm refers to the IPv6 Internet or IPv6 network. Accordingly, the hosts located in the external realm are called external hosts, and the addresses used in the external realm are called external addresses. Internal address pool: an address pool used for assigning internal addresses for the external hosts. Note that this address pool is specific to IPv4->IPv6 NAT and IPv6->IPv4 NAT. For IPv4->IPv6 NAT, the IPv4 address pool used for assigning internal IPv4 addresses for external IPv6 hosts is the internal address pool. For IPv6->IPv4 NAT, the prefix64 used for synthesizing internal IPv6 addresses for external IPv4 hosts could be looked as a special internal address pool. External address pool: an address pool used for assigning external addresses for the internal hosts. For IPv4->IPv4 NAT and IPv6->IPv4 NAT, the IPv4 address pool is the external address pool. For IPv4- >IPv6 NAT, the prefix64 could be looked as a special external IPv6 address pool from which synthesized IPv6 addresses are assigned to internal IPv4 hosts. CPE (Customer Premises Equipment): a router in front of internal hosts. Prefix64: an IPv6 prefix used for synthesizing IPv6 addresses for the IPv4 hosts. See [Prefix64] for more details. 3. Scenario Description +-------------------------+ +-----------------------+ | | | | | +-+-----+-+ | | | NAT-A | | +----+-------------+ +-+-----+-+ +-------------+ | |CPE/Internal Host | | | |External Host| | +----+-------------+ | | +-------------+ | | +-+-----+-+ | | | NAT-B | | | Internal realm +-+-----+-+ External realm | | | | | +-------------------------+ +-----------------------+ Figure 1. General Scenario of Dual NAT Routers Xu. Expires December 9, 2009 [Page 4] Internet-Draft Redundancy and Load Balancing June 2009 Mechanism for Stateful NAT In a typical operational scenario as illustrated in Figure 1, two NAT routers are usually deployed for redundancy and/or load balancing purposes. Hence we will describe the corresponding mechanisms based on this scenario. Note that these mechanisms are also suitable in the scenarios in which more than two NAT routers are used. Due to the fact that the redundancy and load-balancing mechanisms for IPv4->IPv4 NAT, IPv4->IPv6 NAT and IPv6->IPv4 NAT are almost the same except for the routes towards the external realm advertised into the internal realm by the NAT routers, e.g., a route to the prefix64 in the case of IPv6->IPv4 NAT, a route to the IPv4 Internet (in [NAT444]) or the tunnel concentrator (in [DS-Lite]) in the case of IPv4->IPv4 NAT, and a route to the IPv4 address pool in IPv4- >IPv6 NAT, we will try to describe these mechanisms in general. 4. Redundancy Mechanisms The basic idea of NAT redundancy is to make two NAT routers function as a redundancy group, and select one as the Master and the other as the Backup through some election protocol (see section 6) or manually configuration. In normal case, the packets between the internal realm and the external realm traverse via the Master. Once the Master fails, the Backup takes over the translation role. There are two redundancy mechanisms: a cold standby mechanism and a hot standby mechanism. The goal of the cold standby mechanism is to keep the NAT failover transparent to the communicating internal hosts. In contrast, the purpose of the hot standby mechanism is to remain the established sessions continuous during the NAT failover. The following sections will describe them separately. 4.1. Cold Standby Mechanism To achieve cold standby, the internal addresses for external hosts (as communication responders) should be remained despite the NAT failover. In IPv4->IPv4 NAT, the external hosts' internal addresses are the same as their external addresses, so the above requirement can be met naturally. In IPv6->IPv4 NAT, NAT routers belonging to a redundancy group should be configured with an identical prefix64. In IPv4->IPv6 NAT, NAT routers in a redundancy group should be configured with an identical IPv4 address pool, besides, the state information should be synchronized among these NAT routers through some state synchronization protocol (see section 7) so as to ensure the Backup, once selected as the current Master, could assign the Xu. Expires December 9, 2009 [Page 5] Internet-Draft Redundancy and Load Balancing June 2009 Mechanism for Stateful NAT communicating IPv6 hosts the same IPv4 addresses as those assigned by the previous Master. Of the NAT routers in a redundancy group each is configured with a different external address pool and announces into the external realm a route to that external address pool. In the cases of IPv4- >IPv4 NAT and IPv6->IPv4 NAT, NAT routers each are configured with different external IPv4 address pools without any overlap. Otherwise, the same address or address/port pair, which was assigned to some internal host by the previous Master, may be occasionally assigned to a different internal host by the current Master, this occasion will cause some confusions. For example, the return packets towards host A will be misunderstood by the current Master as those towards host B. In the case of IPv4->IPv6 NAT, each NAT router is configured with a different prefix64. In order to make packets towards the external realm always traverse via the Master, the Master should announce into the internal realm a route towards the external realm. In case the Master and the Backup are specified manually, the Backup should also announce into the internal realm a route towards the external realm to prepare for the takeover. However, in order to ensure the route advertised by the Master, rather than that advertised by the Backup, is selected as the best by the routers in the internal realm despite topology changes, the route advertised by the Backup should be set at higher enough cost or larger granularity (For example, the Backup announces a route to 10.0.0.0/8, while the Master announces two specific routes to 10.0.0.0/9 and 10.128.0.0/9 respectively). Once the connections to the external realm lost, the Master should withdraw the route towards the external realm previously announced. When the Master fails, packets towards the external realm will pass through the Backup. If the Master and the Backup are automatically elected through some election protocol, the Backup would be elected as a new Master when the old Master fails, so it's not necessary for the Backup to make the above route announcement. 4.2. Hot Standby Mechanism To preserve the established sessions during the failover, in addition to remain the internal addresses for the external hosts unchanged, the external addresses for the internal hosts should also keep unchanged. How to meet the first requirement will not be reiterated since it is the same as that for the cold standby mechanism. To meet the second requirement, NAT routers in a redundancy group should be configured with an identical external address pool and they should assign the same external address for Xu. Expires December 9, 2009 [Page 6] Internet-Draft Redundancy and Load Balancing June 2009 Mechanism for Stateful NAT identical internal hosts. In the case of IPv4->IPv6 NAT, NAT routers should simply be configured with an identical prefix64. For IPv4- >IPv4 NAT and IPv6->IPv4 NAT, in addition that the NAT routers are configured with identical IPv4 address pools, the state on the Master should be synchronized to the Backup in a timely fashion. The Master announces into the internal realm a route towards the external realm and announces into the external realm a route towards the external address pool. If the Master and the Backup are specified manually, the Backup should also announce those routes, but with higher enough cost or larger granularity. Once the connections to either the external realm or the internal realm lost, the Master should withdraw the above routes. When the Master fails, the packets towards the external realm will pass through the Backup. If the Master and the Backup are automatically elected through some election protocol, the Backup would be elected as a new Master when the old Master fails, so it's not necessary for the Backup to make the above route announcement. 5. Load Balancing Mechanisms Based on the above redundancy mechanisms, one can further realize load balancing among a group of NAT routers. The basic idea is to create two redundancy groups (e.g. group A and group B) on these NAT routers, make one router as the Master for group A and the Backup for group B, while make the other as the Master for group B and the Backup for group A. Taking IPv6->IPv4 NAT as an example, NAT routers are configured with two prefix64s (e.g., prefix64-A and prefix64-B) corresponding to two different redundancy group (e.g., group A and group B)separately, and one router is designated as the Master for group A and the Backup for group B, while the other as the Backup for group A and the Master for group B. In this way, the IPv6 packets towards the IPv4 external realm are balanced among these NAT routers according to their destination addresses with different prefix64s. For load balancing together with the cold standby, each NAT router could either use the same external address pool or different external address pools corresponding to these redundancy groups. However, the external address pools on different NAT routers shouldn't have any overlap. Otherwise, the same address or address/port pair could be assigned occasionally to different internal hosts. In contrast, for load balancing together with the hot standby, different external address pools should be configured for these redundancy groups. Otherwise, the return packets towards the internal realm may be forwarded to a wrong NAT router. Xu. Expires December 9, 2009 [Page 7] Internet-Draft Redundancy and Load Balancing June 2009 Mechanism for Stateful NAT 6. Election Protocol Consideration Election protocol is used to automatically elect one from a redundancy group as the Master NAT router and the other as the Backup NAT routers. Once the Master fails, the Backup with the highest priority will take over the Master role after a short delay. The election protocol will also be used to track the connectivity to the external realm and the internal realm. Once connections to the external realm or the internal realm lost, the NAT router is not qualified to the Master and it will withdraw the route towards the external realm announced previously, in the case of hot standby, it should also withdraw the route towards the internal address pool. In fact, one can use the VRRP [RFC2338] directly as the automatic election protocol. In addition, the interface track mechanism can also be used to adjust the priority to influence the election results. If two NAT routers are directly connected via an Ethernet network, the VRRP can run directly on the Ethernet interfaces. Otherwise, some extra configuration or protocol changes need to be implemented. One option is to create conditions for VRRP to run among these routers. For example, we create a VPLS [RFC4761, RFC4762] instance and enable IP functions and run VRRP on those VLAN interfaces which are bound to that VPLS instance. If enabling IP function on those interfaces is not supported, one can use the following trick to realize the same goal, but at a cost of consuming two physical interfaces on each NAT router. The approach is: create a VPLS instance among a set of NAT routers, and on each of them one Ethernet interface is bound to that VPLS instance, and another IP enabled Ethernet interface is locally connected with that interface. Then VRRP can run on those IP enabled Ethernet interfaces which are all connected to that VPLS instance. Another option is to do some change to VRRP so that VRRP neighbors can be configured manually and VRRP messages can be exchanged directly between two neighbors in a unicast fashion. 7. State Synchronization Protocol Consideration The Server Cache Synchronization Protocol (SCSP) defined in [RFC2334] could be used as a candidate for state synchronization protocol. Details about the usage and possible modifications will be explored in the next version of this document. Xu. Expires December 9, 2009 [Page 8] Internet-Draft Redundancy and Load Balancing June 2009 Mechanism for Stateful NAT 8. Security Considerations TBD. 9. IANA Considerations TBD. 10. Acknowledgments The author would like to thank Dan Wing, Dave Thaler for their insightful comments and reviews, and thank Dacheng Zhang and Xuewei Wang for their valuable editorial reviews. 11. References [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC3022] Srisuresh, P. and K. Egevang, "Traditional IP Network Address NAT (Traditional NAT)", RFC 3022, January 2001. [RFC2663] Srisuresh, P. and M. Holdrege, "IP Network Address NAT (NAT) Terminology and Considerations", RFC 2663, August 1999. [RFC2766] Tsirtsis, G. and P. Srisuresh, "Network Address Translation - Protocol Translation (NAT-PT)", RFC 2766, February 2000. [RFC4966] Aoun, C. and E. Davies, "Reasons to Move the Network Address NAT - Protocol NAT (NAT-PT) to Historic Status", RFC 4966, July 2007. [RFC2338] Knight, S., et. al., "Virtual Router Redundancy Protocol", RFC2338, April 1998. [RFC2334] Luciani, J., Armitage, G., Halpern, J., and N. Doraswamy, "Server Cache Synchronization Protocol (SCSP)", RFC 2334, April 1998. [RFC4761] Kompella, K. and Y. Rekhter, "Virtual Private LAN Service (VPLS) Using BGP for Auto-Discovery and Signaling",RFC 4761, January 2007. Xu. Expires December 9, 2009 [Page 9] Internet-Draft Redundancy and Load Balancing June 2009 Mechanism for Stateful NAT [RFC4762] Lasserre, M. and Kompella, V. (Editors), "Virtual Private LAN Service (VPLS) Using Label Distribution Protocol (LDP) Signaling", RFC 4762, January 2007. [NAT64] Bagnulo, M., Matthews, P., and I. Beijnum, "NAT: Network Address and Protocol Translation from IPv6 Clients to IPv4 Servers", draft-bagnulo-behave-NAT64-03 (work in progress), March 2009. [NAT444] Shirasaki, Y., Miyakawa, S., Nakagawa, A., Yamaguchi, J., and H. Ashida, "NAT444 with ISP Shared Address", draft-shirasaki-nat444-isp-shared-addr-00 (work in progress), October 2008. [DS-Lite] Durand, A., "Dual-stack lite broadband deployments post IPv4 exhaustion", draft-ietf-softwire-dual-stack-lite-00 (work in progress), March 2009. [Prefix64] Miyata, H., "PREFIX64 Comparison", draft-miyata-behave- prefix64-00 (work in progress), October 2008. [LSN] Nishitani,T., Miyakawa, S., Nakagawa, A., Ashida,H., " Common Functions of Large Scale NAT (NAT)", draft-nishitani-cgn- 01 (work in progress), November 2008 [Framework] Baker, F., Li,X., Bao,C., "Framework for IPv4/IPv6 Translation", draft-baker-behave-v4v6-framework-02 (work in progress), February 2009. Authors' Addresses Xiaohu Xu Huawei Technologies, No.3 Xinxi Rd., Shang-Di Information Industry Base, Hai-Dian District, Beijing 100085, P.R. China Phone: +86 10 82836073 Email: xuxh@huawei.com Xu. Expires December 9, 2009 [Page 10]