Network Working Group B. Sarikaya Internet-Draft Huawei USA Intended status: Standards Track October 17, 2011 Expires: April 19, 2012 Dual Stack Lite and 4rd for Migrating Cloud Networks to IPv6 draft-sarikaya-softwire-v6migration4cloud-00.txt Abstract This document specifies how Dual Stack Lite or 4rd can be used to support IPv4 users in a cloud network that is IPv6 based. The cloud network is IPv6 only which simplifies the network operation and management. IPv4 users are supported either one of the transition technologies of DS-Lite or 4rd. This protocol is ideal for new cloud deployments since no new public IPv4 addresses are needed. Status of this Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on April 19, 2012. Copyright Notice Copyright (c) 2011 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as Sarikaya Expires April 19, 2012 [Page 1] Internet-Draft DS Lite and 4rd for Cloud October 2011 described in the Simplified BSD License. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 3. Requirements . . . . . . . . . . . . . . . . . . . . . . . . . 3 4. Architecture . . . . . . . . . . . . . . . . . . . . . . . . . 4 4.1. DS-Lite Architecture . . . . . . . . . . . . . . . . . . . 4 4.2. 4rd Architecture . . . . . . . . . . . . . . . . . . . . . 5 5. DS-Lite Operation . . . . . . . . . . . . . . . . . . . . . . 7 5.1. DS-Lite for Datacenter Considerations . . . . . . . . . . 8 6. 4rd Operation . . . . . . . . . . . . . . . . . . . . . . . . 8 6.1. 4rd for Datacenter Considerations . . . . . . . . . . . . 9 7. Security Considerations . . . . . . . . . . . . . . . . . . . 9 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 9 10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 10 10.1. Normative References . . . . . . . . . . . . . . . . . . . 10 10.2. Informative references . . . . . . . . . . . . . . . . . . 10 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 12 Sarikaya Expires April 19, 2012 [Page 2] Internet-Draft DS Lite and 4rd for Cloud October 2011 1. Introduction Data center/cloud networks are being increasingly used by telecom operators as well as by enterprises. Currently these networks are using IPv4 addressing. However due to the depletion of public IPv4 addresses this practice can not go on indefinitely. In this document we argue that data center/cloud network should use IPv6. If IPv6 is used then it becomes an issue how to support IPv4 users which we address in this document. For this purpose dual stack lite [RFC6333] or 4rd [I-D.murakami-softwire-4rd], [I-D.xli-behave-divi], [I-D.despres-softwire-4rd] can be used. Dual stack lite has two elements: Basic Bridging BroadBand (B4) element which receives privately addressed IPv4 packets from the users and encapsulates them in IPv6 and sends them to the Dual Stack Lite Address Family Transition Router (AFTR) element which is in charge of a central Network Address Translation (NAT) for the whole cloud network. Packets received from public IPv4 network by AFTR are tunneled to the destination B4s after NAT operation in IPv6. B4 decapsulates the packet and delivers it to the user. 4rd has two elements: Customer Edge (CE) device/router receives privately addressed IPv4 packets from the users and after a NAT operation sends them to the Border Relay (BR) either encapsulated using IPv4-in-IPv6 encapsulation or translated into IPv6. BR decapsulates or translates the packet before sending to IPv4 network. The operation is stateless and the next packet from CE may be sent to another BR. In current data center deployments IPv4 is being used and [I-D.sakura-6rd-datacenter] describes how IPv6 service can be supported. IPv6 Rapid Deployment (6rd) [RFC5969] is used where 6rd CE is located at the servers and 6rd Border Relay is located at the backbone network. 2. Terminology This document uses the terminology defined in [RFC6333] and in [RFC6333] or 4rd [I-D.murakami-softwire-4rd], [I-D.xli-behave-divi], [I-D.despres-softwire-4rd]. 3. Requirements This section states requirements on data center/cloud network IPv4 and IPv6 support protocol. Sarikaya Expires April 19, 2012 [Page 3] Internet-Draft DS Lite and 4rd for Cloud October 2011 Data center/cloud network MUST support IPv6 natively due to the depletion of public IPv4 addresses. IPv4 support SHOULD enable direct routing among the geographically distributed data centers of the same network. IPv4 support SHOULD minimize the use of public IPv4 addresses. Network address translation (NAT) could be used on the customer side. 4. Architecture Data center/cloud networks consists of thosands of servers in one center and usually an organization operates more than one data center each connected by a backbone IP network. Each server runs multiple Virtual Machines (VM) each with its own IP address. Data center could be Layer-2 based or Layer-3 based. Datacenter is Layer-3 based if packets are routed among the subnets, e.g. in between the racks and switched in a subnet, e.g. inside a rack. Layer-3 based datacenter networks scale well for address resolution protocols like ARP but they make is difficult to move Virtual Machines as some sort of tunneling would be needed [I-D.wkumari-dcops-l3-vmmobility]. Datacenter is Layer-2 based if packets are switched inside a rack and bridged among the racks, i.e. completely in Layer-2. From IP point of view the nodes are connected to a single link. Layer-2 based networks make it easy to move Virtual Machines from one server to another but on the other hand they don't scale well for address resolution protocols like ARP [I-D.narten-armd-problem-statement]. In this document we assume L2-based datacenter/cloud network and design Dual Stack Lite protocol, see Figure 1 and 4rd, see Figure 2. 4.1. DS-Lite Architecture Virtual Machines connect to the network using a virtual interface supported by the guest operating system. These VMs are mapped to the physical servers in the rack. In VM based data center/cloud networks B4 for VMs are placed in the guest OS using operating system patch similar to 6rd CE in [I-D.sakura-6rd-datacenter]. Some data center/cloud networks do not employ Virtual Machines. These networks are server-based. In server-based data center/cloud networks, B4 is placed in the access router. In the storage area network B4 is also placed in the gateway router. Sarikaya Expires April 19, 2012 [Page 4] Internet-Draft DS Lite and 4rd for Cloud October 2011 AFTR is at the border of IPv6 backbone. At B4s IPv4 packets are tunneled to AFTR and then AFTR decapsulates and performs the NAT operation and sends out IPv4 packet to the public internet, see Figure 1. .--. .--. _(. `) _(. `) _( IPv6 `)_ _( IPv4 `)_ ( Internet `) ( Internet `) ( ) ( ) `--(_______)---' `--(_______)---' | | \ +----------+ \ | AFTR | \ +----------+ \ | \ .--. | +----------+ (. `) / | Cloud | _( IPv6 `)/_ +----------+ | | ( Backbone `) |B4 on GW | | Network |====( )====|----------| +----------+ `--(_______)---' | Storage | || | Network | || +----------+ +-------------+ |Access Router| |-------------| | Aggregation | | Switch | +-------------+ | [VM] B4|-|L2 Network |- [Server] [VM] on | | | [VM] guest OS | [Server] [Server] Figure 1: Architecture of DS-Lite for Cloud 4.2. 4rd Architecture In VM based data center/cloud networks Customer Edge router of 4rd for VMs are placed in the guest OS using operating system patch similar to 6rd CE in [I-D.sakura-6rd-datacenter]. Some data center/cloud networks do not employ Virtual Machines. These networks are server-based. In server-based data center/cloud networks, CE is placed in the access router. In the storage area network CE is also placed in the gateway router. Sarikaya Expires April 19, 2012 [Page 5] Internet-Draft DS Lite and 4rd for Cloud October 2011 CEs assign private IPv4 addresses to the virtual machines or users and perform NAT towards the backbone network. On the outgoing interface, CEs may share public IPv4 addresses with port space sharing if there are not enough public IPv4 addresses to assign to the individual CEs. Public IPv4 address sharing is optional, CEs may be configured with a unique IPv4 public address. 4rd Border Relay is at the border of IPv6 backbone. There could be more than one BRs and they all can have a single anycast address which CEs are configured with. At CEs IPv4 packets are sent encapsulated with an IPv6 header. In case of translation, IPv4 packet is translated into an IPv6 packet [RFC6145] and then sent to BR, see Figure 2. At the BR, IPv6 packet is received. In case of encapsulation, the packet is decapsulated and then sent to IPv4 network. In case of translation, BR translates IPv6 packet into IPv4 packet [RFC6145] and then sends it to IPv4 network. Sarikaya Expires April 19, 2012 [Page 6] Internet-Draft DS Lite and 4rd for Cloud October 2011 .--. .--. _(. `) _(. `) _( IPv6 `)_ _( IPv4 `)_ ( Internet `) ( Internet `) ( ) ( ) `--(_______)---' `--(_______)---' | | \ +----------+ \ | BR | \ +----------+ \ | \ .--. | +----------+ (. `) / | Cloud | _( IPv6 `)/_ +----------+ | | ( Backbone `) |CE on GW | | Network |====( )====|----------| +----------+ `--(_______)---' | Storage | || | Network | || +----------+ +-------------+ |Access Router| |-------------| | Aggregation | | Switch | +-------------+ | [VM] CE|-|L2 Network |- [Server] [VM] on | | | [VM] guest OS | [Server] [Server] Figure 2: Architecture of 4rd for Cloud 5. DS-Lite Operation AFTR IPv6 address must be provisioned in B4 elements. DHCPv6 can be used for this purpose. B4 requests AFTR FQDN using the AFTR-Name DHCPv6 Option defined in [RFC6334]. B4 gets IPv6 address from the DNS using the AFTR-Name DHCPv6 Option and configures IPv4-in-IPv6 tunnel based on this address. B4 operates its own DHCPv4 server and hands out private IPv4 addresses to the users. It is the default IPv4 router and advertizes itself as the DNS server. It should also be DNS proxy and serves IPv4 requests and sends the requests using IPv6 to the DNS servers. B4 encapsulates IPv4 packets and sends them to AFTR. Fragmentation and reassembly of these packets are as discussed in Section 5.3 of Sarikaya Expires April 19, 2012 [Page 7] Internet-Draft DS Lite and 4rd for Cloud October 2011 [RFC6333]. Users in the same data center may communicate with others in the same center via certain applications like Skype. Local routing of IPv4 packets generated by such applications is needed at B4 to avoid a round trip to AFTR. AFTR receives IPv4 packets tunneled by B4s and decapsulates them. AFTR performs NAT operation on IPv4 before forwarding it to public IPv4 network. NAT binding table is extended with IPv6 source address, i.e. WAN side address of the B4 which sent this packet. For packets coming back from the Internet, AFTR does a reverse lookup in the extended IPv4 NAT binding table and extracts IPv6 destination address of the B4 and tunnels IPv4 packet to it. 5.1. DS-Lite for Datacenter Considerations In DS-Lite B4 to B4 communication always goes through AFTR. This could incur additional delays in datacenter/cloud networks because such traffic is expected to be high, e.g. VMs/server may need constant communication with the storage area network which is also part of the same data center. IPv6 hardware support in current switches is poor compared with IPv4. Most commonly used switches reserve for IPv6 half of the memory reserved for IPv4 and thus they are slow in processing IPv6 packets. 6. 4rd Operation Instead of public IPv4 address sharing among the CEs of a data center, each CE is assigned a unique public IPv4 address. This simplifies port usage when doing NAT operation. IPv4 address is derived from domain IPv4 prefix which is used by the BRs to receive incoming traffic from IPv4 internet. CEs configure their WAN interface (IPv6 backbone) using regular means such as SLAAC, DHCPv6, etc. and get an IPv6 prefix assigned. Each CE is configured with one or several mapping rules. Each CE uses a reserved interface ID as part of its IPv6 source address to enable marking 4rd traffic and also to effectively create a 4rd interface on its WAN side. On the LAN side (VMs or users), CE is dual stack and supports IPv4 by assigning private IPv4 addresses. When CE receives an IPv4 packet it performs NAT on the packet. CE looks up the destination IPv4 address in the mapping rules. If the prefix matches domain IPv4 prefix then Sarikaya Expires April 19, 2012 [Page 8] Internet-Draft DS Lite and 4rd for Cloud October 2011 this packet sent to another CE. CE derives an IPv6 destination address from IPv4 destination address. Otherwise this packet is sent to BR. CE derives a BR anycast address as destination IPv6 address. Next the packet is either encapsulated in IPv6 [RFC2473] or translated from IPv4 to IPv6 packet [RFC6145]. BR receives IPv6 packet and checks to see if it is 4rd packet, either using the interface ID of the source IPv6 address or looking up the source IPv4 address in the mapping rules. BR decapsulates or translates IPv6 packet and forwards resulting IPv4 packet using IPv4 interface. For an incoming IPv4 packet BR checks if the destination IPv4 address has a match in the mapping rules. Next, BR generates CE IPv6 address from IPv4 destination address. BR either encapsulates IPv4 packet in IPv6 and or translates it into IPv6 and then sends it to CE. When CE receives an IPv6 packet it checks to see if IPv4 source address matches a mapping rule, i.e. the packet is coming from another CE or IPv6 source address matches BR IPv6 address, i.e. the packet is coming from BR. CE decapsulates or translates the packet to obtain IPv4 packet and then forwards it on the LAN interface. 6.1. 4rd for Datacenter Considerations In 4rd CE to CE communication does not go through BR. This is a significant advantage of 4rd over DS-Lite especially in data center networks. 4rd CE and BR process each packet using 4rd mapping rules. There could be over 1000 mapping rules. This per packet processing naturally slows down users access to the internet especially at the CEs which are much less resourceful than BRs. 7. Security Considerations TBD. 8. IANA Considerations TBD. 9. Acknowledgements TBD. Sarikaya Expires April 19, 2012 [Page 9] Internet-Draft DS Lite and 4rd for Cloud October 2011 10. References 10.1. Normative References [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC2629] Rose, M., "Writing I-Ds and RFCs using XML", RFC 2629, June 1999. [RFC6052] Bao, C., Huitema, C., Bagnulo, M., Boucadair, M., and X. Li, "IPv6 Addressing of IPv4/IPv6 Translators", RFC 6052, October 2010. [RFC6145] Li, X., Bao, C., and F. Baker, "IP/ICMP Translation Algorithm", RFC 6145, April 2011. [RFC6333] Durand, A., Droms, R., Woodyatt, J., and Y. Lee, "Dual- Stack Lite Broadband Deployments Following IPv4 Exhaustion", RFC 6333, August 2011. [RFC6334] Hankins, D. and T. Mrugalski, "Dynamic Host Configuration Protocol for IPv6 (DHCPv6) Option for Dual-Stack Lite", RFC 6334, August 2011. [I-D.murakami-softwire-4rd] Murakami, T., Troan, O., and S. Matsushima, "IPv4 Residual Deployment on IPv6 infrastructure - protocol specification", draft-murakami-softwire-4rd-01 (work in progress), September 2011. [I-D.despres-softwire-4rd] Despres, R., "IPv4 Residual Deployment across IPv6-Service networks (4rd) A NAT-less solution", draft-despres-softwire-4rd-00 (work in progress), October 2010. [I-D.xli-behave-divi] Bao, C., Li, X., Zhai, Y., and W. Shang, "dIVI: Dual- Stateless IPv4/IPv6 Translation", draft-xli-behave-divi-03 (work in progress), July 2011. 10.2. Informative references [RFC5969] Townsley, W. and O. Troan, "IPv6 Rapid Deployment on IPv4 Infrastructures (6rd) -- Protocol Specification", RFC 5969, August 2010. Sarikaya Expires April 19, 2012 [Page 10] Internet-Draft DS Lite and 4rd for Cloud October 2011 [RFC2473] Conta, A. and S. Deering, "Generic Packet Tunneling in IPv6 Specification", RFC 2473, December 1998. [I-D.narten-armd-problem-statement] Narten, T., "Problem Statement for ARMD", draft-narten-armd-problem-statement-00 (work in progress), July 2011. [I-D.sakura-6rd-datacenter] Tsuchiya, S., Townsley, M., and S. Ohkubo, "IPv6 Rapid Deployment (6rd) in a Large Data Center", draft-sakura-6rd-datacenter-01 (work in progress), July 2011. [I-D.wkumari-dcops-l3-vmmobility] Kumari, W. and J. Halpern, "Virtual Machine mobility in L3 Networks.", draft-wkumari-dcops-l3-vmmobility-00 (work in progress), August 2011. [sourceforge] Source Forge, "OpenWRT DS-Lite Software", February 2011. Sarikaya Expires April 19, 2012 [Page 11] Internet-Draft DS Lite and 4rd for Cloud October 2011 Author's Address Behcet Sarikaya Huawei USA 5340 Legacy Dr. Building 3 Plano, TX 75024 Email: sarikaya@ieee.org Sarikaya Expires April 19, 2012 [Page 12]