SDN Research Group R. Tu Internet Draft C. Zhou Intended status: Informational J. Zhao Expires: January 2015 X. Wang Fudan University July 4, 2014 SDN Middle Box for Data Centers draft-tu-sdnrg-middle-box-00 Status of this Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. This document may contain material from IETF Documents or IETF Contributions published or made publicly available before November 10, 2008. The person(s) controlling the copyright in some of this material may not have granted the IETF Trust the right to allow modifications of such material outside the IETF Standards Process. Without obtaining an adequate license from the person(s) controlling the copyright in such materials, this document may not be modified outside the IETF Standards Process, and derivative works of it may not be created outside the IETF Standards Process, except to format it for publication as an RFC or to translate it into languages other than English. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html This Internet-Draft will expire on January 4, 2015. Tu, et al. Expires January 4, 2015 [Page 1] Internet-Draft SDN Middle Box for Data Centers July 2014 Copyright Notice Copyright (c) 2014 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Abstract This document describes a middle box to improve the utilization of data centers. By first combining Clos network with Software Defined Network (SDN), the middle box is programmable and nonblocking. SDN controller in the middle box can collect information such as traffic load, port speed, and resources utilization, which can be used to get better efficiency. The nonblocking feature can effectively reduce the delay caused by the middle box thus get the better performance. Furthermore, the middle box is transparent for the original network, so it can be readily implemented for today's data centers. Table of Contents 1. Introduction ................................................. 3 2. Design of the Middle Box ..................................... 3 2.1. Architecture of the Middle Box .......................... 3 2.2. Inside the Middle Box ................................... 4 2.3. Design Space of the Middle Box .......................... 5 3. Working Principal of the Middle Box .......................... 5 3.1. Initialization .......................................... 6 3.1.1. Establishing Connection ............................ 6 3.1.2. Path initialization ................................ 8 3.2. Learning and Optimization ............................... 8 3.3. Load Balancing .......................................... 9 4. Security Considerations ...................................... 9 5. IANA Considerations .......................................... 9 6. Conclusions .................................................. 9 7. References .................................................. 10 7.1. Normative References ................................... 10 Tu, et al. Expires January 4, 2015 [Page 2] Internet-Draft SDN Middle Box for Data Centers July 2014 7.2. Informative References ................................. 10 1. Introduction A middle box based on Clos network for data centers is designed to get better Qos. Comparing with the traditional solutions for improving performance of data centers, we first proposed a middle box based on Clos network to achieve better performance for data centers. The middle box uses Software Defined Network (SDN), which has the centralized controller, and is more suitable for load balancing. We proposed a protocol for centralized controller to communicate with servers and VMs, so our solution can take full advantage of all these information to get better efficiency. Furthermore, the middle box we proposed is transparent for the original network, so it can be readily implemented for today's data. centers 2. Design of the Middle Box The architecture of the middle box and the basic theory of how to choose the proper number of switches is described in this part. 2.1. Architecture of the Middle Box The middle box is based on SDN and the topology inside is Clos network. Inside the middle box, there are several OpenFlow switches and a SDN controller. The OpenFlow switches are controlled by the controller through OpenFlow Protocol 2. The topology in the middle box are based on Clos network. The Clos networks can be divided into three types according to the number of switches in different stage. In order to get better efficiency and use fewer switches in the middle box, The Rearrangealbe nonblocking Clos network is chosen in the middle box to get better efficiency and use fewer switches. It means that if we want to choose an idle path for the traffic from the unused input port to the unused output port, then we may need to rearrange the existing traffic to the different middle stage switches. Fortunately, this can be done by the SDN controller, and we can rearrange the path through rewriting the flow tables in the OpenFlow switches. Middle box is transparent for the original data center, so it is placed in the connection between switches and servers in the data center. When the middle box is deployed in the data center, switches and servers perform regularly as before. The only difference is that the servers need to report its load and resources usage to the SDN Tu, et al. Expires January 4, 2015 [Page 3] Internet-Draft SDN Middle Box for Data Centers July 2014 controller periodically, then our middle box can make the optimal decision for the data center to get better efficiency. 2.2. Inside the Middle Box As we previously discussed, the topology inside the middle box is rearrangealbe nonblocking Clos network. Assuming that there are r ingress stage switches with n*n ports, n middle switches with r*r ports and r egress stage switches with n*n ports. Suppose the data center has N_servers servers, then we can get: r=N_servers/n Then the total number of switches in the middle box are: N_switches=N_servers/n+n+N_servers/n=n+2N_servers/n According to the fundamental inequality, we have: N_switches>=2sqrt(2N_servers) The inequality above implies that when given the data center, we must choose the proper switches. When the port number is n=2sqrt(2N_servers) then the total number of the switches inside of our middle box is N_switches=2sqrt(2N_servers). Unfortunately, the number of n may be decimal, which doesnt conform the reality. We must guarantee the number of n is an integer and the number of r is also an integer. From the discussed above, we can use the integral linear programming to obtain the minimal number of switches for our middle box: Min z=2r+n (r>=N_servers/n, N_servers belong to Z* is a constant, r, n belong to Z*) For example, if there are 100 servers in a data center, then we can obtain the minimal number of switches is 30. Using integral linear programming in formula above, we can estimate the total number of switches in our middle box. However, there are some problems. For instance, if the data center is too huge (about 20000 servers), we have z_min=400 when n=200. It means that the minimal number of switches inside the middle box is 400, but the switch of the middle stage must have 800 ports, which is impractical. Another problem occurs when the data center is too small (such as 8 servers), and in this case, we can get the number Tu, et al. Expires January 4, 2015 [Page 4] Internet-Draft SDN Middle Box for Data Centers July 2014 of the switches inside the middle box is 8 which equals to the number of servers, which is costly. Under these circumstances, there must be some strategies to build the middle box. If the data center is too small, the Clos network turns into a large volume OpenFlow switch. On the other hand, if the data center is too large, then we will discuss how to solve the problem. 2.3. Design Space of the Middle Box In order to solve the problem in large data centers, we consider the design space of the middle box. When using one middle box for the data center, if there are N_servers, then the total number of switches inside the middle box is n+2N_servers/n, which contains N_servers/n ingress stage switches with n*n ports, n middle stage switches with (N_servers/n)*(N_servers/n) ports, and N_servers/n egress stage switches with n*n ports. When using two middle boxes for the data center, if there are N_servers servers, and each middle box connects to N_servers/2 servers, then the total number of switches inside one middle box is n+N_servers/n, which contains N_servers/(2n) ingress stage switches with n*n ports, n middle stage switches with N_servers/(2n)*N_servers/(2n) ports, and N_servers/(2n) egress stage switches with n*n ports. So the total number of the switches is 2n+N_servers/n. Generally, when using k middle boxes, if there are N_servers, and each middle box connects to N_servers/k servers, then the total number of switches inside one middle box is n+2N_servers/(kn), which contains N_servers/(kn) ingress stage switches with n*n ports, n middle stage switches with N_servers/(kn)*N_servers/(kn) ports, and N_servers/(kn) egress stage switches with n*n ports. So the total number of the switches is kn+2N_servers/n. According to the discuss above, we obtain that if the data center is too huge, we can use multiple middle boxes to reduce the ports number of the middle stage switches. 3. Working Principal of the Middle Box In this section, we will give a detailed description of the working principle of our middle box. In order to better explain the principal, we deploy our middle box in a data center with the fat- tree topology. Tu, et al. Expires January 4, 2015 [Page 5] Internet-Draft SDN Middle Box for Data Centers July 2014 The topology of the data center is the fat-tree, and our middle box is placed in the connection between switches and servers in the data center. Our middle box has the following functions: 1. Initialization: When the middle box is deployed in the data center, it will trigger the initialization process. 2. Learning and optimization: The controller inside the middle box will learn the traffic characteristics, then it will optimize the path inside the middle box. 3. Load balancing: The middle box will collect information from controller and switches, then do the load balancing for the data center. 3.1. Initialization When the middle box is deployed in the data center, firstly, it needs to initialize. The initialization can ensure the normal operation of the data center and can be done by the controller. There are two basic rules of initialization: 1. The controller need to establish the connection to the switches inside the middle box and the servers inside the data center. 2. The middle box must be transparent for the data center, which indicates that the traffic must pass through the original path after initialization. For example the original topology of the data center is a fat-tree, thus server 1 and server 2 originally connect to the switch 1.0. When our middle box is deployed in this data center, then the middle box need to ensure traffic from server 1 and server 2 must be transmitted to the switch 1.0 after initialization. 3.1.1. Establishing Connection Establishing connection between SDN switches and controller is defined in OpenFlow protocol, so it is easy to realize. In order to get better efficiency, the controller inside the middle box need to establish connection with the servers, so that the controller can make the optimal decision. Establishing connection between controller and servers is unique in our middle box, and it is useful for the controller to collect information about the servers. Since there is no existing method for the controller to Tu, et al. Expires January 4, 2015 [Page 6] Internet-Draft SDN Middle Box for Data Centers July 2014 communicate with the normal servers, so we need to deploy applications for servers and controller to establish connection. Consider that TCP/IP protocol is widely applied in the network today, so we use Socket for the controller and servers to communicate with each other. The Socket listening code is placed in the controller, and when the middle box is deployed in the data center, the controller start to listen to the new connection of servers. On the other hand, the Socket connection code is placed in the servers, and when the servers are connected to the middle box, it will try to establish connection with the controller. If connection successes, then the controller can get information from the servers, such as traffic load, CPU usage and memory capacity and so on. The servers periodically report their information to the controller, and the controller also can request information actively. The communication process between controller and servers is shown in Figure 1. +----------------------------+ |<---Establish Connection----| | | | | |-------Connection OK------->| | | | | |<---Report Information------| | | Controller | |server |<---Report Information------| | | | | |--Request Information------>| | | | | |<---Report Information------| | | | ...... | +----------------------------+ Figure 1 Communication Process between Controller and Servers Tu, et al. Expires January 4, 2015 [Page 7] Internet-Draft SDN Middle Box for Data Centers July 2014 In Figure 1, the server firstly connects to the controller, after controller received the request, it returns the "Connection OK" information. Then the server will report its information periodically to the controller. The controller also can request for the information from the servers. 3.1.2. Path initialization After connections are established, the middle box need to initialize the route inside the middle box so that the data center can operate as usual. After initialization, the traffic must pass through the original path. Consider a rearrangeable nonblocking Clos network. There are 2 ingress stage switches with 3*3 ports, 3 middle switches with 2*2 ports and 2 egress stage switches with 3*3 ports. Every ingress stage switches output connects to every middle switches input and every middle stage switches output connects to every egress stage switches. We define Match Matrix M to describe the matches between the input ports (ingress stage switch input ports) and the output ports (egress stage switch output ports), which can be used to initialize the original path. For example, if input port i matches the output port j, which indicate that the traffic from input i need to be transmitted to the output j, then M(i,j)=1 otherwise M(i,j)=0. That is, M(i,j)=1 port i matches port j M(i,j)=0 otherwise Besides, there are some constrains for the Match Matrix: o For each row, the summary of columns is not more than 1 o For each column, the summary of rows is not more than 1 o M(i,j) belong to {0,1} 3.2. Learning and Optimization Combining with the SDN controller, our middle box has the ability to learn the traffic characteristics, then it will optimize the path inside the middle box to get better efficiency for the data center. Tu, et al. Expires January 4, 2015 [Page 8] Internet-Draft SDN Middle Box for Data Centers July 2014 3.3. Load Balancing After initialization, the middle box established connection between servers and switches and the servers will periodically report its resource usage, which is useful to do load balancing. There are two kinds of load balancing in our middle box. The first load balancing is for the switches inside the middle box, and the second is for the servers in the data center. Switches inside the middle box may overload because the optimization operated by the controller. When a mass of traffic pass through a switch inside the middle box, it will therefore result in delay and packet loss. Fortunately, the controller can collect information from the inside switches, such as port rate, traffic load, queue length and so on. Using these information, we can dynamically change the route inside the middle box, which can be done by the controller. From the previous section, we know that the servers will establish connection with the controller after initialization. Servers will periodically report their information to the controller, so controller can take action if servers are overloaded. On condition that in data centers, some servers may have the same functions. So if one server is overloaded, then controller can transmit traffic to another. On the other side, if the load of servers with the same functions is light, then the controller can aggregate the traffic to one server and shut down other servers to improve the efficiency of the data center. 4. Security Considerations TBD. 5. IANA Considerations This document introduces no additional considerations for IANA. 6. Conclusions In this document, we proposed a middle box to improve the utilization of data centers. By first combining the Clos network with Software Defined Network, our middle box is programmable and nonblocking. The centralized controller inside the middle box can collect information from SDN switches and servers, and these information can significantly improve the utilization of the data Tu, et al. Expires January 4, 2015 [Page 9] Internet-Draft SDN Middle Box for Data Centers July 2014 center. On the other side, the middle box has the ability to learn traffic characteristics of the data center, so it can get better efficiency. The evaluation indicates that the middle box can significantly improve the utilization comparing with the original data center. Furthermore, the middle box is transparent for the original network and can be readily implemented for today's data centers. 7. References 7.1. Normative References [1] Greenberg A, Hamilton J, Maltz D A, et al. The cost of a cloud: research problems in data center networks[J]. ACM SIGCOMM Computer Communication Review, 2008, 39(1): 68-73. [2] Liu Xiaoqian. Research on Data Center Structure and Scheduling Mechanism in Cloud Computing [D][D]. Hefei: University of Science and Technology of China. [3] Meng X, Pappas V, Zhang L. Improving the scalability of data center networks with traffic-aware virtual machine placement[C]//INFOCOM, 2010 Proceedings IEEE. IEEE, 2010: 1-9. [4] Chidiebele U, Inyiama H C, Kennedy O, et al. QoS Parameters Investigations and Load Intensity Analysis,(A Case for Reengineered DCN)[J].International Journal, 2012. [5] Beloglazov A, Buyya R. Energy efficient resource management in virtualized cloud data centers[C]//Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing. IEEE Computer Society, 2010: 826-831. [6] Charles E. Leiserson Fat-trees: universal networks for hardware- efficient supercomputing, IEEE Transactions on Computers, Vol. 34 , no. 10, Oct.1985, pp. 892-901 6. 7.2. Informative References [7] Greenberg A, Hamilton J R, Jain N, et al. VL2: a scalable and flexible data center network[C]//ACM SIGCOMM Computer Communication Review. ACM, 2009, 39(4): 51-62. [8] Guo C, Lu G, Li D, et al. BCube: a high performance, server- centric network architecture for modular data centers[J]. ACM SIGCOMM Computer Communication Review, 2009, 39(4): 63-74. Tu, et al. Expires January 4, 2015 [Page 10] Internet-Draft SDN Middle Box for Data Centers July 2014 [9] Cao J, Xia R, Yang P, et al. Per-packet load-balanced, low- latency routing for clos-based data center networks[C]//Proceedings of the ninth ACM conference on Emerging networking experiments and technologies. ACM, 2013: 49-60. [10] Heller B, Seetharaman S, Mahadevan P, et al. ElasticTree: Saving Energy in Data Center Networks[C]//NSDI. 2010, 10: 249-264. [11] Beloglazov A, Buyya R. Energy efficient resource management in virtualized cloud data centers[C]//Proceedings of the 2010 10th IEEE/ACM. International Conference on Cluster, Cloud and Grid Computing. IEEE Computer Society, 2010: 826-831. 1312. [12] Clos, Charles (Mar 1953). "A study of non-blocking switching networks".Bell System Technical Journal 32 (2): 406C424. doi:10.1002/j. 1538-7305.1953.tb01433.x. ISSN 0005-8580. [13] Qazi Z A, Tu C C, Chiang L, et al. SIMPLE-fying middlebox policy enforcement using SDN[C]//Proceedings of the ACM SIGCOMM 2013 conference on SIGCOMM. ACM, 2013: 27-38. [14] Fundation O N. Software-defined networking: The new norm for networks[J]. ONF White Paper, 2012. Tu, et al. Expires January 4, 2015 [Page 11] Internet-Draft SDN Middle Box for Data Centers July 2014 Authors' Addresses Renlong Tu Fudan University 525, Zhang Hen Road, Zhangjiang Shanghai 201203 P.R. China Email: 13210240004@fudan.edu.cn Chuanjie Zhou Fudan University 525, Zhang Hen Road, Zhangjiang Shanghai 201203 P.R. China Email: 13210240050@fudan.edu.cn Jin Zhao Fudan University 525, Zhang Hen Road, Zhangjiang Shanghai 201203 P.R. China Email: jzhao@fudan.edu.cn Xin Wang Fudan University 525, Zhang Hen Road, Zhangjiang Shanghai 201203 P.R. China Email: xinw@fudan.edu.cn Tu, et al. Expires January 4, 2015 [Page 12]