Methodologies for Scaling SLB Environments February 2006 
 
 
   Network Working Group                                                
   Internet Draft                                              Z. Naseh 
   Document: draft-naseh-scaling-slb-00.txt               Cisco Systems 
   Expires: August 2006                                   February 2006 
    
    
       Methodologies for Scaling Server Load Balancing Environments 
    
    
Status of this Memo 
    
   By submitting this Internet-Draft, each author represents that any 
   applicable patent or other IPR claims of which he or she is aware 
   have been or will be disclosed, and any of which he or she becomes 
   aware will be disclosed, in accordance with Section 6 of BCP 79. 
    
    
   Internet-Drafts are working documents of the Internet Engineering 
   Task Force (IETF), its areas, and its working groups.  Note that 
   other groups may also distribute working documents as Internet-
   Drafts. 
    
   Internet-Drafts are draft documents valid for a maximum of six months 
   and may be updated, replaced, or obsoleted by other documents at any 
   time.  It is inappropriate to use Internet- Drafts as reference 
   material or to cite them other than as "work in progress." 
    
   The list of current Internet-Drafts can be accessed at 
   http://www.ietf.org/ietf/1id-abstracts.txt 
    
   The list of Internet-Draft Shadow Directories can be accessed at 
   http://www.ietf.org/shadow.html. 
 
 
   The IETF takes no position regarding the validity or scope of any 
   Intellectual Property Rights or other rights that might be claimed to 
   pertain to the implementation or use of the technology described in 
   this document or the extent to which any license under such rights 
   might or might not be available; nor does it represent that it has 
   made any independent effort to identify any such rights.  Information 
   on the procedures with respect to rights in RFC documents can be 
   found in BCP 78 and BCP 79. 
    
    
Naseh                   Expires - August 2006                [Page 1] 
              Methodologies for Scaling SLB Environments February 2006 
 
 
Abstract 
    
   This document defines details of several design methodologies and 
   current best practices for scaling server load balancing (a.k.a. 
   content switching) environments in today’s enterprise data centers. 
   The scenarios covered in this document covers utilization of 
   protocols and technologies ranging from IPv4 routing to DNS for 
   scalability of server load balancing. 
    
 
Table of Contents 
    
   1. Introduction...................................................2 
   2. Benefits of Scaling Content Switching..........................3 
      2.1 Scalability................................................3 
      2.2 Performance................................................3 
   3. Scaling Methodologies..........................................4 
      3.1 Distribution of Applications...............................4 
      3.2 Using DNS for Application Scalability......................4 
      3.3 Using IP Routing for Application Scalability...............4 
   4. Application Distribution Approach..............................5 
   5. DNS-Based Scaling Approach.....................................5 
      5.1 Predictable Traffic Flow...................................7 
      5.2 Ease of Management and Maintenance.........................7 
   6. IP Routing Based Scaling Approach..............................7 
   7. Scaling Beyond Server Capacity.................................8 
   8. IANA Considerations............................................9 
   9. Security Considerations........................................9 
   10. Acknowledgments...............................................9 
   11. Author's Addresses............................................9 
   12. Copyright Notice..............................................9 
   13. Disclaimer....................................................9 
   14. Intellectual Property........................................10 
    
    
1.   Introduction 
    
   An enterprise Data Center is a complex environment which is used to 
   host mission critical internal and external applications servers, 
   data bases and data storage devices. The Data Center infrastructure 
   is tied together using Layer2 and Layer3 devices. The security of the 
   compute infrastructure and the networking devices is made possible 
   using firewalls and intrusion detection devices. In order to scale 
   the applications, the key service provided within the data center is 
   server load balancing. This is also known as content switching. The 
   idea is to distribute user requests across a group or farm of servers 
   hosting the same application or daemon. 
    

Naseh                   Expires - August 2006                [Page 2] 
              Methodologies for Scaling SLB Environments February 2006 
 
 
   In this document will focus on how a load-balanced environment can be 
   scaled. With the passage of time as the user base increases, the key 
   question that comes up is how to scale an application such that user 
   experience is not compromised. Typically, an application is scaled 
   simply by adding more servers to the load-balanced server farms 
   within the same data center or by replicating the application in 
   another data center. Having the same application exist in two in-
   service data centers achieves dual purposes - one being scalability 
   and the other being disaster recovery. There are several ways to 
   provide redundancy between the data centers. These approaches are not 
   covered in this memo. 
    
   This document introduces the concepts and design methodologies of 
   scaling load-balancing services within a single data center. It will 
   be discussed how load-balanced applications can be scaled using DNS 
   or IP and also how they are scaled when server capacity within a load 
   balancer is maximized. In other words, we will cover how the same 
   application is load balanced in two different SLB devices, the 
   approaches to SLB device selection, and the methods of scaling when 
   the servers within an SLB device have reached their maximum capacity 
   or have failed health checks. 
    
 
2.   Benefits of Scaling Content Switching 
    
   The motivations behind scaling content switching within a data center 
   are similar to the motivations for server load balancing. Scalability 
   and performance are the main reasons behind server load balancing in 
   general.  
    
    
2.1    Scalability 
    
   A server load-balancing device can be fully utilized in connections 
   per second, concurrent connections, probes or health checks to 
   servers, slow path or process-switching performance, or software 
   configuration limits. Once any of the above-mentioned functionality 
   is maximized, another server load-balancing device would have to be 
   deployed. 
     
    
2.2    Performance 
    
   Performance of the load balancer in terms of packet switching and raw 
   throughput is another key reason to deploy multiple pairs of server 
   load-balancing devices in the same data center. To overcome 
   performance limitations of the load balancer, one option is to design 
   the SLB device so that only load-balanced traffic passes through the 
   SLB devices. If this does not help, another pair of load balancers 
 
 
Naseh                   Expires - August 2006                [Page 3] 
              Methodologies for Scaling SLB Environments February 2006 
 
 
   can be deployed for a different application or the same application. 
   In the next few sections, several design methodologies that can be 
   used to scale SLB environments will be discussed. 
    
    
3.   Scaling Methodologies 
    
   In this section, some of the design approaches and technologies that 
   can be used to scale within a data center will be introduced. The key 
   scaling methodologies are distributing applications over multiple SLB 
   devices, using smart DNS to distribute the same application traffic 
   over multiple SLB devices and using IP Routing to distribute the same 
   application traffic over multiple SLB devices. 
    
    
3.1    Distribution of Applications 
    
   Distributing applications across multiple SLB devices is the simplest 
   and the most manageable of all the design approaches. The idea is to 
   segment an existing fully utilized SLB device’s configuration with 
   respect to applications. One set of applications will reside in the 
   old SLB device, and a different set of applications will reside in a 
   new SLB device. 
   In this approach, each application owner would have to service a 
   particular SLB device. Routing infrastructure changes or DNS changes 
   are not required. 
    
    
3.2    Using DNS for Application Scalability 
    
   In a situation where the SLB device is overloaded by a single 
   application, distribution of applications will not help. In that 
   case, the particular application traffic will have to be split across 
   multiple pairs of SLB devices. One way to do this is to host the 
   application using different sets of server farms and different 
   virtual IP addresses on separate pairs of SLB devices. A smart DNS 
   server (a DNS device that can do health checks and verify 
   availability of the virtual IP addresses) can be used to distribute 
   the users across the two virtual IP addresses being hosted on 
   separate pairs of SLB devices.  
   A typical smart DNS appliance does have source IP-based stickiness or 
   source IP-based hash functionality available that can be used to 
   provide client persistence to a particular SLB device.  
    
    
3.3    Using IP Routing for Application Scalability 
    
   In environments where the use of DNS is not desired and where the 
   client base is internal, an IP-based load-balancing solution can be 
 
 
Naseh                   Expires - August 2006                [Page 4] 
              Methodologies for Scaling SLB Environments February 2006 
 
 
   used to scale the SLB device. This design requires the application to 
   be hosted using different sets of server farms but same Virtual IP 
   address on separate pairs of SLB devices. Based on the availability 
   of the servers, the Virtual IP address will be injected into the 
   routing table of the adjacent routers. For this methodology the SLB 
   device will need to establish routing peering with its adjacent 
   router. If servers are operational in both SLB devices, then two 
   host-based routes for the same Virtual IP address will show up in the 
   routing domain. As client requests are routed, they will end up at 
   the SLB device closest to them in terms of routing metrics. 
    
    
4.   Application Distribution Approach 
 
   The application distribution approach is fairly straight forward 
   conceptually. It requires the configurations of the SLB device to be 
   split along the lines of different applications or business units or 
   owners. The segmented configuration resides on different SLB devices. 
   The complex part of this approach is to distribute the server 
   resources. The servers belonging to a particular application should 
   have their default gateway pointing to the SLB device hosting that 
   specific application. This becomes difficult when load-balanced 
   servers associated with different applications reside in the same 
   VLAN.  
   To ensure that load-balanced server return traffic traverses the 
   appropriate SLB device, we can take several measures. These methods 
   range from re-IP addressing and placement of the servers in 
   appropriate VLANs behind the SLB device to redesigning the SLB 
   environment to a one-armed design approach where the physical and 
   logical presence of the application servers does not matter. 
 
   For example, partner.example.com and shop.example.com resided on the 
   same SLB device. As the usage of the SLB device went to 100 percent, 
   clients started experiencing delays. These delays were resolved by 
   splitting the SLB device configurations into two different units, one 
   hosting partner.example.com and the other shop.example.com. 
    
    
5.   DNS-Based Scaling Approach 
 
   DNS is typically used to load balance applications across multiple 
   data centers. This is known as global site (or server) load balancing 
   (GSLB). Similarly, we can use DNS to perform GSLB within a data 
   center to scale SLB devices.  
    
   The DNS-based scaling approach is simple to integrate in most 
   existing infrastructures and can be migrated over time. For example, 
   a pair of intelligent DNS appliances is deployed within a data center 
   and the DNS appliances are authoritative for the domains 
 
 
Naseh                   Expires - August 2006                [Page 5] 
              Methodologies for Scaling SLB Environments February 2006 
 
 
   shop.example.com and partner.example.com. These applications are load 
   balanced by the SLB pairs within the data center. Each SLB pair load 
   balances traffic across a local set of servers for the same 
   application. 
 
   Let’s take the example of shop.example.com. The virtual IP addresses 
   for this are 10.11.12.15 in data center segment 1 and 10.12.13.15 in 
   data center segment 2. The idea is that if any client would like to 
   access shop.example.com, the DNS appliance will respond with the next 
   available VIP, 10.12.13.15 or 10.11.12.15, based on the load-
   balancing algorithm. The load-balancing methods on the DNS appliance 
   range from simple round robin to static proximity based on the 
   requestor’s source IP address. Let’s say a client with an IP address 
   in subnet 10.11.0.0 queries the DNS appliance; the response should 
   have the VIP of data center segment 1 unless that VIP is down. The 
   DNS appliance accomplishes that by looking at the source IP of the 
   requestor. If the source IP is that of a client or server in subnet 
   10.11.0.0, for example, then answer within the DNS response has the 
   data center segment 1 VIP (10.11.12.15). Following are the steps of 
   DNS resolution: 
    
     1. A client with IP address 10.11.42.31 issues a DNS query for 
        shop.example.com. 
     2. The internal example.com name server receives the request and 
        responds with both DNS appliances’ IP addresses; that is, 
        10.11.10.171 and 10.12.11.161. 
     3. The client sends the request to the first member in the answer 
        list; that is, 10.11.10.171. (If 10.11.10.171 times out, the 
        client queries 10.12.11.161.) 
     4. DNS appliance inspects the query and finds a match for 
        shop.example.com; the policy configured on the DNS appliance 
        indicates to look at the source IP address of the requestor and 
        respond back appropriately. Static policy configuration is 
        similar to that detailed here: 
          a. If the source address is 10.11.0.0/16, then respond with 
             10.11.12.15; if 10.11.12.15 is down, respond with 
             10.12.13.15. 
          b. If the source address is 10.12.0.0/16, then respond with 
             10.12.13.15; if 10.12.13.15 is down, respond with 
             10.11.12.15. 
          c. Alternatively, do a source IP hash and balance between 
             10.11.12.15 and 10.12.13.15. 
     5. As the client gets the response of the local SLB device VIP 
        (10.11.12.15), it initiates the TCP connect to that VIP. 
      
   The DNS-based solution meets the requirements of proximity and 
   stickiness to the selected SLB device. The key advantages of the DNS-
   based approach are: 
     . Predictable traffic flow  
 
 
Naseh                   Expires - August 2006                [Page 6] 
              Methodologies for Scaling SLB Environments February 2006 
 
 
     . Ease of management and maintenance of each SLB node 
    
    
5.1    Predictable Traffic Flow 
    
   The DNS-based solution is independent of routing protocol issues like 
   flaps (rapid fluctuation of routing information) or slow convergence. 
   The decision about which SLB device will be used is made before the 
   session is initiated by the data center client. The SLB node 
   selection is based on the IP address and configured policy, thus the 
   flow is known. The configured decision will be taken unless the SLB 
   node in the policy is unavailable. Several balancing methods similar 
   to the ones used in SLB devices are available to load balance client 
   DNS requests.  
 
 
5.2    Ease of Management and Maintenance 
    
   In the DNS approach previously mentioned, the DNS appliance is 
   configured with the virtual IP addresses on the SLB devices in all 
   the data center segments. In order to take down an application, 
   services, or the entire SLB node, the process is fairly simple. The 
   service can be easily suspended on the DNS appliance by a click of a 
   button.  
 
    
6.   IP Routing Based Scaling Approach 
 
   The IP Routing based approach works best when the client base is 
   internal or in a controlled routing environment. In this solution, 
   the application is hosted on multiple SLB devices with different 
   servers but the same Virtual IP address.  
   In this design methodology the SLB device injects a host route for 
   the Virtual IP address in the adjacent router as long as at least one 
   of the servers in the respective server farm is operational. 
   Extensive probing is available on the SLB device to check the health 
   of the server and the appropriate application daemon that runs on the 
   server. When we use this method on multiple SLB devices hosting the 
   same application with the same Virtual IP address, the routing domain 
   will have multiple paths to the same VIP address. The next hop on 
   these host routes will be the SLB devices IP address. As the user 
   request enters the routing domain, it is sent to the SLB device 
   closest to the user based on the routing metrics. 
    
 
Naseh                   Expires - August 2006                [Page 7] 
              Methodologies for Scaling SLB Environments February 2006 
 
 
7.   Scaling Beyond Server Capacity 
 
   In the preceding sections; it is discussed how to scale SLB devices 
   beyond their capacity limits. In this section, it will be discussed 
   how to scale the server’s capacity within an SLB device. Let’s say we 
   have an SLB device with 10 servers configured on it for a particular 
   resource-intensive web-based application. Each server is capable of 
   serving only 100 user sessions concurrently. So at any given time, 
   the SLB environment can service 1000 users - beyond that, if any new 
   user is load balanced to a server, the server becomes unstable.  
   There are several approaches to resolve this server capacity issue. 
   These solutions range from increasing the server CPU/memory resources 
   to using multiple features on the SLB devices to form a scalable 
   environment. The complex but comprehensive solutions rely on the max 
   connections feature on the real server or virtual server level. The 
   max connections configuration within a real server informs the SLB 
   device that the server can only handle the configured number of 
   sessions concurrently. This protects the real servers from excessive 
   user requests that may in turn cause them to become unstable. The 
   idea is not to disrupt the existing users’ sessions. 
   So, the first step is to detect the max connections on the servers 
   and the second step is to take appropriate action. If the SLB device 
   detects that all the servers in the server farm of a particular 
   application have reached their maximum capacity, any of the following 
   measures can be taken: 
    
     . Have a server that will serve a turn-away page or a sorry 
        message to the clients. The user will be informed to return to 
        the site at a later time. 
     . Send an HTTP 302 redirect to the new clients and send them to 
        another SLB device hosting the same application. 
     . Deploy the SLB devices in conjunction with a smart DNS appliance 
        like DNS appliance. When max connections are reached, the SLB 
        device will inform the DNS appliance to service the user DNS 
        query with a VIP from another SLB device.  
     . Send the user request to another SLB device, but source 
        translate the packet. This ensures that the packet from the 
        second SLB device will traverse the first SLB device. In other 
        words, the session path will be symmetric. 
    
   The overall solution can use a combination of approaches to come up 
   with an environment that best services the hosted application.  
    
   The document introduced the concepts and design methodologies of 
   scaling load-balancing services within a single data center. The memo 
   covered how load-balanced applications can be scaled using DNS or IP, 
   and how they are scaled when server capacity within a load balancer 
   is maximized. 

 
Naseh                   Expires - August 2006                [Page 8] 
              Methodologies for Scaling SLB Environments February 2006 
 
 
8.   IANA Considerations 
 
   This document requests no action by IANA. 
 
 
9.   Security Considerations 
 
   The scenarios and design methods identified in this document should 
   only be implemented in accordance with the network security policies. 
   The SLB environment identified MUST be behind a firewall perimeter.  
    
   Components of effective information security architecture, including 
   network infrastructure and server infrastructure security, physical 
   security, security awareness, incident monitoring and response, 
   audit, and security improvement processes are assumed to be in place 
   and active. 
 
 
10.    Acknowledgments 
    
   The author gratefully acknowledges the contributions of Haroon Khan 
   of Cisco Systems, Inc. 
    
 
11.    Author's Addresses 
 
   Zeeshan Naseh 
   Cisco Systems 
   San Jose, California 
   znaseh@cisco.com 
 
 
12.    Copyright Notice 
    
   Copyright (C) The Internet Society (2006).  This document is subject 
   to the rights, licenses and restrictions contained in BCP 78, and 
   except as set forth therein, the authors retain all their rights.  
 
 
13.    Disclaimer 
       
   This document and the information contained herein are provided on an 
   'AS IS' basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 
   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET 
   ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, 
   INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE 

 
Naseh                   Expires - August 2006                [Page 9] 
              Methodologies for Scaling SLB Environments February 2006 
 
 
   INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 
   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.'  
 
 
14.    Intellectual Property 
    
   The IETF takes no position regarding the validity or scope of any 
   Intellectual Property Rights or other rights that might be claimed to 
   pertain to the implementation or use of the technology described in 
   this document or the extent to which any license under such rights 
   might or might not be available; nor does it represent that it has 
   made any independent effort to identify any such rights.  Information 
   on the ISOC's procedures with respect to rights in ISOC Documents can 
   be found in BCP 78 and BCP 79.  
    
   Copies of IPR disclosures made to the IETF Secretariat and any 
   assurances of licenses to be made available, or the result of an 
   attempt made to obtain a general license or permission for the use of 
   such proprietary rights by implementers or users of this 
   specification can be obtained from the IETF on-line IPR repository at  
   http://www.ietf.org/ipr.   
        
   The IETF invites any interested party to bring to its attention any 
   copyrights, patents or patent applications, or other proprietary 
   rights that may cover technology that may be required to implement 
   this standard.  Please address the information to the IETF at ietf- 
   ipr@ietf.org.    
    
    
Naseh                   Expires - August 2006               [Page 10]