INTERNET-DRAFT Expires: AUGUST 1997 INTERNET-DRAFT Network Working Group J. Mansigian Internet-Draft Consultant Category: Informational February 1997 Clearing the Traffic Jam at Internet Servers A Network Layer View Of Network Traffic Consolidation Status of this Memo This document is an Internet-Draft. Internet-Drafts are working documents of the Internet Engineering task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress". To learn the current status of any Internet-Draft, please check the "1id-abstract.txt" listing contained in the Internet-Drafts Shadow Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe), munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or ftp.isi.edu (US West Coast). I Abstract The cause of the typical glacial response from popular Internet World Wide Web servers is seldom lack of network bandwidth or any deficits in the client's equipment. The reasons for the abysmal performance are the subject server is spending an inordinate amount of time managing two problems; an unnecessarily large number of TCP connections and masses of redundant data that are received and transmitted without optimization. This work addresses both problems. This document presents an introduction to the concepts and architecture of network traffic consolidation. It is not intended to describe a complete protocol with every ancillary feature but rather to focus on performance driven core ideas that could become a part of emerging commercially featured protocols. In that sense it does not particularly define a replacement for what exists but is more a source of ideas to influence emergent communication initiatives. The scope of network traffic consolidation is confined to file level interactions between Internet World Wide Web servers and their clients. Allowed data is presented literally to the client without being transformed by client specific selection below the file level or client specific calculated outputs. The goal of network traffic consolidation is to make an overburdened file server's behavior become very much as if it were servicing a light flow of file requests. The methods of network traffic consolidation can be summarized by saying that they achieve their goal by actually making the server's file request flow light. Network traffic consolidation acts on both the input and output flows of client server data. The input phase of network traffic consolidation is called request reduction - the output phase is called dynamic multicast. Request reduction is implemented by a router resident process that sees a busy server's file request flow templated by a series of time windows of small uniform interval. The request reduction process gathers into one grouping all file requests that are from the same time window, request the same file, and originate from different clients. A common example would be multiple HTTP requests from different clients requesting the same HTML document file occuring in the same time window. All input requests are complete as single packets. Likewise the output reduced request is a single packet. The output request contains the payload data common to all input file requests and a unique key, placed in the client's address field. The reduced request's key identifies a distribution tree of client addresses that the router keeps cached for a short time while the request is being fulfilled by the output phase. The direct advantage of request reduction is a dramatic decrease in the frequency of host interruption with attendant improvement in server performance. Dynamic multicast involves distributing a server's response to a reduced request by using the distribution tree of requester addresses that was created on the fly by the request reduction process. The distribution tree is very similar to the pruned spanning tree used by MOSPF multicast implementations. It differs in that the complete distribution tree is included in each of the server's response packets in network traffic consolidation instead of being held in router cache as in MOSPF. Network routers effect dynamic multicast by a resident process which copies each response packet onto the interfaces indicated by the distribution tree contained in the response packet. The principal advantage of dynamic multicast is that the server needs to put only one copy of the file on the wire no matter how many client requests compose the reduced request. II Introduction The Internet is used by millions of people every day for a variety of purposes. There is no sign that growth of interest in using the Internet is abating. The demands being placed on the Internet today could not have been anticipated decades ago when its predecessor, Arpanet, was designed. However the effects of these early design decisions are still very much with the Internet. The Paradigm Shift and its Effects The 1990s brought to the Internet a crush of new users from diverse backgrounds. This influx was mostly the result of the meteoric rise of the World Wide Web precipitated by the widespread availability of easy to use graphically interfaced clients such as Mosaic. There were two important changes that resulted from this burgeoning movement. One was that there was explosive growth in activity both over the network and at the host interface. Two, the predominant form of communication had shifted from a peer-to-peer model to a mixture of peer-to-peer and client server modalities. In the early days of internetworking the peer-to-peer model of network use was exemplified by collaborating researchers sending email and experiment data to each other. The peer-to-peer traffic remains very important today but is no longer unchallenged as the predominant form of communication on the Internet. The ascendency of the World Wide Web has created a massive client server traffic on the Internet that differs qualitatively from the previous network traffic in important ways. The first difference is that in today's internetworked client server model the data is no longer necessarily unique. Colleagues sending each other email or collaborating by exchanging files of work related data such as the results of an experiment are very unlikely not to make progress from one communication to the next. Thus the data being transmitted is pretty much unique. Intimate use of the Internet by small numbers of communicants sending unique data was the predominant style before the 1990s. This was the culture of the Internet before the public embraced it. This state of affairs contrasts sharply with the current rage for accessing HTML pages from the World Wide Web. Popular Web pages undergo change very slowly when regarded in ratio with their access rate and therefore closely approach being constant data. The number of clients that access popular Web pages is spectacular. On another front the emergence of commercial and public data bases accessible from the Internet has brought about the commoditization of online information. The commodity data of these databases tends to change slowly in ratio with their access rate and is therefore another major source of nearly constant data which did not exist before. Like the Web pages many of these information files also experience heavy demand from a growing public audience. Another important way in which the new Internet traffic differs from its precursor has to do with the temporal clustering of requests. With the phenomenal growth in client activity in recent years the percentage of requests that arrive almost simultaneously at servers has also increased dramatically. The confluence of data redundancy, temporal clustering of requests, and heavy traffic in the new Internet are crucial factors that effect network performance and provide the basis for optimization. The Problems and Their Causes Frequent Host Interruption Network based hosts on which server processes execute are controlled by general purpose operating systems. The host system does not perform efficiently when interrupts arrive too frequently. Protocols based on individual request and response in conjunction with an environment of hundreds to thousands of clients a minute accessing the host produce such a dense pattern of interrupts that the host's performance is seriously degraded. LAN Saturation The LANs that Internet based hosts are connected to are adversely and unnecessarily effected by the passage of large numbers of individual requests onto the LAN when data redundancy of the requests is high. Every packet that arrives at the LAN must have its Internet address resolved to a physical address. Carrying request packets that are to be processed individually keeps the LAN unnecessarily loaded. The degrading effects of LAN saturation go beyond shackling performance delivered to remote clients. Local clients running transient applications ( e.g. word processors ) on hosts connected to the LAN also experience a loss in quality of service. Host Interface Burdened by Redundant Output The current state of the art for the internetworked client server model has the server or a proxy copying data onto the wire as many times as it is requested regardless of conditions. Conditions may include many clients making requests for the same data within a brief time interval. However current protocols used to distribute the server's output cannot optimize transfer of data from a memory buffer to the network media using the conditions cited. As a server becomes more popular and develops tighter temporal clustering for same file requests the time it takes to output the data increases at a faster than linear rate. The rate is thus because as the interrupt pattern becomes more dense the system degrades. The use of on server caches and mirror servers cannot address the fact that the data is transferred to the network substrate as many times as it is requested. Conclusion To The Introduction The individualy focused request and response paradigm at the core of current client server design fails massive public application because of inefficiency bred of treating every request and every response as an individual piece of work regardless of the prescence of conditions that allow optimization. Solution lies in the direction of revised input and output processing that efficiently exploits patterns of data redundancy, temporal clustering, and the efficiencies of multicast routing. This approach to solution cannot be wholly transparent to the network. As the forward sections of this document reveal network layer protocols must be involved in the new client server processing model. III Client To Server Collecting Data For Dynamic Multicast The client will transmit requests to the server using the IP Record Route Option. This option causes a packet's path from source to destination to be recorded in a preformatted area provided within the packet. This mechanism is what is used to collect the data that will form the distribution tree used by dynamic multicast. Basis for Request Reduction The basis for advantageous request reduction is high frequency arrival of the same request semantics from different clients. The busiest Web sites today receive HTTP hits at a sustained rate of 300 per second. Given the fact that most clients will use the same entry point to the site and the same few layers of the site's HTML document hierarchy there exists, within a small time window such as two seconds, scores of requests for the same HTML file. Even if we scale down from the busiest Web sites by an order of magnitude the sixty or so HTTP hits inside the time window provides sufficient basis for successful request reduction. Request Reduction and the Optimizing Router The request reduction process runs on network traffic consolidation optimizing routers. The hardware for this type of router differs from any other type of packet switching device only in that the device should have upgraded memory capacity and upgraded processor speed. The request reduction daemon is the process that receives incoming packets. The request reduction process acts as a filter that removes and processes the single packet file requests it is responsible for and passes through to conventional router logic the rest of the packet traffic. Procedure for Request Reduction The request reduction process divides time into small windows of a configured interval. These windows may overlap. When the single packet of a complete raw request arrives at the optimizing router this individual request is examined to see if it is a file request. If it is not a file request the packet is passed through to be acted upon by conventional router logic. If the request received by the request reduction process is a file request its payload data is examined to see if it matches the payload data in the request buffer associated with any existing time window. If there is no match between the payload data of the new request and the payload data in any request buffer then a new time window is allocated by starting a new timer, allocating a reduction counter set to zero, allocating a memory buffer for the new request, moving the request into this newly allocated request buffer, and incrementing the reduction counter by one. Another buffer associated one-to-one with the newly allocated time window is also allocated. This is the time window's distribution tree buffer. The unmatched request's path from source to destination, found in the request packet, is formatted into a tree and moved to the distribution tree buffer associated with the newly allocated time window. That takes care of non-matching file requests. If there is a match between the new request's payload data and the payload data found in the request buffer of any time window then the source to destination path of the newly arrived request is used to update the matching time window's distribution tree and the matching time window's reduction count is incremented by one. When a time window's release comes due, either because of elapsed time or because the reduction count has exceeded a configured maximum, the following happens. 1) The request reduction process invokes the appropriate protocol to resolve the Internet destination address of the reduced request to the corresponding physical address of the destination station on the LAN. 2) The request reduction process generates, for reference by the co-resident dynamic multicast process, a unique request key that is associated with the reduced request. 3) The request reduction process creates the reduced request record which consists of the payload data from the time window's request buffer, the resolved physical address of the reduced request's destination station, and the above cited generated request key placed in the reduced request's originator field. 4) A response timer associated with the reduced request is started. 5) The reduced request is copied onto the LAN and is duly received by the server to whom it is addressed. IV Server To Client MOSPF Implementation of IP Multicast in a Nutshell Multicast communication involves the sending of packets from one source to many destinations. Network routers that run the multicast router daemon copy received packets onto those interfaces that are part of a shortest path distribution tree pruned of superfluous links. This pruned distribution tree provides just one path from the packet's source to each destination. Destinations are referenced by a special type of IP address known as a group address or Class D Internet address. Recipients of multicast packets have a standard command interface that allows them to join and leave a group address thus controlling what transmissions they will receive. The architecture of IP multicast is defined by RFC 1112. MOSPF is defined by RFC 1584 and further discussed in RFC 1585. Dynamic Multicast Like IP multicast dynamic multicast involves sending packets from one source to many destinations. It makes use of an optimal path distribution tree similar to IP multicast to efficiently copy the source packets to their destinations. Dynamic multicast differs from IP multicast in the following ways. 1) Dynamic multicast derives the addresses in its distribution tree from the incoming request packets sent with the IP Record Route option enabled. This is different from MOSPF implementations of IP multicast where the distribution tree is created by a router's IP multicast process running its route determination algorithm against the router's buffered network topology data. 2) Dynamic multicast makes use of unicast addresses in its distribution tree. Group addresses are not needed because the distribution tree travels with the data being distributed. 3) The data persistence of the distribution tree used by dynamic multicast is bounded by the life span of the clients' request. This scope of data persistence contrasts with MOSPF implementations of IP multicast where a cached distribution tree will persist until an update changes it. Outbound Processing At entry to outbound processing the server has already received the reduced request, read the requested file from cache or secondary storage, and formatted the file into packets. Now, the server replies to clients with the following steps: 1) A reply packet is sent by the server to the network traffic consolidation optimizing router. 2) The dynamic multicast process running on this router will take the unique key, the key created by the request reduction process, the key placed in the reply packet's destination address field by the server, and use it to find the corresponding distribution tree saved in its memory. 3) The dynamic multicast process will allocate memory for a multicast response packet for each of the interfaces it should copy to as indicated by the distribution tree. 4) For each of the newly allocated response packets the dynamic multicast process will populate the multicast response packet with the server's reply packet, the complete distribution tree, and the Internet address of the forwarding interface to which the packet is destined. 5) All of the response packets are copied onto the router's forwarding interfaces as indicated by the packets' destination addresses. 6) Other routers along the distribution tree fanout may or may not run network traffic consolidation optimization. If a router executing the dynamic multicast process determines from its control data that a router it should forward to does not run dynamic multicast it will compensate for this by forwarding copies of response packets individually in the unicast fashion to the forwarding addresses of every requesting node in the subtree rooted at the non-participating router. Thus packet fanout will be completed all at once for a particular subtree of the distribution tree with the dynamic multicast enabled router doing all the work. 7) The distribution tree contained in each packet has data that indicates which nodes in the distribution tree are requestors themselves as well as serving as part of the path to other destinations addresses. Whenever a destination node is detected by the local dynamic multicast process this process gives a copy of the packet to the LAN of the recipient. Leaf nodes are implicit destinations. The packets now have Internet addresses in their destination fields so physical address resolution can take place without a new protocol being developed. However the distribution tree is still included in each packet that arrives at its destination station and must be removed by an asset of the transport process that runs on the client's host. The above description of how dynamic multicast works involves almost no discussion of the transport layer. A comprehensive discussion of the requisite interface to the transport layer is beyond the scope of this document. V Advantages Of Network Traffic Consolidation Network traffic consolidation offers these advantages. 1) The approach to optimization taken by network traffic consolidation does not duplicate the progress being made by ideas such as multiple asynchronous requests per TCP connection being explored in HTTP-NG development. This is because network traffic consolidation optimizes horizontally across a group of like intentioned clients and a server instead of vertically optimizing the sequence of steps taken by just one client and its server. The two approaches to optimization find and eliminate different forms of inefficiency. Yet, because a reduced request is indistinguishable from any other TCP request to the transport, network traffic consolidation does not conflict with HTTP-NG or HTTP. 2) Although HTTP protocol data, both control and user data, can be served very well by network traffic consolidation this technology is future safe in the sense that it is general enough to process all highly redundant fielded data records regardless of format. 3) Network traffic consolidation addresses the bottleneck at the point where busy servers, be they primary or proxy servers, transfer data from host memory to network media. This involves significant CPU resource on popular servers that regularly have dozens of clients simultaneously requesting the same few high level HTML files. This problem is not addressed by any other protocol concept save for the multicast communication found in MBONE. 4) The multicast mode of transmission used by the server to client phase of network traffic consolidation preserves network bandwidth when compared to the current unicast method of serving clients. 5) Network traffic consolidation reduces the number of software interrupts received by network hosts for a given rate of client requests. 6) Network traffic consolidation scales exceptionally well. The worst area of Web site overload involves accessing the first few levels of HTML document files. There is more redundant data access here than anywhere else. Because of the hierarchical structure of a Web site nearly everyone enters from a common top page and there is a slow moving concentration of traffic at levels near the top page that gradually works downward. In network traffic consolidation, because every like intentioned request in the same small time window is consolidated into one request, the greatest improvement over the conventional one request one response mode of service is seen during heavy load. 7) As present Web usage trends become more commercial and continue to wind skyward in traffic volume the increase in network traffic will see clients and servers operating in environments that absolutely require groups of same file requests to be handled as one. Security Considerations This Internet-Draft raises no security issues. VI References S. Deering, "Host Extensions for IP Multicasting", STD 5, RFC1112 Stanford University, August 1989 J. Moy "Multicast Extensions to OSPF", STD 1, RFC 1584, Proteon Inc., March 1994 J. Moy "MOSPF: Analysis and Experience", RFC 1585, Proteon Inc., March 1994 T. Berners-Lee, R. Fielding, H. Frystak, "Hypertext Transfer Protocol - HTTP/1.0", RFC 1945, MIT/LCS, UC Irvine, DEC, May 1996 R. Fielding, J. Gettys, J. Mogul, H. Frystyk, T. Berners-Lee "Hypertext Transfer Protocol - HTTP/1.1", STD 1, RFC 2068, January 1997 Simon E. Spero, "Analysis of HTTP Performance Problems", http://sunsite.unc.edu/mdma-release/http-prob.html Simon E. Spero, "HTTP-NG Architectural Overview", http://www.w3.org/pub/WWW/Protocols/HTTP-NG/http-ng-arch.html Simon E. Spero, "Session Control Protocol", http://www.w3.org/pub/WWW/Protocols/HTTP-NG/http-ng-scp.html Author's Address Joseph Mansigian 155 Marlin Rd. New Britain, CT 06053 Phone: (860) 223-5869 EMail: jman@connix.com INTERNET-DRAFT EXPIRES: AUGUST 1997 INTERNET-DRAFT