Networking Working Group Zheng Wang Internet Draft Bell Labs Lucent Technologies Expiration: May 1998 Nov 1997 User-Share Differentiation (USD) Scalable bandwidth allocation for differentiated services Status of this Memo This document is an Internet Draft. Internet Drafts are working documents of the Internet Engineering Task Force (IETF), its Areas, and its Working Groups. Note that other groups may also distribute working documents as Internet Drafts. Internet Drafts are draft documents valid for a maximum of six months. Internet Drafts may be updated, replaced, or obsoleted by other documents at any time. It is not appropriate to use Internet Drafts as reference material or to cite them other than as a "working draft" or "work in progress." Please check the 1id-abstracts.txt listing contained in the internet-drafts Shadow Directories on nic.ddn.mil, nnsc.nsf.net, nic.nordu.net, ftp.nisc.sri.com, or munnari.oz.au to learn the current status of any Internet Draft. Table Of Contents 1. Introduction ........................................ 2 2. Objective ........................................... 2 3. Related Proposals ................................... 3 4. User-Share Differentiation (USD) .................... 4 4.1. Overview .......................................... 5 4.2. Per-User Traffic Isolation ........................ 5 4.3. Flexible Aggregation .............................. 6 4.4. Bottleneck Sharing ................................ 6 4.5. Application Marking ............................... 7 4.6. Implementation and Deployment ..................... 8 5. Standardization Issues .............................. 8 6. Acknowledgments ..................................... 9 7. References .......................................... 9 Wang Expiration: May 1998 [Page 1] Internet Draft User-Share Differentiation Nov 1997 1. Introduction In this memo, we present a new differentiated service scheme called ''User-Share Differentiation (USD)''. The USD scheme is based on the principle of link sharing. The scheme allows ISPs to differentiate traffic flows on a per-user basis, providing minimum bandwidth guar- antee and share-based bandwidth sharing. We first look at the objec- tives of differentiated services, and the problems with the current proposals, then present the details of the USD scheme. Finally we examine the implementation and standardization issues 2. Objective The current Internet is built on the best-effort model where all packets are treated as independent datagrams and are serviced on a FIFO basis. The best effort model does not provide any form of traf- fic isolation inside the network and the sharing of the resources essentially depends on how much users ask for. Therefore the cooper- ation of the end systems (such as TCP congestion control mechanisms) is vital to make the system work. In today's Internet, however, such dependence on the end systems' cooperation is increasingly becoming unrealistic. Inevitably, some people start to exploit the weakness of the current best effort model to gain more resource (e.g., Netscape browsers allow users to open multiple TCP connections). Another prob- lem with the best effort model is that it is difficult to make any guarantee, either explicit (e.g., x bps bandwidth) or relative (e.g., a service package 5 times better than another package) as all traffic is aggregated into a single flow. The problems with the best effort model have been long recognized. For the last a couple of years, QoS provision has been one of the hottest areas in networking research, and various aspects of the issue have been extensively studied including traffic analysis, admission control, resource reservation, scheduling, QoS routing, and operating system support. The architectures of various proposed solutions differ in details. Nevertheless, the underlying model is similar. Generally, applications make resource reservation on an end-to-end session basis. In the internet community, RSVP and INT- SERV are examples of protocols and service models based on this end- to-end approach. However, research and experience have shown a number of difficulties in deploying end-to-end reservation in the global Internet. The lacking of accounting support, the high administrative overheads, and the complexity of inter-ISP settlement make it diffi- cult for end-to-end reservation to be widely deployed. There are also scalability concerns for fine granularity classification and per- session signaling. Wang Expiration: May 1998 [Page 2] Internet Draft User-Share Differentiation Nov 1997 The best effort model and the end-to-end model represent the two extremes of the spectrum. The best effort model does no traffic iso- lation at all whilst the end-to-end model provides isolation on the finest granularity (an end-to-end session identified by the 5-tuple). The solution may, therefore, lie somewhere in between the two extremes. For differentiated services, the general approach is to deal with the problem of QoS provision through long-term contract- based service allocation rather than per-session reservation. The key to achieve that is to choose a proper level of granularity for traf- fic isolation that satisfies users' and ISPs' needs, and is also scalable in a global Internet. We also need a solution that can offer immediate help to ISPs, and impose minimum conditions on ISPs' busi- ness models and network infrastructures, yet is flexible to allow further development as the Internet evolves. 3. Related Proposals A number of proposals that have been put forward for differentiated services fall into two groups: drop priority based and delay priority based. In this section, we discuss some of the problems with the related proposals. Profile-Based Tagging The profile-based tagging scheme uses drop priority in conjunction with an admission control mechanism. In the scheme, a user and its ISP agree on a profile, and the traffic flowing into the ISP is checked at the entry points [2]. When the traffic exceeds the pro- file, the packets are let through but tagged. When there is conges- tion inside the network, the routers start to discard tagged packets first. One issue is how to determine a profile for a user. When a network does not have uniform bandwidth provisioning, profiles are likely to be destination-specific. For example, suppose that an ISP has a link to a neighbor ISP with a capacity 600 Mbps and a link to the Internet backbone with a capacity of 100 Mbps. If a user is communicating with another user in the neighbor ISP, the user can have a rate limit of 6 Mbps but only 1 Mbps if the user's traffic goes through the backbone access link. In this case, a profile of either 6 Mbps or 1 Mbps is not appropriate for the user. When an ISP have multiple bot- tleneck links with different bandwidth provision, it is difficult to choose a fixed profile for a user. Another problem is to do with reserve traffic, i.e., the traffic that comes from the server towards the user, as it is the case in most web-based traffic. With reverse traffic, tagging packets at the Wang Expiration: May 1998 [Page 3] Internet Draft User-Share Differentiation Nov 1997 user-ISP boundary has little effects. The profile-based tagging only provides limited protection against mis-behaving source. Since profile-based tagging essentially creates two classes: tagged and un-tagged, the network deals with the tagged packets in a FIFO fashion. A misbehaving source can still gain more bandwidth by injecting excessive traffic. The problem can be aggra- vated when the fixed profiles are significantly over (or below) the level appropriate for the congested links. In such a case, the major- ity of the packets are not tagged (or tagged). Thus the tagging pro- vides little information to enforce differentiation, and network behavior will be close to simple FIFO best effort. Delay Priority Differentiation In a delay priority based scheme, users are classified into two or more classes 1, 3]. When the traffic enters the network, the packets will be set with appropriate delay priority. Under congestion, the packets with higher priority are transmitted prior to the ones with lower priority. Delay priority schemes represent an extreme form of resource alloca- tion, where lower priority classes suffer all the consequences of congestion. Unless the priority traffic only accounts for a very small percentage of the link capacity, lower class traffic may expe- rience significant deterioration under congestion, and may be com- pletely starved from any bandwidth. In fact, TCP has the tendency to grab as much bandwidth as it can. Because the congestion is invisible to the upper class, the network will no longer to provide any conges- tion signal for the upper class traffic. The TCP windows in upper class flows may grow to take up all bandwidth and starve the lower class traffic completely. Delay priority schemes provide isolation between different classes. Within a single class, however, there is no traffic isolation and therefore there is no protection against misbehaving users. Like the profile-based tagging, delay priority schemes also have problems with reverse traffic where setting priority has to be done close to the sources rather than the users. 4. User-Share Differentiation (USD) In this section, we present User-Share Differentiation (USD), a scal- able bandwidth allocation scheme for differentiated services, and discuss the key design principles behind the scheme. Wang Expiration: May 1998 [Page 4] Internet Draft User-Share Differentiation Nov 1997 4.1. Overview In USD, we treat the problem of service allocation as a form of resource sharing: an ISP as a pool of bandwidth resources shared by multiple users. The sharing of the bandwidth resource is controlled with two parameters, user and share. A user is the entity to which the resource is allocated and the share quantifies the amount of resource allocated to a user. At the time when a user subscribes to its ISP for Internet services, the ISP determines the share for the user based on what the user has asked for or the package the user selects. The USD scheme can guar- antee that: 1) The user will have a minimum mount of bandwidth on all or some of the links in the ISP 2) For the bandwidth over the minimum amount, the user shares the bandwidth with other users in proportion to its allocated share For example, suppose that user A and B ask for 2 Mbps and 10 Mbps mini- mum bandwidth respectively, and the ISP allocates share of 1000 and 5000 to the two users in return. The USD scheme guarantees that user A and B have 2 Mbps and 10 Mbps minimum bandwidth, and for bandwidth over the minimum, the allocation to user A and B follows the ratio of 1:5. We now discuss the key design decisions behind the USD scheme. 4.2. Per-User Traffic Isolation As we discussed early, the objective of the differentiated services is to find a granularity that meets users' and ISP's needs and is also scalable. When there is contention of resources, we must deter- mine a rule for allocating limited resources to multiple users. In a commercial Internet, we believe that the rule has to be based on the contracts between the ISP and its users. In USD, we choose "user" as the basic working unit that defines the granularity. A user is the entity with whom the service contract is signed, and it can be a dial-up customer, or one corporate network or a group of individual customers or networks. In USD, all traffic originated from or destined to a user is aggregated into a single flow. The per-user granularity within an ISP strikes a good balance between aggregation and isolation. It provides real traffic isolation between different users yet it does not have to deal with individual Wang Expiration: May 1998 [Page 5] Internet Draft User-Share Differentiation Nov 1997 sessions/flows within the per-user aggregated flow. Thus, the alloca- tion can be done at the subscription time and on a long-term basis. The USD scheme creates a hierarchical management structure. Inside an ISP's network, the traffic which belongs to a user is guaranteed by the ISP according to the user-ISP contract. Within its share of the bandwidth, the user manages its own sessions/flows and decides how the resources should be utilized. Per-user traffic isolation provides full protection against mis- behaving users. If a mis-behaving user ignores the congestion signal, and continues to send traffic at high rates, it can only hurt itself. It gives a strong incentive for users to deploy intelligent control congestion mechanisms and make the best use of its allocated share of bandwidth. 4.3. Flexible Aggregation As the definition of a user changes when traffic goes across ISP boundaries, the level of traffic aggregation also changes accord- ingly. For example, dial-up customers typically are users of a retail ISP, and the retail ISP in turn becomes a user of a backbone ISP. Within the retail ISP, traffic from or to a dial-up user is aggre- gated whilst in the backbone only the retail ISP is visible. Such variable level of aggregation ensures a great deal of scalability. In general, when packets are close to the source ISP, the sender's allo- cation has most influence and when the packets move close to the des- tination, the receiver's allocation becomes more important. Incorporating the concept of user into traffic isolation provides a very flexible tool for ISPs to choose the level of traffic aggrega- tion they want. Note that a user can represent one dial-up customer, or one corporate network or a group of individual customers or net- works. While the maximum granularity is per customer, ISPs can achieve different levels of aggregation by creating user classes. For example, an ISP may create 3 user classes: premium users, basic users and best-effort users. The USD scheme will allow the ISP to guarantee bandwidth allocation to the three classes. 4.4. Bottleneck Sharing As we discussed in the previous section, admission control at the user-ISP boundaries does not work very well since a fixed profile can not cater for multiple bottlenecks with different sizes. Essentially, each user needs to have multiple profiles for different bottlenecks. For this reason, we believe that bandwidth allocation should be done Wang Expiration: May 1998 [Page 6] Internet Draft User-Share Differentiation Nov 1997 at the bottleneck links instead of the edges. In USD, we use share- based allocation scheme that works both for minimum bandwidth alloca- tion and proportional bandwidth sharing. To allocate bandwidth of a single link to multiple users, we can describe the allocation with actual bandwidth or as relative sharing. For example, suppose that an ISP has 4 users A, B, C and D, sharing an access link of 30 Mbps. The agreed allocation is 4 Mbps, 6 Mbps, 8 Mbps and 12 Mbps respectively. This allocation can be described with the actual bandwidth, 4 Mbps, 6 Mbps, 8 Mbps and 12 Mbps. Alternatively, the allocation can also be expressed with relative sharing, 2:3:4:6. When the bandwidth is allocated in proportion to the relative sharing, the two representation leads to the same band- width allocation. However, the relative sharing representation has a number of advan- tage. First of all, it can guarantee the same minimum bandwidth allocation as an explicit allocation does. In addition to that, it allows the bandwidth above the minimum to be shared in proportion to the minimum allocation. For example, suppose that user A and B in the previous example are not using their allocated bandwidth during a period. Now user C and D can share the extra bandwidth in proportion to their relative ratio. The final allocation to user C and D becomes 12 Mbps and 18 Mbps. Most importantly, the relative sharing representation allows proportional allocation on multiple bottle- necks. More importantly, the relative sharing representation works works well with multiple bottlenecks with different bandwidth provision. For example, suppose the ISP of the 4 users has another link with 600 Mbps bandwidth. the 4 users who have shares of 2, 3, 4, and 6 will have minimum guaranteed bandwidth automatically scaled up to 80 Mbps, 120 Mbps, 160 Mbps and 240 Mbps respectively. The relative sharing can be viewed as a flexible profile as it scales up and down according to the bandwidth available whilst guaranteeing the minimum bandwidth. In practice, the unit of share can be defined as certain bandwidth so that the share for a user can be easily derived from the minimum bandwidth allocated. For example, if we define the unit of share is 1 Kbps, a user with 4 Mbps minimum band- width has a share of 4000. 4.5. Application Marking As we discussed before, USD provides traffic isolation on a per-user basis, and does not manage flows/sessions within the per-user aggre- gated flow. However, a user or an application may indicate Wang Expiration: May 1998 [Page 7] Internet Draft User-Share Differentiation Nov 1997 preferences or the nature of the packets through application marking. For example, a user may prioritize its traffic or an application may mark some packets as important thus should be last dropped. It is important to note that the use of packet marking here is very different from that in profile-based tagging and priority-based schemes, where packet marking is used for admission control and traf- fic isolation. The in/out profile bit or the delay class are primar- ily used to enforce the contractual agreement. When packets enter a new ISP, packets may be re-tagged or re-classified based on the new contractual agreement. In application marking, the drop priority or the delay priority reflect only users' preferences and application characteristics and allow the network make application-friendly actions. How the packets are marked are entirely decided by the users and the applications, and the marking is meaningful only among the flows of the same user. Traffic isolation and application marking can be viewed as two levels of traffic management in USD. Traffic isolation guarantees bandwidth sharing among different users, and the application marking allows a user to manage its own traffic flows. 4.6. Implementation and Deployment To support USD, routers need to implement a scheduler that is capable of enforcing proportional bandwidth sharing. There is a wide range of scheduling algorithms that can provide such functions. Examples of such schedulers include Class-Based Queuing (CBQ) [4], a scheduling algorithm originally designed for hierarchical link sharing, Weighted Fair Queuing (WFQ), one of the most studied scheduling algorithms in recent years [9]. Although the original WFQ is expensive to imple- ment, several variations of WFQ have been proposed that support band- width sharing in similar fashion but are optimized for software and hardware implementation [5, 6, 7, 8]. Some of the algorithms can emu- late WFQ closely with O(1) complexity [8]. A complete analysis of those algorithms and buffer management can be found in [5, 10]. USD enforces bandwidth sharing locally on the bottleneck links. Thus it does not require any changes to the end systems and any admission control at the user-ISP boundaries. Consequently, USD can be deployed an incremental fashion. In fact, routers can be upgraded to support USD individually and each upgrade gives incremental improve- ment to the whole network. For example, when USD is installed on the router connected to the access link to the backbone, bandwidth allo- cation is enforced immediately for all traffic that going through the access link. Wang Expiration: May 1998 [Page 8] Internet Draft User-Share Differentiation Nov 1997 Moreover, USD only needs to be deployed at the points in the network that are heavily congested. Once bandwidth sharing is enforced at those points, other links may not require further policing. 5. Standardization Issues Very little of USD needs to be standardized. One addition to the cur- rent system is the distribution of user IDs and its associated shares to the routers. This can be achieved through network management such as SNMP. Therefore, we need to agree on a common MIB for network man- agement systems to install user-specific shares on the routers. 6. Acknowledgments I would like to thank G. Armitage, T. V. Lakshman, S. Mithal, D. Stiliadis, B. Suter for their feedback and discussion. 7. References [1] K. Kilkki, "Simple Integrated Media Access," Internet Draft, June 1997, [2] D. Clark, J. Wroclawski, "An Approach to Service Allocation in the Internet," Internet Draft, July 1997, [3] P. Ferguson, "Simple Differential Services: IP TOS and Precedence, Delay Indication, and Drop Preference," Internet Draft, Nov 1997, [4] S. Floyd, and V. Jacobson, "Link-sharing and Resource Management Models for Packet Networks," IEEE/ACM Transactions on Networking, Vol. 3 No. 4, August 1995. [5] D. Stiliadis, "Traffic Scheduling in Packet Switched Networks: Analysis, Design and Implementation," Ph.D. Thesis, UC Santa Cruz, June 1996, [6] S. Golestani, "A self-clocked fair queuing scheme for broadband applications," IEEE INFOCOM'94, pp. 636-646, April, 1994. [7] J. Bennett and H. Zhang, "WF2Q: Worst-acse fair weighted fair Wang Expiration: May 1998 [Page 9] Internet Draft User-Share Differentiation Nov 1997 queuing," IEEE INFOCOM'96, March 1996. [8] M. Shreedhar and G. Varghese, "Efficient Fair Queueing using Deficit Round Robin," ACM SIGCOMM'95, Sept 1995. [9] A. K. Parekh, R. G. Gallager, "A generalized processor sharing apporach to flow control - the single node case", IEEE INFOCOM'92, Vol 2, pp 915-924, May 1992. [10] B. Suter, T. V. Lakshman, D. Stiliadis, A. Choudhury, "Efficient Active Queue Management for Internet Routers", submitted to INTEROP, November 1997 Authors' Address: Zheng Wang Bell Labs Lucent Technologies 101 Crawfords Corner Road Holmdel, NJ 07733 Email: zhwang@dnrc.bell-labs.com Wang Expiration: May 1998 [Page 10]