Internet DRAFT - draft-wuht-diffserv-dccs

draft-wuht-diffserv-dccs





Internet Engineering Task Force                               Haitao Wu 
Internet Draft                                              Keping Long 
Expires: February 2001                                    Shiduan Cheng 
                                         Beijing Univ. of Posts & Tele. 
                                                                Jian Ma 
                                                 Nokia China R&D Center 
                                                            August 2000 
 
    
    
                 A Direct Congestion Control Scheme for  
          Non-responsive Flow Control in Diff-Serv IP Networks 
                <draft-wuht-diffserv-dccs-00.txt, .pdf> 
 
Status of Memo 
    
   This document is an Internet-Draft and is in full conformance with 
   all provisions of Section 10 of RFC2026. 
    
   Internet-Drafts are working documents of the Internet Engineering 
   Task Force (IETF), its areas, and its working groups. Note that other 
   groups may also distribute working documents as Internet-Drafts.  
    
   Internet-Drafts are draft documents valid for a maximum of six months 
   and may be updated, replaced, or obsoleted by other documents at any 
   time. It is inappropriate to use Internet- Drafts as reference 
   material or to cite them other than as "work in progress."  
    
   The list of current Internet-Drafts can be accessed at 
   http://www.ietf.org/ietf/1id-abstracts.txt 
    
   The list of Internet-Draft Shadow Directories can be accessed at 
   http://www.ietf.org/shadow.html. 
    
   For potential updates to the above required-text see: 
   http://www.ietf.org/ietf/1id-guidelines.txt 
    
    
Abstract 
    
   This draft considers the potentially negative impacts of an 
   increasing deployment of non-congestion-controlled or non-responsive 
   traffic on the Internet. Traffic unresponsiveness could bring 
   extremely unfairness against responsive TCP traffic and great service 
   degradation. Differentiated Services (DS)[5,6], which has been 
   proposed by IETF recently, aims to provide a scalable service 
   differentiation in the Internet that can be used for differentiated 
   payment. We argue that this could add the incentives of their 
   customer to use unresponsive flow to achieve better service assurance 
   against the competing traffic. 
    
   We argue that the responsiveness should not be identified by its 
   transport protocol, buy its reaction behavior to congestion in 
   networks. However, the TC at the boundary has no idea of the dynamic 
  
Wu, Long, Cheng, Ma     Expires: February 2001                [Page 1] 

Draft-wuht-diffserv-dccs-00                                August 2000 
 
 
   traffic conditions in the DS network. To remedy these problems, we 
   create a general direct congestion control scheme trying to regulate 
   the traffic conditioner at the boundary of DS domain in order to 
   provide fairness between responsive and unresponsive traffic by 
   congestion information generated at the interior node. This mechanism 
   enables a network provider to control the traffic entering the DS 
   domain more powerfully. Therefore, a better resource utilization and 
   a fair resource sharing between different traffic types can be 
   achieved. 
    
   The pdf version of this document is available at: 
   http://wuht.topcool.net/publications.htm 
    
    
1.Introduction 
    
   In the traditional IP network model, all user packets compete equally 
   for network resources and cannot achieve Quality of Service(QoS) 
   guarantee. With the development of new applications of Internet, such 
   as voice, video and www, the desire of Quality of Service (QoS) 
   becomes more and more strong.  
    
   An architecture for Differentiated Services, recently proposed by 
   IETF, provides a scalable means to deliver IP QoS based on handling 
   of traffic aggregates. Traffic Classification state is conveyed by 
   means of IP-layer packet marking using the DS field. Packets are 
   classified and marked to receive a particular per-hop behaviour(PHBs) 
   on nodes along their path. Sophisticated classification and traffic 
   conditioning, including marking, policing, and shaping operations, 
   need only be implemented at network boundaries or hosts. Within the 
   DS domain, core router forward packets according to the DSCP value in 
   the packet header. A detailed description of Diff-serv is given in 
   its architecture[6] and DSCP documents[5]. 
    
   Comparing to Inte-serv, Diff-serv is more scalable in terms of 
   implementation. It is achieved by handling aggregated traffic using a 
   number of PHBs within the core network rather than on a per-flow 
   basis. However, many recently studies have shown that there still 
   exist great unfairness between responsive traffic flows and 
   unresponsive flows in a Diff-serv network. [2,3,4,10,11] 
    
   To remedy this problem, we introduce a general direct congestion 
   control scheme trying to regulate the traffic conditioner at the edge 
   of a DS domain, in order to provide fairness between responsive and 
   unresponsive traffic flows. In addition, this scheme follows the 
   original idea of Differentiated Services network, i.e., simplify the 
   operation of core router and make the edge router smart to control 
   traffic entering the DS domain. 
    
    
2. Motivation and Related Works 
    

  
Wu, Long, Cheng, Ma     Expires: February 2001                [Page 2] 

Draft-wuht-diffserv-dccs-00                                August 2000 
 
 
   By now, IETF only define two set of PHBs [7,8], i.e, Expedited 
   Forwarding (EF) PHB and Assured Forwarding (AF) PHBs. Since EF build 
   a low loss, low latency, low jitter, assured bandwidth, end-to-end 
   service through DS domain, it needs strict policing and shaping. 
   Therefore, an EF customer can't affect another much as long as there 
   is enough resource in the DS network to support all EF requirements.  
    
   While the definition of AF is much more loose, in fact, there are no 
   quantitative requirements for AF PHB. There are 4 independent AF 
   classes, and 3 drop precedence level in each class. An IP packet that 
   belongs to an AF class x and has drop precedence y is marked with AF 
   code point AFxy. This draft will use DP0 to specify the drop 
   precedence value with lowest drop probability and DP2 to specify the 
   drop precedence with highest drop probability within an AF class.  
    
   Recent study has shown that under various conditions, existing Diff-
   serv mechanism may have problems of unfairness and inefficient 
   resource utilization, thereby failing to achieve the desired QoS 
   [4,11,12]. Much study has been done to alleviate the unfairness 
   between responsive TCP and non-responsive UDP by mapping TCP and UDP 
   in and out of profile packets to different drop precedence in an AF 
   class.  
    
   In [11], Seddigh points out that in an over-provisioned network, the 
   share of excess bandwidth is dependent on the mapping of out-of-
   profile packets, while in an under-provisioned network, fair 
   degradation for TCP and UDP can not be achieved by different drop 
   precedence. While Goyal [4] argue that fair allocation of excess 
   network bandwidth between congestion sensitive and insensitive flows 
   can be achieved if packets are 'colored' differently, but if the 
   network operates close to its capacity, even three drop precedence or 
   color can never achieve fairness. 
    
   Seddigh[11] also suggests that if TCP and UDP are put to separate 
   queues or AF classes, they may coexist fairly. But we argue that 
   since the core router in a DS domain has no idea of the reservation 
   bandwidth of current TCP and UDP flows, and the traffic is dynamic in 
   nature, it can't decide how to allocate link bandwidth to them. 
   Besides, there is no isolation of flow inside the core of the DS 
   network, even the core router knows the reserved bandwidth of the 
   aggregates, it can't judge how much resource a flow of the aggregate 
   should receive. Although some forms of call admission control (CAC) 
   mechanism may help alleviate the problems, we argue that CAC is only 
   a necessary but insufficient requirement. Since the problem is 
   associated with dynamic feature of network load and network capacity, 
   and the reaction of different transport protocols to congestion, 
   which is indicated by packets loss, only a dynamic control mechanism 
   at DS boundary can solve this problem radically.  
    
   Chow[12] points out these problems are caused by there is no dynamic 
   control at the diff-serv boundary, and network rely on transport 
   protocol to react. Besides, he propose a framework similar to the 
   Resource Management cells in ATM networks, in which boundary 
  
Wu, Long, Cheng, Ma     Expires: February 2001                [Page 3] 

Draft-wuht-diffserv-dccs-00                                August 2000 
 
 
   periodically obtain information from the core of the network at 
   update their TCP by those information. The main drawback of this 
   scheme is that core router needs to maintain all the state 
   information and the boundary router sends probe traffic periodically. 
   However, it's a great progress since the boundary can adjust their TC 
   according to the traffic dynamics in the original DS network. 
    
   We argue that the responsiveness of a flow can not identified by its 
   transport protocol, but its behavior or reaction to congestion in the 
   DS network. Therefore, the boundary of DS needs additional dynamic 
   information reporting the behavior of un-responsive or un-TCP-
   friendly flows from the interior network to regulate its TC. To 
   remedy these problems, we propose a direct congestion control scheme 
   for controlling non-responsive or non-TCP-friendly flows in DS 
   network to achieve fairness between responsive flows and non-
   responsive flows. 
    
    
3. Direct Congestion Control Scheme (DCCS) 
    
   In this draft, we use core router to mean an interior node of a DS 
   domain, which performs packets forwarding to implement a particular 
   PHB according to DSCP in the packet head. And we use edge router to 
   mean the ingress/egress node of a DS domain which performs TC 
   functions to traffic entering the DS domain, it could be a host if it 
   can perform TC function for its traffic entering the DS domain and it 
   is connected to an core router directly. 
    
   We create a general direct congestion control scheme to overcome the 
   issues mentioned in preceding sections. The basic concept of our DCCS 
   is that when some kind of PHBs packets are in congestion, core 
   routers generate congestion control message and send it to edge 
   routers directly, and the edge routers will regulate the Traffic 
   Conditioner (TC) of the corresponding aggregates or flows adaptively 
   according to the control message received. When congestion occurs at 
   edge routers, it can adjust its TC directly and no control messages 
   are generated.  
    
   This mechanism is in conform with the essential idea of Diff-serv, 
   that is, push the complex conditioning functions to the edge of the 
   network, and make the core routers do forwarding according to the 
   DSCP marked by edge routers to implement a particular PHB. 
    
   The following figure shows the prototype of DCCS. 
    
    
   Figure 1. Prototype of Direct Congestion Control Scheme (DCCS) 
    
    
3.1 Core router requirements 
    
   In addition to the basic packet forwarding function, a core router is 
   extended to include a load monitor function. This won't add many 
  
Wu, Long, Cheng, Ma     Expires: February 2001                [Page 4] 

Draft-wuht-diffserv-dccs-00                                August 2000 
 
 
   overheads since most core router will use RIO[2,3] or Multi-RED[11] 
   to implement different drop precedence according to current 
   recommendations. In addition, RED[1] is the active queue management 
   mechanism recommended by IETF, most router currently used have 
   realized this mechanism.  
    
   According to IETF RFC, different independent PHBs will likely be 
   implemented by different queues. While an AF class with multiple drop 
   precedence will be implemented by a single FIFO queue with RIO or 
   Multi-RED enabled. This will simplify our scheme since RED is 
   sensitive to incipient congestion and packets will be marked with a 
   probability according to current average queue length. Therefore, the 
   core routers can send congestion control message according to current 
   load. 
    
   When core routers are in congestion for lowest drop precedence 
   packets, e.g. AFx0, a congestion control message packet for an 
   aggregate or flows should be generated, and the message should be 
   sent to the edge routers which perform the TC function to the 
   corresponding aggregate or flows, which depend on the granularity of 
   implementation. Upon receiving such a control packet, other core 
   routers should forward it to the ingress edge router as a network 
   control message. It may be argued that this mechanism will add 
   overhead to core routers in a DS domain. But we believe since it is 
   only generated in congestion and it can direct control the TC at the 
   boundary routers adaptively, it will alleviate congestion powerfully. 
   Therefore, considering its effectiveness, this cost is very worthy 
   and the granularity of implementation can be adjusted to alleviate 
   overhead. 
    
    
3.2 Edge router requirements 
    
   An boundary router generally perform TC functions to ensure that the 
   traffic entering a DS domain conform to the rules specified in the 
   TCS, in accordance with the domain's service provision policy. In our 
   DCCS, we propose to make this TC functions dynamically adapt to the 
   control message from the core routers. The general rules of the 
   Dynamic TC (DTC) is: 
    
   (1) Under a normal situation, DTC performs the TC functions normally; 
    
   (2) When it receives a congestion control message from core router, 
   it should adjust its TC functions of the corresponding aggregates or 
   flows dynamically. 
    
   (3) If it use a mechanism which could distinguish micro-flows in an 
   aggregate and treat them fairly, it can regulate a micro-flow 
   directly, not the whole aggregate.  
    
   (4) An edge router should distinguish a control message whose 
   destination is any hosts or networks accessing the DS domain by this 
   router itself and should terminate this control message.  
  
Wu, Long, Cheng, Ma     Expires: February 2001                [Page 5] 

Draft-wuht-diffserv-dccs-00                                August 2000 
 
 
    
    
3.3 Congestion control message creation and handling  
    
   A congestion control should be distinguishable from other packets in 
   a DS network. There are several possible alternatives exist for such 
   a control packet that it could be identified easily: 
    
   (1) Using a special DSCP, in which control information is carried in 
   the data field of the packet; 
    
   (2) Using a new IP option, in which a special extension is defined 
   for control message; 
    
   (3) Using an ICMP packet containing the control information; 
    
   (4) A bit in IP header to indicate that this is a congestion control 
   message. 
    
   If the control message will traverse from a DS domain to another DS 
   domain, further negotiation will be needed to assure the 
   effectiveness of the control message between the two DS domains. 
   Otherwise, if a DS domain does not want to corporate with another 
   domain on such control message, it can drop it at the edge simply. 
   Then the other domain could adjust the TC of itself without 
   affections to other domains. 
    
   We strongly recommend that it should be given a special DSCP when 
   this scheme is implemented in a DS domain. A core router could 
   distinguish such a message more clearly and forward it as a network 
   control message. In addition, an edge router could identify such a 
   control message more easily. 
    
3.4 Congestion control message 
    
   Various fields can be carried in such a control message. It should 
   include following fields: 
    
   (1) Version, since we don't have a consistent implementation, we need 
   a identifier to indicate such an implementation; 
    
   (2) PHB in congestion, it should indicate which PHB packets are now 
   in congestion; 
    
   (3) TS, a timestamp or a sequence number, this field is used to 
   identify a control message from a particular core router when 
   multiple core routers are in congestion simultaneously; 
    
   (4) Granularity, it could be to aggregate or flows; 
    
   (5) Other identifiers, this field is determined by granularity; 
    

  
Wu, Long, Cheng, Ma     Expires: February 2001                [Page 6] 

Draft-wuht-diffserv-dccs-00                                August 2000 
 
 
   (6) Control power, it indicate the DTC should adjust it functions to 
   which extend. Or the core router could send current loading 
   information as control power. 
    
    
3.5 Dynamic Traffic Conditioning (DTC) 
    
   A DTC should perform normal functions with no congestion indication. 
   When it receive an congestion control message, it should adjust the 
   TC parameters of corresponding aggregate or flows according to the 
   received message. 
    
   One way to implement is using two sets of parameters: one is a set of 
   the original TC profile parameters that is static; the other is a set 
   of supplementary parameters changing with control message. If no 
   control messages received, the supplementary set should be identical 
   to static one. If no further control message received after it has 
   adjust its supplementary parameters, it should change the 
   supplementary ones to the static ones step by step. 
    
   DTC parameters include parameters for shaping, dropping and marking.  
    
    
4. An implementation example 
    
   By now, IETF only define two set of PHBs, i.e., EF PHB and AF PHBs. 
   Since EF needs strict policing and shaping. Therefore, an EF customer 
   can't affect another much as long as there is enough resource in the 
   DS network to support all EF requirements. 
    
   While AF has no quantitative requirements, so a customer may exceed 
   the subscribed profile with the understanding that the excess traffic 
   is not delivered with as high probability as the traffic that is 
   within the profile. But when the network are in congestion, 
   responsive TCP flows will decrease its window size and back-off, but 
   unresponsive UDP flows will continue sending packets, which lead to 
   great unfairness between TCP and UDP flows. 
    
   Core router should care about the packet drops of DP1 and DP2, since 
   the customer use it to grab excess network resource. But packet drops 
   of DP0 means a very possible incipient congestion, and the core 
   router should generate congestion control message when lots of DP0 
   packets are being dropped. Since a normal TCP will use back-off and 
   slow start responding for packet drops, we believe that the end-to-
   end congestion control of TCP is enough, so we only generate 
   congestion control message for non-responsive flows. But there is an 
   issue need to be solved first, that is, how to identify an un-
   responsive or not-TCP friendly aggregate or flow. 
    
    
Core router: 
    

  
Wu, Long, Cheng, Ma     Expires: February 2001                [Page 7] 

Draft-wuht-diffserv-dccs-00                                August 2000 
 
 
   A core router should implement RIO or Multi-RED or similar mechanism 
   for different drop precedence. 
   A core router should have a table to store information about dropped 
   DP0, DP1 and DP2 packets, which may include following field: 
    
   +--------+------+----------+------+------+-------+-------+-------+ 
   | Source | Dest | Transport| Time | AF   | DP0   | DP1   | DP2   | 
   | IP     | IP   | Protocol | Stamp| Class| Counts| Counts| Counts| 
   +--------+------+----------+------+------+-------+-------+-------+ 
       
   Where the three preceding fields is used to identify the flows. These 
   three fields are determined by the granularity of scheme. Time stamp 
   records the last time when the sets have been updated, and after a 
   specified period (Tupd), this row should be cleared. Counts record 
   the dropped packet number. Tupd should be a configurable parameter. 
    
   When a AF packet dropped, it searches the table and uses the 
   information of currently dropped packet to update the table. It 
   should generate a congestion control message with a probability (pcc), 
   which is determined by the Counts and current load. 
    
   The congestion control message created by core router uses a special 
   PHB to identify itself. The Source IP and Dest IP of this message are 
   Dest and Source IP of the dropped packet respectively.  
    
   The data field of this message include following:  
    
     +--------+-----------+-------+------------+--------------+--------+ 
     | Version| PHB in    | TS/SN | Granularity| Transport    | Control| 
     |        | Congestion|       |            | Protocol Type| Power  | 
     +--------+-----------+-------+------------+--------------+--------+ 
    
    
   The meaning of these field have been explained in preceding section. 
   In this example, we use granularity of flow identified by Source IP, 
   Dest IP and Transport Protocol type. Control power is determined by 
   dropped number of corresponding flow and current load. Normally, 
   Control power is set to 1 and increases linearly when congestion is 
   severer. 
    
Edge router: 
    
   An edge router should perform DTC functions mentioned in preceding 
   section. When it receive a control message, it should adjust the TC 
   parameters of the corresponding flows dynamically. 
    
   When a flow traverses multiple core routers that are occurring 
   congestion, the edge router where this flow accesses the DS domain 
   will receive congestion control message from different core routers. 
   The routers sending control message can be identified by the TS/SN 
   field in the control message. It should respond to the most server 
   congestion indication rather than all the messages. Therefore, it 

  
Wu, Long, Cheng, Ma     Expires: February 2001                [Page 8] 

Draft-wuht-diffserv-dccs-00                                August 2000 
 
 
   could avoid unfairness to flows that will traverse a long way in a DS 
   domain through lots of hops. 
    
    
5. Summary 
    
   In this draft, we propose a Direct Congestion Control Scheme (DCCS) 
   for unresponsive flows congestion control in a DS IP network. We 
   present the general requirements of this scheme and an example of the 
   implementation. This scheme is very simple and soundly effective. We 
   belive that it will enhance the and fairness between responsive and 
   non responsive flows will do good to the traffic control and resource 
   utilization in real networks. Currently, we are working on the 
   simulation of this scheme, and we will publish the results of this 
   scheme very soon. 
    
   Finally, we list the advantages and disadvantages of our scheme: 
    
   Advantages: 
    
   1) Fairness between responsive TCP flows and unresponsive UDP flows 
   can be achieved; 
    
   2) No changes of transport protocols (such as TCP and UDP) at hosts 
   or end; 
    
   3) Edge routers can dynamically adjust its TC according to current 
   network load and capacity; 
    
   4) Still make the end-to-end congestion control for TCP useable and 
   corporatable, such as ECN or other mechanisms can be used with our 
   schemes simultaneously; 
    
   Disadvantages: 
    
   1) It add somewhat additional overhead to core routers and network 
   for contorl message generation and delivery; 
    
   2) Control message needs identification, we recommend to use a new 
   PHB to indicate such a control message. 
    
    
    
7. Reference 
    
   [1] Floyd, S., and Jacobson, V., "Random Early Detection gateways for 
   Congestion Avoidance ", IEEE/ACM Transactions on Networking, V.1 N.4, 
   August 1993, p. 397-413. 
    
   [2] Clark D. and Fang W., "Explicit Allocation of Best Effort Packet 
   Delivery Service", ACM Transactions on Networking, August 1998. 
   http://diffserv.lcs.mit.edu/exp-alloc-ddc-wf.ps 
    
  
Wu, Long, Cheng, Ma     Expires: February 2001                [Page 9] 

Draft-wuht-diffserv-dccs-00                                August 2000 
 
 
   [3] Ibanez J, Nichols K., "Preliminary Simulation Evaluation of an 
   Assured Service", Internet Draft, draft-ibanez-diffserv-assured-eval-
   00.txt>, August 1998 
    
   [4] M. Goyal, P. Misra, R. Jain, Effect of number of drop precedences 
   in Assured Forwarding. Available from http://www.cis.ohio-
   state.edu/~jain/papers/dpstdy_globecom99.htm 
    
   [5] Nichols, K., Blake, S., Baker, F. and D. Black, "Definition of 
   the Differentiated Services Field (DS Field) in the IPv4 and IPv6 
   Headers", RFC 2474, December 1998. 
    
   [6] Blake, S., Black, D., Carlson, M., Davies, E., Wang, Z and W. 
   Weiss, "An Architecture for Differentiated Services", RFC 2475, 
   December 1998. 
    
   [7] Heinanen J., Baker F., Weiss W., and Wroclawski J., "Assured 
   Forwarding PHB Group", Internet RFC 2597, June 1999. 
    
   [8] V. Jacobson, K. Nichols, K. Poduri, "Expedited Forwarding PHB", 
   Internet RFC 2599, June 1999. 
    
   [9] Heinanen, J. and R. Guerin, "A Two Rate Three Color Marker", RFC 
   2698, September 1999. 
    
   [10] Seddigh N, Nandy B, and Pieda P, "Bandwidth Assurance Issues for 
   TCP flows in a Differentiated Services Network", GLOBECOM 99, Rio De 
   Janeiro, December 1999, 
    
   [11] Nabil Seddigh, Biswajit Nandy, Peter Pieda "Study of TCP and UDP 
   Interaction for the AF PHB", Internet Draft, draft-nsbnpp-diffserv-
   tcpudpaf-01.pdf, August 1999 
    
   [12] Hungkei (Keith) Chow, Alberto Leon-Garcia, "A Feedback Control 
   Extension to Differentiated Services", Internet Draft, draft-chow-
   diffserv-fbctrl-00.pdf March 1999 
 
 
8. Author's Address 
 
    
   Haitao Wu, Keping Long, Shiduan Cheng 
    
   National Key Lab of Switching Technology and Telecommunication 
   Networks, 
   P.O.Box 206, Beijing University of Posts & Telecommunications,  
   Beijing 100876, P.R.China 
   Tel: +86 10 62283761; Fax: +86 10 62283412 
    
   E-mail: {xiaodan, lkp, chsd}@bupt.edu.cn 
   Homepage: http://wuht.topcool.net/ 
    
    
  
Wu, Long, Cheng, Ma     Expires: February 2001               [Page 10] 

Draft-wuht-diffserv-dccs-00                                August 2000 
 
 
    
   Jian Ma 
    
   Nokia China R&D Center,  
   Nokia House 1, No.11, He Ping Li Dong Jie,  
   Beijing, 1000013, P.R.China 
   Tel: +86 10 8422 9922 Ext.2940; Fax: +86 10 8422 2439  
    
   E-mail: jian.j.ma@nokia.com 
    











































  
Wu, Long, Cheng, Ma     Expires: February 2001               [Page 11]