Internet Engineering Task Force                   Integrated Services WG
INTERNET-DRAFT                              Shenker/Partridge/Wroclawski
draft-ietf-intserv-control-del-svc-01.txt                  Xerox/BBN/MIT
                                                              June, 1995
                                                       Expires: 12/25/95


          Specification of Controlled Delay Quality of Service


Status of this Memo


   This document is an Internet-Draft.  Internet-Drafts are working
   documents of the Internet Engineering Task Force (IETF), its areas,
   and its working groups.  Note that other groups may also distribute
   working documents as Internet-Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as ``work in progress.''

   To learn the current status of any Internet-Draft, please check the
   ``1id-abstracts.txt'' listing contained in the Internet-Drafts Shadow
   Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe),
   munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or
   ftp.isi.edu (US West Coast).

   This document is a product of the Integrated Services working group
   of the Internet Engineering Task Force.  Comments are solicited and
   should be addressed to the working group's mailing list at int-
   serv@isi.edu and/or the author(s).


Abstract


      This memo describes the network element behavior required to
      implement the Controlled Delay traffic handling service. The
      controlled delay service is designed for service-adaptive and
      delay-adaptive applications; i.e., applications that are prepared
      to dynamically adapt to changing packet transmission delays and to
      dynamically change the level of packet delivery delay control they
      request from the network when their current level of service is
      not adequate.  The controlled delay service imposes relatively
      minimal requirements on network components which implement it, and


Shenker/Partridge/WroclawskiExpires 12/25/95                    [Page 1]
INTERNET-DRAFT draft-ietf-intserv-control-del-svc-01.txt     March, 1995


      is intended to be usable in situations ranging from small
      centrally managed private IP networks to the global Internet.


Introduction


   This memo is one of a series of documents which specify network
   element behavior in IP internetworks which provides multiple
   qualities of service to their clients. Services described in these
   documents are useful both in the global Internet and private IP
   networks.

   This document is based on the service specification template given in
   [xxx]. Please refer to that document for definitions and additional
   information about the specification of qualities of service within
   the IP protocol family.


Motivation


   Controlled delay service is designed for service-adaptive and delay-
   adaptive applications. These applications are sensitive to packet
   delivery delay, but are prepared to adapt to dynamically changing
   delays, and to change their service request at any time if the
   current level of service received from the network is not adequate.
   This flexibility allows such applications to operate successfully and
   efficiently over a wide range of network conditions.

   Many applications which transmit interactive data, such as audio and
   video conferencing sessions, are well suited to operation with the
   controlled delay service. Applications which desire proven guarantees
   on packet delivery time, such as real-time control and servoing
   systems, are generally not in this category.

   The controlled delay service does not provide any quantified level of
   assurance about packet delays.  Instead, it merely promises to avoid
   overloads by turning excess traffic away.  Criteria for determining
   when a resource is overloaded are not specified in this definition,
   but are left to the individual vendor.  Thus, while this service
   offers some control over packet delay, it does not meet the stricter
   condition of offering a guaranteed or statistical bound.

   The end-to-end behavior obtained with controlled delay service
   provides a middle ground between the employment of adaptive
   applications in a pure best-effort network and the employment of a
   network which rigidly controls delay.  Strengths of this middle


Shenker/Partridge/WroclawskiExpires 12/25/95                    [Page 2]
INTERNET-DRAFT draft-ietf-intserv-control-del-svc-01.txt     March, 1995


   ground are that applications can obtain some load control and
   delivery preference for their packets while still benefiting from
   their adaptive behavior; that the service can be usefully deployed in
   large, unstructured internetworks; and that the specification is
   amenable to highly efficient implementation and use of network
   resources.

   One important model for the use of controlled delay may be thought of
   as "best-effort with a floor"; applications suited for use with this
   service may choose to operate with no QoS control at all much of the
   time, requesting controlled-delay service from the network only when
   the uncontrolled service slips below acceptable minimums.


Network Element Data Handling Requirements


   The network element must assure that the packet delays are
   controlled.  This must be accomplished through active admission
   control.  In particular, overprovisioning is not sufficient to
   deliver controlled delay service; the element must be able to turn
   flows away if accepting them would cause the element to have
   excessive queueing delays.  However, no quantitative specification of
   average, statistical, or maximal delays is required.

   There are three different logical levels of service. A network
   element may internally implement fewer (or more) actual levels of
   service, but must map them into three logical levels at the
   controlled delay service invocation interface.  The levels have
   different degrees of delay control, with level 1 having the most
   tightly controlled delay, and level 3 having the least tightly
   controlled delay.  The different levels do not have to give strictly
   ordered delays for each packet; that is, the network element need not
   ensure that every packet given level 1 service experiences less delay
   than if it were given level 2 service.  The element need only ensure
   that the typical delays are no greater in level 1 than in level 2
   (and similarly for levels 2 and 3).

   All three levels of service should be given better service, i.e. more
   tightly controlled delay, than uncontrolled best effort traffic.

   The controlled delay service must maintain a very low level of packet
   loss. Although packet losses may occur, any noticeable loss
   represents a "failure" of the admission control algorithm.

   The controlled delay service definition does not require any control
   of short-term packet jitter (variation in network element transit
   delay between different packets in the flow) beyond the control


Shenker/Partridge/WroclawskiExpires 12/25/95                    [Page 3]
INTERNET-DRAFT draft-ietf-intserv-control-del-svc-01.txt     March, 1995


   already exercised on delay. Network element implementors who find it
   advantageous to do so may use resource scheduling algorithms which
   exercise some jitter control.


Invocation Information


   The controlled delay service is invoked by specifying the flow's
   proposed traffic pattern (TSpec) and service request (RSpec) to the
   network element.

   The TSpec takes the form of a token bucket, with bucket depth b and
   token rate r.  Both b and r must be positive.

   The rate r is measured in bytes per 1/100th of a second, and can
   range from 1 to 10^12.  This range allows data rates as small as 800
   bits per second (a reasonable minimum) as well as data rates as large
   as 400 terabits per second (or 5 times what is believed to be the
   maximum theoretical bandwidth of a single strand of fiber) to be
   requested. The representation of this value should be precise to at
   least 0.1 percent of the value.  The use of floating point
   representations in implementations is encouraged.

   The bucket depth b is measured in bytes and can range from 1 byte to
   125 gigabytes.  Again, the representation of this value should be
   precise to at least 0.1 percent of the value, and the use of floating
   point representations is encouraged.

   The range of values is intentionally large to allow for the future
   bandwidths.  The range is not intended to imply that a network
   element must support the entire range.

   The RSpec is a service level.  The service level is specified by one
   of the integers 1, 2, or 3.  Implementations should internally choose
   representations which leave a range of at least 256 service levels
   undefined, for possible extension in the future.

   The TSpec can be represented by two 16-bit values each using the
   following form.

                    15        10 9                 0
                    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
                    | Exponent  |     Value         |
                    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   In this format, the 6 most significant bits of the word encode an
   exponent (E), and the 10 LSB's encode a value (V).  This format


Shenker/Partridge/WroclawskiExpires 12/25/95                    [Page 4]
INTERNET-DRAFT draft-ietf-intserv-control-del-svc-01.txt     March, 1995


   encodes a number, of the form V * (2**E).  This format is chosen to
   allow easy representation of a wide range of values, while avoiding
   over-precise representations. The 16-bit word is placed into packets
   in network byte order.

   The first value is the rate (r) and the second value is the bucket
   size (b).

   The RSpec may be represented as an unsigned 16-bit integer carried in
   network byte order.


Exported Information


   Each controlled delay service module exports at least the following
   information. All of the data elements described below are
   characterization parameters.

   For each level of service, the network element exports three
   measurements of delay (thus making nine quantities in total).  Each
   of these characterization parameters is the maximal packet transit
   delay experienced over a previous time interval T.  The three time
   intervals T are 1 second, 60 seconds, and 3600 seconds.  The exported
   parameters can be somewhat stale, in that they can reflect
   measurements taken as long as time 2T ago.  However, they may also be
   updated more frequently, and can also reflect any subsequent packet
   delays that exceed the current advertised quantity.

   There is no requirement that these characterization parameters be
   based on exact measurements.  In particular, these delay measurements
   can be based on estimates of packet delays or aggregate measurements
   of queue loading.  This looseness is allowed to avoid placing undue
   burdens on network element designs in which obtaining precise delay
   measurements is difficult.

   These delay parameters have an additive composition rule. For each
   parameter the composition function computes the sum, enabling a setup
   protocol to deliver the cumulative sum along the path to the end
   nodes.

   The delays are measured in units of one microsecond.  An individual
   element can advertise a delay value between 1 and 2**28 (somewhat
   over two minutes) and the total delay added across all elements can
   range as high as 2**32-1.  Should the sum of the different elements
   delay exceed 2**32-1, the end-to-end advertised delay should be
   2**32-1.


Shenker/Partridge/WroclawskiExpires 12/25/95                    [Page 5]
INTERNET-DRAFT draft-ietf-intserv-control-del-svc-01.txt     March, 1995


   Note that while the granularity of measurement is microseconds, a
   conforming element is free to measure delays more loosely.  The
   minimum requirement is that the element estimate its delay accurately
   to the nearest 100 microsecond granularity.  Elements that can
   measure more accurately are, of course, encouraged to do so.

      NOTE: Measuring in milliseconds is not acceptable, because if the
      minimum delay value is a millisecond, a path with several hops
      will lead to a composed delay of at least several milliseconds,
      which is likely to be misleading.

   The characterization parameters may be represented as a sequence of
   nine 32-bit unsigned integers in network byte order.  The first three
   integers are the parameters for T=1, T=60 and T=3600 for level 1, the
   next three integers are for T=1, T=60, T=3600 for level 2, and the
   last three integers are for T=1, T=60, T=3600 for level 3.

   The following values are assigned from the characterization parameter
   namespace.

   The controlled delay service is service_name 1.

   The delay characterization parameters receive parameter_number's one
   through nine, in the order given above. That is,

      parameter_name          definition

      1                       Service Level = 1, T = 1
      2                       Service Level = 1, T = 60
      3                       Service Level = 1, T = 3600
      4                       Service Level = 2, T = 1
      5                       Service Level = 2, T = 60
      6                       Service Level = 2, T = 3600
      7                       Service Level = 3, T = 1
      8                       Service Level = 3, T = 60
      9                       Service Level = 3, T = 3600


   The end-to-end composed results are assigned parameter_names N+10,
   where N is the value of the per-hop name given above.

   No other exported data is required by this specification.


Policing


   Policing is done at the edge of the network and at all source merge


Shenker/Partridge/WroclawskiExpires 12/25/95                    [Page 6]
INTERNET-DRAFT draft-ietf-intserv-control-del-svc-01.txt     March, 1995


   points within the interior of the network.  A source merge point is
   any point where data from multiple sources is merged into a single
   flow.  It is the responsibility of the invoker of the service (a
   setup protocol, local configuration tool, or similar mechanism) to
   identify points where policing is required.

   Nonconforming packets are dropped. Other actions, such as delaying
   packets until they fit within the bounds of the policing function,
   are not permitted.

      NOTE: This point is open to discussion. The requirement given
      above may be too strict; it may be better to permit some delaying
      of a packet if that delay would allow it to pass the policing
      function.  Intuitively, a plausible approach is to allow a delay
      of (roughly) up to the maximum queueing delay experienced by
      completely conforming packets before declaring that a packet has
      failed to pass the policing function and dropping it. The merit of
      this approach, and the precise wording of the specification which
      describes it, require further study.

   Implementors should note that traffic entering the network with a
   certain burstiness factor will in many circumstances tend to grow
   more bursty as it traverses the network. Thus, a well-chosen policing
   function for network elements serving as interior source merge points
   will allow for somewhat more traffic burstiness than would be
   suggested by simple consideration of the traffic's TSpec at that
   point.

      NOTE: This modification of the policing function to allow for
      increased burstiness as traffic flows through a network is
      independent of any modification of the TSpec or policing function
      needed to handle the merging of multiple service requests.


Ordering and Merging


   Traffic specifications (TSpecs) presented to a controlled delay
   service module are ordered as follows; TSpec A is substitutable for
   ("as good or better than") TSpec B if both the token bucket depth and
   rate of TSpec A are greater than or equal to the token bucket depth
   and rate of TSpec B.

   A merged TSpec may be calculated over a set of TSpecs by taking the
   largest rate and largest bucket size across all members of the set.
   This use of the word "merging" is similar to that in the RSVP
   protocol; a merged TSpec is one which is adequate to describe the
   traffic from any one of a number of flows.


Shenker/Partridge/WroclawskiExpires 12/25/95                    [Page 7]
INTERNET-DRAFT draft-ietf-intserv-control-del-svc-01.txt     March, 1995


   In other circumstances it is necessary to computed the "additive"
   TSpec, which describes the traffic of all of the input flows
   simultaneously. The rate and bucket size parameters of an additive
   TSpec are computed by adding together the rate and bucket size
   parameters of each of the original TSpec's.

   It is possible to combine these functions to compute a TSpec for the
   traffic from any N of a set of M original flows, where N ranges from
   1 to M.

   Service request specifications (RSpecs) are ordered by their
   numerical values (in inverse order); service level 1 is substitutable
   for service level 2, and service level 2 is substitutable for service
   level 3.


Resulting Service


   The resulting end-to-end service is one that offers applications
   several levels of delay to choose from.  Furthermore, this service
   promises that the levels of experienced delays will be controlled,
   such that all three levels of service will have lower average delays
   than best effort service.  However, this service makes no assurances
   about the absolute levels of delay or jitter the receiving
   application will experience.

   This service is designed for use by adaptive playback applications,
   among others. These applications may be willing to accept varying
   degrees of perceived loss due to late packets. At a given level of
   controlled delay service the application can vary the loss rate by
   utilizing a more or less conservative delay adaptation function to
   adjust their "playback point"; the application will compromise
   between minimal playback delay and percentage of packets arriving too
   late to be useful. If the best compromise at a given service level is
   not good enough, the application may select a lower-delay service
   level.


Guidelines for Implementors


   It is expected that the service levels implemented at a particular
   element will offer significantly different levels of delay control.
   There seems little advantage in offering levels which differ only
   slightly in the level of delay control.  So, while a particular
   element may offer less than three levels of service, the levels of
   service it does offer should have notably different queueing delays.


Shenker/Partridge/WroclawskiExpires 12/25/95                    [Page 8]
INTERNET-DRAFT draft-ietf-intserv-control-del-svc-01.txt     March, 1995


   An additional service currently being considered is the "predictive"
   service described in [xxx].  It is expected that if an element offers
   both predictive service and controlled delay service, that it need
   not implement both but can use the predictive service as a controlled
   delay service.  This is allowed since (1) the required behavior of
   predictive service meets all of the requirements of controlled delay
   service, (2) the invocations are compatible, and (3) the ordering
   relationships defined in the predictive service specification
   document are such that a given level of predictive service is at
   least as good as the same level of controlled delay service.

      NOTE: The inter-service mapping with predictive service, mentioned
      above, is omitted from the "Ordering and Merging" section of this
      draft of the controlled delay service specification because the
      exact definition of both services is still under discussion.
      Should the final definitions include an inter-service mapping
      function, the Ordering and Merging sections of each document might
      contain words similar to the following:

      "In addition, the controlled delay service is related to the
      predictive service in the sense that a given level of predictive
      service is considered at least as good as the same level of
      controlled delay service.  See additional comments in the
      guidelines section."

Evaluation Criteria


   Evaluating a network element's implementation of controlled delay
   service is somewhat difficult, since the quality of service depends
   on overall traffic load, the traffic pattern presented and the degree
   of delay control implemented.  In this section we sketch out a
   methodology for testing an element's controlled delay service.

   The idea is that one chooses a particular traffic mix (for instance,
   30 percent 1, 10 percent level 2, 20 percent level 3 and 40 percent
   uncontrolled best-effort traffic) and loads the network element with
   progressively higher amounts of this traffic mix (i.e., 40% of
   capacity, then 50% of capacity, on beyond 100% capacity).  For each
   load level, one measures the utilization, mean delays, and the packet
   loss rate for each level of service (including best effort).  Each
   test run at a particular load should involve enough traffic that is a
   reasonable predictor of the performance a long-lived application such
   as a video conference would experience (e.g., an hour or more of
   traffic).

   This memo does not specify particular traffic mixes to test.
   However, we expect in the future that as the nature of real-time


Shenker/Partridge/WroclawskiExpires 12/25/95                    [Page 9]
INTERNET-DRAFT draft-ietf-intserv-control-del-svc-01.txt     March, 1995


   Internet traffic is better understood, the traffic used in these
   tests will be chosen to reflect the current and future Internet load.


Examples of Implementation


   A possible implementation of controlled delay service would be to
   have a queueing mechanism with three priority levels, with level 1
   packets being highest priority and level 3 packets being lowest
   priority.  Each controlled delay service level would be associated
   with a target queue utilization level, say 20% for level 1, 50% for
   the combination of levels 1 and 2, and 70% for the combination of all
   three levels.  The utilization of the link, by each of the three
   levels, would be measured over some relatively short time period
   (say, 5 seconds, or 10000 MTU packet transmission times).  A new flow
   would be admitted to level 1 if the measured usage of level 1, plus
   the token bucket rate of the new flow, was below the target
   utilization of level 1.  Similarly, a new flow would be admitted to
   level 2 if the measured usage of levels 1 and 2, plus the token
   bucket rate of the new flow, was below the target utilization of
   levels 1 and 2.


Examples of Use


   We give two examples of use, both involving an interactive
   application.

   In the first example, we assume that either the receiving application
   is ignoring characterizations or the network is not delivering the
   characterizations to the end-nodes. We further assume that the
   application's data transmission units is timestamped.  The receiver,
   by inspecting the timestamps, can determine the end-to-end delays and
   react if they are excessive.  If so, then the application asks for a
   better level of service.  If the delays are well below the required
   level, the application can ask for a worse level of service.  A
   protocol useful to applications providing this capability is the
   proposed IETF Real-Time Transport Protocol [xxx].

   In the second example, we assume that characterization parameters are
   delivered to the receiving application.  The receiver chooses the
   service level whose characterizations for the maximal delays for all
   intervals are under the required level after network latencies are
   considered. If the actual delays during the course of operation are
   worse than expected, the application can ask for a better level of
   service.


Shenker/Partridge/WroclawskiExpires 12/25/95                   [Page 10]
INTERNET-DRAFT draft-ietf-intserv-control-del-svc-01.txt     March, 1995


Security Considerations

   Security considerations are not discussed in this memo.

Authors' Address:


   Scott Shenker
   Xerox PARC
   3333 Coyote Hill Road
   Palo Alto, CA  94304-1314
   shenker@parc.xerox.com
   415-812-4840
   415-812-4471 (FAX)

   Craig Partridge
   BBN
   2370 Amherst St
   Palo Alto, CA  94306
   craig@bbn.com

   John Wroclawski
   MIT Laboratory for Computer Science
   545 Technology Sq.
   Cambridge, MA  02139
   jtw@lcs.mit.edu
   617-253-7885
   617-253-2673 (FAX)


Shenker/Partridge/WroclawskiExpires 12/25/95                   [Page 11]