Internet Draft                Load Control                    April 2000


                   Load Control of Real-Time Traffic
                     draft-westberg-loadcntr-03.txt
                         Document Revision: 1.3
                          2000/04/19 12:43:19

                  A Two-bit Resource Allocation Scheme

                               April 2000


                              L. Westberg
                             Z. R. Turanyi
                               D. Partain

                                Ericsson


Various Authors           Expires October 2000                  [Page 1]


Internet Draft                Load Control                    April 2000


Status of this Memo

This document is an Internet-Draft and is in full conformance with all
provisions of Section 10 of RFC2026.

Internet-Drafts are working documents of the Internet Engineering Task
Force (IETF), its areas, and its working groups.  Note that other groups
may also distribute working documents as Internet-Drafts.

Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time.  It is inappropriate to use Internet- Drafts as reference material
or to cite them other than as "work in progress."

The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt

The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.


1.  Abstract

   The purpose of this memo is to present a new resource allocation
   scheme for DiffServ (DS) networks, called Load Control.  The main
   purpose of Load Control is to provide a simple and scalable solution
   to the resource provisioning problem.

   Load Control addresses two particular issues:
     1. Measurement-based access control, whereby a probe packet is
        sent along the forwarding path in a network to determine
        whether a flow can be admitted based upon the current
        congestion state of the network
     2. A lightweight reservation of a certain amount of network
        resources.

   Load Control uses two-bit markers in packet headers to carry load
   information from core routers to edge devices. The scheme provides
   the capability of controlling the traffic load in the network without
   requiring signaling or any per-flow processing in the core routers.
   The complexity of Load Control is kept to a minimum to make
   implementation simple.

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED",  "MAY", and "OPTIONAL" in this


Various Authors           Expires October 2000                  [Page 2]


Internet Draft                Load Control                    April 2000


   document are to be interpreted as described in [RFC2119].


2.  Background and Motivation

   The amount of traffic carried on the Internet is now greater than the
   traffic on the world's telephony network. Still, Internet-based
   communication services generate less income than plain old telephony
   services. Enabling value-added services over the Internet is
   therefore crucial for service providers. One significant class of
   such value-added services requires real-time packet transportation.
   It can be expected that these real-time services will be popular as
   they replicate or are natural extensions of existing communication
   services like telephony.  Exact and reliable resource management
   (e.g., admission control) is essential for achieving high utilization
   in networks with real-time transportation capabilities. The problem
   is difficult mainly due to scalability issues.

   With the introduction of differentiated services (DS) [RFC2475], it
   is now possible to provide large scale, real-time services. The basic
   idea of DiffServ is that, rather than classifying packets at each
   router, packets are only classified at the edge devices.  The result
   - the required packet treatment - is stored and carried in the packet
   headers, and core routers can carry out appropriate scheduling.

   The current definition of DiffServ, however, does not contain any
   simple, scalable solution to the problem of resource provisioning and
   control. A number of approaches to solving the problem already exist
   [Berson97, Guerin97, Stoica99, Bernet99].  The scheme presented in
   this document does not require any state aggregation and aims at
   extreme simplicity and low cost of implementation along with good
   scaling properties. Load control operates edge-to-edge in a DS
   domain, or between two RSVP-capable routers, where only the edge
   devices keep flow state and do per-flow processing.  The main purpose
   of Load Control is to provide a simple and scalable solution to the
   resource provisioning problem.


3.  Overview

   Load control is achieved by two actions: measurement-based admission
   control of incoming requests and the dropping of admitted flows in
   case of exceptional events such as link failures.  Load Control uses
   two-bit markers in the packet headers to gather information about the
   load level along various paths through the network.  The core routers


Various Authors           Expires October 2000                  [Page 3]


Internet Draft                Load Control                    April 2000


   are able to mark passing packets to signal the exhaustion of
   resources to the edge devices.

   For admission control, the resource state of core routers is gathered
   by sending a specially marked packet, denoted a "probe" packet, from
   the ingress to the egress edge device.  The probe result is then used
   by the ingress to decide flow acceptance or rejection and to set up
   traffic conditioning/policy.  If rigid admission control is required,
   soft-state based reservations are also supported. In this case the
   probe packet does both the probing and allocation of resources along
   the path. The latter method is comparable to signaling based schemes
   but does not require processing of signaling messages in the core
   routers.

   Under normal circumstances, admission control is enough to control
   the load in the network. Nevertheless, when exceptional events (such
   as link failures) cause too much traffic to be re-routed over a link,
   the resulting severe congestion may degrade the quality of all of the
   flows on the link. In that case, the best solution might be to keep
   existing flows and suffer the loss of quality. However, for some
   services, it may be desirable to drop some of the previously admitted
   flows to protect the quality of the remaining flows. Thus, when
   severe congestion occurs, the core routers mark the headers of all
   (not only probe) packets to notify the edge devices of the congestion
   condition.

   In the following sections, we assume a DS (DiffServ) domain where
   connection requests arrive at the edges of the domain via RSVP, at
   the request of a Bandwidth Broker, or by other means.  The requests
   may be generated directly at the edge by a gateway, which provides
   connection to other types of networks, or in hosts that are connected
   directly to the domain.


4.  Operation of Load Control

   The load control scheme has two modes of operation:

      a) 'Simple marking':  This refers to a measurement-based admission
      scheme where routers measure the traffic volume and base the
      marking on these results.

      b) 'Unit-based reservations':  A "unit" represents a share of
      bandwidth in the network that could be reserved by the edge
      devices.  This mode makes it possible to perform resource


Various Authors           Expires October 2000                  [Page 4]


Internet Draft                Load Control                    April 2000


      reservations, independently of the amount of traffic that is
      actually transmitted.


   Both modes can perform admission control of incoming requests and
   indicate exceptional events.

   In the appendices, we present some analysis of Load Control
   properties, but a more detailed investigation can be found in
   [Tur99].


4.1.  Simple Marking

   The idea of simple marking is that core routers measure the traffic,
   and, if they encounter near exhaustion of resources, they mark
   passing probe packets and thereby notify the edge devices of the lack
   of resources.

   The scheme has the following steps of operation:

      1) Resource Probing: Before establishing the flow, the initiating
      edge device sends a probe packet into the network.  The probe
      packet passes through the same routers as the actual traffic will
      pass through (in any case, with a high degree of probability) and
      is exposed to the marking function in each router. The marking
      performs an OR-operation of its own status and the incoming probe
      packet status (a packet once marked must not be changed).  When
      the packet reaches the egress edge device, its header will reflect
      the aggregated resource status along that path.

      2) Send resource status to ingress: When the egress edge device
      receives the probe packet, it copies the marker from the header to
      the header/payload of a reverse packet and sends it back to the
      initiating party (the ingress edge device). The probe packet may
      be discarded, converted to an ordinary data packet, or
      encapsulated (as mentioned above) and sent to the ingress edge
      device. The packet containing the probing result can also serve as
      a probe packet for the reverse path. This allows the initiating
      party to check for bi-directional resources.

      3) Acceptance/Rejection: The report packet is returned to the
      initiating ingress edge device, which uses the result of the probe
      to admit or block the request by setting up appropriate packet
      filtering, measuring, and marking rules.


Various Authors           Expires October 2000                  [Page 5]


Internet Draft                Load Control                    April 2000


      4) Reaction to exceptional events: If a core router detects severe
      congestion on an interface, it starts marking all packets on that
      interface. If the egress edge device receives a marked packet
      which is not a probe packet, this can be interpreted as a sign of
      severe congestion along the path.  The fact that the incoming
      marked packet was not sent as a probe packet can be determined
      from the packet content, by multi-field classification or by
      checking the admittance state at the egress edge device.  If
      severe congestion occurs, a signaling message can be sent to the
      ingress edge device, which can then take the appropriate action.


   To make the scheme more robust against packet loss, the initiating
   edge device MAY maintain a timer associated with each probe packet.
   If a probe packet is lost, the device simply re-transmits on time-
   out.  How often and how many times the probe packet should be
   retransmitted before failure is declared is an implementation issue,
   but these parameters SHOULD be configurable (e.g., via an SNMP MIB).
   Furthermore, whether probes are retransmitted at all SHOULD be
   configurable.


4.2.  Unit-based Reservations

   While measurement-based admission control has important advantages
   over non-measurement based algorithms, it has disadvantages as well.
   Unit-based reservations allow the sources to keep their reservations
   irrespective of the volume of the traffic they transmit.  Although
   the admission scheme is very similar to the simple marking case, the
   presence of actual reservations is a fundamental difference.

   Each flow can occupy any number of units of resources, and even
   fractions of units by allowing a number of flows to share a common
   resource unit.  The unit is not necessarily a simple bandwidth value:
   it may be defined in terms of any resource unit (e.g., effective
   bandwidth) to support statistical multiplexing at packet level (use
   of silence period). The definition of the unit may vary from network
   to network and is outside the scope of this document.  The basic idea
   of unit-based reservation is to allow the edge devices periodically
   to mark some of the data packets to refresh resource reservation.
   Each refresh packet reserves one unit of resources for one refresh
   period. Reservations are timed out after a refresh period and have to
   be refreshed in a soft state manner.  The length of the refresh
   period must be the same throughout the DS domain and SHOULD be


Various Authors           Expires October 2000                  [Page 6]


Internet Draft                Load Control                    April 2000


   configurable.

   Core routers estimate the number of reservations by counting the
   number of refresh packets during a refresh interval. If the router
   runs out of units, it goes into blocking state, starts to mark probe
   packets indicating congestion and thereby rejects new flows.  The
   probe packets that pass the router unmarked and the refresh packets
   reserve one unit of resources for the following refresh period.
   (Editor note:  It is clear that we need to have the capability of
   reserving more than one unit, but it is not yet clear how that will
   be encoded in the packet header.  See below.) Thus, after the probe
   packet has passed along the path unmarked, the ingress edge device is
   required to send the first reservation refresh packet during the next
   refresh period.

   If a flow occupies more than one unit, more than one probe packet may
   be sent to allocate the required number of resources (an alternative
   using only one packet should be defined).  Similarly, more than one
   refresh packet must be sent for such a flow. By proper definition of
   the unit, a wide range of flows can be described and handled using
   this simple mechanism.

   If a probe packet was forwarded unmarked by a core router, but was
   marked later downstream, that core router will not be notified and
   will incorrectly maintain the reservation. However, as the flow is
   rejected, no refresh packets will arrive, and the reservation will
   time out at the end of the refresh period and will be released.

   Severe congestion is handled in the same way as in 'Simple marking'
   (see below).

   If a refresh packet is lost, the downstream routers will
   underestimate the number of reserved units. Refresh and probe packets
   should therefore be protected from losses in the manner described
   above.

   Core routers estimate the number of allocated units by counting the
   number of refresh packets during a refresh period.  The accuracy of
   the estimate can be increased by generating refresh packets evenly
   spread in time over the refresh period. This minimizes errors
   resulting from time alignment differences between routers and edge
   devices.


Various Authors           Expires October 2000                  [Page 7]


Internet Draft                Load Control                    April 2000


4.3.  Multiple Unit reservation

   In some cases it might feasible to add functionality for reservation
   of several units in one single reservation request.  A similar
   semantic (as the two-bit reservation scheme) could be used to provide
   such functionality but it will of course require addition of a
   integer value denoting the number of units.

   The coding of such proposal is still under discussion and needs to
   studied further.


4.4.  Codepoints for Flow Types

   In both variants of Load Control, routers making marking decisions
   have very little information about the resource or QoS requirement of
   the flow in question. The DS field of the probe packet can be used to
   indicate the DiffServ class the flow will arrive on and thus the QoS
   requirements.  The marking function of core routers can take the
   required PHB into account when deciding on the marking.

   Information on the resource requirements for incoming flows can also
   be expressed using the DS field by dividing real-time traffic into
   classes based on resource requirements and using different codepoints
   for different classes. If the DSCPs denote not only the PHB that the
   flow is to receive, but implicitly also the bandwidth requirements
   for the flow, core routers will be able to mark packets more
   intelligently, resulting in less resource waste and greater
   flexibility.

   In the unit-based case, the major benefit is that the size of the
   unit can be different in different classes, making it possible to
   allocate resources with finer granularity.


5.  Objects for Standardization

   A forthcoming standard might only include the encoding of the Load
   Control information into the IP header and some design
   recommendations.


Various Authors           Expires October 2000                  [Page 8]


Internet Draft                Load Control                    April 2000


5.1.  Packet Types

   We need four types of packets in the algorithm:

      - Ordinary Packet (OP)
      - Probe Packet (PP)
      - Marked Packet (MP)
      - Refresh Packet (RP)

   During transport through the network, a probe packet can be changed
   to a marked packet. This indicates that at least one router does not
   accept the reservation associated by the probe packet.

   ------       Rejection       ------
   | PP |---------------------->| MP |
   ------                       ------

   An ordinary packet can also be changed to a marked packet, meaning
   that some exceptional event caused severe congestion on one link of
   the path the packet took.

   ------  Severe Congestion   ------
   | OP |---------------------->| MP |
   ------                       ------

   In the simple marking scheme, only three packet types are used.
   Refresh packets are treated as ordinary packets, except that these
   packets cannot be changed to marked packets.


5.2.  Coding of Packet Types

   We have two alternative solutions for storing Load Control related
   information in the packet headers: using new DS codepoints or using
   the two currently unused bits (intended for ECN) in the DS byte.  The
   latter case is only considered in Appendix E.

   In the first alternative (where PHBs are intended to be used together
   for Load Control), two or three new codepoints would have to be
   defined for probe, marked and (optionally) refresh packets. For
   example, in the case of the EF PHB, in addition to the codepoint used
   for the EF packets, EF-probe, EF-marked and EF-refresh packets can
   also be sent. The new codepoints can be drawn from the LU/EXP space.


Various Authors           Expires October 2000                  [Page 9]


Internet Draft                Load Control                    April 2000


5.3.  Behavior Description

   The behavior of the edge devices depends greatly on the application
   or signaling protocol that uses the load control scheme. Below we
   only describe the few aspects of the edge device behavior that are
   necessary for interworking with the core routers.


5.3.1.  Behavior of the Core Routers

   All core routers continuously maintain a state of accepting or
   rejecting more flows.  If the state is accepting, the router passes
   all packets unchanged. If the state is congestion, then the router
   changes the marking of incoming packets from probe to marked.

   If the router is capable of detecting severe congestion, and this
   occurs, then the router forwards both ordinary and probe packets as
   marked.  The router MUST NOT change the marking of refresh packets.

    Addition for Unit-based Reservations:

      The router uses the refresh and probe markers in packets to
      maintain its estimation of reserved resources. A refresh packet
      signals previously admitted resource usage, while a probe packet
      signals a new request. When passed unmarked, both types of packets
      reserve one unit for one refresh period.


5.3.2.  Behavior of the Edge Devices

   When a new reservation is needed, the ingress edge device should send
   the appropriate number of packets marked as probe.

   If the egress edge device receives a probe packet that is marked,
   this means that the network has insufficient capacity along the path
   between the two edge devices. The egress edge device should take care
   of blocking the flow by notifying the ingress device.  If the egress
   device receives a marked packet that is not initially sent as probe
   packet, it shall inform the ingress device to reject admitted flows.
   This can be determined from the packet content, multi-field
   classification of the IP header, or by checking the admittance state
   at the egress edge device.

    Addition for Unit-based Reservations:


Various Authors           Expires October 2000                 [Page 10]


Internet Draft                Load Control                    April 2000


      For the unit-based reservation scheme, the ingress edge device
      should generate the required number of refresh packets per refresh
      period and per flow. If there are not enough data packets to mark
      as refresh packets, the ingress device must generate dummy packets
      and mark those as refresh packets.  The generated refresh packets
      should be as uniformly distributed through the refresh interval as
      possible to minimize the effect of refresh interval timing between
      routers.


6.  Interworking with RSVP/Intserv

   Load control can also be used in DiffServ regions (backbones) that
   connect RSVP/Intserv regions. This inter-operation is described in
   detail in [Bernet99]. For load control, border routers of the
   DiffServ region must be RSVP-aware in order to detect the arrival of
   new connections.

   RSVP PATH messages can be used as probe packets to gather congestion
   information along the path between the two border routers. When a new
   RSVP path state is installed at the egress border router, the
   collective admission state of the path (collected in the packet of
   the PATH message) is also stored. If a RESV message for the installed
   state arrives within a time period during which the congestion state
   can be considered valid, then the egress border router can perform
   the admission control for the DiffServ network as well. If the first
   RESV message arrives too late, then the egress border router MUST
   solicit a new (dummy) probe packet from the ingress router to
   determine the current congestion state.

   When the egress receives a marked packet that is not a PATH message
   nor a dummy probe packet, this signals a severe congestion state
   along the path. The identity of the ingress router can easily be
   determined from the path state, but in this case the egress router
   can itself decide to drop certain reservations. The ingress router
   can be notified via ResvTear messages while the receiver end systems
   get ResvErr messages.

   RSVP routers can also be placed inside the domain. In this case,
   probing is performed between RSVP routers instead of edge devices.
   Thus adding a simple and cheap extension to non-RSVP capable routers,
   correct admission control is possible on non-RSVP capable parts of an
   end-to-end path.

   Unit-based reservations can also be used to provide resources in a DS


Various Authors           Expires October 2000                 [Page 11]


Internet Draft                Load Control                    April 2000


   domain that is used to provide VPN tunnels between customer sites.
   Using a load control scheme, it is fast and easy to modify the size
   of these tunnels. Thus, tunnel size selection can be a very dynamic
   process. Note that tunnels are not necessarily real-time tunnels.
   Packets of any DSCP can travel on them after receiving the
   appropriate PHB. Even best-effort tunnels can be reserved this way.
   Provisioning can be done on a per-DSCP basis or in aggregates as the
   service provider wishes.


7.  Security Considerations

   We propose using two-bit markers in packet headers (DS field) to
   reserve resources within a DiffServ domain. This poses similar
   security problems to the use of the DS field to differentiate packets
   in general [RFC2475].

   If the interior of the DS domain fully contains a tunnel, then by
   copying the outer marking into the inner header at de-encapsulation,
   load control can be exercised over the links of the tunnel as well.
   The procedure is similar to the one described in [RFC2481]. As IPSec
   [RFC2402, 2406] does not allow the copying of the DS field from the
   outer to the inner header at de-encapsulation, load control cannot be
   exercised over regions where IPSec tunnels are used.


8.  Identification of Edge Nodes

   In the absense of RSVP, an alternative method for identificatiof of
   edge nodes will be required.  This section needs to be written.


9.  Multicast-related Issues

[RFC2406] Kent, S. and R. Atkinson, "IP Encapsulating Security Payload
   (ESP)", RFC 2406, November 1998.

[Bernet99] Bernett, Y., Yavatkar, R., Ford, P., Baker, F., Zhang, L.,
   Speer, M., Braden, R., "Interoperation of RSVP/Intserv and Diffserv
   Networks", Work in Progress, March 1999

[Stoica99] Stoica, I., et al "Per Hop Behaviors Based on Dynamic Packet
   States", Work in Progress, February 1999


Various Authors           Expires October 2000                 [Page 12]


Internet Draft                Load Control                    April 2000


[Berson97] Berson, S. and Vincent, R., "Aggregation of Internet
   Integrated Services State", Work in Progress, December 1997.

[Guerin97] Guerin, R., Blake, S. and Herzog, S.,"Aggregating RSVP based
   QoS Requests", Work in Progress, November 1997.

[Gross99] Grossglauser, M., Tse, D. N. C., "A Time-Scale Decomposition
   Approach to Measurement-Based Admission Control", Infocom '99

[Tur99] Z. R. Turanyi, L. Westberg "Load Control: Lightweight
   Provisioning of Internet Resources" submitted to Networking 2000,
   Paris, May 2000, http: //www.ericsson.co.hu/ethzrt/

[IAB-QoS] G. Huston (Internet Architecture Board), "Next Steps for the
   IP QoS Architecture", Work in Progress, March 2000.

[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
   Requirement Levels", BCP 14, RFC2119, March 1997.


Appendix A. Admission Precision of Simple Marking

   Simple marking is basically a measurement-based admission control
   scheme, where flows do not say anything about their traffic
   characteristics. In addition, flow departure is not signaled
   explicitly.

   When the network carries more types of flows with different bandwidth
   requirements, the core routers do not know the bandwidth requirements
   of the incoming flows. They simply declare whether they will accept
   more flows or not irrespective of the bandwidth demands of the new
   flow. Thus the marking algorithm in the routers should conservatively
   always expect the largest type of flow that the network carries and
   start rejecting flows when there is not enough bandwidth left for one
   such flow.  On the positive side, this will result in fair rejection
   among different flow types, but on the negative side, some bandwidth
   will be wasted.  However, if the links of our domain can carry at
   least several hundred requests even from the most bandwidth-demanding
   types of flow, then this is not a significant waste.


Various Authors           Expires October 2000                 [Page 13]


Internet Draft                Load Control                    April 2000


Appendix B. Effect of Delays on Admission

   When a probe packet is passed unmarked without correcting the
   estimate of the free resources, we in fact admit a flow without
   immediately reserving resources for it.  The reservation will be
   implicitly done later by the arriving traffic or refresh packets of
   the flow. During the time between admission and the arrival of the
   traffic of the flow, new requests can be admitted without taking the
   previously admitted flow into account.  To illustrate the effects of
   this delay, we took an old and simple Markovian example. Flows are
   identical with an average flow-holding time of 180 seconds and flow
   arrivals and departures follow a Poisson process. Let the link be
   able to carry N calls and let the delay be T. The link starts
   refusing flows when the measured traffic exceeds N-H calls. We can
   say that a space of size H is put aside to cater for the errors
   caused by the delay.

   If the link is properly dimensioned, then the usual blocking ratio
   should not exceed 1%. However, in a mass call situation (such as
   occurs at New Year's Eve for example) it can be considerably higher.
   In this example, 50% blocking was chosen to demonstrate the extreme
   load case. Thus, the offered traffic is roughly twice the link
   capacity.

   QoS violation occurs if during time T the difference between the
   number of arriving and departing flows is larger than H. Under the
   above assumptions, the chance of QoS violation can be calculated.
   Naturally the larger H is, the less the chance is that QoS will be
   violated. The required value of H can be determined for a low value
   of QoS violation probability (e.g.  10e-5).

   The following table presents the value of H as a function of link
   size (N), delay length (T) and load (causing 1% or 50% blocking).

         |    1ms    |   10ms    |   100ms   |   500ms   |     1s    |
         | 1%  | 50% | 1%  | 50% | 1%  | 50% | 1%  | 50% | 1%  | 50% |
   ------+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+
      50 |  2  |  2  |  2  |  3  |   3 |   4 |   4 |   5 |   5 |   7 |
     100 |  2  |  2  |  3  |  3  |   4 |   4 |   4 |   7 |   6 |   9 |
     500 |  2  |  3  |  3  |  4  |   4 |   7 |   9 |  13 |  12 |  18 |
    1000 |  3  |  3  |  4  |  4  |   5 |   9 |  12 |  18 |  16 |  25 |
    5000 |  3  |  4  |  5  |  7  |  12 |  18 |  24 |  44 |  33 |  69 |
   10000 |  4  |  4  |  7  |  9  |  16 |  25 |  33 |  69 |  47 | 113 |

   The amount of required safety margin is highest for small links,


Various Authors           Expires October 2000                 [Page 14]


Internet Draft                Load Control                    April 2000


   since less statistical multiplexing is possible there.


Appendix C. A Simple Algorithm for Core Routers

   In this appendix, we present an algorithm for core routers that use
   unit-based reservations. The algorithm is simple, so it can be easily
   implemented in hardware by simple counters. Its inputs are the
   refresh interval and the number of flows allowed on the link. The
   latter is denoted by <threshold>. (We assume flows with similar
   characteristics (e.g., voice) and that one flow sends one refresh
   packet per refresh interval.) If the network uses more DSCPs for
   real-time traffic, then a separate copy of the algorithm may be run
   for each DSCP, resulting in per-DSCP admission.

   The algorithm counts the number of refresh and admitted probe packets
   in refresh intervals (<count>). The result of the counting is an
   upper limit on the number of units reserved on the link, as some
   reservations may have gone by the end of the refresh interval. The
   value of this counter is used in the next interval to decide on
   admission (<last>). When a new reservation is admitted, this value is
   increased to take the new reservation into account. If this value is
   high above the admission limit, then we start sending severe
   congestion notifications by marking regular packets as well.

      On initialization:
         last = 0
         count = 0

      On arrival of a refresh packet
         count++

      On arrival of a probe packet
         if last < threshold then
            last ++
            count ++
         elseif
            Mark Packet
         endif

      On arrival of a regular packet
         if last < threshold*1.1 then
            Mark Packet
         endif


Various Authors           Expires October 2000                 [Page 15]


Internet Draft                Load Control                    April 2000


      At the end of the refresh interval
         last = count
         count = 0


Appendix D. Simulation Results

   The purpose of the simulations described in this appendix is to give
   some insight into the performance of load control. The simulation
   cases are by no means representative, and the scheme may work
   differently in other situations. In section C.1, the simple marking
   case is demonstrated with a purely measurement-based admission
   algorithm by using a single link with both constant bit-rate and
   on/off sources. In appendix C.2, the unit-based reservation method is
   shown, using the algorithm in appendix B.

   Severe congestion signaling is not used in any of the examples; only
   admission control is used.

   We simulated a very simple network of one link. This can be viewed as
   the single bottleneck in the domain. The link had a 2 Mbit/s
   throughput, 50% of which was designated to carry real-time traffic.
   The round trip propagation delay was set to 100ms. The real time
   flows arrived according to a Poisson process, holding time was
   exponential with a 90 second mean. The arrival rate of flows was set
   to produce approximately 50% blocking. Only real-time traffic was
   simulated, so scheduling was simple FIFO.


D.1 Simple Marking


D.1.1 Constant Bit-Rate Sources

   In the first case, flows emitted 40 byte long packets every 20 ms,
   producing a constant 16 kbit/s load. The 1 Mbit/s capacity assigned
   to this traffic can thus carry 62.5 flows. From the table in appendix
   A, we can see that 4 calls should be reserved in addition to the
   62.5. After an initial transient of 5 minutes, we simulated 2.5
   hours.

   During the 2.5 hour simulation time, utilization was measured over
   5-minute intervals. Utilization was also measured in 20ms slots and
   the percentage of slots in which it was above 1.064 Mbit/s (66.5


Various Authors           Expires October 2000                 [Page 16]


Internet Draft                Load Control                    April 2000


   calls) was counted.

      min/avg/max of the utilization was: 881 / 899 / 914 kbit/s
      min/avg/max of the violation ratio was: 98.96% / 99.78% / 100%


D.1.2 On/Off Sources

   In the second simulation case, on/off sources were used. During an
   "off" period, no packets were generated, while in the "on" state the
   behavior is the same as in the previous case: 40 byte long packets 20
   ms apart. The distributions of the on and off periods were both drawn
   from a pareto distribution with the shape parameter of 1.1 and mean
   of 5 seconds. The average bit-rate of the sources is thus 8 kbit/s.
   The flow arrival rate has been doubled to produce50% blocking,
   since the link is capable of carrying nearly twice the number of
   flows. The same set of measurements was carried out as in the
   previous case.

      min/avg/max of the utilization was: 808 / 819 / 837 kbit/s
      min/avg/max of the violation ratio was: 98.98% / 99.40% / 99.70%

It can be seen that although the measurement-based approach was not able
to prevent the over-use of the real-time resources in this high load
case, it is a viable alternative. In no case did the 20 ms measurements
exceed 1.15 Mbit/s, so the over-use just means a temporary steal from
the resources provisioned to the lower priority traffic.


D.1.3 The Router Algorithm

   The mbac algorithm used by the router is presented here only for the
   completeness of the simulation description. The marking strategy was
   the same for both types of traffic. The router counts the number of
   bytes transmitted in every 20 ms interval and calculates the average
   bit rate in these 20 ms slots. Then it smoothes these values in time
   through an exponentially weighted moving average (ewma) filter. The
   window size of the ewma was set to 9 seconds, i.e., running a unit
   step function through it, the output will be 0.63 after 9 seconds.
   The algorithm also calculated the histogram of the difference between
   the original slot values and the filtered values. The histogram has
   been counted in 1000 bins between the range of -1 and +1 Mbit/s. The
   99% quantile of the histogram was calculated every 100 seconds. The
   router marks all passing packets if the sum of the output of the ewma
   filter and the calculated quantile is greater than 1 Mbit/s. The


Various Authors           Expires October 2000                 [Page 17]


Internet Draft                Load Control                    April 2000


   router makes no correction to its measurements when a new flow is a

   Thus, the target violation probability was set to 1%, which was in
   fact fulfilled in the long run.

   On arrival of a new packet, only counters are incremented. Every 20
   ms a new value for the ewma must be calculated, a marking decision
   must be made for the next 20 ms and the value of one bin in the
   histogram must be increased. Every 100 seconds, the 99% quantile
   value must be looked up in the histogram and the histogram must be
   initialized.

   The interested reader can read more about the design rationale of the
   above algorithm in [Gross99].


D.2 Unit-Based Reservations

   In this section we demonstrate the unit-based reservation scheme. The
   routers use the simple algorithm in Appendix B, except that it never
   marks regular packets. The simulation setup is otherwise the same as
   in the previous section. The traffic inside the flows does not affect
   the admission algorithm, so during simulation, sources send only
   probe and refresh packets. The definition of the unit is a peak bit-
   rate of 16 kbit/s. The flow number threshold was set to 62 flows
   resulting in close to the same target utilization of 1Mbits/s as in
   appendix C.1. The length of the refresh period was changed between
   100 ms and 10 seconds. The actual number of flows on the link never
   exceeded 62 (no violation), so only the utilization values are shown
   in kbit/s.

                      | interval | min | avg | max |
                      +----------+-----+-----+-----+
                      |    --    | 968 | 972 | 976 |
                      | 100 ms   | 952 | 954 | 959 |
                      |  1 sec.  | 941 | 946 | 949 |
                      |  2 sec.  | 927 | 933 | 936 |
                      |  4 sec.  | 908 | 913 | 920 |
                      |  7 sec.  | 861 | 870 | 879 |
                      | 10 sec.  | 827 | 837 | 852 |

The first line shows the utilization value for the case when the source
limits itself to 62 flows, i.e., blocking is not done by the network,
but by the source. This emulates the case when the refresh period is
infinitely short or when a state approach is used, as in RSVP. The


Various Authors           Expires October 2000                 [Page 18]


Internet Draft                Load Control                    April 2000


utilization is not 100% due to the burstiness of the arrivals.

It can be seen that as the refresh packets becomes less frequent, more
resources are wasted, as the resources allocated to departing flows
remain allocated until the end of the next refresh period. The result is
not only lower average utilization, but lower maximal utilization as
well. When the refresh period is 10 seconds long, the highest
utilization experienced was 952 kbit/sec, which is 3 units below the
limit.

This motivates the use of as short a refresh period as possible.
However, too short a refresh period will increase the effects of clock
differences between edge and core devices (which was not taken into
account during simulation). It also decreases the chance of finding a
packet to mark as refresh if the flow is currently transmitting below
its reserved rate.


Appendix E: Marking using ECN bits

If the ECN bits were to be used for load control marking, the values are
encoded in the two unused bits as described below, and the DS field
contains the PHB.

               DS byte    Load Control
               01234567   codepoint (in ECN)
               -----------------------------
               xxxxxx00   Ordinary
               xxxxxx01   Probe
               xxxxxx10   Marked
               xxxxxx11   Refresh

               The interpretation of the two unused bits remains
               unspecified for other PHBs that do not support Load
               Control. This is done so as not to interfere with
               possible ECN deployment [RFC2481].


Table of Contents


1 Abstract ........................................................    2
2 Background and Motivation .......................................    3
3 Overview ........................................................    3


Various Authors           Expires October 2000                 [Page 19]


Internet Draft                Load Control                    April 2000


4 Operation of Load Control .......................................    4
4.1 Simple Marking ................................................    5
4.2 Unit-based Reservations .......................................    6
4.3 Multiple Unit reservation .....................................    8
4.4 Codepoints for Flow Types .....................................    8
5 Objects for Standardization .....................................    8
5.1 Packet Types ..................................................    9
5.2 Coding of Packet Types ........................................    9
5.3 Behavior Description ..........................................   10
5.3.1 Behavior of the Core Routers ................................   10
5.3.2 Behavior of the Edge Devices ................................   10
6 Interworking with RSVP/Intserv ..................................   11
7 Security Considerations .........................................   12
8 Identification of Edge Nodes ....................................   12
9 Multicast-related Issues ........................................   12
 Appendix A. Admission Precision of Simple Marking ................   13
 Appendix B. Effect of Delays on Admission ........................   14
 Appendix C. A Simple Algorithm for Core Routers ..................   15
 Appendix D. Simulation Results ...................................   16
 D.1 Simple Marking ...............................................   16
 D.1.1 Constant Bit-Rate Sources ..................................   16
 D.1.2 On/Off Sources .............................................   17
 D.1.3 The Router Algorithm .......................................   17
 D.2 Unit-Based Reservations ......................................   18
 Appendix E: Marking using ECN bits ...............................   19


Authors' Addresses

Lars Westberg
Ericsson Research
Kistagangen 26
SE-164 80 Stockholm
Sweden
EMail: Lars.Westberg@era-t.ericsson.se

Zoltan R. Turanyi
Ericcson Telecommunications
Budapest, Laborc u. 1
H-1037
Hungary


Various Authors           Expires October 2000                 [Page 20]


Internet Draft                Load Control                    April 2000


EMail: Zoltan.Turanyi@ericsson.com

David Partain
Ericsson Radio Systems AB
P.O. Box 1248
SE-581 12  Linkoping
Sweden
EMail: David.Partain@ericsson.com


Various Authors           Expires October 2000                 [Page 21]