Internet Draft Load Control April 2000 Load Control of Real-Time Traffic draft-westberg-loadcntr-03.txt Document Revision: 1.3 2000/04/19 12:43:19 A Two-bit Resource Allocation Scheme April 2000 L. Westberg Z. R. Turanyi D. Partain Ericsson Various Authors Expires October 2000 [Page 1] Internet Draft Load Control April 2000 Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet- Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. 1. Abstract The purpose of this memo is to present a new resource allocation scheme for DiffServ (DS) networks, called Load Control. The main purpose of Load Control is to provide a simple and scalable solution to the resource provisioning problem. Load Control addresses two particular issues: 1. Measurement-based access control, whereby a probe packet is sent along the forwarding path in a network to determine whether a flow can be admitted based upon the current congestion state of the network 2. A lightweight reservation of a certain amount of network resources. Load Control uses two-bit markers in packet headers to carry load information from core routers to edge devices. The scheme provides the capability of controlling the traffic load in the network without requiring signaling or any per-flow processing in the core routers. The complexity of Load Control is kept to a minimum to make implementation simple. The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this Various Authors Expires October 2000 [Page 2] Internet Draft Load Control April 2000 document are to be interpreted as described in [RFC2119]. 2. Background and Motivation The amount of traffic carried on the Internet is now greater than the traffic on the world's telephony network. Still, Internet-based communication services generate less income than plain old telephony services. Enabling value-added services over the Internet is therefore crucial for service providers. One significant class of such value-added services requires real-time packet transportation. It can be expected that these real-time services will be popular as they replicate or are natural extensions of existing communication services like telephony. Exact and reliable resource management (e.g., admission control) is essential for achieving high utilization in networks with real-time transportation capabilities. The problem is difficult mainly due to scalability issues. With the introduction of differentiated services (DS) [RFC2475], it is now possible to provide large scale, real-time services. The basic idea of DiffServ is that, rather than classifying packets at each router, packets are only classified at the edge devices. The result - the required packet treatment - is stored and carried in the packet headers, and core routers can carry out appropriate scheduling. The current definition of DiffServ, however, does not contain any simple, scalable solution to the problem of resource provisioning and control. A number of approaches to solving the problem already exist [Berson97, Guerin97, Stoica99, Bernet99]. The scheme presented in this document does not require any state aggregation and aims at extreme simplicity and low cost of implementation along with good scaling properties. Load control operates edge-to-edge in a DS domain, or between two RSVP-capable routers, where only the edge devices keep flow state and do per-flow processing. The main purpose of Load Control is to provide a simple and scalable solution to the resource provisioning problem. 3. Overview Load control is achieved by two actions: measurement-based admission control of incoming requests and the dropping of admitted flows in case of exceptional events such as link failures. Load Control uses two-bit markers in the packet headers to gather information about the load level along various paths through the network. The core routers Various Authors Expires October 2000 [Page 3] Internet Draft Load Control April 2000 are able to mark passing packets to signal the exhaustion of resources to the edge devices. For admission control, the resource state of core routers is gathered by sending a specially marked packet, denoted a "probe" packet, from the ingress to the egress edge device. The probe result is then used by the ingress to decide flow acceptance or rejection and to set up traffic conditioning/policy. If rigid admission control is required, soft-state based reservations are also supported. In this case the probe packet does both the probing and allocation of resources along the path. The latter method is comparable to signaling based schemes but does not require processing of signaling messages in the core routers. Under normal circumstances, admission control is enough to control the load in the network. Nevertheless, when exceptional events (such as link failures) cause too much traffic to be re-routed over a link, the resulting severe congestion may degrade the quality of all of the flows on the link. In that case, the best solution might be to keep existing flows and suffer the loss of quality. However, for some services, it may be desirable to drop some of the previously admitted flows to protect the quality of the remaining flows. Thus, when severe congestion occurs, the core routers mark the headers of all (not only probe) packets to notify the edge devices of the congestion condition. In the following sections, we assume a DS (DiffServ) domain where connection requests arrive at the edges of the domain via RSVP, at the request of a Bandwidth Broker, or by other means. The requests may be generated directly at the edge by a gateway, which provides connection to other types of networks, or in hosts that are connected directly to the domain. 4. Operation of Load Control The load control scheme has two modes of operation: a) 'Simple marking': This refers to a measurement-based admission scheme where routers measure the traffic volume and base the marking on these results. b) 'Unit-based reservations': A "unit" represents a share of bandwidth in the network that could be reserved by the edge devices. This mode makes it possible to perform resource Various Authors Expires October 2000 [Page 4] Internet Draft Load Control April 2000 reservations, independently of the amount of traffic that is actually transmitted. Both modes can perform admission control of incoming requests and indicate exceptional events. In the appendices, we present some analysis of Load Control properties, but a more detailed investigation can be found in [Tur99]. 4.1. Simple Marking The idea of simple marking is that core routers measure the traffic, and, if they encounter near exhaustion of resources, they mark passing probe packets and thereby notify the edge devices of the lack of resources. The scheme has the following steps of operation: 1) Resource Probing: Before establishing the flow, the initiating edge device sends a probe packet into the network. The probe packet passes through the same routers as the actual traffic will pass through (in any case, with a high degree of probability) and is exposed to the marking function in each router. The marking performs an OR-operation of its own status and the incoming probe packet status (a packet once marked must not be changed). When the packet reaches the egress edge device, its header will reflect the aggregated resource status along that path. 2) Send resource status to ingress: When the egress edge device receives the probe packet, it copies the marker from the header to the header/payload of a reverse packet and sends it back to the initiating party (the ingress edge device). The probe packet may be discarded, converted to an ordinary data packet, or encapsulated (as mentioned above) and sent to the ingress edge device. The packet containing the probing result can also serve as a probe packet for the reverse path. This allows the initiating party to check for bi-directional resources. 3) Acceptance/Rejection: The report packet is returned to the initiating ingress edge device, which uses the result of the probe to admit or block the request by setting up appropriate packet filtering, measuring, and marking rules. Various Authors Expires October 2000 [Page 5] Internet Draft Load Control April 2000 4) Reaction to exceptional events: If a core router detects severe congestion on an interface, it starts marking all packets on that interface. If the egress edge device receives a marked packet which is not a probe packet, this can be interpreted as a sign of severe congestion along the path. The fact that the incoming marked packet was not sent as a probe packet can be determined from the packet content, by multi-field classification or by checking the admittance state at the egress edge device. If severe congestion occurs, a signaling message can be sent to the ingress edge device, which can then take the appropriate action. To make the scheme more robust against packet loss, the initiating edge device MAY maintain a timer associated with each probe packet. If a probe packet is lost, the device simply re-transmits on time- out. How often and how many times the probe packet should be retransmitted before failure is declared is an implementation issue, but these parameters SHOULD be configurable (e.g., via an SNMP MIB). Furthermore, whether probes are retransmitted at all SHOULD be configurable. 4.2. Unit-based Reservations While measurement-based admission control has important advantages over non-measurement based algorithms, it has disadvantages as well. Unit-based reservations allow the sources to keep their reservations irrespective of the volume of the traffic they transmit. Although the admission scheme is very similar to the simple marking case, the presence of actual reservations is a fundamental difference. Each flow can occupy any number of units of resources, and even fractions of units by allowing a number of flows to share a common resource unit. The unit is not necessarily a simple bandwidth value: it may be defined in terms of any resource unit (e.g., effective bandwidth) to support statistical multiplexing at packet level (use of silence period). The definition of the unit may vary from network to network and is outside the scope of this document. The basic idea of unit-based reservation is to allow the edge devices periodically to mark some of the data packets to refresh resource reservation. Each refresh packet reserves one unit of resources for one refresh period. Reservations are timed out after a refresh period and have to be refreshed in a soft state manner. The length of the refresh period must be the same throughout the DS domain and SHOULD be Various Authors Expires October 2000 [Page 6] Internet Draft Load Control April 2000 configurable. Core routers estimate the number of reservations by counting the number of refresh packets during a refresh interval. If the router runs out of units, it goes into blocking state, starts to mark probe packets indicating congestion and thereby rejects new flows. The probe packets that pass the router unmarked and the refresh packets reserve one unit of resources for the following refresh period. (Editor note: It is clear that we need to have the capability of reserving more than one unit, but it is not yet clear how that will be encoded in the packet header. See below.) Thus, after the probe packet has passed along the path unmarked, the ingress edge device is required to send the first reservation refresh packet during the next refresh period. If a flow occupies more than one unit, more than one probe packet may be sent to allocate the required number of resources (an alternative using only one packet should be defined). Similarly, more than one refresh packet must be sent for such a flow. By proper definition of the unit, a wide range of flows can be described and handled using this simple mechanism. If a probe packet was forwarded unmarked by a core router, but was marked later downstream, that core router will not be notified and will incorrectly maintain the reservation. However, as the flow is rejected, no refresh packets will arrive, and the reservation will time out at the end of the refresh period and will be released. Severe congestion is handled in the same way as in 'Simple marking' (see below). If a refresh packet is lost, the downstream routers will underestimate the number of reserved units. Refresh and probe packets should therefore be protected from losses in the manner described above. Core routers estimate the number of allocated units by counting the number of refresh packets during a refresh period. The accuracy of the estimate can be increased by generating refresh packets evenly spread in time over the refresh period. This minimizes errors resulting from time alignment differences between routers and edge devices. Various Authors Expires October 2000 [Page 7] Internet Draft Load Control April 2000 4.3. Multiple Unit reservation In some cases it might feasible to add functionality for reservation of several units in one single reservation request. A similar semantic (as the two-bit reservation scheme) could be used to provide such functionality but it will of course require addition of a integer value denoting the number of units. The coding of such proposal is still under discussion and needs to studied further. 4.4. Codepoints for Flow Types In both variants of Load Control, routers making marking decisions have very little information about the resource or QoS requirement of the flow in question. The DS field of the probe packet can be used to indicate the DiffServ class the flow will arrive on and thus the QoS requirements. The marking function of core routers can take the required PHB into account when deciding on the marking. Information on the resource requirements for incoming flows can also be expressed using the DS field by dividing real-time traffic into classes based on resource requirements and using different codepoints for different classes. If the DSCPs denote not only the PHB that the flow is to receive, but implicitly also the bandwidth requirements for the flow, core routers will be able to mark packets more intelligently, resulting in less resource waste and greater flexibility. In the unit-based case, the major benefit is that the size of the unit can be different in different classes, making it possible to allocate resources with finer granularity. 5. Objects for Standardization A forthcoming standard might only include the encoding of the Load Control information into the IP header and some design recommendations. Various Authors Expires October 2000 [Page 8] Internet Draft Load Control April 2000 5.1. Packet Types We need four types of packets in the algorithm: - Ordinary Packet (OP) - Probe Packet (PP) - Marked Packet (MP) - Refresh Packet (RP) During transport through the network, a probe packet can be changed to a marked packet. This indicates that at least one router does not accept the reservation associated by the probe packet. ------ Rejection ------ | PP |---------------------->| MP | ------ ------ An ordinary packet can also be changed to a marked packet, meaning that some exceptional event caused severe congestion on one link of the path the packet took. ------ Severe Congestion ------ | OP |---------------------->| MP | ------ ------ In the simple marking scheme, only three packet types are used. Refresh packets are treated as ordinary packets, except that these packets cannot be changed to marked packets. 5.2. Coding of Packet Types We have two alternative solutions for storing Load Control related information in the packet headers: using new DS codepoints or using the two currently unused bits (intended for ECN) in the DS byte. The latter case is only considered in Appendix E. In the first alternative (where PHBs are intended to be used together for Load Control), two or three new codepoints would have to be defined for probe, marked and (optionally) refresh packets. For example, in the case of the EF PHB, in addition to the codepoint used for the EF packets, EF-probe, EF-marked and EF-refresh packets can also be sent. The new codepoints can be drawn from the LU/EXP space. Various Authors Expires October 2000 [Page 9] Internet Draft Load Control April 2000 5.3. Behavior Description The behavior of the edge devices depends greatly on the application or signaling protocol that uses the load control scheme. Below we only describe the few aspects of the edge device behavior that are necessary for interworking with the core routers. 5.3.1. Behavior of the Core Routers All core routers continuously maintain a state of accepting or rejecting more flows. If the state is accepting, the router passes all packets unchanged. If the state is congestion, then the router changes the marking of incoming packets from probe to marked. If the router is capable of detecting severe congestion, and this occurs, then the router forwards both ordinary and probe packets as marked. The router MUST NOT change the marking of refresh packets. Addition for Unit-based Reservations: The router uses the refresh and probe markers in packets to maintain its estimation of reserved resources. A refresh packet signals previously admitted resource usage, while a probe packet signals a new request. When passed unmarked, both types of packets reserve one unit for one refresh period. 5.3.2. Behavior of the Edge Devices When a new reservation is needed, the ingress edge device should send the appropriate number of packets marked as probe. If the egress edge device receives a probe packet that is marked, this means that the network has insufficient capacity along the path between the two edge devices. The egress edge device should take care of blocking the flow by notifying the ingress device. If the egress device receives a marked packet that is not initially sent as probe packet, it shall inform the ingress device to reject admitted flows. This can be determined from the packet content, multi-field classification of the IP header, or by checking the admittance state at the egress edge device. Addition for Unit-based Reservations: Various Authors Expires October 2000 [Page 10] Internet Draft Load Control April 2000 For the unit-based reservation scheme, the ingress edge device should generate the required number of refresh packets per refresh period and per flow. If there are not enough data packets to mark as refresh packets, the ingress device must generate dummy packets and mark those as refresh packets. The generated refresh packets should be as uniformly distributed through the refresh interval as possible to minimize the effect of refresh interval timing between routers. 6. Interworking with RSVP/Intserv Load control can also be used in DiffServ regions (backbones) that connect RSVP/Intserv regions. This inter-operation is described in detail in [Bernet99]. For load control, border routers of the DiffServ region must be RSVP-aware in order to detect the arrival of new connections. RSVP PATH messages can be used as probe packets to gather congestion information along the path between the two border routers. When a new RSVP path state is installed at the egress border router, the collective admission state of the path (collected in the packet of the PATH message) is also stored. If a RESV message for the installed state arrives within a time period during which the congestion state can be considered valid, then the egress border router can perform the admission control for the DiffServ network as well. If the first RESV message arrives too late, then the egress border router MUST solicit a new (dummy) probe packet from the ingress router to determine the current congestion state. When the egress receives a marked packet that is not a PATH message nor a dummy probe packet, this signals a severe congestion state along the path. The identity of the ingress router can easily be determined from the path state, but in this case the egress router can itself decide to drop certain reservations. The ingress router can be notified via ResvTear messages while the receiver end systems get ResvErr messages. RSVP routers can also be placed inside the domain. In this case, probing is performed between RSVP routers instead of edge devices. Thus adding a simple and cheap extension to non-RSVP capable routers, correct admission control is possible on non-RSVP capable parts of an end-to-end path. Unit-based reservations can also be used to provide resources in a DS Various Authors Expires October 2000 [Page 11] Internet Draft Load Control April 2000 domain that is used to provide VPN tunnels between customer sites. Using a load control scheme, it is fast and easy to modify the size of these tunnels. Thus, tunnel size selection can be a very dynamic process. Note that tunnels are not necessarily real-time tunnels. Packets of any DSCP can travel on them after receiving the appropriate PHB. Even best-effort tunnels can be reserved this way. Provisioning can be done on a per-DSCP basis or in aggregates as the service provider wishes. 7. Security Considerations We propose using two-bit markers in packet headers (DS field) to reserve resources within a DiffServ domain. This poses similar security problems to the use of the DS field to differentiate packets in general [RFC2475]. If the interior of the DS domain fully contains a tunnel, then by copying the outer marking into the inner header at de-encapsulation, load control can be exercised over the links of the tunnel as well. The procedure is similar to the one described in [RFC2481]. As IPSec [RFC2402, 2406] does not allow the copying of the DS field from the outer to the inner header at de-encapsulation, load control cannot be exercised over regions where IPSec tunnels are used. 8. Identification of Edge Nodes In the absense of RSVP, an alternative method for identificatiof of edge nodes will be required. This section needs to be written. 9. Multicast-related Issues [RFC2406] Kent, S. and R. Atkinson, "IP Encapsulating Security Payload (ESP)", RFC 2406, November 1998. [Bernet99] Bernett, Y., Yavatkar, R., Ford, P., Baker, F., Zhang, L., Speer, M., Braden, R., "Interoperation of RSVP/Intserv and Diffserv Networks", Work in Progress, March 1999 [Stoica99] Stoica, I., et al "Per Hop Behaviors Based on Dynamic Packet States", Work in Progress, February 1999 Various Authors Expires October 2000 [Page 12] Internet Draft Load Control April 2000 [Berson97] Berson, S. and Vincent, R., "Aggregation of Internet Integrated Services State", Work in Progress, December 1997. [Guerin97] Guerin, R., Blake, S. and Herzog, S.,"Aggregating RSVP based QoS Requests", Work in Progress, November 1997. [Gross99] Grossglauser, M., Tse, D. N. C., "A Time-Scale Decomposition Approach to Measurement-Based Admission Control", Infocom '99 [Tur99] Z. R. Turanyi, L. Westberg "Load Control: Lightweight Provisioning of Internet Resources" submitted to Networking 2000, Paris, May 2000, http: //www.ericsson.co.hu/ethzrt/ [IAB-QoS] G. Huston (Internet Architecture Board), "Next Steps for the IP QoS Architecture", Work in Progress, March 2000. [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC2119, March 1997. Appendix A. Admission Precision of Simple Marking Simple marking is basically a measurement-based admission control scheme, where flows do not say anything about their traffic characteristics. In addition, flow departure is not signaled explicitly. When the network carries more types of flows with different bandwidth requirements, the core routers do not know the bandwidth requirements of the incoming flows. They simply declare whether they will accept more flows or not irrespective of the bandwidth demands of the new flow. Thus the marking algorithm in the routers should conservatively always expect the largest type of flow that the network carries and start rejecting flows when there is not enough bandwidth left for one such flow. On the positive side, this will result in fair rejection among different flow types, but on the negative side, some bandwidth will be wasted. However, if the links of our domain can carry at least several hundred requests even from the most bandwidth-demanding types of flow, then this is not a significant waste. Various Authors Expires October 2000 [Page 13] Internet Draft Load Control April 2000 Appendix B. Effect of Delays on Admission When a probe packet is passed unmarked without correcting the estimate of the free resources, we in fact admit a flow without immediately reserving resources for it. The reservation will be implicitly done later by the arriving traffic or refresh packets of the flow. During the time between admission and the arrival of the traffic of the flow, new requests can be admitted without taking the previously admitted flow into account. To illustrate the effects of this delay, we took an old and simple Markovian example. Flows are identical with an average flow-holding time of 180 seconds and flow arrivals and departures follow a Poisson process. Let the link be able to carry N calls and let the delay be T. The link starts refusing flows when the measured traffic exceeds N-H calls. We can say that a space of size H is put aside to cater for the errors caused by the delay. If the link is properly dimensioned, then the usual blocking ratio should not exceed 1%. However, in a mass call situation (such as occurs at New Year's Eve for example) it can be considerably higher. In this example, 50% blocking was chosen to demonstrate the extreme load case. Thus, the offered traffic is roughly twice the link capacity. QoS violation occurs if during time T the difference between the number of arriving and departing flows is larger than H. Under the above assumptions, the chance of QoS violation can be calculated. Naturally the larger H is, the less the chance is that QoS will be violated. The required value of H can be determined for a low value of QoS violation probability (e.g. 10e-5). The following table presents the value of H as a function of link size (N), delay length (T) and load (causing 1% or 50% blocking). | 1ms | 10ms | 100ms | 500ms | 1s | | 1% | 50% | 1% | 50% | 1% | 50% | 1% | 50% | 1% | 50% | ------+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+ 50 | 2 | 2 | 2 | 3 | 3 | 4 | 4 | 5 | 5 | 7 | 100 | 2 | 2 | 3 | 3 | 4 | 4 | 4 | 7 | 6 | 9 | 500 | 2 | 3 | 3 | 4 | 4 | 7 | 9 | 13 | 12 | 18 | 1000 | 3 | 3 | 4 | 4 | 5 | 9 | 12 | 18 | 16 | 25 | 5000 | 3 | 4 | 5 | 7 | 12 | 18 | 24 | 44 | 33 | 69 | 10000 | 4 | 4 | 7 | 9 | 16 | 25 | 33 | 69 | 47 | 113 | The amount of required safety margin is highest for small links, Various Authors Expires October 2000 [Page 14] Internet Draft Load Control April 2000 since less statistical multiplexing is possible there. Appendix C. A Simple Algorithm for Core Routers In this appendix, we present an algorithm for core routers that use unit-based reservations. The algorithm is simple, so it can be easily implemented in hardware by simple counters. Its inputs are the refresh interval and the number of flows allowed on the link. The latter is denoted by . (We assume flows with similar characteristics (e.g., voice) and that one flow sends one refresh packet per refresh interval.) If the network uses more DSCPs for real-time traffic, then a separate copy of the algorithm may be run for each DSCP, resulting in per-DSCP admission. The algorithm counts the number of refresh and admitted probe packets in refresh intervals (). The result of the counting is an upper limit on the number of units reserved on the link, as some reservations may have gone by the end of the refresh interval. The value of this counter is used in the next interval to decide on admission (). When a new reservation is admitted, this value is increased to take the new reservation into account. If this value is high above the admission limit, then we start sending severe congestion notifications by marking regular packets as well. On initialization: last = 0 count = 0 On arrival of a refresh packet count++ On arrival of a probe packet if last < threshold then last ++ count ++ elseif Mark Packet endif On arrival of a regular packet if last < threshold*1.1 then Mark Packet endif Various Authors Expires October 2000 [Page 15] Internet Draft Load Control April 2000 At the end of the refresh interval last = count count = 0 Appendix D. Simulation Results The purpose of the simulations described in this appendix is to give some insight into the performance of load control. The simulation cases are by no means representative, and the scheme may work differently in other situations. In section C.1, the simple marking case is demonstrated with a purely measurement-based admission algorithm by using a single link with both constant bit-rate and on/off sources. In appendix C.2, the unit-based reservation method is shown, using the algorithm in appendix B. Severe congestion signaling is not used in any of the examples; only admission control is used. We simulated a very simple network of one link. This can be viewed as the single bottleneck in the domain. The link had a 2 Mbit/s throughput, 50% of which was designated to carry real-time traffic. The round trip propagation delay was set to 100ms. The real time flows arrived according to a Poisson process, holding time was exponential with a 90 second mean. The arrival rate of flows was set to produce approximately 50% blocking. Only real-time traffic was simulated, so scheduling was simple FIFO. D.1 Simple Marking D.1.1 Constant Bit-Rate Sources In the first case, flows emitted 40 byte long packets every 20 ms, producing a constant 16 kbit/s load. The 1 Mbit/s capacity assigned to this traffic can thus carry 62.5 flows. From the table in appendix A, we can see that 4 calls should be reserved in addition to the 62.5. After an initial transient of 5 minutes, we simulated 2.5 hours. During the 2.5 hour simulation time, utilization was measured over 5-minute intervals. Utilization was also measured in 20ms slots and the percentage of slots in which it was above 1.064 Mbit/s (66.5 Various Authors Expires October 2000 [Page 16] Internet Draft Load Control April 2000 calls) was counted. min/avg/max of the utilization was: 881 / 899 / 914 kbit/s min/avg/max of the violation ratio was: 98.96% / 99.78% / 100% D.1.2 On/Off Sources In the second simulation case, on/off sources were used. During an "off" period, no packets were generated, while in the "on" state the behavior is the same as in the previous case: 40 byte long packets 20 ms apart. The distributions of the on and off periods were both drawn from a pareto distribution with the shape parameter of 1.1 and mean of 5 seconds. The average bit-rate of the sources is thus 8 kbit/s. The flow arrival rate has been doubled to produce50% blocking, since the link is capable of carrying nearly twice the number of flows. The same set of measurements was carried out as in the previous case. min/avg/max of the utilization was: 808 / 819 / 837 kbit/s min/avg/max of the violation ratio was: 98.98% / 99.40% / 99.70% It can be seen that although the measurement-based approach was not able to prevent the over-use of the real-time resources in this high load case, it is a viable alternative. In no case did the 20 ms measurements exceed 1.15 Mbit/s, so the over-use just means a temporary steal from the resources provisioned to the lower priority traffic. D.1.3 The Router Algorithm The mbac algorithm used by the router is presented here only for the completeness of the simulation description. The marking strategy was the same for both types of traffic. The router counts the number of bytes transmitted in every 20 ms interval and calculates the average bit rate in these 20 ms slots. Then it smoothes these values in time through an exponentially weighted moving average (ewma) filter. The window size of the ewma was set to 9 seconds, i.e., running a unit step function through it, the output will be 0.63 after 9 seconds. The algorithm also calculated the histogram of the difference between the original slot values and the filtered values. The histogram has been counted in 1000 bins between the range of -1 and +1 Mbit/s. The 99% quantile of the histogram was calculated every 100 seconds. The router marks all passing packets if the sum of the output of the ewma filter and the calculated quantile is greater than 1 Mbit/s. The Various Authors Expires October 2000 [Page 17] Internet Draft Load Control April 2000 router makes no correction to its measurements when a new flow is a Thus, the target violation probability was set to 1%, which was in fact fulfilled in the long run. On arrival of a new packet, only counters are incremented. Every 20 ms a new value for the ewma must be calculated, a marking decision must be made for the next 20 ms and the value of one bin in the histogram must be increased. Every 100 seconds, the 99% quantile value must be looked up in the histogram and the histogram must be initialized. The interested reader can read more about the design rationale of the above algorithm in [Gross99]. D.2 Unit-Based Reservations In this section we demonstrate the unit-based reservation scheme. The routers use the simple algorithm in Appendix B, except that it never marks regular packets. The simulation setup is otherwise the same as in the previous section. The traffic inside the flows does not affect the admission algorithm, so during simulation, sources send only probe and refresh packets. The definition of the unit is a peak bit- rate of 16 kbit/s. The flow number threshold was set to 62 flows resulting in close to the same target utilization of 1Mbits/s as in appendix C.1. The length of the refresh period was changed between 100 ms and 10 seconds. The actual number of flows on the link never exceeded 62 (no violation), so only the utilization values are shown in kbit/s. | interval | min | avg | max | +----------+-----+-----+-----+ | -- | 968 | 972 | 976 | | 100 ms | 952 | 954 | 959 | | 1 sec. | 941 | 946 | 949 | | 2 sec. | 927 | 933 | 936 | | 4 sec. | 908 | 913 | 920 | | 7 sec. | 861 | 870 | 879 | | 10 sec. | 827 | 837 | 852 | The first line shows the utilization value for the case when the source limits itself to 62 flows, i.e., blocking is not done by the network, but by the source. This emulates the case when the refresh period is infinitely short or when a state approach is used, as in RSVP. The Various Authors Expires October 2000 [Page 18] Internet Draft Load Control April 2000 utilization is not 100% due to the burstiness of the arrivals. It can be seen that as the refresh packets becomes less frequent, more resources are wasted, as the resources allocated to departing flows remain allocated until the end of the next refresh period. The result is not only lower average utilization, but lower maximal utilization as well. When the refresh period is 10 seconds long, the highest utilization experienced was 952 kbit/sec, which is 3 units below the limit. This motivates the use of as short a refresh period as possible. However, too short a refresh period will increase the effects of clock differences between edge and core devices (which was not taken into account during simulation). It also decreases the chance of finding a packet to mark as refresh if the flow is currently transmitting below its reserved rate. Appendix E: Marking using ECN bits If the ECN bits were to be used for load control marking, the values are encoded in the two unused bits as described below, and the DS field contains the PHB. DS byte Load Control 01234567 codepoint (in ECN) ----------------------------- xxxxxx00 Ordinary xxxxxx01 Probe xxxxxx10 Marked xxxxxx11 Refresh The interpretation of the two unused bits remains unspecified for other PHBs that do not support Load Control. This is done so as not to interfere with possible ECN deployment [RFC2481]. Table of Contents 1 Abstract ........................................................ 2 2 Background and Motivation ....................................... 3 3 Overview ........................................................ 3 Various Authors Expires October 2000 [Page 19] Internet Draft Load Control April 2000 4 Operation of Load Control ....................................... 4 4.1 Simple Marking ................................................ 5 4.2 Unit-based Reservations ....................................... 6 4.3 Multiple Unit reservation ..................................... 8 4.4 Codepoints for Flow Types ..................................... 8 5 Objects for Standardization ..................................... 8 5.1 Packet Types .................................................. 9 5.2 Coding of Packet Types ........................................ 9 5.3 Behavior Description .......................................... 10 5.3.1 Behavior of the Core Routers ................................ 10 5.3.2 Behavior of the Edge Devices ................................ 10 6 Interworking with RSVP/Intserv .................................. 11 7 Security Considerations ......................................... 12 8 Identification of Edge Nodes .................................... 12 9 Multicast-related Issues ........................................ 12 Appendix A. Admission Precision of Simple Marking ................ 13 Appendix B. Effect of Delays on Admission ........................ 14 Appendix C. A Simple Algorithm for Core Routers .................. 15 Appendix D. Simulation Results ................................... 16 D.1 Simple Marking ............................................... 16 D.1.1 Constant Bit-Rate Sources .................................. 16 D.1.2 On/Off Sources ............................................. 17 D.1.3 The Router Algorithm ....................................... 17 D.2 Unit-Based Reservations ...................................... 18 Appendix E: Marking using ECN bits ............................... 19 Authors' Addresses Lars Westberg Ericsson Research Kistagangen 26 SE-164 80 Stockholm Sweden EMail: Lars.Westberg@era-t.ericsson.se Zoltan R. Turanyi Ericcson Telecommunications Budapest, Laborc u. 1 H-1037 Hungary Various Authors Expires October 2000 [Page 20] Internet Draft Load Control April 2000 EMail: Zoltan.Turanyi@ericsson.com David Partain Ericsson Radio Systems AB P.O. Box 1248 SE-581 12 Linkoping Sweden EMail: David.Partain@ericsson.com Various Authors Expires October 2000 [Page 21]