Internet Engineering Task Force Integrated Services WG INTERNET-DRAFT Shenker/Partridge/Wroclawski draft-ietf-intserv-control-del-svc-01.txt Xerox/BBN/MIT June, 1995 Expires: 12/25/95 Specification of Controlled Delay Quality of Service Status of this Memo This document is an Internet-Draft. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as ``work in progress.'' To learn the current status of any Internet-Draft, please check the ``1id-abstracts.txt'' listing contained in the Internet-Drafts Shadow Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe), munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or ftp.isi.edu (US West Coast). This document is a product of the Integrated Services working group of the Internet Engineering Task Force. Comments are solicited and should be addressed to the working group's mailing list at int- serv@isi.edu and/or the author(s). Abstract This memo describes the network element behavior required to implement the Controlled Delay traffic handling service. The controlled delay service is designed for service-adaptive and delay-adaptive applications; i.e., applications that are prepared to dynamically adapt to changing packet transmission delays and to dynamically change the level of packet delivery delay control they request from the network when their current level of service is not adequate. The controlled delay service imposes relatively minimal requirements on network components which implement it, and Shenker/Partridge/WroclawskiExpires 12/25/95 [Page 1] INTERNET-DRAFT draft-ietf-intserv-control-del-svc-01.txt March, 1995 is intended to be usable in situations ranging from small centrally managed private IP networks to the global Internet. Introduction This memo is one of a series of documents which specify network element behavior in IP internetworks which provides multiple qualities of service to their clients. Services described in these documents are useful both in the global Internet and private IP networks. This document is based on the service specification template given in [xxx]. Please refer to that document for definitions and additional information about the specification of qualities of service within the IP protocol family. Motivation Controlled delay service is designed for service-adaptive and delay- adaptive applications. These applications are sensitive to packet delivery delay, but are prepared to adapt to dynamically changing delays, and to change their service request at any time if the current level of service received from the network is not adequate. This flexibility allows such applications to operate successfully and efficiently over a wide range of network conditions. Many applications which transmit interactive data, such as audio and video conferencing sessions, are well suited to operation with the controlled delay service. Applications which desire proven guarantees on packet delivery time, such as real-time control and servoing systems, are generally not in this category. The controlled delay service does not provide any quantified level of assurance about packet delays. Instead, it merely promises to avoid overloads by turning excess traffic away. Criteria for determining when a resource is overloaded are not specified in this definition, but are left to the individual vendor. Thus, while this service offers some control over packet delay, it does not meet the stricter condition of offering a guaranteed or statistical bound. The end-to-end behavior obtained with controlled delay service provides a middle ground between the employment of adaptive applications in a pure best-effort network and the employment of a network which rigidly controls delay. Strengths of this middle Shenker/Partridge/WroclawskiExpires 12/25/95 [Page 2] INTERNET-DRAFT draft-ietf-intserv-control-del-svc-01.txt March, 1995 ground are that applications can obtain some load control and delivery preference for their packets while still benefiting from their adaptive behavior; that the service can be usefully deployed in large, unstructured internetworks; and that the specification is amenable to highly efficient implementation and use of network resources. One important model for the use of controlled delay may be thought of as "best-effort with a floor"; applications suited for use with this service may choose to operate with no QoS control at all much of the time, requesting controlled-delay service from the network only when the uncontrolled service slips below acceptable minimums. Network Element Data Handling Requirements The network element must assure that the packet delays are controlled. This must be accomplished through active admission control. In particular, overprovisioning is not sufficient to deliver controlled delay service; the element must be able to turn flows away if accepting them would cause the element to have excessive queueing delays. However, no quantitative specification of average, statistical, or maximal delays is required. There are three different logical levels of service. A network element may internally implement fewer (or more) actual levels of service, but must map them into three logical levels at the controlled delay service invocation interface. The levels have different degrees of delay control, with level 1 having the most tightly controlled delay, and level 3 having the least tightly controlled delay. The different levels do not have to give strictly ordered delays for each packet; that is, the network element need not ensure that every packet given level 1 service experiences less delay than if it were given level 2 service. The element need only ensure that the typical delays are no greater in level 1 than in level 2 (and similarly for levels 2 and 3). All three levels of service should be given better service, i.e. more tightly controlled delay, than uncontrolled best effort traffic. The controlled delay service must maintain a very low level of packet loss. Although packet losses may occur, any noticeable loss represents a "failure" of the admission control algorithm. The controlled delay service definition does not require any control of short-term packet jitter (variation in network element transit delay between different packets in the flow) beyond the control Shenker/Partridge/WroclawskiExpires 12/25/95 [Page 3] INTERNET-DRAFT draft-ietf-intserv-control-del-svc-01.txt March, 1995 already exercised on delay. Network element implementors who find it advantageous to do so may use resource scheduling algorithms which exercise some jitter control. Invocation Information The controlled delay service is invoked by specifying the flow's proposed traffic pattern (TSpec) and service request (RSpec) to the network element. The TSpec takes the form of a token bucket, with bucket depth b and token rate r. Both b and r must be positive. The rate r is measured in bytes per 1/100th of a second, and can range from 1 to 10^12. This range allows data rates as small as 800 bits per second (a reasonable minimum) as well as data rates as large as 400 terabits per second (or 5 times what is believed to be the maximum theoretical bandwidth of a single strand of fiber) to be requested. The representation of this value should be precise to at least 0.1 percent of the value. The use of floating point representations in implementations is encouraged. The bucket depth b is measured in bytes and can range from 1 byte to 125 gigabytes. Again, the representation of this value should be precise to at least 0.1 percent of the value, and the use of floating point representations is encouraged. The range of values is intentionally large to allow for the future bandwidths. The range is not intended to imply that a network element must support the entire range. The RSpec is a service level. The service level is specified by one of the integers 1, 2, or 3. Implementations should internally choose representations which leave a range of at least 256 service levels undefined, for possible extension in the future. The TSpec can be represented by two 16-bit values each using the following form. 15 10 9 0 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Exponent | Value | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ In this format, the 6 most significant bits of the word encode an exponent (E), and the 10 LSB's encode a value (V). This format Shenker/Partridge/WroclawskiExpires 12/25/95 [Page 4] INTERNET-DRAFT draft-ietf-intserv-control-del-svc-01.txt March, 1995 encodes a number, of the form V * (2**E). This format is chosen to allow easy representation of a wide range of values, while avoiding over-precise representations. The 16-bit word is placed into packets in network byte order. The first value is the rate (r) and the second value is the bucket size (b). The RSpec may be represented as an unsigned 16-bit integer carried in network byte order. Exported Information Each controlled delay service module exports at least the following information. All of the data elements described below are characterization parameters. For each level of service, the network element exports three measurements of delay (thus making nine quantities in total). Each of these characterization parameters is the maximal packet transit delay experienced over a previous time interval T. The three time intervals T are 1 second, 60 seconds, and 3600 seconds. The exported parameters can be somewhat stale, in that they can reflect measurements taken as long as time 2T ago. However, they may also be updated more frequently, and can also reflect any subsequent packet delays that exceed the current advertised quantity. There is no requirement that these characterization parameters be based on exact measurements. In particular, these delay measurements can be based on estimates of packet delays or aggregate measurements of queue loading. This looseness is allowed to avoid placing undue burdens on network element designs in which obtaining precise delay measurements is difficult. These delay parameters have an additive composition rule. For each parameter the composition function computes the sum, enabling a setup protocol to deliver the cumulative sum along the path to the end nodes. The delays are measured in units of one microsecond. An individual element can advertise a delay value between 1 and 2**28 (somewhat over two minutes) and the total delay added across all elements can range as high as 2**32-1. Should the sum of the different elements delay exceed 2**32-1, the end-to-end advertised delay should be 2**32-1. Shenker/Partridge/WroclawskiExpires 12/25/95 [Page 5] INTERNET-DRAFT draft-ietf-intserv-control-del-svc-01.txt March, 1995 Note that while the granularity of measurement is microseconds, a conforming element is free to measure delays more loosely. The minimum requirement is that the element estimate its delay accurately to the nearest 100 microsecond granularity. Elements that can measure more accurately are, of course, encouraged to do so. NOTE: Measuring in milliseconds is not acceptable, because if the minimum delay value is a millisecond, a path with several hops will lead to a composed delay of at least several milliseconds, which is likely to be misleading. The characterization parameters may be represented as a sequence of nine 32-bit unsigned integers in network byte order. The first three integers are the parameters for T=1, T=60 and T=3600 for level 1, the next three integers are for T=1, T=60, T=3600 for level 2, and the last three integers are for T=1, T=60, T=3600 for level 3. The following values are assigned from the characterization parameter namespace. The controlled delay service is service_name 1. The delay characterization parameters receive parameter_number's one through nine, in the order given above. That is, parameter_name definition 1 Service Level = 1, T = 1 2 Service Level = 1, T = 60 3 Service Level = 1, T = 3600 4 Service Level = 2, T = 1 5 Service Level = 2, T = 60 6 Service Level = 2, T = 3600 7 Service Level = 3, T = 1 8 Service Level = 3, T = 60 9 Service Level = 3, T = 3600 The end-to-end composed results are assigned parameter_names N+10, where N is the value of the per-hop name given above. No other exported data is required by this specification. Policing Policing is done at the edge of the network and at all source merge Shenker/Partridge/WroclawskiExpires 12/25/95 [Page 6] INTERNET-DRAFT draft-ietf-intserv-control-del-svc-01.txt March, 1995 points within the interior of the network. A source merge point is any point where data from multiple sources is merged into a single flow. It is the responsibility of the invoker of the service (a setup protocol, local configuration tool, or similar mechanism) to identify points where policing is required. Nonconforming packets are dropped. Other actions, such as delaying packets until they fit within the bounds of the policing function, are not permitted. NOTE: This point is open to discussion. The requirement given above may be too strict; it may be better to permit some delaying of a packet if that delay would allow it to pass the policing function. Intuitively, a plausible approach is to allow a delay of (roughly) up to the maximum queueing delay experienced by completely conforming packets before declaring that a packet has failed to pass the policing function and dropping it. The merit of this approach, and the precise wording of the specification which describes it, require further study. Implementors should note that traffic entering the network with a certain burstiness factor will in many circumstances tend to grow more bursty as it traverses the network. Thus, a well-chosen policing function for network elements serving as interior source merge points will allow for somewhat more traffic burstiness than would be suggested by simple consideration of the traffic's TSpec at that point. NOTE: This modification of the policing function to allow for increased burstiness as traffic flows through a network is independent of any modification of the TSpec or policing function needed to handle the merging of multiple service requests. Ordering and Merging Traffic specifications (TSpecs) presented to a controlled delay service module are ordered as follows; TSpec A is substitutable for ("as good or better than") TSpec B if both the token bucket depth and rate of TSpec A are greater than or equal to the token bucket depth and rate of TSpec B. A merged TSpec may be calculated over a set of TSpecs by taking the largest rate and largest bucket size across all members of the set. This use of the word "merging" is similar to that in the RSVP protocol; a merged TSpec is one which is adequate to describe the traffic from any one of a number of flows. Shenker/Partridge/WroclawskiExpires 12/25/95 [Page 7] INTERNET-DRAFT draft-ietf-intserv-control-del-svc-01.txt March, 1995 In other circumstances it is necessary to computed the "additive" TSpec, which describes the traffic of all of the input flows simultaneously. The rate and bucket size parameters of an additive TSpec are computed by adding together the rate and bucket size parameters of each of the original TSpec's. It is possible to combine these functions to compute a TSpec for the traffic from any N of a set of M original flows, where N ranges from 1 to M. Service request specifications (RSpecs) are ordered by their numerical values (in inverse order); service level 1 is substitutable for service level 2, and service level 2 is substitutable for service level 3. Resulting Service The resulting end-to-end service is one that offers applications several levels of delay to choose from. Furthermore, this service promises that the levels of experienced delays will be controlled, such that all three levels of service will have lower average delays than best effort service. However, this service makes no assurances about the absolute levels of delay or jitter the receiving application will experience. This service is designed for use by adaptive playback applications, among others. These applications may be willing to accept varying degrees of perceived loss due to late packets. At a given level of controlled delay service the application can vary the loss rate by utilizing a more or less conservative delay adaptation function to adjust their "playback point"; the application will compromise between minimal playback delay and percentage of packets arriving too late to be useful. If the best compromise at a given service level is not good enough, the application may select a lower-delay service level. Guidelines for Implementors It is expected that the service levels implemented at a particular element will offer significantly different levels of delay control. There seems little advantage in offering levels which differ only slightly in the level of delay control. So, while a particular element may offer less than three levels of service, the levels of service it does offer should have notably different queueing delays. Shenker/Partridge/WroclawskiExpires 12/25/95 [Page 8] INTERNET-DRAFT draft-ietf-intserv-control-del-svc-01.txt March, 1995 An additional service currently being considered is the "predictive" service described in [xxx]. It is expected that if an element offers both predictive service and controlled delay service, that it need not implement both but can use the predictive service as a controlled delay service. This is allowed since (1) the required behavior of predictive service meets all of the requirements of controlled delay service, (2) the invocations are compatible, and (3) the ordering relationships defined in the predictive service specification document are such that a given level of predictive service is at least as good as the same level of controlled delay service. NOTE: The inter-service mapping with predictive service, mentioned above, is omitted from the "Ordering and Merging" section of this draft of the controlled delay service specification because the exact definition of both services is still under discussion. Should the final definitions include an inter-service mapping function, the Ordering and Merging sections of each document might contain words similar to the following: "In addition, the controlled delay service is related to the predictive service in the sense that a given level of predictive service is considered at least as good as the same level of controlled delay service. See additional comments in the guidelines section." Evaluation Criteria Evaluating a network element's implementation of controlled delay service is somewhat difficult, since the quality of service depends on overall traffic load, the traffic pattern presented and the degree of delay control implemented. In this section we sketch out a methodology for testing an element's controlled delay service. The idea is that one chooses a particular traffic mix (for instance, 30 percent 1, 10 percent level 2, 20 percent level 3 and 40 percent uncontrolled best-effort traffic) and loads the network element with progressively higher amounts of this traffic mix (i.e., 40% of capacity, then 50% of capacity, on beyond 100% capacity). For each load level, one measures the utilization, mean delays, and the packet loss rate for each level of service (including best effort). Each test run at a particular load should involve enough traffic that is a reasonable predictor of the performance a long-lived application such as a video conference would experience (e.g., an hour or more of traffic). This memo does not specify particular traffic mixes to test. However, we expect in the future that as the nature of real-time Shenker/Partridge/WroclawskiExpires 12/25/95 [Page 9] INTERNET-DRAFT draft-ietf-intserv-control-del-svc-01.txt March, 1995 Internet traffic is better understood, the traffic used in these tests will be chosen to reflect the current and future Internet load. Examples of Implementation A possible implementation of controlled delay service would be to have a queueing mechanism with three priority levels, with level 1 packets being highest priority and level 3 packets being lowest priority. Each controlled delay service level would be associated with a target queue utilization level, say 20% for level 1, 50% for the combination of levels 1 and 2, and 70% for the combination of all three levels. The utilization of the link, by each of the three levels, would be measured over some relatively short time period (say, 5 seconds, or 10000 MTU packet transmission times). A new flow would be admitted to level 1 if the measured usage of level 1, plus the token bucket rate of the new flow, was below the target utilization of level 1. Similarly, a new flow would be admitted to level 2 if the measured usage of levels 1 and 2, plus the token bucket rate of the new flow, was below the target utilization of levels 1 and 2. Examples of Use We give two examples of use, both involving an interactive application. In the first example, we assume that either the receiving application is ignoring characterizations or the network is not delivering the characterizations to the end-nodes. We further assume that the application's data transmission units is timestamped. The receiver, by inspecting the timestamps, can determine the end-to-end delays and react if they are excessive. If so, then the application asks for a better level of service. If the delays are well below the required level, the application can ask for a worse level of service. A protocol useful to applications providing this capability is the proposed IETF Real-Time Transport Protocol [xxx]. In the second example, we assume that characterization parameters are delivered to the receiving application. The receiver chooses the service level whose characterizations for the maximal delays for all intervals are under the required level after network latencies are considered. If the actual delays during the course of operation are worse than expected, the application can ask for a better level of service. Shenker/Partridge/WroclawskiExpires 12/25/95 [Page 10] INTERNET-DRAFT draft-ietf-intserv-control-del-svc-01.txt March, 1995 Security Considerations Security considerations are not discussed in this memo. Authors' Address: Scott Shenker Xerox PARC 3333 Coyote Hill Road Palo Alto, CA 94304-1314 shenker@parc.xerox.com 415-812-4840 415-812-4471 (FAX) Craig Partridge BBN 2370 Amherst St Palo Alto, CA 94306 craig@bbn.com John Wroclawski MIT Laboratory for Computer Science 545 Technology Sq. Cambridge, MA 02139 jtw@lcs.mit.edu 617-253-7885 617-253-2673 (FAX) Shenker/Partridge/WroclawskiExpires 12/25/95 [Page 11]