Internet Engineering Task Force Hyogon Kim INTERNET-DRAFT Telcordia Technologies Expires: October 1999 (formerly Bellcore) April 1999 A Fair Marker Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet- Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Abstract This draft defines Fair Marker (FM), which can be used as a component in a Differentiated Services traffic conditioner. FM controls token distribution from the token bucket to the flows originating from the same subscriber network, in order to enforce fairness among them. FM may use different fair token allocation algorithms. Presently, we assume that FM is used to mark packets for Assured Forwarding. For a supplementary note, this draft also presents the result of using fair token allocation algorithms. These algorithms are derived from fair buffer allocation algorithms, for which the duality between packet buffer and token bucket marker is exploited. Preliminary simulation results show that FM significantly increases the number of tokens allocated to adaptive and fragile flows, Kim [Page 1] INTERNET-DRAFT A Fair Marker April 1999 leading to improved fairness in assured bandwidth allocation. 1. Introduction Internet Differentiated Services ("Diffserv") architecture[DSARCH] defines the traffic conditioner as an essential component. Recent proposals other than classical token bucket and shaper include TCP- friendly conditioner[TCPCOND], Three Color Markers[SRTCM,TRTCM], and Time Sliding Window conditioner[EABE] among others. One aspect of traffic conditioning that these existing proposals have not addressed is the fairness between the flows as they are marked. In best-effort Internet, it is sufficient to assume that some local mechanism (e.g. CBQ) is to enforce bandwidth fairness among the same subscriber flows before they enter the network. So the fairness is an issue only inside the network, between subscribers. In DiffServ architecture, however, it is no longer true that any such fairness mechanism detached from token allocation behavior at the traffic conditioner can achieve bandwidth fairness among the flows. A somewhat contrived example that can show the need of direct control over token consumption is where a WFQ is used to control bandwidth allocation between two competing flows before they are subject to marking. Suppose the WFQ equally divides bandwidth between a TCP flow and a non-adaptive flow, and the WFQ service rate is twice the profile rate installed at the marker. Further assume that the packets are perfectly interleaved between the two flows by the WFQ, i.e. UTUTUTUT..., where U is the packet from the non- adaptive flow and T is the packet from the TCP flow. After the token bucket is depleted, it can happen that each token is generated in exact synchrony with the arrival of a packet from the non-adaptive flow. Since the packets from the non-adaptive flow consume tokens first, TCP packets are always marked out-of-profile. Or it could be the other way around and TCP wins all tokens. In essence, fair bandwidth allocation done before the flows are subject to traffic conditioning does not warrant fair assured bandwidth allocation. We do not argue that other fairness mechanism detached from the traffic conditioner are unnecessary; these mechanisms in combination with fair allocation of tokens in traffic conditioner might further improve fairness. Rather, regardless of whether or not they are used, we argue that direct control over token consumption MUST be exercised at the traffic conditioner. In this draft, we define a fair marker (FM) that directly controls the token allocation to fairly distribute tokens to flows originating from the same subscriber network. We assume that FM is used to mark packets for Assured Forwarding [AFPG]. Note that in Kim [Page 2] INTERNET-DRAFT A Fair Marker April 1999 this draft we do not define the granularity of "flow", which is a policy decision. Also, we do not specify the exact location of FM. We only require that it should be where all subscriber flows converge to be marked, before they enter the provider's network. 2. Configuration Figure 1 shows the block diagram of FM. In specifying the configuration of each component, we follow the convention used in Three Color Marker drafts[SRTCM,TRTCM]. FM MAY be specified as operating on a per packet or a per byte basis. This MUST be made explicit. If FM operates on a per packet basis, it MUST be made clear what packet size is assumed by the module or if this is a configurable parameter. The METER component in FM is configured by assigning values to two traffic parameters: a Profile Rate (PR) and a Burst Size (BS). The PR is measured in bytes of IP packets per second, i.e. it includes the IP header, but not link-specific headers. The BS is measured in bytes and it MUST be configured to be greater than 0. It is RECOMMENDED that it be configured to be equal to or greater than the size of the largest IP packet in the stream. PER-FLOW CONTROL is configured by installing one or more Flow Identifiers of the monitored flows, a packet count variable for each monitored flow, and a fair allocation algorithm. In case there is only one monitored Flow Identifier, which is the default at initialization, it should match all flows. Each Flow Identifier is defined by a 5-tuple, where some fields MAY be unspecified (i.e. wildcard). The packet count variables are initialized to zero. The fair allocation algorithm MAY have its own configurable parameters. 3. Metering and marking The token bucket is initially (at time 0) full, i.e., the token count T(0) = BS. Thereafter, the token count T is incremented by one PR times per second up to BS. When a packet of size B bytes arrives at time t, PACKET CLASSIFIER extracts the Flow Identifier from the packet, and feeds it and the packet size to PER-FLOW CONTROL. The fair allocation algorithm in PER-FLOW CONTROL tests if the flow is using disproportionate amount of tokens compared with other flows. The behavior of fair allocation Kim [Page 3] INTERNET-DRAFT A Fair Marker April 1999 algorithms is described in Section 4. And at the METER, the Final Result is calculated by the following rule: If T(t) - B >= 0 and PER-FLOW CONTROL returns in-profile, the packet is in-profile and T is decremented by B Else the packet is out-of-profile The MARKER reflects the metering result by setting the DS field of the packet to a particular codepoint. In case of the AF PHB [AFPG], the "in-profile" can be coded as AFx1 and "out-of-profile" as AFx{2,3}. A drop precedence assignment rule compatible with Three Color Markers will be provided in the next version of this document. Also, other extensions of FM, such as "color-aware" mode of operation or "weighted" FM will be discussed in the next version of this draft. Final Result +------------+ | | | V Packet +----------+ +-------+ +--------+ Marked Stream ===>| PACKET |==>| METER |==>| MARKER |===> Packet from |CLASSIFIER| | | | | Stream subscriber +----------+ +-------+ +--------+ to network | ^ Diff-Serv | | cloud Flow | | Per-flow Id. V | test result +-----------------+ | PER-FLOW | | CONTROL | +-----------------+ Figure 1. A fair marker (FM) 4. Fair marking algorithms In order to fairly allocate tokens to flows, we need to maintain per-flow states on token usage. Obviously, it is wasteful to maintain the states for all flows ever marked, especially if flow granularity is fine. So we maintain only the states of the flows that consumed tokens during the last token bucket fill time. The fill time is defined as the token bucket size divided by the token generation rate (i.e. profile rate). This is analogous to maintaining flow states for only the flows that have packets queued Kim [Page 4] INTERNET-DRAFT A Fair Marker April 1999 in the buffer in fair buffering algorithms (e.g. [DTQ,FRED]). Pushing the analogy further, packet is dropped if buffer is full; packet is marked if token bucket is empty. Router processing is wasted if the buffer is empty and no packet arrives; token is wasted if the token bucket is full and no packet arrives. Non-empty router queue shrinks at service rate; non-empty token bucket is filled at token generation rate. Exploiting this analogy, we can easily adapt fair buffer management algorithms to token allocation. Namely, we regard a packet consuming tokens as the packet's "trace" replacing the consumed tokens in the token bucket. In practice, FM has a separate complementary queue that holds the packet traces. And whenever a new token is generated, we see if the accumulated tokens are sufficient to erase the trace of the oldest packet from the complementary If so, we decrement the token count by the packet size and deque the packet trace from the complementary queue. So we maintain the states of the flows (i.e. Flow Identifier and the packet count in the token bucket) whose packet traces are queued in the complementary queue, and apply fair buffer management algorithms over the queue of traces. Owing to this complementary relationship between tokens and packet traces, determining if the packet should consume a token directly translates to determining if a packet trace should be enqueued in the token bucket. And we use a fair buffer management algorithm to determine if the packet trace should be enqueued. The fair marking procedure is as follows. When a packet pkt comes for marking, the token bucket is updated first. Tokens worth of (t_now - t_last) * PR are added to the bucket (up to BS), where t_last is the last time a packet arrived and PR is the profile rate. The number of packet traces that can be dequeued from the token bucket as a result is (floor(T_now/Pkt_Size) - floor(T_last/Pkt_Size)), assuming fixed packet size for simplicity. T_now is the updated token count, and T_last is the token count at the last packet arrival. As packet traces are dequeued, per-flow statistics and global variables are accordingly updated for the packet trace queue, according to the given fair buffering algorithm (e.g. Flow RED [FRED] or Dynamic Threshold [DTQ]). Then pkt is marked. If tokens are not enough, it is marked out-of-profile and sent off. Its trace is not queued in the token bucket, either. It is analogous to the buffer full event in router buffers. Since the trace is not queued, no state change occurs including average token bucket length and per-flow statistics. On the other hand, if there are enough tokens for pkt, this flow's recent token usage determines the marking. If pkt was going to consume disproportionate amount of tokens, its trace is rejected according to the fair buffering algorithm. In this case, even if there are tokens, pkt is marked out-of-profile. Otherwise, pkt is marked in-profile and tokens are Kim [Page 5] INTERNET-DRAFT A Fair Marker April 1999 consumed from the token bucket. In its place, the packet's trace is queued. Due to the enqueing of the trace, per-flow and global variables for the queue are updated. 5. Preliminary simulation In this section, we evaluate the performance of FM using simulation. We compare the fairness of the FM with that of the vanilla token bucket marker. For simplicity, we simulate packet-mode operation. In Figure 2, we show the simulation topology. There are always 4 TCP flows (TCP 1-4) competing with either another TCP flow (TCP5) or a non-adaptive flow (CBR). These sources feed the marker, which is either the FM or the vanilla marker. The link between the marker and the router equipped with RIO [EABE] has enough capacity to digest all traffic from the sources. It abstracts the network path up to the bottleneck node, wherever the bottleneck is. TCP 1----\ d=1ms \ b/w=10Mbps TCP 2---\ \ d=1ms d=1ms \ \ Marker or b/w=50Mbps b/w=1.5Mbps ===>Fair marker--------------RIO------------Sink1 / / ^ \ TCP 3---/ / | d=60ms \ d=1ms / | b/w=10Mbps \b/w=1.5Mbps TCP 4----/ | \ TCP 5 or Sink2 CBR Figure 2. Simulation topology The RIO buffer size is set to 50, which is larger than the (bottleneck) bandwidth-delay product of the network. Other parameter values are set as follows: minth(out) = 7, maxth(out) = minth(in) = 15, maxth(in) = 30, wq = 0.002, maxp = 0.1. The token bucket size in FM is set to 32. For FRED, parameter values are as follows: bucket_size = 32, minq = 4, maxq = minth = bucket_size/2, maxth = bucket_size, maxp = 0.1, wq = 0.002. In case of DT, it automatically adapts the threshold to the remaining buffer slots (i.e. tokens), so no configuration parameter other than the token bucket size needs to be provided. Unless otherwise stated, Sink1 is the sink for all 5 flows. In the first experiment, we have the 4 TCP connections compete with the CBR flow. The rate of the CBR flow is set to 8Mbps, large enough to overrun the bottleneck link. The profile rate at the marker is set to 1.5Mbps. Since the profile rate is the same as the bottleneck rate, all out-of-profile packets should be dropped by the RIO Kim [Page 6] INTERNET-DRAFT A Fair Marker April 1999 router, and only in-profile packets get through the bottleneck link. So if the tokens are fairly allocated at the marker, each flow should get 0.3Mbps throughput. Table 1 compares the throughput and fairness of the TCP connections and those of the non-adaptive flow. The fairness index is defined as (Sum xi)^2 / Sum n*(xi)^2, where xi is the throughput of the ith flow and n is the number of flows[CSART]. Throughputs are in Mbps and fairness index is unitless. M denotes vanilla marker, while FM/DT denotes fair marker with DT algorithm, and FM/FRED denotes fair marker with FRED algorithm. =================================================================== Min TCP Max TCP Avg. TCP CBR Fairness thruput thruput thruput thruput ------------------------------------------------------------------- M 0 0 0 1.499 0.200 ------------------------------------------------------------------- FM/FRED 0.287 0.295 0.2915 0.335 0.997 ------------------------------------------------------------------- FM/DT 0.220 0.220 0.220 0.620 0.779 ------------------------------------------------------------------- Table 1. Fairness between 4 TCP connections and a non-adaptive flow The bandwidth fairness in Table 1 almost exactly reflects the token allocation fairness, which we do not show. With vanilla marker, CBR takes almost all tokens. This is true even if we do not have the four TCPs and CBR directly compete at the bottleneck, by terminating the CBR flow at Sink2 instead of Sink1. In either case, CBR completely shuts out the TCP connections under vanilla marker. Although FM with FRED seems to give better fairness results than with DT, the above result is not at all intended to compare FRED with DT. Rather, it simply shows that plugging in any fair buffering algorithms significantly improves fairness. Different fair buffering algorithms may yield different fairness results under different settings. We do not advocate any particular fair buffering algorithm in this draft, although finding a better algorithm is a worthwhile extension of this work. The next experiment is where 4 TCP connections compete with another TCP connection with a larger RTT. TCP5 RTT varies from 64 to 204ms, compared with just 6ms RTT for the 4 TCP connections. Table 2 shows the result, and the fair marker greatly improves the throughput of TCP5, achieving better fairness. =================================================================== TCP-5 Min TCP1-4 Max TCP1-4 Avg TCP1-4 TCP 5 Fairness RTT thruput thruput thruput thruput ------------------------------------------------------------------- Kim [Page 7] INTERNET-DRAFT A Fair Marker April 1999 M 64ms 0.343 0.380 0.362 0.052 0.853 ----------------------------------------------------------------- FM/FRED 64ms 0.294 0.304 0.299 0.305 0.999 ------------------------------------------------------------------- FM/DT 64ms 0.311 0.314 0.312 0.251 0.993 ------------------------------------------------------------------- M 124ms 0.361 0.374 0.366 0.035 0.837 ------------------------------------------------------------------- FM/FRED 124ms 0.312 0.318 0.315 0.240 0.990 ------------------------------------------------------------------- FM/DT 124ms 0.322 0.327 0.324 0.203 0.975 ------------------------------------------------------------------- M 204ms 0.362 0.386 0.373 0.009 0.809 ------------------------------------------------------------------- FM/FRED 204ms 0.317 0.331 0.324 0.203 0.974 ------------------------------------------------------------------- FM/DT 204ms 0.328 0.331 0.330 0.182 0.963 ------------------------------------------------------------------- Table 2. Fairness between different RTT connections These simulation results are preliminary and intended to simply demonstrate the usefulness of FM. Further simulation results will be provided in the ensuing revisions as necessary. 6. Scaling properties We observe that unlike backbone routers, edge traffic markers can exercise fine-grain control such as per-flow marking without causing excessive scalability problems. The complexity of FM is bounded by the number of packets that can fit in the given token bucket size. 7. Security concerns FM has no known security concerns at this point. 8. Example services A subscriber network can enforce fair distribution of assured bandwidth among its flows using FM alone or in combination with other local bandwidth distribution mechanisms. For instance, it can make FM treat all hosts in its network with equal weight, distributing the assured bandwidth equally among active hosts that have packets registered at FM. Then the Flow Identifiers in PER-FLOW CONTROL will be of the form <*, src-addr, *, *, *> Kim [Page 8] INTERNET-DRAFT A Fair Marker April 1999 where src-addr is the originating host address of an active flow, and * denotes wildcard. 9. References [TCPCOND] F. Azeem et al., "TCP-Friendly Traffic Conditioners for Differentiated Services," Internet Draft, March 1999. [DSARCH] D. Black et al., "An Architecture for Differentiated Services", Internet Draft, May 1998. [DTQ] A. Choudhury and E. Hahne, "Dynamic Queue Length Thresholds in a Shared Memory ATM Switch," IEEE INFOCOM '96, pp. 679-687. [EABE] W. Fang and D. Clark, "Explicit Allocation of Best Effort Packet Delivery Service," submitted for publication. [SRTCM] J. Heinanen and R. Guerin, "A Single Rate Three Color Marker," Internet Draft, March 1999. [TRTCM] J. Heinanen and R. Guerin, "A Two Rate Three Color Marker," Internet Draft, March 1999. [CSART] R. Jain, "The Art of Computer Systems Performance Analysis," John Wiley and Sons Inc., 1991. [FRED] D. Lin and R. Morris, "Dynamics of Random Early Detection," ACM Sigcomm '97. [AFPG] J. Heinanen et al., "Assured Forwarding PHG Group," Internet Draft, February 1999. [FDSTC] K. Nichols and B. Carpenter, "Format for Diffserv Working Group Traffic Conditioner Drafts," Internet Draft, February 1999. 10. Author Address Hyogon Kim Telcordia Technologies (formerly Bellcore) 331 Newman Springs Road Red Bank, NJ 07701 U.S.A. Email: hkim@research.teclcordia.com Kim [Page 9]