Network Working Group A. Charny Internet-Draft Cisco Systems, Inc. Intended status: Standards Track J. Zhang Expires: January 10, 2008 Cisco Systems, Inc. and Cornell University F. Le Faucheur V. Liatsos Cisco Systems, Inc. July 9, 2007 Pre-Congestion Notification Using Single Marking for Admission and Termination draft-charny-pcn-single-marking-02.txt Status of this Memo By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on January 10, 2008. Copyright Notice Copyright (C) The IETF Trust (2007). Abstract Pre-Congestion Notification described in [I-D.eardley-pcn-architecture] and earlier in Charny, et al. Expires January 10, 2008 [Page 1] Internet-Draft PCN with Single Marking July 2007 [I-D.briscoe-tsvwg-cl-architecture] approach proposes the use of an Admission Control mechanism to limit the amount of real-time PCN traffic to a configured level during the normal operating conditions, and the use of a Flow Termination mechanism to tear-down some of the flows to bring the PCN traffic level down to a desirable amount during unexpected events such as network failures, with the goal of maintaining the QoS assurances to the remaining flows. In [I-D.eardley-pcn-architecture], Admission and Flow Termination use two different markings and two different metering mechanisms in the internal nodes of the PCN region. This draft proposes a mechanism using a single marking and metering for both Admission and Flow Termination, and presents a preliminary analysis of the tradeoffs. A side-effect of this proposal is that a different marking a nd metering Admission mechanism than that proposed in [I-D.eardley-pcn-architecture] may be also feasible, and may result in a number of benefits. In addition, this draft proposes a migration path for incremental deployment of this approach as an intermediate step to the dual-marking approach. Requirements Language The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119]. Charny, et al. Expires January 10, 2008 [Page 2] Internet-Draft PCN with Single Marking July 2007 Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.1. Changes from -01 version . . . . . . . . . . . . . . . . . 5 1.2. Terminology . . . . . . . . . . . . . . . . . . . . . . . 5 1.3. Background and Motivation . . . . . . . . . . . . . . . . 5 2. The Single Marking Approach . . . . . . . . . . . . . . . . . 7 2.1. High Level description . . . . . . . . . . . . . . . . . . 7 2.2. Operation at the PCN-interior-node . . . . . . . . . . . . 8 2.3. Operation at the PCN-egress-node . . . . . . . . . . . . . 8 2.4. Operation at the PCN-ingress-node . . . . . . . . . . . . 8 2.4.1. Admission Decision . . . . . . . . . . . . . . . . . . 8 2.4.2. Flow Termination Decision . . . . . . . . . . . . . . 9 3. Benefits of Allowing the Single Marking Approach . . . . . . . 10 4. Impact on PCN Architectural Framework . . . . . . . . . . . . 10 4.1. Impact on the PCN-Internal-Node . . . . . . . . . . . . . 11 4.2. Impact on the PCN-boundary nodes . . . . . . . . . . . . . 11 4.2.1. Impact on PCN-Egress-Node . . . . . . . . . . . . . . 11 4.2.2. Impact on the PCN-Ingress-Node . . . . . . . . . . . . 12 4.3. Summary of Proposed Enhancements Required for Support of Single Marking Options . . . . . . . . . . . . . . . . 13 4.4. Proposed Optional Renaming of the Marking and Marking Thresholds . . . . . . . . . . . . . . . . . . . . . . . . 13 4.5. An Optimization Using a Single Configuration Parameter for Single Marking . . . . . . . . . . . . . . . . . . . . 14 5. Incremental Deployment Considerations . . . . . . . . . . . . 14 6. Tradeoffs, Issues and Limitations of Single Marking Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 6.1. Restrictions on Termination-to-admission Thresholds . . . 16 6.2. Assumptions on Loss . . . . . . . . . . . . . . . . . . . 16 6.3. Effect of Reaction Timescale of Admission Mechanism . . . 16 6.4. Performance Implications and Tradeoffs . . . . . . . . . . 16 6.5. Effect on Proposed Anti-Cheating Mechanisms . . . . . . . 17 6.6. ECMP Handling . . . . . . . . . . . . . . . . . . . . . . 17 6.7. Traffic Engineering Considerations . . . . . . . . . . . . 18 7. Performance Evaluation Comparison . . . . . . . . . . . . . . 22 7.1. Relationship to other drafts . . . . . . . . . . . . . . . 22 7.2. Limitations, Conclusions and Direction for Future Work . . 22 7.2.1. High Level Conclusions . . . . . . . . . . . . . . . . 22 7.2.2. Future work . . . . . . . . . . . . . . . . . . . . . 23 8. Appendix A: Simulation Details . . . . . . . . . . . . . . . 23 8.1. Network and Signaling Models . . . . . . . . . . . . . . . 23 8.2. Traffic Models . . . . . . . . . . . . . . . . . . . . . . 26 8.2.1. Voice Traffic Models . . . . . . . . . . . . . . . . . 26 8.2.2. "Synthetic Video": High Rate ON-OFF traffic with Video-like Mean and Peak Rates ("SVD") . . . . . . . 27 8.2.3. Real Video Traces (VTR) . . . . . . . . . . . . . . . 28 8.2.4. Randomization of Base Traffic Models . . . . . . . . . 29 Charny, et al. Expires January 10, 2008 [Page 3] Internet-Draft PCN with Single Marking July 2007 8.3. Parameter Settings . . . . . . . . . . . . . . . . . . . . 29 8.3.1. Queue-based settings . . . . . . . . . . . . . . . . . 29 8.3.2. Token Bucket Settings . . . . . . . . . . . . . . . . 29 8.4. Simulation Details . . . . . . . . . . . . . . . . . . . . 30 8.4.1. Sensitivity to EWMA weight and CLE . . . . . . . . . . 30 8.4.2. Effect of Ingress-Egress Aggregation . . . . . . . . . 32 8.4.3. Effect of Multiple Bottlenecks . . . . . . . . . . . . 38 9. Security Considerations . . . . . . . . . . . . . . . . . . . 42 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 42 11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 42 11.1. Normative References . . . . . . . . . . . . . . . . . . . 42 11.2. Informative References . . . . . . . . . . . . . . . . . . 42 11.3. References . . . . . . . . . . . . . . . . . . . . . . . . 43 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 43 Intellectual Property and Copyright Statements . . . . . . . . . . 45 Charny, et al. Expires January 10, 2008 [Page 4] Internet-Draft PCN with Single Marking July 2007 1. Introduction 1.1. Changes from -01 version o Added miscellaneous clarifications based on comments received on version -01 o Removed Terminology section and replaced it with a pointer to [I-D.eardley-pcn-architecture]. o Added a section on standards implications and considerations for incremental deployment o Added a section on ECMP handling o Added a section on traffic engineering considerations and tradeoffs. o Undated the Appendix to include new results and consolidate some of the old ones 1.2. Terminology This draft uses the terminology defined in [I-D.eardley-pcn-architecture] 1.3. Background and Motivation Pre-Congestion Notification [I-D.eardley-pcn-architecture] approach proposes to use an Admission Control mechanism to limit the amount of real-time PCN traffic to a configured level during the normal operating conditions, and to use a Flow Termination mechanism to tear-down some of the flows to bring the PCN traffic level down to a desirable amount during unexpected events such as network failures, with the goal of maintaining the QoS assurances to the remaining flows. In [I-D.eardley-pcn-architecture], Admission and Flow Termination use two different markings and two different metering mechanisms in the internal nodes of the PCN region. Admission Control algorithms for variable-rate real-time traffic such as video have traditionally been based on the observation of the queue length, and hence re-using these techniques and ideas in the context of pre- congestion notification is highly attractive, and motivated the virtual-queue-based marking and metering approach specified in [I-D.briscoe-tsvwg-cl-architecture] for Admission. On the other hand, for Flow Termination, it is desirable to know how many flows need to be terminated, and that in turn motivates rate-based Flow Termination metering. This provides some motivation for employing different metering algorithm for Admission and for Flow Termination. Charny, et al. Expires January 10, 2008 [Page 5] Internet-Draft PCN with Single Marking July 2007 Furthermore, it is frequently desirable to trigger Flow Termination at a substantially higher traffic level than the level at which no new flows are to be admitted. There are multiple reasons for the requirement to enforce a different configured-admissible-rate and configured-termination-rate. These include, for example: o End-users are typically more annoyed by their established call dying than by getting a busy tone at call establishment. Hence decisions to terminate flows may need to be done at a higher load level than the decision to stop admitting. o There are often very tight (possibly legal) obligations on network operators to not drop established calls. o Voice Call Routing often has the ability to route/establish the call on another network (e.g., PSTN) if it is determined at call establishment that one network (e.g., packet network) can not accept the call. Therefore, not admitting a call on the packet network at initial establishment may not impact the end-user. In contrast, it is usually not possible to reroute an established call onto another network mid-call. This means that call Termination can not be hidden to the end-user. o Flow Termination is typically useful in failure situations where some loads get rerouted thereby increasing the load on remaining links. Because the failure may only be temporary, the operator may be ready to tolerate a small degradation during the interim failure period. This also argues for a higher configured- termination-rate than configured-admissible-rate o A congestion notification based Admission scheme has some inherent inaccuracies because of its reactive nature and thus may potentially over admit in some situations (such as burst of calls arrival). If the Flow Termination scheme reacted at the same rate threshold as the Admission , calls may get routinely dropped after establishment because of over admission, even under steady state conditions. These considerations argue for metering for Admission and Flow Termination at different traffic levels and hence, implicitly, for different markings and metering schemes. Different marking schemes require different codepoints. Thus, such separate markings consume valuable real-estate in the packet header, especially scarce in the case of MPLS Pre-Congestion Notification [I-D.davie-ecn-mpls] . Furthermore, two different metering techniques involve additional complexity in the data path of the internal routers of the PCN-domain. Charny, et al. Expires January 10, 2008 [Page 6] Internet-Draft PCN with Single Marking July 2007 To this end, [I-D.briscoe-tsvwg-cl-architecture] proposes an approach, referred to as "implicit Preemption marking" in that draft, that does not require separate termination-marking. However, it does require two separate measurement schemes: one measurement for Admission and another measurement for Flow Termination. Furthermore, this approach mandates that the configured-termination-rate be equal to a drop rate. This approach effectively uses dropping as the way to convey information about how much traffic can "fit" under the configured-termination-rate, instead of using a separate termination marking. This is a significant restriction in that it results in flow termination only taking effect once packets actually get dropped. This document presents an approach that allows the use of a single PCN marking and a single metering technique at the internal devices without requiring that the dropping and flow termination thresholds be the same. We also argue that this approach can be used as intermediate step in implementation and deployment of a full-fledged dual-marking PCN implementation. 2. The Single Marking Approach 2.1. High Level description The proposed approach is based on several simple ideas: o Replace virtual-queue-based marking for Admission Control by excess rate marking: * meter traffic exceeding the configued-admissible-rate and mark *excess* traffic (e.g. using a token bucket with the rate configured with the rate equal to configured-admissible-rate) * at the PCN-boundary-node, stop admitting traffic when the fraction of marked traffic for a given edge-to-edge aggregate exceeds a configured threshold (e.g. stop admitting when 3% of all traffic in the edge-to-edge aggregate received at the ingress is marked) o Impose a PCN-domain-wide constraint on the ratio U between the configured-admissible-rate on a link and level of the PCN load on the link at which Flow Termination needs to be triggered (but do not explicitly configure configured-termination-rate). For example, one might impose a policy that Flow Termination is triggered when PCN traffic exceeds 120% of the configured- admissible-rate on any link of the PCN-domain). Charny, et al. Expires January 10, 2008 [Page 7] Internet-Draft PCN with Single Marking July 2007 The remaining part of this section describes the possible operation of the system. 2.2. Operation at the PCN-interior-node The PCN-interior-node meters the aggregate PCN traffic and marks the excess rate. A number of implementations are possible to achieve that. A token bucket implementation is particularly attractive because of its relative simplicity, and even more so because a token bucket implementation is readily available in the vast majority of existing equipment. The rate of the token bucket is configured to correspond to the configured-admissible-rate, and the depth of the token bucket can be configured by an operator based on the desired tolerance to PCN traffic burstiness. Note that no configured-termination-rate is explicitly configured at the PCN-interior-node, and the PCN-interior-node does nothing at all to enforce it. All marking is based on the single configured rate threshold (configured-admissible-rate). 2.3. Operation at the PCN-egress-node The PCN-egress-node measures the rate of both marked and unmarked traffic on a per-ingress basis, and reports to the PCN-ingress-node two values: the rate of unmarked traffic from this ingress node, which we deem Sustainable Admission Rate (SAR) and the Congestion Level Estimate (CLE), which is the fraction of the marked traffic received from this ingress node. Note that Sustainable Admission Rate is analogous to the sustainable Preemption rate of [I-D.briscoe-tsvwg-cl-architecture], except in this case it is based on the configured-admissible- rather than termination threshold, while the CLE is exactly the same as that of [I-D.briscoe-tsvwg-cl-architecture]. The details of the rate measurement are outside the scope of this draft. 2.4. Operation at the PCN-ingress-node 2.4.1. Admission Decision Just as in [I-D.briscoe-tsvwg-cl-architecture], the admission decision is based on the CLE. The ingress node stops admission of new flows if the CLE is above a pre-defined threshold (e.g. 3%). Note that although the logic of the decision is exactly the same as in the case of [I-D.briscoe-tsvwg-cl-architecture], the detailed semantics of the marking is different. This is because the marking used for admission in this proposal reflects the excess rate over the configured-admissible-rate, while in [I-D.briscoe-tsvwg-cl-architecture], the marking is based on Charny, et al. Expires January 10, 2008 [Page 8] Internet-Draft PCN with Single Marking July 2007 exceeding a virtual queue threshold. Notably, in the current proposal, if the average sustained rate of admitted traffic is 5% over the admission threshold, then 5% of the traffic is expected to be marked, whereas in the context of [I-D.briscoe-tsvwg-cl-architecture] a steady 5% overload should eventually result in 100% of all traffic being admission marked. A consequence of this is that for "smooth" constant-rate traffic, the approach presented here will not mark any traffic at all until the rate of the traffic exceeds the configured admission threshold by the amount corresponding to the chosen CLE threshold. At first glance this may seem to result in a violation of the pre- congestion notification premise that attempts to stop admission before the desired traffic level is reached. However, in reality one can simply embed the CLE level into the desired configuration of the admission threshold. That is, if a certain rate X is the actual target admission threshold, then one should configure the rate of the metering device (e.g. the rate of the token bucket) to X-y where y corresponds to the level of CLE that would trigger admission blocking decision. A more important distinction is that virtual-queue based marking reacts to short-term burstiness of traffic, while the excess-rate based marking is only capable of reacting to rate violations at the timescale chosen for rate measurement. Based on our investigation, it seems that this distinction is not crucial in the context of PCN when no actual queuing is expected even if the virtual queue is full. More discussion on this is presented later in the draft. 2.4.2. Flow Termination Decision When the ingress observes a non-zero CLE and Sustainable Admission Rate (SAR), it first computes the Sustainable Termination Rate (STR) by simply multiplying SAR by the system-wide constant u, where u is the system-wide ratio between Preemption and admission thresholds on all links in the PCN domain: STR = SAR*U. The PCN-ingress-node then performs exactly the same operation as is proposed in [I-D.briscoe-tsvwg-cl-architecture] with respect to STR: it preempts the appropriate number of flows to ensure that the rate of traffic it sends to the corresponding egress node does not exceed STR. Just as in the case of [I-D.briscoe-tsvwg-cl-architecture], an implementation may decide to slow down the termination process by preempting fewer flows than is necessary to cap its traffic to STR by employing a variety of techniques such as safety factors or hysteresis. In summary, the operation of Termination at the ingress node is identical to that of [I-D.briscoe-tsvwg-cl-architecture], with the only exception that the sustainable Termination rate is computed from the sustainable admission rate rather than derived from a separate Charny, et al. Expires January 10, 2008 [Page 9] Internet-Draft PCN with Single Marking July 2007 marking. As discussed earlier, this is enabled by imposing a system- wide restriction on the termination-to-admission thresholds ratio and changing the semantics of the admission marking. 3. Benefits of Allowing the Single Marking Approach The following is a summary of benefits associated with enabling the Single Marking Approach. Some tradeoffs will be discussed in section 7 below. o Reduced implementation requirements on core routers due to a single metering implementation instead of two different ones. o Ease of use on existing hardware: given that the proposed approach is particularly amenable to a token bucket implementation, the availability of token buckets on virtually all commercially available routers makes this approach especially attractive. o Enabling incremental implementation and deployment of PCN (see section 4). o Reduced number of codepoints which need to be conveyed in the packet header. If the PCN-bits used in the packets header to convey the congestion notification information are the ECN-bits in an IP core and the EXP-bits in an MPLS core, those are very expensive real-estate. The current proposals need 5 codepoints, which is especially important in the context of MPLS where there is only a total of 8 EXP codepoints which must also be shared with DiffServ. Eliminating one codepoint considerably helps. o A possibility of using a token-bucket-, excess-rate- based implementation for admission provides extra flexibility for the choice of an admission mechanism, even if two separate markings and thresholds are used. Subsequent sections argue that these benefits can be achieved with a relatively minor enhancements to the proposed PCN architecture as defined in [I-D.eardley-pcn-architecture], allow simpler implementations at the PCN-interior nodes, and trivial modifications at the PCN- boundary nodes. 4. Impact on PCN Architectural Framework The goal of this section is to propose several minor changes to the PCN architecture framework as currently described in [I-D.eardley-pcn-architecture] in order to enable the single marking Charny, et al. Expires January 10, 2008 [Page 10] Internet-Draft PCN with Single Marking July 2007 approach. 4.1. Impact on the PCN-Internal-Node No changes are required to the PCN-internal-node in architectural framework in [I-D.eardley-pcn-architecture] in order to support the Single Marking Proposal. The current architecture [I-D.eardley-pcn-architecture] already allows only one marking and metering scheme rather than two by supporting either "admission only" or "termination only" functionality. To support the Single Marking proposal a single threshold (i.e. Configured-termination-rate) must be configured at the PCN-internal-node, and excess-rate marking should be used to mark packets as described in [I-D.briscoe-tsvwg-cl-architecture]. Note however that the meaning of this single threshold and the marking in this case is not related to termination function, but rather to admission function. The configuration parameter(s) described in section 4.2 below at the PCN- ingress-nodes and PCN-egress-node will determine whether the marking should be interpreted as the admission-marking (as appropriate for the Single Marking approach) or as termination-marking (as appropriate for the Dual Marking approach of [I-D.briscoe-tsvwg-cl-architecture] We note that from the implementation standpoint, a PCN-ingress-node supporting Single Marking implements only a subset of the functionality needed for Dual Marking. 4.2. Impact on the PCN-boundary nodes We propose an addition of one global configuration parameter MARKING_MODE to be used at all PCN boundary nodes. IF MARKING_MODE = DUAL_MARKING, the behavior of the appropriate PCN-boundary-node as described in the current version of [I-D.eardley-pcn-architecture]. If MARKING_MODE = SINGLE_MARKING, the behavior of the appropriate boundary nodes is as described in the subsequent subsections. 4.2.1. Impact on PCN-Egress-Node If MARKING_MODE=SINGLE_MARKING, the Congestion-Level_Estimete (CLE) is measured against termination-marked packets. If MARKING_MODE=DUAL_MARKING, the CLE is measured against admission_marked packets. The method of measurement does not depend on the choice of the marking against which the measurement is performed. Regardless of the setting of the MARKING_MODE parameter, Sustainable- Aggregate-Rate is measured against termination_marked packets, as currently defined in [I-D.briscoe-tsvwg-cl-architecture]. Charny, et al. Expires January 10, 2008 [Page 11] Internet-Draft PCN with Single Marking July 2007 We note that from the implementation point of view, the same two functions (measuring the CLE and measuring the Sustainable-Aggregate- Rate are required by both the Single Marking approach and the approach in, so the difference in the implementation complexity of the PCN-egress-node is quite negligible. 4.2.2. Impact on the PCN-Ingress-Node If MARKING_MODE=DUAL_MARKING, the PCN-ingress-node behaves exactly as described in [I-D.eardley-pcn-architecture]. If MARKING_MODE = SINGLE_MARKING, then an additional global parameter U is defined. U must be configured at all PCN_ingess nodes and has the meaning of the desired ratio between the traffic level at which termination should occur and the desired admission threshold, as described in section 2.4 above. The value of U must be greater than or equal to 1. The value of this constant U is used to multiply the Sustainable Aggregate Rate received from a given PCN-egress-node to compute the rate threshold used for flow termination decisions. In more detail, if MARKING_MODE=SINGLE_MARKING, then o A PCN-ingress-node receives CLE and/or Sustainable Aggregate Rate from each PCN-egress-node it has traffic to. This is fully compatible with PCN architecture as described in [I-D.eardley-pcn-architecture]. o A PCN-ingress-node bases its admission decisions on the value of CLE. Specifically, once the value of CLE exceeds a configured threshold, the PCN-ingress-node stops admitting new flows. It restarts admitting when the CLE value goes down below the specified threshold. This is fully compatible with PCN architecture as described in [I-D.eardley-pcn-architecture]. o A PCN-ingress node receiving a Sustainable Rate from a particular PCN-egress node measures its traffic to that egress node. This again is fully compatible with PCN architecture as described in [I-D.eardley-pcn-architecture]. o The PCN-ingress-node computes the desired Termination Rate to a particular PCN-egress-node by multiplying the Sustainable Aggregate Rate from a given PCN-egress-node by the value of the configuration parameter U. This computation step represents a proposed change to the current version of [I-D.eardley-pcn-architecture]. o Once the Termination Rate is computed, it is used for the flow termination decision in a manner fully compatible with [I-D.eardley-pcn-architecture]. Namely the PCN-ingress-node Charny, et al. Expires January 10, 2008 [Page 12] Internet-Draft PCN with Single Marking July 2007 compares the measured traffic rate destined to the given PCN- egress-node with the computed Termination rate for that egress node, and terminates a set of traffic flows to reduce the rate exceeding that Termination rate. This is fully compatible with [I-D.eardley-pcn-architecture]. We note that as in the case of the PCN-egress-node, the change in the implementation of the PCN-ingress-node to support Single Marking is quite negligible (a single multiplication per ingress rate measurement interval for each egress node). 4.3. Summary of Proposed Enhancements Required for Support of Single Marking Options The enhancements to the PCN architecture as defined in [I-D.eardley-pcn-architecture], in summary, amount to: o defining a global (within the PCN domain) configuration parameter MARKING_MODE at PCN-boundary nodes o Defining a global (within the PCN domain) configuration parameter U at the PCN-ingress_nodes. This parameter signifies the implicit ratio between the termination and admission thresholds at all links o Multiplication of Sustainable-Aggregate-Rate by the constant U at the PCN-ingress-nodes if MARKING_MODE=SINGLE_MARKING o Using the MARKING_MODE parameter to guide which marking is used to measure the CLE at the PCN-egress-node (but the actual measurement functionality is unchanged). 4.4. Proposed Optional Renaming of the Marking and Marking Thresholds Previous work on example mechanisms [I-D.briscoe-tsvwg-cl-architecture] implementing the architecture of [I-D.eardley-pcn-architecture] assumed that the semantics of admission control marking and termination marking differ. Specifically, it was assumed that for termination purposes the semantics of the marking is related to the excess rate over the configured (termination) rate, or even more precisely, the amount of traffic that remains unmarked (sustainable rate) after the excess traffic is marked. Some of the recent proposals assume yet different marking semantics [I-D.babiarz-pcn-3sm], [I-D.westberg-pcn-load-control]. Even though specific association with marking semantics and function (admission vs termination) has been assumed in prior work, it is Charny, et al. Expires January 10, 2008 [Page 13] Internet-Draft PCN with Single Marking July 2007 important to note that in the current architecture draft [I-D.eardley-pcn-architecture], the associations of specific marking semantics (virtual queue vs excess rate) with specific functions (admission vs termination) are actually *not* directly assumed. In fact , the architecture document does not explicitly define the marking mechanism, but rather states the existence of two different marking mechanisms, and also allows implementation of either one or both of these mechanisms in a PCN-domain. We argue that this separation of the marking semantics from the functional use of the marking is important to make sure that devices supporting the same marking can interoperate in delivering the function which is based on specific supported marking semantics. To explicitly divorce the function (admission vs termination) and the semantics (excess rate marking, virtual queue marking), it may be beneficial to rename the marking to be associated with the semantics rather than the function. Specifically, it may be beneficial to change the "admission-marking" and "termination-marking" currently defined in the architecture as "Type Q" or "virtual-queue-based" marking, and "Type R" or "excess-rate-based" marking. Of course, other choices of the naming are possible (including keeping the ones currently used in [I-D.eardley-pcn-architecture]). With this renaming, the dual marking approach in [I-D.briscoe-tsvwg-cl-architecture] would require PCN-internal-nodes to support both Type R and Type Q marking, while Single Marking would require support of Type-R marking only. We conclude by emphasizing that the changes proposed here amount to merely a renaming rather than a change to the proposed architecture, and are therefore entirely optional. 4.5. An Optimization Using a Single Configuration Parameter for Single Marking We note finally that it is possible to use a single configuration constant U instead of two constants (U and MARKING_TYPE). Specifically, one can simply interpret the value of U=1 as the dual- marking approach (equivalent to MARKING_TYPE=DUAL_MARKING) and use U>1 to indicate Single Marking. 5. Incremental Deployment Considerations As most of today's routers already implement a token bucket, implementing token-bucket based excess-rate marking at PCN-ingress nodes is a relatively small incremental step for most of today's Charny, et al. Expires January 10, 2008 [Page 14] Internet-Draft PCN with Single Marking July 2007 implementations. Implementing an additional metering and marking scheme in the datapath required by the dual-marking approach without encountering performance degradation is a larger step. The single- marking approach may be used as an intermediate step towards the deployment of a dual-marking approach in the sense that routers implementing single-marking functionality only may be incrementally deployed. The deployment steps might be as follows: o Initially all PCN-ingress-nodes might implement Excess-rate (Type R) type marking and metering only o All PCN-boundary nodes implement the full functionality as described in this document (including the configuration parameters MARKING_TYPE and U) from the start. Since the PCN-boundary-node behavior is enabled by simply changing the values of the configuration parameters, all boundary nodes become immediately compatible with both dual-marking and single-marking. o Initially all boundary nodes are configured parameter settings indicating Single Marking option. o When a PCN-internal node with dual-marking functionality replaces a subset of PCN-internal-nodes, the virtual-queue-based (Type Q) marking is simply ignored by the boundary nodes until all PCN- internal-nodes in the PCN-domain implement the dual-marking metering and marking. At that time the value of the configuration parameters may be reset to at all boundary nodes to indicate the Dual Marking configuration. o Note that if a subset of PCN-boundary-nodes communicates only with each other, and all PCN-internal-nodes their traffic traverses have been upgraded, this subset of nodes can be upgraded to two dual-marking behavior while the rest of the PCN-domain can still run the single marking case. This would entail configuring two thresholds at the PCN-internal-nodes, and setting the value of the configuration parameters appropriately in this subset. o Finally note that if the configuration parameter U is configured per ingress-egress-pair rather than per boundary node, then each ingress-egress pair can be upgraded to the dual marking simultaneously. While we do not recommend that U is defined on a per-ingress-egress pair, such possibility should be noted and considered. Charny, et al. Expires January 10, 2008 [Page 15] Internet-Draft PCN with Single Marking July 2007 6. Tradeoffs, Issues and Limitations of Single Marking Approach 6.1. Restrictions on Termination-to-admission Thresholds An obvious restriction necessary for the single-marking approach is that the ratio of (implicit) Preemption and admission thresholds remains the same on all links in the PCN region. While clearly a limitation, this does not appear to be particularly crippling, and does not appear to outweigh the benefits of reducing the overhead in the router implementation and savings in codepoints in the case of a single PCN domain, or in the case of multiple concatenated PCN regions. The case when this limitation becomes more inconvenient is when an operator wants to merge two previously separate PCN regions (which may have different admission-to-preemption ratios) into a single PCN region. In this case it becomes necessary to do a network-wide reconfiguration to align the settings. The fixed ratio between the implicit termination rate and the configured-admissible-rate also has an implications on traffic engineering considerations. Those are discussed in section 7.7 below. 6.2. Assumptions on Loss Just as in the case of [I-D.briscoe-tsvwg-cl-architecture], the approach presented in this draft assumes that the configured- admissible-rate is configured at each link below the service rate of the traffic using PCN. This assumption is significant because the algorithm relies on the fact that if admission threshold is exceeded, enough marked traffic reaches the pcn-egress-node to reach the configured CLE level. If this condition does not hold, then traffic may get dropped without ever triggering admission decision. 6.3. Effect of Reaction Timescale of Admission Mechanism As mentioned earlier in this draft, there is a potential concern that slower reaction time of admissions mechanism presented in this draft compared to [I-D.briscoe-tsvwg-cl-architecture] may result in overshoot when the load grows rapidly, and undershoot when the load drops rapidly. While this is a valid concern theoretically, it should be noted that at least for the traffic and parameters used in the simulation study reported here, there was no indication that this was a problem. 6.4. Performance Implications and Tradeoffs Replacement of a relatively well-studied queue-based measurement- based admission control approach by a cruder excess-rate measurement Charny, et al. Expires January 10, 2008 [Page 16] Internet-Draft PCN with Single Marking July 2007 technique raises a number of algorithmic and performance concerns that need to be carefully evaluated. For example, a token-bucket excess rate measurement is expected to be substantially more sensitive to traffic burstiness and parameter setting, which may have a significant effect in the case of lower levels of traffic aggregation, especially for variable-rate traffic such as video. In addition, the appropriate timescale of rate measurement needs to be carefully evaluated, and in general it depends on the degree of expected traffic variability which is frequently unknown. In view of that, an initial performance comparison of the token- bucket based measurement is presented in the following section. Within the constraints of this study, the performance tradeoffs observed between the queue-based technique suggested in [I-D.briscoe-tsvwg-cl-architecture] and a simpler token-bucket-based excess rate measurement do not appear to be a cause of substantial concern for cases when traffic aggregation is reasonably high at the bottleneck links as well as on a per ingress-egress pair basis. Details of the simulation study, as well as additional discussion of its implications are presented in section 6. Also, one mitigating consideration in favor of the simpler mechanism is that in a typical DiffServ environment, the real-time traffic is expected to be served at a higher priority and/or the target admission rate is expected to be substantially below the speed at which the real-time queue is actually served. If these assumptions hold, then there is some margin of safety for an admission control algorithm, making the requirements for admission control more forgiving to bounded errors - see additional discussion in section 6. 6.5. Effect on Proposed Anti-Cheating Mechanisms Replacement of the queue-based admission control mechanism of [I-D.briscoe-tsvwg-cl-architecture] by an excess-rate based admission marking changing the semantics of the pre-congestion marking, and consequently interferes with mechanisms for cheating detection discussed in [I-D.briscoe-tsvwg-re-ecn-border-cheat]. Implications of excess-rate based marking on the anti-cheating mechanisms need to be considered. 6.6. ECMP Handling An issue not directly addressed by either the dual-marking approach described in [I-D.briscoe-tsvwg-cl-architecture] or the single- marking approach described in this draft, is that if ECMP is enabled in the PCN-domain, then the PCN-boundary-nodes do not have a way of knowing whether specific flows in the ingress-egress aggregate followed the same path or not. If multiple paths are followed, then Charny, et al. Expires January 10, 2008 [Page 17] Internet-Draft PCN with Single Marking July 2007 some of those paths may be experiencing pre-congestion marking, and some are not. Hence, for example, an ingress node may choose to terminate a flow which takes an entirely un-congested path. This will not only unnecessarily terminate some flows, but also will not eleminate congestion on the actually congested path. While eventually, after several interations, the correct number of flows might be terminated on the congestion path, this is clearly suboptimal, as the termination takes longer, and many flows are potentialy terminated unnecessarily. Two approaches for solving this problem were proposed in [I-D.babiarz-pcn-3sm] and [I-D.westberg-pcn-load-control]. The former handles ECMP by terminating those flows that are termination-marked as soon as the termination marling is seen. The latter uses an additional DiffServ marking/codepoint to mark all packets of the flows passing through a congestion point, with the PCN-boundary-nodes terminating only those flows which are marked with this additional marks. Both of these approaches also differ in the termination-marking semantics, but we omit the discussion of these differences as they can be considered largely independent of the ECMP issue. It should be noted that although not proposed in this draft, either of these ideas can be used with dual- and single- marking approaches discussed here. Specifically, and when a PCN-ingress-node decides which flows to terminate, it can choose for termination only those flows that are termination-marked. Likewise, at the cost of an additional (DiifServ) codepoint, a PCN-internal-node can mark all packets of all flows using this additional marking, and then the PCN- boundary-nodes can use this additional marking to guide their flow termination decisions. Either of these approaches appears to imply changes to the PCN architecture as proposed in [I-D.eardley-pcn-architecture]. Such changes have not been considered in this draft at this point. 6.7. Traffic Engineering Considerations Dual-marking PCN can be viewed as a replacement for Resilient Network Provisioning (RNP). It is reasonable to expect that an operator currently using DiffServ provisioning for real-time traffic might consider a move to PCN. For such a move it is necessary to understand how to set the PCN rate thresholds to make sure that the move to PCN does not detrementally affect the quarantees currently offered to the operator. The key question addressed in this section is how to set PCN Charny, et al. Expires January 10, 2008 [Page 18] Internet-Draft PCN with Single Marking July 2007 admission and termination thresholds in the dual marking approach or the single admission threshold and the scaling factor U reflecting the implicit termination threshold in the single-marking approach, to ensure that the result is "not worse than provisioning" in the amount of traffic that can be admitted. Even more specifically we will address what if any are the tradeoffs between the dual-marking and the single-approach arise when answering this question. This question was first raised in [Menth] and is further addressed below. Typically, RNP would size the network (in this specific case for traffic that is expected to use PCN) by making sure that capacity available for this (PCN) type of traffic is sufficient under "normal" circumstances ( that is, under no failure condition, for a given traffic matrix), and under a specific set of single failure scenarios (e.g. failure of each individual single link). Some of the obvious limitations of such provisioning is that o the traffic matrix is often not known well, and at times, especially during flash-crowds, the actual traffic matrix can differ substantially from the one assumed by provisioning o unpredicted, non-planned failures can occur (e.g. multiple links, nodes, etc), causing overload. It is specifically such unplanned cases that serve as the motivation for PCN. Yet, one may want to make sure that for cases that RNP can (and does today) plan for, PCN does no worse when an operator makes the decision to implement PCN on a currently provisioned network. This question directly relates to the choice of the PCN configured admission and termination thresholds. For the dual-marking approach, where the termination and admission thresholds are set independently on any link, one can address this issue as follows [Menth]. If a provisioning tool is available, for a given traffic matrix, one can determine the utilisation of any link used by traffic expected to use PCN under the no-failure condition, and simply set the configured-admissible-rate to that "no-failure utilization". Then a network using PCN will be able to admit as much traffic as the RNP, and will reject any traffic that exceeds the expected traffic matrix. To address resiliency against a set of planned failures, one can use RNP to find the worst-case utilization of any link under the set of all provisioned failures, and then set the configured-termination-rate to that worst case utilisation. Clearly, such setting of PCN thresholds with the dual-marking approach will achieve the following goals: Charny, et al. Expires January 10, 2008 [Page 19] Internet-Draft PCN with Single Marking July 2007 o PCN will admit the same traffic matrix as used by RNP and will protect it against all planned failures without terminating any traffic o When traffic deviates from the planned traffic matrix, PCN will admit such traffic as long as the total usage of any link (without failure) does not exceed the configured-admission threshold, and all admitted traffic will be protected against all planned failures o Additional traffic will not be admitted under the normal, no- failure conditions o Traffic exceeding configure-termination threshold during non- planned failures will be terminated o Under non-planned failures, some of the planned traffic matrix may be terminated, but the remaining traffic will be able to receive its QoS treatment. The above argues that an operator moving from a purely provisioned network to a PCN network can find the settings of the PCN threshold with dual marking in such a way that all admitted traffic is protected against all planned failures. It is easy to see that with the single-marking scheme, the above approach does not work directly [Menth]. Indeed, the ratio between the configured-termination thresholds and the configured-admissible- rate used by the dual-marking approach as described above may not be constant on all links. Since the single-marking approach requires the (implicit) termination rate to be within a fixed factor of the configured admission rate, it can be argued (as was argued in [Menth].) that one needs to set the system-wide ratio U between the (implicit) termination threshold and the configured admission threshold to correspond to the largest ratio between the worst case resilient utilization and the no-failure utilization of RNP, and set the admission threshold on each link to the worst case resilient utilization divided by that system wide ratio. Such approach would result in lower admission thresholds on some links than that of the dual-marking setting of the admission threshold proposed above. It can therefore be argued that PCN with single marking will be able to admit *less* traffic that can be fully protected under the planned set of failures than both RNP and the dual-marking approach. However, the settings of the single-marking threshold proposed above are not the only ones possible, and in fact we propose here that the settings are chosen differently. Such different settings (described below) will result in the following properties of the PCN network: Charny, et al. Expires January 10, 2008 [Page 20] Internet-Draft PCN with Single Marking July 2007 o PCN will admit the same traffic matrix as used by RNP *or more* o The traffic matrix assumed by RNP will be fully protected against all planned failures without terminating any admitted traffic o When traffic deviates from the planned traffic matrix, PCN will admit such traffic as long as the total usage of any link (without failure) does not exceed the configured-admission threshold, However, not all admitted traffic will be protected against all planned failures (i.e. even under planned failures, traffic exceeding the planned traffic matrix may be preempted) o Under non-planned failures, some of the planned traffic matrix may be terminated, but the remaining traffic will be able to receive its QoS treatment. It is easy to see that all of these properties can be achieved if instead of using the largest ratio between worst case resilient utilisation to the no-failure utilisation of RNP across all links for setting the system wide constant U in the single-marking approach as proposed in [Menth], one would use the *smallest* ratio, and then set the configured-admissible-rate to the worst case resilient utilization divided by that ratio. With such setting, the configured-admissions threshold on each link is at least as large as the non-failure RNP utilisation (and hence the planned traffic matrix is always admitted), and the implicit termination threshold is at the worst case planned resilient utilisation of RNP on each link (and hence the planned traffic matrix will be fully protected against the planned failures). Therefore, with such settings, the single-marking draft does as well as RNP or dual-marking with respect to the planned matrix and planned failures. In fact, unlike the dual marking approach, it can admit more traffic on some links than the planned traffic matrix would allow, but it is only guaranteed to protect up to the planned traffic matrix under planned failures. In summary, we have argued that both the single-marking approach and the dual-marking approach can be configured to ensure that PCN "does no worse" than RNP for the planned matrix and the planned failure conditions, (and both can do better than RNP under non-planned conditions). The tradeoff between the two is that although the planned traffic matrix can be admitted with protection guarantees against planned failures with both approaches, the nature of the guarantee for the admitted traffic is different. Dual marking (with the settings proposed) would protect all admitted traffic but would not admit more than planned), while single marking (with the settings proposed) will admit more traffic than planned, but will not guarantee protection against planned failures for traffic exceeding planned utilization. Charny, et al. Expires January 10, 2008 [Page 21] Internet-Draft PCN with Single Marking July 2007 7. Performance Evaluation Comparison 7.1. Relationship to other drafts Initial simulation results of admission and termination mechanisms of [I-D.briscoe-tsvwg-cl-architecture] were reported in [I-D.briscoe-tsvwg-cl-phb]. A follow-up study of these mechanisms is presented in a companion draft [I-D.zhang-pcn-performance-evaluation]. The current draft concentrates on a performance comparison of the admission control mechanism of [I-D.briscoe-tsvwg-cl-phb] and the token-bucket-based admission control described in section 2 of this draft. 7.2. Limitations, Conclusions and Direction for Future Work Due to time constraints, the study performed so far was limited to a small set of topologies, described in the Appendix. The key questions that have been investigated are the comparative sensitivity of the two schemes to parameter settings and the effect of traffic burstiness and of the degree of aggregation on a per ingress-egress pair on the performance of the admission control algorithms under study. The study is limited to the case where there is no packet loss. While this is a reasonable initial assumption for an admission control algorithm that is supposed to maintain the traffic level significantly below the service capacity of the corresponding queue, nevertheless future study is necessary to evaluate the effect of packet loss. 7.2.1. High Level Conclusions The results of this (preliminary) study indicate that there may be a reasonable complexity/performance tradeoff for the choice of admission control algorithm. In turn, this suggests that using a single codepoint and metering technique for admission and Preemption may be a viable option. The key high-level conclusions of the simulation study comparing the performance of queue-based and token-based admission control algorithms are summarized below: 1. At reasonable level of aggregation at the bottleneck and per ingress-egress pair traffic, both algorithms perform reasonably well for the range of traffic models considered. 2. Both schemes are stressed for low levels of ingress-egress aggregation, especially for the burstier traffic models such as SVD. The token bucket scheme is substantially more sensitive to parameter variations than the virtual-queue scheme at the very Charny, et al. Expires January 10, 2008 [Page 22] Internet-Draft PCN with Single Marking July 2007 low levels of ingress-egress aggregation )(1-2 flows per ingress- egress pair), and in general is quite brittle at these very low aggregation levels. It also displays substantial performance degradation with BATCH traffic, and is sensitive to CBR synchronization effects resulting in substantial over-admission (see section 8.4.2). Luckily, these issues quickly diminish as the level of ingress-egress aggregation increases. 3. The absolute value of round-trip time (RTT) or the RTT difference between different ingress-egress pair within the range of continental propagation delays does not appear to have a visible effect on the performance of both algorithms. 4. There is no substantial effect on the bottleneck utilization of multi-bottleneck topologies for both schemes. Both schemes suffer substantial unfairness (and possibly complete starvation) of the long-haul aggregates traversing multiple bottlenecks compared to short-haul flows (a property shared by other MBAC algorithms as well). Token-bucket scheme displayed somewhat larger unfairness than the virtual-queue scheme. 7.2.2. Future work This study is but the first step in performance evaluation of the token-bucket based admission control. Further evaluation should include a range of investigation, including the following o interactions between admission control and termination o effect of signaling delays/probing o effect of loss of marked packets 8. Appendix A: Simulation Details 8.1. Network and Signaling Models Network topologies used in this study are shown in the Figures below. The network is modeled as either Single Link (Fig. A.1), Multi Link Network with a single bottleneck (termed "RTT", Fig. A.2), or a range of multi-bottleneck topologies shown in Fig. A.3 (termed "Parking Lot"). Charny, et al. Expires January 10, 2008 [Page 23] Internet-Draft PCN with Single Marking July 2007 A --- B Figure A.1: Simulated Single Link Network. A \ B - D - F / C Figure A.2: Simulated Multi Link Network. A--B--C A--B--C--D A--B--C--D--E--F | | | | | | | | | | | | | | | | | | | | | | | | | | D E F E F G H G H I J K L (a) (b) (c) Figure A.3: Simulated Multiple-bottleneck (Parking Lot )Topologies. Figure A.1 shows a single link between an ingress and an egress node, all flows enter at node A and depart at node B. This topology is used for the basic verification of the behavior of the algorithms with respect to a single ingress-egress aggregate in isolation. In Figure A.2, A set of ingresses (A,B,C) are connected to an interior node in the network (D). This topology is used to study the behavior of the algorithm where many ingress-egress aggregates share a single bottleneck link. The number of ingresses varied in different simulation experiments in the range of 2-100. All links have generally different propagation delays, in the range 1ms - 100 ms (although in some experiments all propagation delays are set the same. This node D in turn is connected to the egress (F). In this topology, different sets of flows between each ingress and the egress converge on the single link D-F, where pre-congestion notification algorithm is enabled. The capacities of the ingress links are not limiting, and hence no PCN is enable on those. The bottleneck link D-F is modeled with a 10ms propagation delay in all simulations. Therefore the range of round-trip delays in the experiments is from 22ms to 220ms. Another type of network of interest is multi-bottleneck (or Parking Lot, PLT for short) topology. The simplest PLT with 2 bottlenecks is Charny, et al. Expires January 10, 2008 [Page 24] Internet-Draft PCN with Single Marking July 2007 illustrated in Fig A.3(a). An example traffic matrix with this network on this topology is as follows: o an aggregate of "2-hop" flows entering the network at A and leaving at C (via the two links A-B-C) o an aggregate of "1-hop" flows entering the network at D and leaving at E (via A-B) o an aggregate of "1-hop" flows entering the network at E and leaving at F (via B-C) In the 2-hop PLT shown in Fig. A.3(a) the points of congestion are links A--B and B--C. Capacity of all other links is not limiting. We also experiment with larger PLT topologies with 3 bottlenecks(see Fig A.3(b)) and 5 bottlenecks ( Fig A.3 (c)). In all cases, we simulated one ingress-egress pair that carries the aggregate of "long" flows traversing all the N bottlenecks (where N is the number of bottleneck links in the PLT topology), and N ingress-egress pairs that carry flows traversing a single bottleneck link and exiting at the next "hop". In all cases, only the "horizontal" links in Fig. A.3 were the bottlenecks, with capacities of all "vertical" links non-limiting. Propagation delays for all links in all PLT topologies are set to 1ms. Due to time limitations, other possible traffic matrices (e.g. some of the flows traversing a subset of several bottleneck links) have not yet been considered and remain the area for future investigation. Our simulations concentrated primarily on the range of capacities of 'bottleneck' links with sufficient aggregation - above 10 Mbps for voice and 622 Mbps for SVD, up to 2.4 Gbps. But we also investigated slower 'bottleneck' links down to 512 Kbps in some experiments. Higher rate bottleneck speeds wee not considered due to the simulation time limitations. It should generally be expected that the higher link speeds will result in higher levels of aggregation, and hence generally better performance of the measurement-based algorithms. Therefore is seems reasonable to believe that the link speeds studied do provide meaningful evaluation targets. In the simulation model, a call requests arrives at the ingress and immediately sends a message to the egress. The message arrives at the egress after the propagation time plus link processing time (but no queuing delay). When the egress receives this message, it immediately responds to the ingress with the current Congestion- Level-Estimate. If the Congestion-Level-Estimate is below the specified CLE-threshold, the call is admitted, otherwise it is rejected. An admitted call sends packets according to one of the Charny, et al. Expires January 10, 2008 [Page 25] Internet-Draft PCN with Single Marking July 2007 chosen traffic models for the duration of the call (see next section). Propagation delay from source to the ingress and from destination to the egress is assumed negligible and is not modeled. In the simulation model of admission control, a call request arrives at the ingress and immediately sends a message to the egress. The message arrives at the egress after the propagation time plus link processing time (but no queuing delay). When the egress receives this message, it immediately responds to the ingress with the current Congestion Level Estimate. If the Congestion Level Estimate is below the specified CLE- threshold, the call is admitted, otherwise it is rejected. For Flow Termination, once the ingress node of a PCN- domain decides to terminate a flow, that flow is preempted immediately and sends no more packets from that time on. The life of a flow outside the domain described above is not modelled. Propagation delay from source to the ingress and from destination to the egress is assumed negligible and is not modelled. 8.2. Traffic Models Four types of traffic were simulated (CBR voice, on-off traffic approximating voice with silence compression, and on-off traffic with higher peak and mean rates (we termed the latter "Synthetic Video" (SVD) as the chosen peak and mean rate was similar to that of an MPEG video stream. (but for SVD no attempt was made to match any other parameters of this traffic to those of a video stream), and finally real video traces from http://www.tkn.tu-berlin.de/research/trace/trace.html (courtesy Telecommunication Networks Group of Technical University of Berlin). The distribution of flow duration was chosen to be exponentially distributed with mean 1min, regardless of the traffic type. In most of the experiments flows arrived according to a Poisson distribution with mean arrival rate chosen to achieve a desired amount of overload over the configured-admissible-rate in each experiment. Overloads in the range 1x to 5x and underload with 0.95x have been investigated. Note that the rationale for looking at the load 1 and below is to see if any significant amount of "false rejects" would be seen (i.e. one would assume that all traffic should be accepted if the total demand is below the admission threshold). For on-off traffic, on and off periods were exponentially distributed with the specified mean. Traffic parameters for each type are summarized below: 8.2.1. Voice Traffic Models Table A.1 below describes all voice codecs we modeled in our simulation results. Charny, et al. Expires January 10, 2008 [Page 26] Internet-Draft PCN with Single Marking July 2007 The first two rows correspond to our two basic models corresponding to the older G.711 encoding with and without silence compression. These two models are referred simply as "CBR" and "VBR" in the reported simulation results. We also simulated several "mixes" of the different codecs reported in the table below. The primary mix consists of equal proportion of all voice codecs listed below. We have also simulated various other mix consist different proportion of the subset of all codecs. Though these result are not reported in this draft due to their similarities to the primary mix result. ------------------------------------------------------------------- | Name/Codecs|Packet Size|Inter-Arrival|On/Off Period |Average Rate | | | (Bytes) | Time (ms) | Ratio | (kbps) | ------------------------------------------------------------------- | "CBR" | 160 | 20 | 1 | 64 | ------------------------------------------------------------------- | "VBR" | 160 | 20 | 0.34 | 21.75 | ------------------------------------------------------------------- | G.711 CBR | 200 | 20 | 1 | 80 | ------------------------------------------------------------------- | G.711 VBR | 200 | 20 | 0.4 | 32 | ------------------------------------------------------------------- | G.711 CBR | 120 | 10 | 1 | 96 | ------------------------------------------------------------------- | G.711 VBR | 120 | 10 | 0.4 | 38.4 | ------------------------------------------------------------------- | G.729 CBR | 60 | 20 | 1 | 24 | ------------------------------------------------------------------- | G.729 VBR | 60 | 20 | 0.4 | 9.6 | ------------------------------------------------------------------- Table A.1 Simulated Voice Codices. 8.2.2. "Synthetic Video": High Rate ON-OFF traffic with Video-like Mean and Peak Rates ("SVD") This model is on-off traffic with video-like mean-to-peak ratio and mean rate approximating that of an MPEG-2 video stream. No attempt is made to simulate any other aspects of a real video stream, and this model is merely that of on-off traffic. Although there is no claim that this model represents the performance of video traffic under the algorithms in question adequately, intuitively, this model should be more challenging for a measurement-based algorithm than the actual MPEG video, and as a result, 'good' or "reasonable" performance on this traffic model indicates that MPEG traffic should perform at least as well. We term this type of traffic SVD for "Synthetic Video". Charny, et al. Expires January 10, 2008 [Page 27] Internet-Draft PCN with Single Marking July 2007 o Long term average rate 4 Mbps o On Period mean duration 340ms; during the on-period the packets are sent at 12 Mbps (1500 byte packets, packet inter-arrival: 1ms) o Off Period mean duration 660m 8.2.3. Real Video Traces (VTR) We used a publicly available library of frame size traces of long MPEG-4 and H.263 encoded video obtained from http://www.tkn.tu-berlin.de/research/trace/trace.html. Each trace in that repository s roughly 60 minutes in length, consisting of a list of records in the format of . Among the 160 available traces, we picked the two with the highest average rate (averaged over the trace length, in this case, 60 minutes. In addition, the two also have a similar average rate). The trace file used in the simulation is the concatenation of the two. Since the duration of the flow in our simulation is much smaller than the length of the trace, we checked whether the expected rate of flow corresponds to the trace's long term average. To do so, we simulated a number of flows starting from random locations in the trace with duration chosen to be exponentially distributed with the mean of 1min. The results show that the expected rate of flow is roughly the same as the trace's average. In summary, our simulations use a set of segments of the 120 min trace chosen at random offset from the beginning and with mean duration of 1 min. Since the traces provide only the frame size, we also simulated packetization of the frame as a CBR segment with packet size and inter-arrival time corresponding to those of our SVD model. Since the frame size is not always a multiple of the chosen packet size, the last packet in a frame may be shorter than 1500 bytes chosen for the SVD encoding. Traffic characteristics for our VTR models are summarized below: o Average rate 769 Kbps o Each frame is sent with packet length 1500 bytes and packet inter- arrival time 1ms o No traffic is sent between frames. Charny, et al. Expires January 10, 2008 [Page 28] Internet-Draft PCN with Single Marking July 2007 8.2.4. Randomization of Base Traffic Models To emulate some degree of network-introduced jitter, in some experiments we implemented limited randomization of the base models by randomly moving the packet by a small amount of time around its transmission time in the corresponding base traffic model. More specifically, for each packet we chose a random number R, which is picked from uniform distribution in a "randomization-interval", and delayed the packet by R compared to its ideal departure time. We choose randomization-interval to be a fraction of packet-inter- arrive-time of the CBR portion of the corresponding base model. To simulate a range of queueing delays, we varied this fraction from 0.0001 to 0.1. While we do not claim this to be an adequate model for network-introduced jitter, we chose it for the simplicity of implementation as a means to gain insight on any simulation artifacts of strictly CBR traffic generation. We implemented randomized versions of all 5 traffic streams (CBR, VBR, MIX, SVD and VTR) by randomizing the CBR portion of each model 8.3. Parameter Settings 8.3.1. Queue-based settings All the queue-based simulations were run with the following Virtual Queue thresholds: o virtual-queue-rate: configured-admissible-rate, 1/2 link speed o min-marking-threshold: 5ms at virtual-queue-rate o max-marking-threshold: 15ms at virtual-queue-rate o virtual-queue-upper-limit: 20ms at virtual-queue-rate At the egress, the CLE is computed as an exponential weighted moving average (EWMA) on an interval basis, with 100ms measurement interval chosen in all simulations. We simulated the EWMA weight ranging 0.1 to 0.9. The CLE threshold is chosen to be 0.05, 0.15, 0.25, and 0.5. 8.3.2. Token Bucket Settings The token bucket rate is set to the configured-admissible-rate, which is half of the link speed in all experiments. Token bucket depth ranges from 64 to 512 packets. Our simulation results indicate that depth of token bucket has no significant impact on the performance of the algorithms and hence, in the rest of the section, we only present the result with 256 packets bucket depth. Charny, et al. Expires January 10, 2008 [Page 29] Internet-Draft PCN with Single Marking July 2007 The CLE is calculated using EWMA just as in the case of virtual-queue settings, with weights from 0.1 to 0.9. The CLE thresholds are chosen to be 0.0001, 0.001, 0.01, 0.05 in this case. Note that the since meaning of the CLE is different for the Token bucket and queue- based algorithms, so there is no direct correspondence between the choice of the CLE thresholds in the two cases. 8.4. Simulation Details To evaluate the performance of the algorithms, we recorded the actual admitted load at a granularity of 50ms, from which the mean admitted load over the duration of the simulation run can be computed. We verified that the actual admitted load at any time does not deviate much from the mean admitted load in each experiment by computing the coefficient of variation (CV is consistently 0.07 for CBR, 0.15 for VBR, 0.17 for VTR and 0.51 for SVD for all experiments). Finally, the performance of the algorithms is evaluated using a metric called over-admission-percentage, which is calculated as a percentage difference between the mean admitted load (with the mean taken over the duration of the experiment) and the configured admission rate. Given reasonably small deviation of the admitted rate from the mean admitted in the experiments, this seems reasonable. 8.4.1. Sensitivity to EWMA weight and CLE Table A.2 summarized the comparison result of over-admission- percentage values from 15 experiments with different [weight, CLE threshold] settings for each type of traffic and each topology. The Ratio of the demand on the bottleneck link to the configured admission threshold is set to 5x. (In the results for 0.95x can be found in previous draft). For parking lot topologies we report the worst case result across all bottlenecks. We present here only the extreme value over the range of resulting over-admission-percentage values. We found that the virtual-queue admission control algorithm works reliably with the range of parameters we simulated, for all five types of traffic. In addition, except for SVD, the performance is insensitive to the parameters change under all tested topologies. For SVD, the algorithms does show certain sensitivity to the tested parameters. The high level conclusion that can be drawn is that (predictably) high peak-to-mean ratio SVD traffic is substantially more stressful to the queue-based admission control algorithm, but a set of parameters exists that keeps the over-admission within about -4% - +7% of the expected load even for the bursty SVD traffic. The token bucket-based admission control algorithm shows higher sensitivity to the parameter settings compared to the virtual queue Charny, et al. Expires January 10, 2008 [Page 30] Internet-Draft PCN with Single Marking July 2007 based algorithm. It is important to note here that for the token bucket-based admission control no traffic will be marked until the rate of traffic exceeds the configured admission rate by the chosen CLE. As a consequence, even with the ideal performance of the algorithms, the over-admission-percentage will not be 0, rather it is expected to equal to CLE threshold if the algorithm performs as expected. Therefore, a more meaningful metric for the token-based results is actually the over-admission-percentage (listed below) minus the corresponding (CLE threshold * 100). For example, for CLE = 0.01, one would expect that 1% over-admission is inherently embedded in the algorithm. When comparing the performance of token bucket (with the adjusted over-admission-percentage) to its corresponding virtual queue result, we found that token bucket performs only slightly worse for voice-like CBR VBR, and MIX traffic. The results for SVD traffic require some additional commentary. Note from the results in Table A.2. in the Single Link topology the performance of the token-based solution is comparable to the performance of the queue-based scheme. However, for the RTT topology, the worse case performance for SVD traffic becomes very bad, with up to 23% over-admission in a high overload. We investigated two potential causes of this drastic degradation of performance by concentrating on two key differences between the Single Link and the RTT topologies: the difference in the round-trip times and the degree of aggregation in a per ingress-egress pair aggregate. To investigate the effect of the difference in round-trip times, we also conducted a subset of the experiments described above using the RTT topology that has the same RTT across all ingress-egress pairs rather than the range of RTTs in one experiment. We found out that neither the absolute nor the relative difference in RTT between different ingress-egress pairs appear to have any visible effect on the over-load performance or the fairness of both algorithms (we do not present these results here as their are essentially identical to those in Table A.2). In view of that and noting that in the RTT topology we used for these experiments for the SVD traffic, there is only 1 highly bursty flow per ingress, we believe that the severe degradation of performance in this topology is directly attributable to the lack of traffic aggregation on the ingress-egress pair basis. Charny, et al. Expires January 10, 2008 [Page 31] Internet-Draft PCN with Single Marking July 2007 (preamble) ------------------------------------------------- | Type | Topo | Over Admission Perc Stats | | | | Queue-based | Bucket-Based | | | | Min Max | Min Max | |------|--------|---------------------------------| | | S.Link | 0.224 1.105 | -0.99 1.373 | | CBR | RTT | 0.200 1.192 | 6.495 9.403 | | | PLT | -0.93 0.990 | -2.24 2.215 | |-------------------------------------------------| | | S.Link | -0.07 1.646 | -2.94 2.760 | | VBR | RTT | -0.11 1.830 | -1.92 6.384 | | | PLT | -1.48 1.644 | -4.34 3.707 | |-------------------------------------------------| | | S.Link | -0.14 1.961 | -2.85 2.153 | | MIX | RTT | -0.46 1.803 | -3.18 2.445 | | | PLT | -1.62 1.031 | -3.69 2.955 | |-------------------------------------------------| | | S.Link | -0.05 1.581 | -2.36 2.247 | | VTR | RTT | -0.57 1.313 | -1.44 4.947 | | | PLT | -1.24 1.071 | -3.05 2.828 | |-------------------------------------------------| | | S.Link | -2.73 6.525 | -11.25 6.227 | | SVD | RTT | -2.98 5.357 | -4.30 23.48 | | | PLT | -4.84 4.294 | -11.40 6.126 | ------------------------------------------------- Table A.2 Parameter sensitivity: Queue-based v.s. Token Bucket- based. For the single bottleneck topologies (S. Link and RTT) the overload column represents the ratio of the mean demand on the bottleneck link to the configured admission threshold. For parking lot topologies we report the worst case result across all bottlenecks. We present here only the worst case value over the range of resulting over-admission-percentage values. 8.4.2. Effect of Ingress-Egress Aggregation To investigate the effect of Ingress-Egress Aggregation, we fix a particular EWMA weight and CLE setting (in this case, weight=0.3, for virtual queue scheme CLE=0.05, and for the token bucket scheme CLE=0.0001), vary the level of ingress-egress aggregation by using RTT topologies with different number of ingresses. Table A.3 shows the change of over-admission-percentage with respect to the increase in the number of ingress for both virtual queue and token bucket. For all traffic, the leftmost column in the represents the case with the largest aggregation (only two ingresses), while the right most column represents the lowest level of aggregation (expected number calls per ingress is just 1 in this case). In all Charny, et al. Expires January 10, 2008 [Page 32] Internet-Draft PCN with Single Marking July 2007 experiments the aggregate load on the bottleneck is the same across each traffic type (with the aggregate load being evenly divided between all ingresses). As seen from Table A.3. the virtual queue based approach is relatively insensitive to the level of ingress-egress aggregation. On the other hand, the Token Bucket based approach is performing significantly worse at lower levels of ingress-egress aggregation. For example for CBR (with expect 1-call per ingress), the over- admission-percentage can be as bad as 45%. Charny, et al. Expires January 10, 2008 [Page 33] Internet-Draft PCN with Single Marking July 2007 (preamble) -------------------------------------------------------------------- | | Type | Number of Ingresses | | |------|---------------------------------------------------- | | | | 2 | 10 | 70 | 300 | 600 | 1000 | | | CBR | 1.003 | 1.024 | 0.976 | 0.354 | -1.45 | 0.396 | | |------------------------------------------------------------| | | | 2 | 10 | 70 | 300 | 600 | 1800 | | | VBR | 1.021 | 1.117 | 1.006 | 0.979 | 0.721 | -0.85 | | |------------------------------------------------------------| |Virtual| | 2 | 10 | 70 | 300 | 600 | 1000 | | Queue | MIX | 1.080 | 1.163 | 1.105 | 1.042 | 1.132 | 1.098 | | Based |------------------------------------------------------------| | | | 2 | 10 | 70 | 140 | 300 | 600 | | | VTR | 1.109 | 1.053 | 0.842 | 0.859 | 0.856 | 0.862 | | |------------------------------------------------------------| | | | 2 | 10 | 35 | 70 | 140 | 300 | | | SVD | -0.08 | 0.009 | -0.11 | -0.286 | -1.56 | 0.914 | -------------------------------------------------------------------- -------------------------------------------------------------------- | | Type | Number of Ingresses | | |------|---------------------------------------------------- | | | | 2 | 10 | 100 | 300 | 600 | 1000 | | | CBR | 0.725 | 0.753 | 7.666 | 21.16 | 33.69 | 44.58 | | |------------------------------------------------------------| | | | 2 | 10 | 100 | 300 | 600 | 1800 | | | VBR | 0.532 | 0.477 | 1.409 | 3.044 | 5.812 | 14.80 | |Token |------------------------------------------------------------| |Bucket | | 2 | 10 | 100 | 300 | 600 | 1800 | |Based | MIX | 0.736 | 0.649 | 1.960 | 4.652 | 10.31 | 27.69 | | |------------------------------------------------------------| | | | 2 | 10 | 70 | 140 | 300 | 600 | | | VTR | 0.758 | 0.889 | 1.335 | 1.694 | 4.128 | 13.28 | | |------------------------------------------------------------| | | | 2 | 10 | 35 | 100 | 140 | 300 | | | SVD | -1.64 | -0.93 | 0.237 | 4.732 | 7.103 | 8.799 | -------------------------------------------------------------------- Table A.3 Synchronisation effect with low Ingress-Egress Aggregation: Queue-based v.s. Token bucket-based. Our investigation reveals that the cause of the poor performance of the token bucket scheme in our experiments is attributed directly to the same "synchronization" effect as was earlier described in the Termination (preemption) results in [I-D.zhang-pcn-performance-evaluation], and to which we refer the reader for a more detailed description of this effect. In short however, for CBR traffic, a periodic pattern arises where packets of a given flow see roughly the same state of the token bucket at the Charny, et al. Expires January 10, 2008 [Page 34] Internet-Draft PCN with Single Marking July 2007 bottleneck, and hence either all get marked, or all do not get marked. As a result, at low levels of aggregation a subset of ingresses always get their packets marked, while some other ingresses do not. As reported in [I-D.zhang-pcn-performance-evaluation], in the case of Termination this synchronization effect is beneficial to the algorithm. In contrast, for Admission, this synchronization is detrimental to the algorithm performance at low aggregations. This can be easily explained by noting that ingresses which packets do not get marked continue admitting new traffic even if the aggregate bottleneck load has been reached or exceeded. Since most of the other traffic patterns contain large CBR segments, this effect is seen with other traffic types as well, although to a different extent. A natural initial reaction can be to write-off this effect as purely a simulation artifact. In fact, one can expect that if some jitter is introduced into the strict CBR traffic pattern so that the packet transmission is longer strictly periodic, then the "synchronization" effect might be easily broken. To verify whether this is indeed the case, we ran the experiment with same topologies and parameter settings, but with randomized version of the base traffic types. The results are summarized in Table A.4. Note, the columns labeled with fractions (e.g. 0.0001) correspond to randomized traffic with a randomization-interval of (specified fraction) x packet-inter-arrival-time. It also means that on average, the packets are delayed by (specified fraction) x packet- inter-arrival-time / 2. In addition, the column of "No-Rand" actually correspond to the token bucket results in Table A.3). It turns out that indeed introducing enough jitter does break the synchronization effect and the performance of the algorithm much improves. However, it takes sufficient amount of the randomization before it is noticed. For instance, in the CBR graph, the only column that shows no aggregation effect is the one labeled with "0.05", which translates to expected packet deviation from its ideal CBR transmit time of 0.5ms. While 0.5ms per-hop deviation is not unreasonable to expect, in well provisioned networks with a relatively small amount of voice traffic in the priority queue one might find lower levels of network-induced jitter. In any case, these results indicates the "synchronization" effect can not be completely written off as a simulation artifact. The good news, however, that this effect is visible only at very low ingress-egress aggregation levels, and as the ingress-egress aggregation increases, the effect quickly disappears. We observed the synchronization effect consistently across all types Charny, et al. Expires January 10, 2008 [Page 35] Internet-Draft PCN with Single Marking July 2007 of traffic we tested with the exception of VTR. VTR also exhibits some aggregation effect - however randomization of its CBR portion has almost have no effect on performance. We suspect this is because the randomization we perform is at packet level, while the synchronization that seems to be causing the performance degradation at low ingress-egress aggregation for VTR traffic occurs at frame- level. Although our investigation of this issue is not completed yet, our preliminary results show that if we calculating random deviation for our artificially induced jitter using frame inter- arrival time instead of packet-interarrival time, we can reduce the over-admission percentage for VTR to roughly 3%. It is unclear however, whether such randomisation at the frame level meaningfully reflects network-introduced jitter. Charny, et al. Expires January 10, 2008 [Page 36] Internet-Draft PCN with Single Marking July 2007 ---------------------------------------------------------------- | | No. | Randomization Interval | | | Ingr | No-Rand | 0.0001 | 0.001 | 0.005 | 0.01 | 0.05 | |----------------------------------------------------------------| | | 2 | 0.725 | 0.683 | 0.784 | 0.725 | 0.772 | 0.787 | | | 10 | 0.753 | 0.725 | 0.543 | 0.645 | 0.733 | 0.854 | | | 100 | 7.666 | 5.593 | 2.706 | 1.454 | 1.226 | 0.692 | | CBR | 300 | 21.16 | 15.52 | 6.699 | 3.105 | 2.478 | 1.624 | | | 600 | 33.69 | 25.51 | 11.41 | 6.021 | 4.676 | 2.916 | | | 1000 | 44.58 | 36.20 | 17.03 | 7.094 | 5.371 | 3.076 | |----------------------------------------------------------------| | | 2 | 0.532 | 0.645 | 0.670 | 0.555 | 0.237 | 0.740 | | | 10 | 0.477 | 0.596 | 0.703 | 0.494 | 0.662 | 0.533 | | | 100 | 1.409 | 1.236 | 1.043 | 0.810 | 1.202 | 1.016 | | VBR | 300 | 3.044 | 2.652 | 2.093 | 1.588 | 1.755 | 1.671 | | | 600 | 5.812 | 4.913 | 3.539 | 2.963 | 2.803 | 2.277 | | | 1800 | 14.80 | 12.59 | 8.039 | 6.587 | 5.694 | 4.733 | |----------------------------------------------------------------| | | 2 | 0.736 | 0.753 | 0.627 | 0.751 | 0.850 | 0.820 | | | 10 | 0.649 | 0.737 | 0.780 | 0.824 | 0.867 | 0.787 | | | 100 | 1.960 | 1.705 | 1.428 | 1.160 | 1.149 | 1.034 | | MIX | 300 | 4.652 | 4.724 | 3.760 | 2.692 | 2.449 | 2.027 | | | 600 | 10.31 | 9.629 | 7.289 | 5.520 | 4.958 | 3.710 | | | 1000 | 17.21 | 15.96 | 11.05 | 8.700 | 7.382 | 5.061 | | | 1800 | 27.69 | 23.46 | 16.53 | 12.04 | 10.84 | 8.563 | |----------------------------------------------------------------| | | 2 | 0.758 | 0.756 | 0.872 | 0.894 | 0.825 | 0.849 | | | 10 | 0.889 | 0.939 | 0.785 | 0.704 | 0.843 | 0.574 | | | 70 | 1.335 | 1.101 | 1.066 | 1.181 | 0.978 | 0.946 | | VTR | 140 | 1.694 | 1.162 | 1.979 | 1.791 | 1.684 | 1.573 | | | 300 | 4.128 | 4.191 | 3.545 | 3.307 | 3.964 | 3.465 | | | 600 | 13.28 | 13.76 | 13.81 | 13.18 | 12.97 | 12.35 | |----------------------------------------------------------------| | | 2 | -1.64 | -2.30 | -2.14 | -1.61 | -1.01 | -0.89 | | | 10 | -0.93 | -1.65 | -2.41 | -2.98 | -2.58 | -2.27 | | | 35 | 0.237 | -0.31 | -0.35 | -1.02 | -0.96 | -2.16 | | SVD | 100 | 4.732 | 4.640 | 4.152 | 2.287 | 1.887 | -0.03 | | | 140 | 7.103 | 6.002 | 5.560 | 4.974 | 3.619 | 0.091 | | | 300 | 8.799 | 10.72 | 9.840 | 7.530 | 6.281 | 4.270 | ---------------------------------------------------------------- Table A.4 Ingress-Egress Aggregation: Token-based results for Randomized traffic. Columns labeled with fractions correspond to the randomizarion interval of (specified fraction) x packet-inter- arrival-time). Finally, we investigated the impact of call arrival assumptions at different levels of ingress-egress aggregation by comparing the Charny, et al. Expires January 10, 2008 [Page 37] Internet-Draft PCN with Single Marking July 2007 results with Poisson and BATCH arrivals. We reported in [I-D.zhang-pcn-performance-evaluation] that virtual queue -based admission is relatively insensitive to the BATCH vs Poisson arrivals, even at lower aggregation levels. In contrast, the call arrival assumption does affect the performance of token bucket-based algorithm, and causes substantial degradation of performance at low ingress-egress aggregation level. An example result with CBR traffic is presented in table A.5. Here we use batch arrival with mean = 5. The results show that with the lowest aggregation, the batch arrival gives worse result than the normal Poisson arrival, however, as the level of aggregation become sufficient (e.g. 100 ingress, 10 call/ ingress), the difference becomes insignificant. This behavior is consistent across all types of traffic. (preamble) ---------------------------------------------------------------- | | No. | Deviation Interval | | | Ingr | No-Rand | 0.0001 | 0.001 | 0.005 | 0.01 | 0.05 | |----------------------------------------------------------------| | | 2 | 0.918 | 1.007 | 0.836 | 0.933 | 1.014 | 0.971 | | | 10 | 1.221 | 0.936 | 0.767 | 0.906 | 0.920 | 0.857 | | | 100 | 8.857 | 7.092 | 3.265 | 1.821 | 1.463 | 1.036 | | CBR | 300 | 29.39 | 22.59 | 8.596 | 4.979 | 4.550 | 2.165 | | | 600 | 43.36 | 37.12 | 17.37 | 10.02 | 8.005 | 4.223 | | | 1000 | 63.60 | 50.36 | 25.48 | 12.82 | 9.339 | 6.219 | |----------------------------------------------------------------| Table A.5 In/Egress Aggregation with batch traffic: Token-based results. 8.4.3. Effect of Multiple Bottlenecks The results in Table A.2 (Section 9.5.1, parameter sensitivity study) implied that from the bottleneck point of view, the performance on the multiple-bottleneck topology, for all types of traffic, is comparable to the ones on the SingleLink, for both queue-based and token bucket-based algorithms. However, the results in Table A.2 only show the worst case values over all bottleneck links. In this section we consider two other aspects of the Multiple Bottleneck effects: relative performance at individual bottlenecks and fairness of bandwidth usage between the short- and the long- haul ingress- egress aggregates. 8.4.3.1. Relative performance of different bottlenecks In Table A.5, we show a snapshot of the behavior with 5 bottleneck topology, with the goal of studying the performance of different bottlenecks more closely. Here, the over-admission-percentage displayed is an average across all 15 experiments with different Charny, et al. Expires January 10, 2008 [Page 38] Internet-Draft PCN with Single Marking July 2007 [weight, CLE] setting. (We do observe the same behavior in each of the individual experiment, hence providing a summarized statistics is meaningful). One differences in token-bucket case vs the queue-based admissions in the PLT topology case revealed in Table A.6 is that there appears to be a consistent relationship between the position of the bottleneck link (how far downstream it is) and its over-admission-percentage. The data shows the further downstream the bottleneck is, the more it tends to over-admit, regardless the type of the traffic. The exact cause of this phenomenon is yet to be explained, but the effect of it seems to be insignificant in magnitude, at least in the experiments we ran. (preamble) --------------------------------------------------------- | | Traffic | Bottleneck LinkId | | | Type | 1 | 2 | 3 | 4 | 5 | | |-------------------------------------------------| | | CBR | 0.288 | 0.286 | 0.238 | 0.332 | 0.306 | | |-------------------------------------------------| | | VBR | 0.319 | 0.420 | 0.257 | 0.341 | 0.254 | | Queue |-------------------------------------------------| | Based | MIX | 0.363 | 0.394 | 0.312 | 0.268 | 0.205 | | |-------------------------------------------------| | | VTR | 0.466 | 0.309 | 0.223 | 0.363 | 0.317 | | |-------------------------------------------------| | | SVD | 0.319 | 0.420 | 0.257 | 0.341 | 0.254 | |--------------------------------------------------------- | | Traffic | Bottleneck LinkId | | | Type | 1 | 2 | 3 | 4 | 5 | | |-------------------------------------------------| | | CBR | 0.121 | 0.300 | 0.413 | 0.515 | 0.700 | | |-------------------------------------------------| | Token | VBR | -0.07 | 0.251 | 0.496 | 0.698 | 1.044 | |Bucket |-------------------------------------------------| | Based | MIX | 0.042 | 0.350 | 0.468 | 0.716 | 0.924 | | |-------------------------------------------------| | | VTR | 0.277 | 0.488 | 0.642 | 0.907 | 1.117 | | |-------------------------------------------------| | | SVD | -2.64 | -2.50 | -1.72 | -1.57 | -1.19 | --------------------------------------------------------- Table A.6 Bottleneck Performance: queue-based v.s. token bucket- based. Charny, et al. Expires January 10, 2008 [Page 39] Internet-Draft PCN with Single Marking July 2007 8.4.3.2. (Un)Fairness Between Different Ingress-Egress pairs It was reported in [I-D.zhang-pcn-performance-evaluation] that virtual-queue-based admission control favors significantly short-haul connection over long-haul connections. As was discussed there, this property is in fact common for measurement-based admission control algorithms (see for example [Jamin] for a discussion). It is common knowledge that in the limit of large demands, long-haul connections can be completely starved. We show in [I-D.zhang-pcn-performance-evaluation] that in fact starvation of long-haul connections can occur even with relatively small (but constant) overloads. We identify there that the primary reason for it is a de-synchronization of the "congestion periods" at different bottlenecks, resulting in the long-haul connections almost always seeing at least one bottleneck and hence almost never being allowed to admit new flows. We refer the reader to that draft for more detail. Here we investigate the comparative behavior of the token-bucket based scheme and virtual queue based scheme with respect to fairness. The fairness is illustrated using the ratio between bandwidth of the long-haul aggregates and the short-haul aggregates. As is intuitively expected, (and also confirmed experimentally), the unfairness is the larger the higher the demand, and the more bottlenecks traversed by the long-haul aggregate Therefore, we report here the "worst case" results across our experiments corresponding to the 5x demand overload and the 5-PLT topology. Table A.7 summaries, at 5x overload, with CLE=0.05 (for virtual queue), 0.0001(for token bucket), the fairness results to different weight and topology. We display the ratio as function of time, in 10 sec increments, (the reported ratios are averaged over the corresponding 10 simulation-second interval). The result presented in this section uses the aggregates that traverse the first bottleneck. The results on all other bottlenecks are extremely similar. Charny, et al. Expires January 10, 2008 [Page 40] Internet-Draft PCN with Single Marking July 2007 (preamble) -------------------------------------------------------------------- | |Topo|Weight| Simulation Time (s) | | | | | 10 | 20 | 30 | 40 | 50 | 60 | 70 | 80 | | |-----------------------------------------------------------| | | | 0.1 |0.99 |1.04 |1.14 |1.14 |1.23 |1.23 |1.35 |1.46 | | |PLT5| 0.5 |1.00 |1.17 |1.24 |1.41 |1.81 |2.13 |2.88 |3.05 | | | | 0.9 |1.03 |1.42 |1.74 |2.14 |2.44 |2.91 |3.83 |4.20 | | |-----------------------------------------------------------| |Virtual| | 0.1 |1.02 |1.08 |1.15 |1.29 |1.33 |1.38 |1.37 |1.42 | |Queue |PLT3| 0.5 |1.02 |1.04 |1.07 |1.19 |1.24 |1.30 |1.34 |1.33 | |Based | | 0.9 |1.02 |1.09 |1.23 |1.41 |1.65 |2.10 |2.63 |3.18 | | |-----------------------------------------------------------| | | | 0.1 |1.02 |0.98 |1.03 |1.11 |1.22 |1.21 |1.25 |1.31 | | |PLT2| 0.5 |1.02 |1.06 |1.14 |1.17 |1.15 |1.31 |1.41 |1.41 | | | | 0.9 |1.02 |1.04 |1.11 |1.30 |1.56 |1.61 |1.62 |1.67 | -------------------------------------------------------------------- -------------------------------------------------------------------- | |Topo|Weight| Simulation Time (s) | | | | | 10 | 20 | 30 | 40 | 50 | 60 | 70 | 80 | | |-----------------------------------------------------------| | | | 0.1 |1.03 |1.48 |1.83 |2.34 |2.95 |3.33 |4.32 |4.65 | | |PLT5| 0.5 |1.08 |1.53 |1.90 |2.44 |3.04 |3.42 |4.47 |4.83 | | | | 0.9 |1.08 |1.48 |1.80 |2.26 |2.82 |3.19 |4.23 |4.16 | | |-----------------------------------------------------------| |Token | | 0.1 |1.02 |1.26 |1.45 |1.57 |1.69 |1.76 |1.92 |1.94 | |Bucket |PLT3| 0.5 |1.07 |1.41 |1.89 |2.36 |2.89 |3.63 |3.70 |3.82 | |Based | | 0.9 |1.07 |1.33 |1.59 |1.94 |2.41 |2.80 |2.75 |2.90 | | |-----------------------------------------------------------| | | | 0.1 |1.03 |1.10 |1.43 |2.06 |2.28 |2.85 |3.09 |2.90 | | |PLT2| 0.5 |1.07 |1.32 |1.47 |1.72 |1.71 |1.81 |1.89 |1.94 | | | | 0.9 |1.09 |1.27 |1.51 |1.86 |1.82 |1.88 |1.88 |2.06 | ------------------------------------------------------------------- Table A.7 Fairness performance: Virtual Queue v.s. Token Bucket. The numbers in the cells represent the ratio between the bandwidth of the long- and short-haul aggregates. Each row represents the time series of these results in 10 simulation second increments. To summarize, we observed consistent beatdown effect across all experiments for both virtual-queue and token-bucket admission algorithms, although the exact extent of the unfairness depends on the demand overload, topology and parameters settings. To further quantify the effect of these factors remains an area of future work. We also note that the cause of the beatdown effect appears to be largely independent of the specific algorithm, and is likely to be relevant to other PCN proposals as well. Charny, et al. Expires January 10, 2008 [Page 41] Internet-Draft PCN with Single Marking July 2007 9. Security Considerations TBD 10. IANA Considerations 11. References 11.1. Normative References [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. 11.2. Informative References [I-D.babiarz-pcn-3sm] Babiarz, J., "Three State PCN Marking", draft-babiarz-pcn-3sm-00 (work in progress), July 2007. [I-D.briscoe-tsvwg-cl-architecture] Briscoe, B., "An edge-to-edge Deployment Model for Pre- Congestion Notification: Admission Control over a DiffServ Region", draft-briscoe-tsvwg-cl-architecture-04 (work in progress), October 2006. [I-D.briscoe-tsvwg-cl-phb] Briscoe, B., "Pre-Congestion Notification marking", draft-briscoe-tsvwg-cl-phb-03 (work in progress), October 2006. [I-D.briscoe-tsvwg-re-ecn-border-cheat] Briscoe, B., "Emulating Border Flow Policing using Re-ECN on Bulk Data", draft-briscoe-tsvwg-re-ecn-border-cheat-01 (work in progress), June 2006. [I-D.briscoe-tsvwg-re-ecn-tcp] Briscoe, B., "Re-ECN: Adding Accountability for Causing Congestion to TCP/IP", draft-briscoe-tsvwg-re-ecn-tcp-03 (work in progress), October 2006. [I-D.davie-ecn-mpls] Davie, B., "Explicit Congestion Marking in MPLS", draft-davie-ecn-mpls-01 (work in progress), October 2006. [I-D.eardley-pcn-architecture] Eardley, P., "Pre-Congestion Notification Architecture", Charny, et al. Expires January 10, 2008 [Page 42] Internet-Draft PCN with Single Marking July 2007 draft-eardley-pcn-architecture-00 (work in progress), June 2007. [I-D.lefaucheur-emergency-rsvp] Faucheur, F., "RSVP Extensions for Emergency Services", draft-lefaucheur-emergency-rsvp-02 (work in progress), June 2006. [I-D.westberg-pcn-load-control] Westberg, L., "LC-PCN - The Load Control PCN solution", draft-westberg-pcn-load-control-00 (work in progress), May 2007. [I-D.zhang-pcn-performance-evaluation] Zhang, X., "Performance Evaluation of CL-PHB Admission and Pre-emption Algorithms", draft-zhang-pcn-performance-evaluation-01 (work in progress), March 2007. 11.3. References [Jamin] "A Measurement-based Admission Control Algorithm for Integrated Services Packet Networks", 1997. [Menth] "PCN-Based Resilient Network Admission Control: The Impact of a Single Bit", 2007. Authors' Addresses Anna Charny Cisco Systems, Inc. 1414 Mass. Ave. Boxborough, MA 01719 USA Email: acharny@cisco.com Xinyang (Joy) Zhang Cisco Systems, Inc. and Cornell University 1414 Mass. Ave. Boxborough, MA 01719 USA Email: joyzhang@cisco.com Charny, et al. Expires January 10, 2008 [Page 43] Internet-Draft PCN with Single Marking July 2007 Francois Le Faucheur Cisco Systems, Inc. Village d'Entreprise Green Side - Batiment T3 , 400 Avenue de Roumanille, 06410 Biot Sophia-Antipolis, France Email: flefauch@cisco.com Vassilis Liatsos Cisco Systems, Inc. 1414 Mass. Ave. Boxborough, MA 01719 USA Email: vliatsos@cisco.com Charny, et al. Expires January 10, 2008 [Page 44] Internet-Draft PCN with Single Marking July 2007 Full Copyright Statement Copyright (C) The IETF Trust (2007). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Intellectual Property The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org. Acknowledgment Funding for the RFC Editor function is provided by the IETF Administrative Support Activity (IASA). Charny, et al. Expires January 10, 2008 [Page 45]