Network Working Group M. Menth Internet-Draft F. Lehrieder Expires: May 22, 2008 University of Wuerzburg November 19, 2007 Performance Evaluation of PCN-Based Algorithms draft-menth-pcn-performance-01 Status of this Memo By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on May 22, 2008. Copyright Notice Copyright (C) The IETF Trust (2007). Menth & Lehrieder Expires May 22, 2008 [Page 1] Internet-Draft PCN Performance Evaluation November 2007 Abstract This document presents a summary of performance studies for PCN-based admission control and flow termination. The numerical results were obtained by simulation or mathematical analysis. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5 3. Comparison of Marking Algorithms for PCN-Based Admission Control . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 3.1. Definition of Simulated Entities and Simulation Setup . . 6 3.1.1. Metering and Marking Mechanisms . . . . . . . . . . . 6 3.1.2. Congestion Level Estimator . . . . . . . . . . . . . . 6 3.1.3. Simulation Setup . . . . . . . . . . . . . . . . . . . 7 3.2. Impact of the Marking Threshold T and the Queue Size S . . 8 3.3. Two Marking Strategies with Different Admission Control Policies . . . . . . . . . . . . . . . . . . . . . 8 3.3.1. Marking with Clear Decisions (MCD) . . . . . . . . . . 8 3.3.2. Marking with Early Warning (MEW) . . . . . . . . . . . 8 3.4. Impact of Ramp Marking . . . . . . . . . . . . . . . . . . 8 3.5. Impact of the Memory M of the Congestion Level Estimator . . . . . . . . . . . . . . . . . . . . . . . . 9 3.6. Impact of Traffic Characteristics . . . . . . . . . . . . 9 3.7. Response Time of the Marking to Sudden Overload . . . . . 10 3.8. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . 10 4. Performance Evaluation of Admission Control with Single-Marking . . . . . . . . . . . . . . . . . . . . . . . . 12 4.1. Impact on Probing . . . . . . . . . . . . . . . . . . . . 12 5. Performance Evaluation of Measured Rate Termination (MRT) . . 14 5.1. Two Options for MRT . . . . . . . . . . . . . . . . . . . 14 5.1.1. Direct MRT . . . . . . . . . . . . . . . . . . . . . . 14 5.1.2. Indirect MRT . . . . . . . . . . . . . . . . . . . . . 14 5.2. Impact of Packet Loss . . . . . . . . . . . . . . . . . . 15 5.2.1. Direct MRT under Lose & Mark . . . . . . . . . . . . . 15 5.2.2. Indirect MRT under Lose & Mark . . . . . . . . . . . . 15 5.2.3. Direct MRT under Mark & Lose . . . . . . . . . . . . . 15 5.2.4. Indirect MRT under Mark & Lose . . . . . . . . . . . . 16 5.2.5. Summary . . . . . . . . . . . . . . . . . . . . . . . 16 5.3. Unintended Traffic Termination with Indirect MRT through Badly Aligned Measurement Intervals . . . . . . . 16 5.3.1. Experiments with Almost CBR Traffic . . . . . . . . . 17 5.3.2. Experiments with VBR Traffic . . . . . . . . . . . . . 17 5.3.3. Experiments with On/Off Traffic . . . . . . . . . . . 17 5.3.4. Experiments with Rerouted Traffic . . . . . . . . . . 18 5.3.5. Summary . . . . . . . . . . . . . . . . . . . . . . . 18 Menth & Lehrieder Expires May 22, 2008 [Page 2] Internet-Draft PCN Performance Evaluation November 2007 6. Performance Evaluation of Marked Flow Termination on a Single Link . . . . . . . . . . . . . . . . . . . . . . . . . 19 6.1. Relative Termination Aggressiveness g_rel . . . . . . . . 19 6.2. Impact of System Parameters . . . . . . . . . . . . . . . 19 6.3. Improvements to the Metering and Marking Algorithm . . . . 20 7. Performance Evaluation of Marked Flow Termination (MFT) with Multiple Bottleneck Links . . . . . . . . . . . . . . . . 22 7.1. Several Serial Links Carrying Only a Common Aggregate . . 22 7.2. Two Serial Links Carrying a Common Aggregate with Cross Traffic on the Second Link . . . . . . . . . . . . . 23 7.3. Two Serial Links Carrying a Common Aggregate with Cross Traffic on the First Link . . . . . . . . . . . . . 23 7.4. Two Serial Links Carrying a Common Aggregate with Cross Traffic on Both Links . . . . . . . . . . . . . . . 24 7.5. Summary . . . . . . . . . . . . . . . . . . . . . . . . . 25 8. Changes from Previous Revisions . . . . . . . . . . . . . . . 26 8.1. Changes from Version -00 to Version -01 . . . . . . . . . 26 9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 27 9.1. Normative References . . . . . . . . . . . . . . . . . . . 27 9.2. Informative References . . . . . . . . . . . . . . . . . . 27 9.3. Other References . . . . . . . . . . . . . . . . . . . . . 27 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 28 Intellectual Property and Copyright Statements . . . . . . . . . . 29 Menth & Lehrieder Expires May 22, 2008 [Page 3] Internet-Draft PCN Performance Evaluation November 2007 1. Introduction Pre-congestion notification (PCN) is based on the idea of marking packets when a certain load threshold on a link is exceeded by PCN traffic. Then, the marking of a packet at the PCN egress node provides information whether the rate threshold of at least one link of the path over which the packet was carried was exceeded by PCN traffic. This information can be used for admission control and flow termination. Several approaches such as Single-Marking (SM) [I-D.charny-pcn-single-marking], CL [I-D.briscoe-tsvwg-cl-architecture], 3SM [I-D.babiarz-pcn-3sm] have been proposed for that purpose. An overview of the basic concept is given in [I-D.ietf-pcn-architecture]. The University of Wuerzburg is conducting performance studies to understand basic mechanisms and to compare different approaches. This document is intended to collect and present summaries of performance results documented in more detail in technical papers that are available online. Currently, it covers the following studies. o A summary of the results of [TR437] is presented in Section 3. [TR437] studies the impact of virtual queue (token bucket) parameters on marking results for threshold and ramp marking and gives a comparison. o A summary of the results in [Menth07] regarding "Performance Evaluation of Admission Control with Single-Marking" is presented in Section 4. o A summary of the results in [Menth07] regarding "Performance Evaluation of Measured Rate Termination (MRT)" is presented in Section 5. o A summary of the results in [Menth07] regarding "Performance Evaluation of Marked Flow Termination (MFT) on a Single Link" is presented in Section 6. o A summary of the results in [Menth07] regarding "Performance Evaluation of Marked Flow Termination (MFT) with Multiple Bottleneck Links" is presented in Section 7. The next section clarifies some terminology issues. Menth & Lehrieder Expires May 22, 2008 [Page 4] Internet-Draft PCN Performance Evaluation November 2007 2. Terminology The terminology used in this document conforms to the topology of [I-D.ietf-pcn-architecture]. We use the following exceptions for better readability and provide the synonyms defined in [I-D.ietf-pcn-architecture]. o Admissible rate: PCN-lower-rate o Supportable rate: PCN-upper-rate o Admission-stop marking: first encoding or PCN-lower-rate-marking o Excess-traffic marking: second encoding or PCN-upper-rate-marking Menth & Lehrieder Expires May 22, 2008 [Page 5] Internet-Draft PCN Performance Evaluation November 2007 3. Comparison of Marking Algorithms for PCN-Based Admission Control The following presents a short summary of [TR437] without the graphs and exact numerical results that are provided in the technical report. The interested reader is referred to that document. 3.1. Definition of Simulated Entities and Simulation Setup In this study, we investigate the behaviour of different metering and marking algorithms under different configuration and use a congestion level estimator to observe the packet markings. 3.1.1. Metering and Marking Mechanisms PCN requires metering and marking algorithms in the interior nodes. [TR437] defines o threshold marking and o ramp marking based on a virtual queue (VQ), but there are equivalent descriptions based on token buckets. The parameters are o the size S of the VQ, o the rate R of the VQ, o the marking threshold T for threshold marking, which is also the upper threshold for ramp marking, o the marking threshold T_ramp, which is the lower threshold for ramp marking 3.1.2. Congestion Level Estimator Furthermore, a congestion level estimator is defined that calculates a congestion level estimate (CLE) at the PCN egress node based on an exponentially weighted moving average (EWMA). Marked packets count 1 and unmarked packets count 0. The CLE is computed as CLE = w o CLE + (1 - w) o X where X is the observed packet marking and w<1 is the weight parameter. If w is large, CLE has a long memory M, if it is low, CLE has a short memory M. The time between CLE updates also influences Menth & Lehrieder Expires May 22, 2008 [Page 6] Internet-Draft PCN Performance Evaluation November 2007 the memory M. A formal definition of the memory M is given in 3.4.2 of [TR437]. The CLE is used to observe the packet markings of the simulations. 3.1.3. Simulation Setup We simulate a single link scenario. Packets from n independent, homogeneous traffic sources are multiplexed onto a single link with infinite bandwidth and pass a meter and marker. The markings are evaluated by a subsequent congestion level estimator. If not mentioned differently, we simulate around n = 100 homogeneous flows for sufficiently long time to obtain reliable results. However, we omit confidence intervals in all our graphs for the sake of clarity. We choose a Gamma distribution to generate the inter- arrival times A between consecutive packets within a flow with a mean of E[A] = 20 ms and a coefficient of variation of cvar[A] = 0.1. The packet sizes B are independent and distributed according to a deterministic phase of 30 bytes plus a negative binomial distribution. Their overall mean is E[B] = 60 bytes and their coefficient of variation is cvar[B] = 0.5. The values for E[A] and E[B] are motivated by typical voice connections that periodically send every 20 ms a packet with 20 bytes payload using a 40 bytes IP/ UDP/RTP header. However, our flow model is not periodic and has variable packet sizes. We use it for two reasons. The simulation of multiplexed, strictly periodic traffic requires special care due to the non-ergodicity of the system and is very time consuming. Therefore, we relax cvar[A] = 0.0 to cvar[A] = 0.1. Furthermore, we use cvar[B] = 0.5 instead of cvar[B] = 0.0 because realtime traffic consists of packets from different applications with and without compression which leads to different packet sizes. However, our findings are general and do not depend on special parameter settings. The rate of the virtual queue is R = 2.4 Mbit/s such that at most 100 flows can pass unmarked. The congestion level estimator implements an exponentially weighted moving average (EWMA) and counts packets with admission-stop marks as 1 and those without as 0. As mentioned previously, its memory M depends on the packet rate and the weight parameter w such that w needs to be adapted to the desired M and the packet frequency in the experiment for which we take the maximum packet rate that can pass unmarked. Thus, we set the weight parameter to w = 0.998 which corresponds to a memory of 0.1 s when 100 default flows are active. If the packet rate changes due to more bursty traffic, we adapt the weight parameter w to achieve the same memory. Menth & Lehrieder Expires May 22, 2008 [Page 7] Internet-Draft PCN Performance Evaluation November 2007 3.2. Impact of the Marking Threshold T and the Queue Size S We measure the percentage of marked packets depending on the PCN rate (number of flows n) and the queue parameters size S and marking threshold T. The ideal marker marks 1. no packets if the PCN rate is below the VQ rate R and 2. all packets if it is above. We found out that 1. is increasingly achieved with increasing threshold T and 2. is increasingly achieved with increasing remaining queue size S-T. 3.3. Two Marking Strategies with Different Admission Control Policies We construct threshold markers with two different CLE characteristics (=function describing the percentage of marked packets depending on PCN rate). 3.3.1. Marking with Clear Decisions (MCD) Marking with clear decisions (MCD) means that the above objectives (1) and (2) are well achieved. This can be obtained for threshold marking with a large marking threshold T and a large remaining queue size S - T. Then, hardly any fluctuations in marking are observed. 3.3.2. Marking with Early Warning (MEW) Marking with early warning (MEW) means that (3) the percentage of marked packets already increases when the PCN rate approaches the VQ rate and (4) is 100% when the PCN rate is above the VQ rate. This can be obtained for threshold marking with a small marking threshold T and a large remaining queue size S - T. 3.4. Impact of Ramp Marking Ramp marking already marks packets probabilistically if the virtual queue length is below the marking threshold T. Therefore, it marks more packets than threshold marking with the same marking threshold T and queue size S. In our study we always set the lower marking threshold to T_ramp = 0. We found out that ramp marking with this configuration cannot achieve MCD because it marks a small percentage of packets when the PCN rate is below the VQ rate, but it can well achieve MEW. MEW can be achieved both with threshold and ramp Menth & Lehrieder Expires May 22, 2008 [Page 8] Internet-Draft PCN Performance Evaluation November 2007 marking, but threshold marking requires a smaller threshold parameter T to get the same marking results as with ramp marking. 3.5. Impact of the Memory M of the Congestion Level Estimator The memory M of the congestion level estimator does not have an impact on the percentage of marked packets that were observed over the simulation time, but it impacts the degree to which the CLE fluctuates. If the memory is long, the fluctuation of CLE is small. If the memory M is short, the fluctuation of CLE is large. When we configure the queue for MCD, i.e., the threshold T and the remaining queue size S-T were chosen sufficiently large, the CLE is almost 0 for PCN rates smaller than the VQ rate and it is 1 for PCN rates larger than the VQ rate. This holds even for a very small memory M of the congestion level estimator. 3.6. Impact of Traffic Characteristics Traffic characteristics have a significant impact on the marking result. o Decreased variance of packet sizes: no impact on the CLE characteristics in case of MCD, slightly lower curves in case of MEW o Increased variance of packet sizes: little impact on the CLE characteristics in case of MCD, significantly higher curves in case of MEW and larger fluctuation of CLE o Increased aggregation level: no impact on the CLE characteristics in case of MCD, slightly higher curves in case of MEW and less fluctuation of CLE o Increased variance of inter-arrival times: little impact on the CLE characteristics in case of MCD, slightly higher curves in case of MEW and larger fluctuation of CLE o Increased burstiness (fewer but larger packets): little impact on the CLE characteristics in case of MCD, significantly higher curves in case of MEW and large fluctuations of CLE o On/off traffic instead of continuous flows: large impact on the CLE characteristics in case of MCD and MEW, in particular very large fluctuations of the CLE Menth & Lehrieder Expires May 22, 2008 [Page 9] Internet-Draft PCN Performance Evaluation November 2007 3.7. Response Time of the Marking to Sudden Overload Large marking thresholds T and remaining queue sizes S-T lead to stable marking results for MCD, but large parameters slow down the reaction time of the marker when the PCN rate exceeds the VQ rate. 3.8. Conclusion One option for pre-congestion notification (PCN) based admission control requires that all packets are marked if the current link rate exceeds a pre-configured admissible rate. This can be achieved by virtual queue based marking algorithms such as simple threshold marking or more complex ramp marking. The objective of [TR437] was to study how marking algorithms can support admission control in order to limit the utilization of the links of a network. We did not consider the use of marking algorithms to support admission control in order to limit the packet delay because we assume that PCN will be used in high-speed networks where packet delay caused by queuing is negligible as long as link utilizations are moderate. We investigated the influence of the parameters of the marking algorithms on their marking results which are translated into a congestion level estimate (CLE) using EWMA-based averaging. We showed that two different marking strategies can be pursued: marking such that the CLE leads to clear decisions (MCD) and marking such that the CLE yields early warning (MEW) when the rate of PCN traffic on a link approaches its admissible rate. We provided recommendations for the configuration of the marking threshold T and the size S of the virtual queue in both cases. Ramp marking increases the level of early warning compared to threshold marking, but this can be approximated by smaller marking thresholds for simple threshold marking such that there is no obvious need for ramp marking. The CLE values for MEW fluctuate, therefore, it is difficult to infer the exact, current traffic rate from the CLE values which is required to take advantage of early warning. A sensitivity study revealed that the average CLE values for MEW depend heavily on the traffic characteristics. This makes the use of early warning difficult: either the marking parameters need to be adapted to produce similar warnings for different traffic types or the mechanism taking early warning into account requires knowledge about the traffic characteristics to correctly interpret the CLE level. In contrast, CLE values for MCD show hardly any variation and are robust against different traffic types. Menth & Lehrieder Expires May 22, 2008 [Page 10] Internet-Draft PCN Performance Evaluation November 2007 For the sake of simplicity, we advocate for the use of MCD for PCN based admission control instead of MEW because the interpretation of early warning is difficult due to its high variation and dependency on traffic characteristics. Furthermore, we think that ramp marking is not needed for PCN since similar markings can be obtained by appropriately configured threshold marking and we do not see any benefit that justifies the implementation complexity of ramp marking. Menth & Lehrieder Expires May 22, 2008 [Page 11] Internet-Draft PCN Performance Evaluation November 2007 4. Performance Evaluation of Admission Control with Single-Marking When the PCN rate exceeds the admissible rate of a link, all unmarked packets are re-marked to "admission-stop" in the CL [I-D.briscoe-tsvwg-cl-architecture] and 3sm [I-D.babiarz-pcn-3sm] proposal. This is different with Single-Marking (SM) [I-D.charny-pcn-single-marking] since it marks only those packets with "admission-stop" that exceed the admissible rate. 4.1. Impact on Probing In SM, the egress node calculates the fraction of marked and all packets which is the so-called congestion level estimate (CLE). If the CLE exceeds the admission-stop threshold of about 4 %, the ingress node stops the admission of further flows. If the CLE falls below an admission-continue threshold of about 2 %, the ingress node continues the admission of further flows. To support probing, SM needs to send several probe messages because if the admissible rate of a link is exceeded by the current PCN rate, only a fraction of the packets is marked and this fraction is the current CLE value which can be interpreted as marking probability. When n_p probe messages are used for probing, the new flow is rejected when at least one message returns with an admission-stop mark. However, an error probability remains that can be computed as follows: p_err(CLE, n_p) = (1 - CLE)^(n_p) This error probability decreases exponentially with an increasing number of probe packets and it significantly depends on the CLE value. The question in practice is: how many probes are required to achieve a reliable probing method, i.e. the error probability of probing is less than p_max^err? To that end, we find n_min(CLE,p_max^err) = min_(n_p) ( p_err(CLE, n_p) < p_max^err ) = ceil(log(p_max^err )/log(1 - CLE)). We the following table shows the minimum number of probes n_min(CLE,p_max^err) required to achieve a probing reliability of p_max^err when the fraction of marked packets is CLE. Menth & Lehrieder Expires May 22, 2008 [Page 12] Internet-Draft PCN Performance Evaluation November 2007 Table 1: Minimum number of required probe packets depending on the CLE and the desired reliability p_max^err. +---------------------------------------------------------+ |CLE | 0.01 | 0.02 | 0.04 | 0.05 | 0.1 | 0.15 | 0.20 | +---------------------------------------------------------+ | 10% | 230 | 114 | 57 | 45 | 22 | 15 | 11 | | 1% | 459 | 228 | 113 | 90 | 44 | 29 | 21 | | 0.1% | 688 | 342 | 170 | 135 | 66 | 43 | 31 | +---------------------------------------------------------+ |p_max^err| minimum number of required probe packets | +---------------------------------------------------------+ Thus, the number of required probe messages to achieve a reliable probing method is rather large. To get representative probes, the messages should not be sent in one shot but rather with exponential inter-arrival times [SoBa05] such that the large number of required probes causes substantial delay. Menth & Lehrieder Expires May 22, 2008 [Page 13] Internet-Draft PCN Performance Evaluation November 2007 5. Performance Evaluation of Measured Rate Termination (MRT) Measured rate termination (MRT) assumes that PCN interior nodes mark packets with "excess-traffic" (ET) when they exceed the supportable rate (SR) of a link with some tolerance. This marking is explained in [I-D.babiarz-pcn-3sm] and used in the CL architecture [I-D.briscoe-tsvwg-cl-architecture]. The PCN egress node measures the rate of marked and unmarked packets and communicates them to the PCN egress node. Based on this information, the PCN ingress node calculates the rate that needs to be terminated and chooses an appropriate set of flows for termination. This is not a trivial task since the rate of the flows are possibly unknown, but we do not further study this issue. There are two options for MRT: direct and indirect MRT. 5.1. Two Options for MRT 5.1.1. Direct MRT The PCN egress node measures the rate of ET-marked packets per ingress-egress aggregate. If this excess traffic rate (ETR) is larger than zero, the PCN egress node communicates it to the corresponding PCN ingress node using a "traffic-reduction" message. Then, the PCN ingress node terminates sufficiently many flows to achieve a reduction of that rate. A minimum interval, a so-called inter-termination time, is required between consecutive rate- reduction messages because it takes some time until the effect of the previous termination action becomes visible in the measured rate of ET-marked traffic at the PCN egress node. Direct MRT is used as a non-standard option to terminate traffic in the E3Tunnel deployment model of 3sm [I-D.babiarz-pcn-3sm]. 5.1.2. Indirect MRT The marking is again the same as in the CL architecture. In contrast to direct MRT, with indirect MRT the PCN egress node measures the rate of traffic that is not marked with ET (nETR) per ingress-egress- aggregate. This is the so-called sustainable rate and the PCN egress node immediately sends it to the PCN ingress node. The PCN ingress node measures the rate of traffic to be transmitted per ingress- egress aggregate, the so-called ingress rate (IR). When the PCN ingress node receives the sustainable rate from the PCN egress node, it calculates the difference between the timely corresponding ingress rate and sustainable rate. This difference is positive if traffic had been ET-marked because then the ingress rate is larger than the sustainable rate. If the difference is positive, it is used as an estimate for the traffic that should be terminated. Indirect MRT is used as the preferred option in [I-D.briscoe-tsvwg-cl-architecture]. Menth & Lehrieder Expires May 22, 2008 [Page 14] Internet-Draft PCN Performance Evaluation November 2007 5.2. Impact of Packet Loss Packet loss has an impact on the rate of marked and unmarked packets received by the PCN egress node. In addition, the fraction of marked and unmarked packets depends on whether packets are first lost or marked. We illustrate this issue on a link with 8 Mbit/s and its supportable rate being set to 4 Mbit/s. Due to a reroute or some other reason, the link is suddenly confronted with a traffic rate of 16 Mbit/s. We consider the behaviour of direct and indirect MRT under the conditions that traffic is first lost and then marked (L&M) or first marked and then lost (M&R). 5.2.1. Direct MRT under Lose & Mark Under L&M, 8 Mbit/s are lost. 4 Mibt/s out of the remaining 8 Mbit/s are AS-marked and the remaining 4 Mbit/s are ET-marked. As a consequence, 4 Mbit/s are terminated such that 12 Mbit/s will remain for the next round. Then, 4 Mbit/s are lost. 4 Mibt/s out of the remaining 8 Mbit/s are AS-marked and the remaining 4 Mbit/s are ET- marked. As a consequence, 4 Mbit/s are terminated such that 8 Mbit/s will remain for the next round. Then, no traffic is lost anymore, but out of the 8 Mbit/s 4 Mbit/s are AS-marked and the remaining 4 Mbit/s are ET-marked. When these 4 Mbit/s have been terminated, there is no overload anymore. 5.2.2. Indirect MRT under Lose & Mark Under L&M, 8 Mbit/s are lost. 4 Mibt/s out of the remaining 8 Mbit/s are AS-marked and the remaining 4 Mbit/s are ET-marked. As a consequence, there is a sustainable rate of 4 Mbit/s such that 12 Mbit/s are terminated. Thus, the overload is already removed after a single termination step. 5.2.3. Direct MRT under Mark & Lose Under M&L, 4 Mbit/s out of the initial 16 Mbit/s are AS-marked and the remaining 12 Mbit/s are ET-marked. Then, 8/16 of the traffic is lost such that 6 Mbit/s arrive with ET-marks. After 6 Mbit/s are terminated, the system is still confronted with a load of 10Mbit/s out of which 4Mbit/s are AS-marked and the remaining 6 Mbit/s are ET- marked. Then, 2/10 of the traffic is lost such that 5.2 Mbit/s arrive with ET-marks. After 5.2 Mbit/s are terminated, the system is confronted with a load of 4.8 Mbit/s out of which 4 Mibt/s are AS- marked and the remaining 0.8 Mbit/s are ET-marked. As no loss occurs anymore, these 0.8 Mbit/s are terminated such that there is no overload anymore. Menth & Lehrieder Expires May 22, 2008 [Page 15] Internet-Draft PCN Performance Evaluation November 2007 5.2.4. Indirect MRT under Mark & Lose Under M&L, 4 Mbit/s out of the initial 16 Mbit/s are AS-marked and the remaining 12 Mbit/s are ET-marked. Then, 8/16 of the traffic is lost such that 6 Mbit/s arrive with ET-marks. As a consequence, there is a sustainable rate of 2 Mbit/s such that 14 Mbit/s are terminated. Thus, the overload is already removed after a single termination step, but with a substantial amount of overtermination. 5.2.5. Summary Table 2 summarizes the termination behavior of direct and indirectMRT under L&M and M&L conditions. With indirectMRT, sudden SR-overload is removed after a single termination step while for direct MRT, the removal of SR-overload requires several termination steps. When packet loss occurs before packets are marked by the meter and marker, indirect MRT works well. However, when packets are metered and marked before they are lost, then indirectMRT can lead to substantial overtermination. Table 2: Impact of the order of packet marking and loss on the termination behavior with direct and indirect MRT. +-----------------------------------------------------------------+ | Time | Lose&Mark | Lose&Mark | Mark&Lose | Mark&Lose | | | Direct MRT| Ind. MRT | Direct MRT| Ind. MRT | +-----------------------------------------------------------------+ |Start of overload|16.0 Mbit/s|16.0 Mbit/s|16.0 Mbit/s|16.0 Mbit/s| |After 1st term. |12.0 Mbit/s| 4.0 Mbit/s|10.0 Mbit/s| 2.0 Mbit/s| |After 2nd term. | 8.0 Mbit/s| 4.0 Mbit/s| 5.2 Mbit/s| 2.0 Mbit/s| |After 3rd term. | 4.0 Mbit/s| 4.0 Mbit/s| 4.0 Mbit/s| 2.0 Mbit/s| +-----------------------------------------------------------------+ 5.3. Unintended Traffic Termination with Indirect MRT through Badly Aligned Measurement Intervals Indirect MRT compares the rate of non-ET marked traffic (nETR) at the egress node and compares it with the rate of the PCN traffic at the ingress node (IR). This implies that the same packets are measured in the corresponding measurement intervals which is hard to achieve. When the delay between PCN ingress and egress node is some fixed transmission delay X, the measurement interval at the PCN egress node needs to start X time later than the corresponding one at the PCN ingress node and it must have the same length to observe the same set of packets. However, X is not exactly fixed due to queuing and other effects. Therefore, exact alignment cannot be achieved. This possibly leads to wrong rate differences when results from badly aligned measurement intervals are compared. In this section, we investigate its impact as it can lead to unintended traffic Menth & Lehrieder Expires May 22, 2008 [Page 16] Internet-Draft PCN Performance Evaluation November 2007 termination. We use a measurement interval length of 100 ms and the misalignment is 50 ms such that the first packet in the measurement interval a the PCN egress is contained in the measurement interval at the PCN ingress. 5.3.1. Experiments with Almost CBR Traffic First, we use exactly periodic constant bit rate (CBR) traffic and several variants of almost periodic or almost CBR traffic. Those are CBR traffic with at most 1 ms uniformly distributed packet arrival delays and constant packet sizes, exactly periodic traffic with little variation in packet sizes (coefficient of variation cvar[B] = 0.1), and almost periodic traffic with little variation in packet inter-arrival times (coefficient of variation cvar[A] = 0.1) and constant packet sizes. We consider n = 200 flows and the supportable rate is set to infinity such that no packet is marked. Experiments show that in case of exact periodicity and equal packet sizes, no flows are terminated. This is different, when little variation is introduced where flows are continuously terminated without any packet being lost or ET-marked. To repair this, we enhance the indirect MRT method by the side condition, that flows are only terminated if a tolerance of Tf is exceeded. This tolerance is usually given as a rate, but we think of it as a number of flows as we deal only with equal rate flows in this section. Experiments show that a tolerance of T_f = 1 flow removes the unintended flow termination effect. 5.3.2. Experiments with VBR Traffic We now conduct the same experiment with more variable traffic and study the impact of different tolerance values T_f on unintended traffic termination. We performed experiments for exactly periodic flows having packet sizes with the same mean E[B] = 200 bytes but a coefficient of variation of cvar[B] = 0.5 and experiments for flows with constant packet sizes but inter-arrival times with the same mean E[A] = 20 ms but a coefficient of variation of cvar[A] = 0.5. In both cases, a tolerance value of T_f = 1 flow cannot avoid unintended flow termination and even a tolerance of T_f = 5 flows cannot fully remove it. 5.3.3. Experiments with On/Off Traffic We now look at on/off traffic, i.e. traffic sources have exponentially distributed on and off phases during which they send CBR traffic or are silent. We performed experiments for mean phase Menth & Lehrieder Expires May 22, 2008 [Page 17] Internet-Draft PCN Performance Evaluation November 2007 durations of 0.5 s and 5 s, respectively. Again, we see substantial unintended flow termination that is mitigated by an increasing tolerance value T_f. 5.3.4. Experiments with Rerouted Traffic In the presence of network failures, large amounts of traffic are rerouted, i.e. shifted to other links. The PCN ingress and egress nodes perform rate measurement using intervals of length of 100 ms starting at a global time of 0 s. We simulate what happens in such a scenario at the PCN ingress node. At time 50 ms a reroute of 8Mbit/s occurs at the PCN ingress node and after another 50 ms the traffic arrives at the PCN egress node. At that time, the PCN egress node measures a sustainable rate of 0 Mbit/s for its first measurement interval and sends this value to the PCN ingress node. At time 150 ms this value arrives there and the PCN ingress node compares the ingress rate of 4 Mbit/s measured in the first measurement interval with the sustainable rate of 0 Mbit/s. As a consequence, it terminates 4 Mbit/s. 5.3.5. Summary The disadvantage of indirect MRT is that the measurement intervals at the PCN ingress and egress nodes require some timely alignment. Otherwise, unintended small positive differences can occur without any traffic being ET-marked or lost which results in unintended traffic termination. Therefore, the traffic rate to be terminated needs to exceed a tolerance threshold $T_f$ before traffic is really terminated. However, it turns out that the value of that threshold should increase with increasing traffic variability. Moreover, in case of sudden extra traffic as it can occur with reroutes, a substantial fraction of the traffic can be terminated without any packets being ET-marked. Menth & Lehrieder Expires May 22, 2008 [Page 18] Internet-Draft PCN Performance Evaluation November 2007 6. Performance Evaluation of Marked Flow Termination on a Single Link Marked flow termination (MFT) is used in 3sm [I-D.babiarz-pcn-3sm]. In contrast to measured rate termination (MRT) it requires that the metering and marking algorithms in interior nodes mark only some of the traffic that exceeds the supportable rate. This is done by adding a slowdown factor of "S" tokens to the bucket whenever a packet is marked with ET. This is called marking frequency reduction (MFR). 6.1. Relative Termination Aggressiveness g_rel Given the average flow rate r_f, the flow termination time D_T, and the slowdown factor S, we can define the relative termination aggressiveness g_rel = r_f * D_T / S. The termination behaviour of an MFT system is the function describing the time-dependent rate from the beginning of the overload until termination of further flows stops. Most interesting is the time to remove the SR-overload and the overtermination which is the percentage of traffic relative to the supportable rate that was unintentionally terminated. 6.2. Impact of System Parameters Adapting the slowdown factor S for different flow rates r_avg caused by different inter-arrival times or packet sizes produces can achieve the same relative termination aggressiveness. This produces the same termination behaviour. We performed experiments showing o an increasing aggressiveness leads to shorter time to remove the overload, but to more overtermination o a decreasing aggressivenss leads to longer time to remove the overload, but to less overtermination A relative termination aggressiveness of g_rel=0.5 leads to short time to remove overload and avoids overtermination. Different degrees of SR-overload and different aggregation levels (n=10, 100, 1000 flows) lead to different curves but to similar time to remove the overload and to the same overtermination. On/off traffic leads to the same termination behaviour if the on/off Menth & Lehrieder Expires May 22, 2008 [Page 19] Internet-Draft PCN Performance Evaluation November 2007 phases are sufficiently long (10 ms), in case of short phase durations the time to reduce overload is longer, but it also works well. Increasing flow termination delay D_T extends the time to remove overload linearly, but it has no influence on the overtermination. Flows with different termination delays see the same termination probability. Modifying the inter-arrival time or the packet size changes the average flow rates. Keeping the slowdown factor constant also changes the relative termination aggressiveness such that the time to reduce overload and the overtermination differs significantly. Flows with larger packets or more frequent packets face a larger flow blocking probability. 6.3. Improvements to the Metering and Marking Algorithm The objective is to make the termination behaviour insensitive to different packet sizes. To that end, two different improvements are introduced. o Packet size independent marking (PSIM) marks a packet already if the token bucket has less tokens than the maximum transfer unit (MTU) instead of the actual packet size. o Proportional marking frequency reduction (PMFR) adds packet.size/ MTU*S tokens to the bucket whenever a packet is marked. Both improvements are documented in [I-D.babiarz-pcn-3sm]. PSIM achieves that the termination probability of a flows is independent of its packet size. PMFR achieves that the termination behaviour of the same for different aggregates irrespective of the average packet size. Both options are simple to implement and their combination achieves that the relative termination aggressiveness becomes independent of packet sizes. It can be calculated by g_rel = MTU * D_T / S / E[A]. Thus the termination behaviour only depends on the average inter- arrival time E[A]. This has also a rather strong effect on the relative termination aggressiveness, but given the fact that most real-time applications have E[A] between 10 ms and 40 ms allows the following robust configuration. Setting S such that a relative aggressiveness of g_rel=0.5 is achieved for E[A]=20 ms, leads to at most 10% overtermination for E[A]=10 ms and to a time to remove overload of not more than 10*D_T. Different tradeoffs can be chosen. This analysis shows that marking frequency reduction (MFT) can be Menth & Lehrieder Expires May 22, 2008 [Page 20] Internet-Draft PCN Performance Evaluation November 2007 well controlled when the marking algorithm implements PMFR with PSIM. Menth & Lehrieder Expires May 22, 2008 [Page 21] Internet-Draft PCN Performance Evaluation November 2007 7. Performance Evaluation of Marked Flow Termination (MFT) with Multiple Bottleneck Links In the presence of several bottleneck links, flows can be ET-marked by different interior nodes. As a consequence, an overloaded link loses flows whose packets were ET-marked by itself, but it also loses flows whose packets were ET-marked by another overloaded link. This impacts the termination speed and possibly leads to unintended overtermination. To avoid this effect, we propose to remove S bytes from the virtual queue for each ET-marked packet sent over the corresponding link (general MFR) and not only for each packet that has been ET-marked by that link itself (local MFR). We illustrate the impact of this change to the metering algorithm. 7.1. Several Serial Links Carrying Only a Common Aggregate We start with a single aggregate crossing several links as shown in Figure 1. All links have a supportable rate of 4Mbit/s such that 125 flows can be supported. However, the observed aggregate initially carries 8 Mbit/s. Simulation results show that with local MFR, the aggregate rate is reduced to 3.0 Mbit/s, 2.15 Mbit/s, and to 1.6 Mbit/s in case of 2, 3, and 4 consecutive links. In contrast, with global MFR, the aggregate rate decreases only to 3.9 Mbit/s in all studied scenarios. Thus, we observe a significant degree of overtermination for local MFR while with global MFR the reduction of the traffic rate meets the expected value quite well and independently of the number of serial bottlenecks. +-----------+ +-----------+ +-----------+ Aggregate a0 ---| |---| |---| |--- +-----------+ +-----------+ +-----------+ Link l0 Link l1 Link l2 Several serial links carry only a common aggregate. Figure 1 To reconstruct this experiment, it is important to avoid the synchronization of the consecutive meters which is achieved when they are configured with (a) the same supportable rate (SR) and (b) the same supportable burst size (SBS) and (c) the same slowdown factor S, and (d) if there is no cross traffic at all. In our first experiment depicted in Figure Figure 1, we have chosen a slightly different SR for each link. In the other experiments that are depicted in Figure 2- Figure 4, the links carry cross traffic such that precondition (d) for the synchronization effect is violated anyway. Menth & Lehrieder Expires May 22, 2008 [Page 22] Internet-Draft PCN Performance Evaluation November 2007 7.2. Two Serial Links Carrying a Common Aggregate with Cross Traffic on the Second Link We consider two serial links carrying an aggregate a0 and the second link carrying cross traffic from an aggregate a1 (cf. Figure 2). +-----------+ +-----------+ ---| |---| |--- Aggregate a0 +-----------+ | |--- Aggregate a1 /+-----------+ Link l0 / Link l1 Two serial links carrying a common aggregate with cross traffic on the second link. Figure 2 The first link l0 has SR(l0) = 4 Mbit/s and the second link l1 has SR(l1) = 8Mbit/s. Initially, there is an SR-overload of 100% since both aggregates carry 8 Mbit/s. Simulation results show that the rate of aggregate a0 decreases from 8 Mbit/s to 2.9 Mibt/s with local MFR and to 3.0 Mbit/s with global MFR. The rate of aggregate a1 (cross traffic) is reduced from 8 Mibt/s to 4.4 Mbit/s with local MFR and to 4.8 Mbit/s with global MFR. As a consequence, the remaining rate on the second link l1 is 7.3Mbit/s with localMFR and 7.8Mbit/s with globalMFR. Thus, global MFT leads to visibly less terminated traffic in this experiment. 7.3. Two Serial Links Carrying a Common Aggregate with Cross Traffic on the First Link We consider two serial links carrying an aggregate a0 and the first link carrying cross traffic from an aggregate a1 (cf. Figure 3). +-----------+ +-----------+ Aggregate a0 ---| |---| |--- | | +-----------+ /+-----------+\ Aggregate a1 / Link l0 \ Link l1 Two serial links carrying a common aggregate with cross traffic on the first link. Figure 3 We consider two serial links carrying an aggregate a0 and the first link carrying cross traffic from an aggregate a1 (cf. Figure 3). The first link l0 has SR(l0) = 8 Mbit/s and the second link l1 has SR(l1) Menth & Lehrieder Expires May 22, 2008 [Page 23] Internet-Draft PCN Performance Evaluation November 2007 = 4 Mbit/s. Initially, there is an SR-overload of 100% since both aggregates carry 8 Mbit/s. Initially, there is an SR-overload of 100% since both aggregates carry 8 Mbit/s. Simulation results show that the rate of aggregate a0 decreases from 8 Mbit/s to 2.8 Mibt/s with local MFR and to 3.8 Mbit/s with global MFR. The rate of aggregate a1 (cross traffic) is reduced from 8 Mibt/s to 4.5 Mbit/s with local MFR and to 4.0 Mbit/s with global MFR. As a consequence, the remaining rate on the second link l0 is 7.3 Mbit/s with local MFR and 7.8 Mbit/s with global MFR. Thus, global MFT leads to visibly less terminated traffic in this experiment and to a fairer treatment of flows crossing a different numbers of bottlenecks. 7.4. Two Serial Links Carrying a Common Aggregate with Cross Traffic on Both Links We consider two serial links carrying a common aggregate with cross traffic on both links (cf. Figure 4). +-----------+ +-----------+ Aggregate a0 ---| |-----| |--- Aggregate a1 ---| | | |--- Aggregate a2 +-----------+\ /+-----------+ Link l0 \ / Link l1 Two serial links carrying a common aggregate with cross traffic on both links. Figure 4 We consider two serial links carrying an aggregate a0. The first link carries cross traffic from aggregate a1 and the second link carries cross traffic from aggregate a2 (cf. Figure 4). Both links have a supportable rate of 8 Mbit/s. Initially, there is an SR- overload of 100% since all three aggregates carry 8 Mbit/s. Simulation results show that the rate of aggregate a0 decreases from 8 Mbit/s to 2.6 Mibt/s with local MFR and to 2.7 Mbit/s with global MFR. The rate of aggregate a1 (cross traffic on link l0) is reduced from 8 Mibt/s to 4.5 Mbit/s with local MFR and to 4.3 Mbit/s with global MFR and the one of aggregate a2 (cross traffic on link l1) is reduced from 8 Mibt/s to 4.3 Mbit/s with local MFR and to 5.1 Mbit/s with global MFR. As a consequence, the remaining rate on the first link l0 is 7.2 Mbit/s with local MFR and 7.8 Mbit/s with global MFR and the one on the second link l1 is 7.2 Mbit/s with local MFR and 7.1 Mbit/s with global MFR. Thus, global MFT leads to visibly less terminated cross traffic on the second link in this experiment. Menth & Lehrieder Expires May 22, 2008 [Page 24] Internet-Draft PCN Performance Evaluation November 2007 7.5. Summary Global MFR leads in all studied experiments to less terminated traffic than local MFR. Especially in case of several serial links carrying only a single aggregate the degree of overtermination is tremendous with local MFR while global MFR does not lead to overtermination in that scenario. In other experiments with cross traffic the difference between local and global MFR is still visible but less dramatic. Nevertheless, global MFR is clearly better than localMFR. However, multiple simultaneous bottleneck links are rather unlikely and, in particular, the situation with several serial bottleneck links where tremendous overtermination with local MFR occurs is rather pathologic. Therefore, local MFR may still be used instead of global MFR if it reduces implementation complexity. Menth & Lehrieder Expires May 22, 2008 [Page 25] Internet-Draft PCN Performance Evaluation November 2007 8. Changes from Previous Revisions 8.1. Changes from Version -00 to Version -01 o Added section on "Performance Evaluation of Admission Control with Single-Marking" o Added section on "Performance Evaluation of Measured Rate Termination (MRT)" o Added section on "Performance Evaluation of Marked Flow Termination on a Single Link" o Added section on "Performance Evaluation of Marked Flow Termination with Multiple Bottleneck Links" Menth & Lehrieder Expires May 22, 2008 [Page 26] Internet-Draft PCN Performance Evaluation November 2007 9. References 9.1. Normative References 9.2. Informative References [I-D.babiarz-pcn-3sm] Babiarz, J., "Three State PCN Marking", draft-babiarz-pcn-3sm-00 (work in progress), July 2007. [I-D.briscoe-tsvwg-cl-architecture] Briscoe, B., "An edge-to-edge Deployment Model for Pre- Congestion Notification: Admission Control over a DiffServ Region", draft-briscoe-tsvwg-cl-architecture-04 (work in progress), October 2006. [I-D.charny-pcn-single-marking] Charny, A., "Pre-Congestion Notification Using Single Marking for Admission and Termination", draft-charny-pcn-single-marking-02 (work in progress), July 2007. [I-D.ietf-pcn-architecture] Eardley, P., "Pre-Congestion Notification Architecture", draft-ietf-pcn-architecture-01 (work in progress), October 2007. 9.3. Other References [Menth07] Menth, M. and F. Lehrieder, "Performance Evaluation of PCN-Based Admission Control and Flow Termination (work in progress)", November 2007, . [SoBa05] Sommers, J., Barford, P., , N., and A. Ron, "Improving Accuracy in End-to-End Packet Loss Measurement. In: ACM SIGCOMM", 2005. [TR437] Menth, M. and F. Lehrieder, "Comparison of Marking Algorithms for PCN-Based Admission Control, Technical Report No. 437", October 2007, . Menth & Lehrieder Expires May 22, 2008 [Page 27] Internet-Draft PCN Performance Evaluation November 2007 Authors' Addresses Michael Menth University of Wuerzburg Am Hubland Wuerzburg D-97074 Germany Phone: +49-931-888-6644 Email: menth@informatik.uni-wuerzburg.de Frank Lehrieder University of Wuerzburg Am Hubland Wuerzburg D-97074 Germany Phone: +49-931-888-6634 Email: lehrieder@informatik.uni-wuerzburg.de Menth & Lehrieder Expires May 22, 2008 [Page 28] Internet-Draft PCN Performance Evaluation November 2007 Full Copyright Statement Copyright (C) The IETF Trust (2007). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Intellectual Property The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org. Acknowledgment Funding for the RFC Editor function is provided by the IETF Administrative Support Activity (IASA). Menth & Lehrieder Expires May 22, 2008 [Page 29]