V. Raisanen Internet Draft Nokia draft-raisanen-ippm-npmps-results-00.txt November 2000 Category: Informational npmps-style measurement results 0. Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026 [1]. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet- Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. 1. Abstract Actual measurement results obtained with npmps-like [2] measurement setup are described and their relation to the npmps draft are discussed. The presented results use a subset of the npmps metric, and are presented to give an example of the use of the metric for reporting real results. An example of derivative use of measurement results is provided. A full text of the draft, including figures, can be downloaded in PDF format from http://www-nrc.nokia.com/netperf/results.pdf. 2. Introduction Active VoIP media stream emulation QoS measurements allow for a controlled QoS test for IP transport [2]. Below, results from media stream emulation measurements are provided for an international measurement between endpoints in Canada and Finland. The relation of measurement to the metric specified in the npmps draft is discussed. The reported measurements were made with round-trip style delay measurements for simplicity. The implementation of measurement using npmps-style metrics, however, is more general. In [4], a method for performing a measurement suitable for both one-way and round-trip measurements is described. Finally, an example derivative use of measurement results is presented. The "raw" QoS characteristics resulting from an active measurement typically include delays, loss percentages, jitters, and derivative metrics such as delay percentiles. These quantities, while representative of transport QoS for media streams at the time of the measurement, are rather abstract as perceived by an end-user. For this reason, a method for mapping measurement results to Mean Opinion Score (MOS) values is outlined. The procedure is described in more detail elsewhere [6]. 3. Network transport QoS measurement The first media stream emulation QoS measurement was performed in August 1999 between domestic endpoints in Finland (Helsinki and Oulu, ca. 500 km apart). The second, international, measurement was performed between endpoints in Helsinki and Ottawa (Canada) in December 1999. The first measurement used a dedicated IP network, whereas the second one was made in public Internet. IPv4 was used in both measurements. 3.1 Measurement arrangement The 1999 measurements described here were of round-trip variety, whereby no host clock synchronization was necessary. The hardware used in the measurements consisted of Pentium-class IBM Thinkpad 600 with 3Com Megahertz 10BaseT PCMCIA NIC as the mobile measurement station ("sender") and a garden-variety 400 MHz Pentium II tabletop with Kingston 10/100 Mbit/s NIC as the fixed measurement station ("pinger"). Both hosts were running RedHat Linux 5.2. The measurement described here was performed above transport level, i.e. analysis used only data collected above Src and Dst socket interfaces. The performance of the hardware, NIC drivers, and operating system were verified extensively prior to network performance measurements, with the following results: - delays due to hardware, NIC driver & TCP/IP stack are negligible if Ethernet segment that hosts are connected to is not overloaded. - no round-trip delay measurement result discontinuities due to OS were observed. - with the same provisions, no packet losses due to measurement setup were observed. The analysis of the results was performed based on measurement results obtained at BSD socket level. Thus, for example, no information about possible corrupted payloads is available. 3.2 Accuracy of results Standard RH Linux 5.2 scheduling quantum is 10 ms. This did not cause problems as the inter-packet transmission interval incT used was a multiple of this. Host clock skew was typically of the order of 0.5%, but did not affect our results as only round-trip delay measurements were used in the analysis. Host clock accuracy for delay measurements was << 1 ms, which was enough for the target delay measurement accuracy of 1 ms. 3.3 Metric parameters used in the measurements The measurement results reported here pertain to G.711 (PCM) media stream emulation with single audio sample per IP packet, i.e. 160 byte payload at 20 ms intervals. To simulate the effect of RTP headers, additional 12 bytes per measurement packets were added to measurement payload size. Multiple measurements at different times of day were performed, only one of which (per endpoint pair) is reported here. As the measurements were of round-trip variety, only metrics collected at Src are described. Accuracy for delay measurements: 1 ms. Path: N/A. For the Finnish measurement, a dedicated single-provider VPN based on ATM backbone was used. Global metric parameters: - src IP: varied with test - dst IP: N/A. - incT = 20 ms. - Tf-T0 = 90000 x incT (international connection), 180000 x incT (domestic connection). - p(j) = 172 bytes for all j above transport level, 200 bytes (IPv4) above link level. - dTloss = 5 seconds. - Tcons = dTloss. As the delay measurement was of round-trip type, only metrics collected at Src are described below. For the same reason, the set of metrics is basically a union of metrics 4.2.2 and 4.2.3 in the npmps draft. Metrics collected at Src: - Tstamp(Src1)[i]: transmission timestamp at Src - Tstamp(Src2)[i]: reception timestamp at Src - PktID[i]: reception sequence number at Src. - PktSiTy[i]: not applicable here (all packets of the same type). - PktStatus[i]: header/payload corrupt not detectable. Metrics obtained with an analysis of Src Metrics: - Tstamp(Src1)[i], Tstamp(Src2)[i] - PktID[i] - PktSiTy[i]: not applicable - Pk tStatus[i]: received/lost/out of sequence - SDV[i] 3.4 Examples of derivative metrics The following analyses have been performed for a relatively long measurement (30-60 minutes), which ought to be observed in studying the results. For concreteness, effective one-way delay was computed, estimated as one-half of the round-trip delay. It is well known that this method does not always yield accurate results due to possible asymmetric propagation delays, but nevertheless provides for a value. Below, the effective one-way delay distributions for the two measurements are shown. Figure 1: One-way delay histogram of the Oulu-Helsinki (left) and Ottawa-Helsinki (right) measurement. In the domestic case, the percentual variations in the delay distribution are larger than for the international case. The reason for this is that the average one-way delay is only about one-tenth of that of the international connection, whereby even small QoS variations (e.g., simultaneous data transfers) cause large variations. It is a straightforward exercise to present the above results in the form of cumulative delay distributions, which can be used for verifying transport QoS. Furthermore, delays could be studied as a time series, for example. Packet loss, as computed over the duration of the whole measurement, was 0.8% for the domestic connection and 1.6% for the international connection. A derivative metric of npmps, for example packet loss correlation figures [5] or loss pattern metrics could also be reported. 3.5 A further example of the use of results The network QoS performance measurement results can be related to user experience of QoS by performing listening tests based on the measurements such as described above. The methodology and the results of such a set of listening tests has been reported in [6]. Below, only an outline of the mapping between network performance measurements and listening test material is provided, and an example of the results is shown. The measurement results were mapped to listening test material by using "QoS budgets". This means that a limit was set for per-packet end-to-end delay, and packets for which delay exceeded this value were interpreted as losses (in addition to "true" packet losses). The effective, QoS budget -based packet loss is referred to as "delay limited packet loss". Three different delay budgets were used for the two measurements, referred to as "loose", "medium", and "tight". The size of delay budgets was based on delay histograms shown above. More precisely, "loose" budget means that delay limited packet loss percentage is identical to "pure" packet loss percentage; "medium" corresponds to approximately 2% packet loss percentage, and "tight" approximately to 15% packet loss percentage. The results of the listening test are shown in Figure 2. Figure 2: Results of the listening test. "Total" indicates the actual Mean Opinion Score (MOS), "CIU" and "CIL" denote the upper and lower limits for 95% confidence interval, respectively. For reference, two leftmost columns depict results for case of no errors and 12.2 kbit/s and 7.95 kbit/s, respectively. The higher the MOS value, the better the result. Listening tests were performed with the AMR (3GPP Adaptive Multi-Rate) codec to study the performance of the codec with bursty packet losses. The metrics of the measurement stream do not correspond to that of AMR media stream. The justification for the listening test to be performed using AMR is control measurements which showed that the results for low-bandwidth streams (<= 64 kbit/s payload rate) were near-identical as long as the transmission interval incT was the same. Please note that is not always the case. 4. Discussion The results of a network transport QoS measurement were reported using npmps metrics. The original npmps metric is one-way, but it was demonstrated that it can also be applied to a round-trip measurement. The results have been provided in an attempt to demonstrate that npmps-like metrics provide a basis of reporting measured IP network transport quality. The proposed metric does not specify derivative metrics in detail, but is intended to be a general way of obtaining transport delays, as well as per-packet status ("OK", "lost", ...). A sound base metric allows the derivative result representations to be tailored according to need. 5. References 1 Bradner, S., "The Internet Standards Process -- Revision 3", BCP 9, RFC 2026, October 1996. 2 V. Raisanen and G. Grotefeld, "Network performance measurement for periodic streams", draft-ietf-ippm-npmps-02.txt (work in progress). 3 Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997 4 ETSI standard TIPHON/TS 101 329 part 5 (to be published in October). 5 R. Koodli and R. Ravikanth, "One-way Loss Pattern Sample Metrics", draft-ietf-ippm-loss-pattern-03.txt (work in progress). 6 A. Lakaniemi, J. Rosti, and V. Raisanen, "Subjective VoIP speech quality evaluation based on network measurements", submitted. 6. Author's Address Vilho Raisanen Nokia Corp. P.O. Box 407 FIN-00045 Nokia Group FINLAND Phone: +358 9 4376 1 E-mail: Vilho.Raisanen@nokia.com 7. Full Copyright Statement Copyright (C) The Internet Society (date). All Rights Reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implmentation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into in the final draft output. Raisanen Informational - expires May, 2001 5