Internet Engineering Task Force Takumi Kimura INTERNET-DRAFT NTT Expires in: April 2004 Jerry Perser Spirent October 2003 Benchmarking Terminology for Protection Performance Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC 2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Abstract This document addresses common terminology and metrics for the performance benchmarking of sub-IP layer protection technologies: Automatic Protection Switching (APS) for SONET/SDH, Fast Reroute for Multi-Protocol Label Switching (MPLS), and Resilient Packet Ring (RPR) standardized in IEEE. The benchmarks describe the performance based on the effects in the IP-layer to avoid dependence on a specific sub-IP layer protection technology. Table of Contents 1. Introduction .............................................. 2 2. Existing definitions ...................................... 3 Kimura & Perser Expires April 2004 [Page 1] INTERNET-DRAFT Protection Performance Terminology October 2003 3. Term definitions .......................................... 3 3.1 Path 3.1.1 Path ............................................... 3 3.1.2 Working Path ....................................... 4 3.1.3 Ordinary Path ...................................... 4 3.1.4 Recovery Path ...................................... 5 3.1.5 Recovery Span ...................................... 5 3.2 Protection 3.2.1 Path Failure ....................................... 6 3.2.2 Failure Detection .................................. 6 3.2.3 Switch Over ........................................ 7 3.2.4 Protection Switching ............................... 7 3.2.5 Protection-Capable Node ............................ 8 3.2.6 Protection System .................................. 8 3.3 Reference Model for Protection Benchmarking 3.3.1 Pseudo-Failure Equipment ........................... 9 3.3.2 Trigger for Failure Protection ..................... 9 3.3.3 Reference Model for Protection Benchmarking ........ 10 3.4 Metrics 3.4.1 Errored Packet ..................................... 11 3.4.2 Lost Packet ........................................ 12 3.4.3 Sequence-Error Period .............................. 12 3.4.4 Loss Period ........................................ 13 3.4.5 Base Latency ....................................... 13 3.4.6 Additive Latency ................................... 14 3.4.7 Induced Latency .................................... 14 3.4.8 Unstable-latency Period ............................ 15 3.4.9 Recovery Time ...................................... 15 4. Security Considerations ................................... 16 5. Acknowledgements .......................................... 16 6. References ................................................ 17 7. Authors' Addresses ........................................ 17 8. Full Copyright Statement .................................. 17 1. Introduction Reliability is needed in today's IP networks, because the Internet has already become an important communication infrastructure, and quality-sensitive applications are being used on it. Protection technologies have been implemented in sub-IP layers improve IP-layer reliability. Automatic Protection Switching (APS) is for SONET/SDH, Fast Reroute is for Multi-Protocol Label Switching (MPLS), and Resilient Packet Ring (RPR) is standardized in IEEE. The recovery time in the IP-layer is different from that in sub-IP layers because of the recognition mechanism for when interfaces go up and down and the buffering effect of IP routers. Protection performance benchmarks and methodologies for testing them are required to allow Kimura & Perser Expires April 2004 [Page 2] INTERNET-DRAFT Protection Performance Terminology October 2003 an objective comparison of implementations. These benchmark definitions are based on the effects in the IP layer, so that they can be developed independent of protection technologies and that we can compare different protection technologies. 2. Existing definitions This document draws on existing terminology defined in other BMWG work. Examples include, but are not limited to: Latency [RFC 1242, section 3.8] Frame Loss Rate [RFC 1242, section 3.6] Throughput [RFC 1242, section 3.17] Device Under Test (DUT) [RFC 2285, section 3.1.1] System Under Test (SUT) [RFC 2285, section 3.1.2] Out-of-sequence Packet [Ref.[4], section 3.3.1] Out-of-order Packet [Ref.[4], section 3.3.2] Duplicate Packet [Ref.[4], section 3.3.3] This document adopts the definition format in Section 2 of RFC 1242. The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119. 3. Term definitions 3.1 Path 3.1.1 Path Definition: A sequence of nodes, , with the following properties: - R1 is the ingress node and forwards IP packets, which are entered into DUT/SUT, to R2 as sub-IP frames. - Ri is a node which forwards data frames to R[i+1] for all i, 1| | | +--->| Node 1 | +---------+ | Node 2 |----+ | |- - - - - - - - - - - - >| | +--------+ Recovery Path +--------+ | | +-------------------------------------------+ System Under Test (SUT) Figure 1 Kimura & Perser Expires April 2004 [Page 10] INTERNET-DRAFT Protection Performance Terminology October 2003 Discussion: A reference model for protection benchmarking is shown in fig.1. A SUT consists of two protection-capable nodes connected by both an ordinary path and a recovery path. The ordinary path has pseudo-failure equipment. Test equipment, which is placed outside the two nodes, continuously sends IP packets that include sequence numbers and time stamps to one of the nodes and receives packets from the other node. After the test equipment has sent a trigger for protection to the pseudo-failure equipment, the system detects the failure and switches from the failed ordinary path to the recovery path. The test equipment records the sequence numbers and time stamps in the IP packets as well as the packet-reception times, during the time it takes protection switching to detect and finish responding to a failure. Measurement units: n/a Issues: See Also: Working Path (3.1.2) Ordinary Path (3.1.3) Recovery Path (3.1.4) Path Failure (3.2.1) Failure Detection (3.2.2) Switch Over (3.2.3) Protection Switching (3.2.4) Protection-Capable Node (3.2.5) Pseudo-Failure Equipment (3.3.1) Trigger for Failure Protection (3.3.2) 3.4 Metrics Performance metrics for protection benchmarking will include Lost Packets (related to Frame Loss Rate in RFC 1242) including Errored Packets, Out-of-order Packets (Ref.[4]), Duplicate Packets (Ref.[4]), Induced Latency, and Recovery Time. 3.4.1 Errored Packet Definition: A received packet that fails at least one error detection scheme in a sub-IP (FCS) or IP layer (IP checksum). Kimura & Perser Expires April 2004 [Page 11] INTERNET-DRAFT Protection Performance Terminology October 2003 Discussion: Packets may have these errors due to failure or protection switching in a sub-IP layer. Such packets with one or more errors are equivalent to lost packets in upper-layers, because the errors are detected in IP or lower layers. Measurement units: Packet count Issues: See Also: Lost Packet (3.4.2) 3.4.2 Lost Packet Definition: A packet which either has one or more errors or dropped from the buffer in a DUT/SUT node. Discussion: The input traffic rate SHOULD be less than or equal to the Throughput (RFC 1242) which is the smallest of two Throughputs for paths before and after protection switching. This metric is related to the Frame Loss Rate defined in RFC 1242 but we are interested in the number of lost packets during testing. Measurement units: Packet count Issues: Lost packets cannot be directly observed because they cannot be received by test equipment. See Also: Throughput (RFC 1242) Frame Loss Rate (RFC 1242) Errored Packet (3.4.1) 3.4.3 Sequence-Error Period Definition: The time duration between the first time and the last time when Out-of-sequence Packets (Ref.[4]) are observed at the end of DUT/SUT during whole testing. Kimura & Perser Expires April 2004 [Page 12] INTERNET-DRAFT Protection Performance Terminology October 2003 Discussion: Observation of out-of-sequence packets can track all of the lost packets, which include errored packets, out-of-order packets, and duplicate packets. Measurement units: Seconds Issues: See Also: Out-of-sequence Packet (Ref.[4]) Errored Packet (3.4.1) Lost Packet (3.4.2) Out-of-order Packet (Ref.[4]) Duplicate Packet (Ref.[4]) 3.4.4 Loss Period Definition: The time duration calculated by dt/(Ns-Nr) if Ns > Nr, or 0 if Ns<=Nr, where dt is a constant inter-packet time with which the test equipment sends packets. Ns is the number of packets sent from the test equipment and Nr is the number of packets received by the test equipment. Discussion: Each test packet does not need to have its sequence number in it to measure this metric. Measurement units: Seconds Issues: See Also: Lost Packet (3.4.2) 3.4.5 Base Latency Definition: Latency during no network changes: no path failures, no route changes, and no traffic overload. Discussion: Base latencies before path failure and after protection Kimura & Perser Expires April 2004 [Page 13] INTERNET-DRAFT Protection Performance Terminology October 2003 switching are the latencies in an ordinary path and in a recovery path respectively. If a recovery path takes more hops than an ordinary path, the base latency is increased by protection switching. Base latency in the duration between path failure and protection switching cannot be determined under the above definition, because the working path is changed in this duration. In this case, base latency is defined as the base latency before path failure. So, base latency changes during testing. Measurement units: Seconds Issues: See Also: Ordinary Path (3.1.3) Recovery Path (3.1.4) Latency (RFC 1242) 3.4.6 Additive Latency Definition: Difference in base latencies in recovery path compared to the ordinary path. Discussion: If a recovery path takes more hops than an ordinary path, the latency is increased by protection switching. Measurement units: Seconds Issues: See Also: Ordinary Path (3.1.3) Recovery Path (3.1.4) Latency (RFC 1242) Base Latency (3.4.5) 3.4.7 Induced Latency Definition: Difference in measured latency during testing compared to the base latency. Kimura & Perser Expires April 2004 [Page 14] INTERNET-DRAFT Protection Performance Terminology October 2003 Discussion: This latency may be induced by buffering in nodes during protection switching and it may vary with time. Measurement units: Seconds Issues: It is necessary to write a timestamp in every packet to measure this metric. See Also: Ordinary Path (3.1.3) Recovery Path (3.1.4) Latency (RFC 1242) Base Latency (3.4.5) 3.4.8 Unstable-latency Period Definition: The time duration between the first time and the last time when test packets injected with a constant period dt are received with a time interval which is not equal to dt at the end of DUT/SUT during the entire test. Discussion: An observed inter-packet time T is set equal to dt, if dt - s < T < dt + s, where the measurement error is limited by the value s. The test equipment measures inter-packet times received by it, because we do not need the value of induced latency itself. The observation of packet intervals can indirectly track induced latency, Measurement units: Seconds Issues: See Also: Ordinary Path (3.1.3) Recovery Path (3.1.4) Latency (RFC 1242) 3.4.9 Recovery Time Definition: Kimura & Perser Expires April 2004 [Page 15] INTERNET-DRAFT Protection Performance Terminology October 2003 The time duration from an earlier start time of unstable-latency and sequence-error periods to a later end time of these periods. Discussion: Recovery time could be the sum of failure detection time, switch-over time, and the time taken for the system to be stabilized. This is the time duration when protection switching in response to path failure has finished and stability is restored enabling packets to be forwarded normally, i.e., abnormal and abnormally received packets (lost, errored, out-of- order, and duplicated) are no longer present, induced latency has decreased, and latency becomes stable. The Loss Period may be an alternative metric of the recovery time. But this metric may be not so accurate. If Loss Period is used as an alternative of the recovery time, it MUST be referred to as "Recovery Time by Loss Period" Measurement units: Seconds Issues: See Also: Ordinary Path (3.1.3) Recovery Path (3.1.4) Failure Detection (3.2.2) Switch Over (3.2.3) Protection Switching (3.2.4) Latency (RFC 1242) Errored Packet (3.4.1) Lost Packet (3.4.2) Out-of-order Packet (Ref.[4]) Duplicate Packet (Ref.[4]) Induced Latency (3.4.7) Sequence-Error Period (3.4.3) Loss Period (3.4.4) Unstable-latency Period (3.4.8) 4. Security Considerations This document only addresses terminology for the performance benchmarking of protection systems, and the information contained in this document shall have no effect on the security of the Internet. 5. Acknowledgements Kimura & Perser Expires April 2004 [Page 16] INTERNET-DRAFT Protection Performance Terminology October 2003 The editors gratefully acknowledge the contribution of Al Morton in reviewing this document. 6. References [1] Bradner, S., "The Internet Standards Process -- Revision 3", RFC 2026, October 1996. [2] Bradner, S., Editor, "Benchmarking Terminology for Network Interconnection Devices", RFC 1242, July 1991. [3] Mandeville, R., "Benchmarking Terminology for LAN Switching Devices", RFC 2285, February 1998. [4] Perser, J., et al., "Terminology for Benchmarking Network-layer Traffic Control Mechanisms", Internet Draft, Work in Progress, draft-ietf-bmwg-dsmterm-07.txt, June 2003. [5] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", RFC 2119, March 1997. [6] Paxson, V., et al., "Framework for IP Performance Metrics", RFC 2026, May 1998. 7. Authors' Addresses Takumi Kimura NTT Service Integration Laboratories 3-9-11 Midori-cho Musashino-shi, Tokyo 180-8585 Japan Phone: +81 422 59 3026 EMail: takumi.kimura@lab.ntt.co.jp Jerry Perser Spirent Communications 26750 Agoura Road Calabasas, CA 91302 USA Phone: + 1 818 676 2300 EMail: jerry.perser@spirentcom.com 8. Full Copyright Statement Kimura & Perser Expires April 2004 [Page 17] INTERNET-DRAFT Protection Performance Terminology October 2003 Copyright (C) The Internet Society (2003). All Rights Reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English. The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assigns. This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Kimura & Perser Expires April 2004 [Page 18]