Internet Engineering Task Force Takumi Kimura INTERNET-DRAFT NTT Expires in: April 2003 Jerry Perser Spirent October 2002 Benchmarking Terminology for Protection Performance Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026[1]. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Abstract This document addresses terminology and metrics for the performance benchmarking of lower-layer protection systems: Automatic Protection Switching (APS) for SONET/SDH, the Resilient Packet Ring (RPR) for Ethernet, and Multi-Protocol Label Switching (MPLS) for just below the IP-layer. The benchmarks describe system performance in the IP- layer. Table of Contents 1. Introduction .............................................. 2. Existing definitions ...................................... 3. Term definitions .......................................... Kimura & Perser Expires April 2003 [Page 1] INTERNET-DRAFT Protection Performance Terminology October 2002 3.1 Path 3.1.1 Path ............................................... 3.1.2 Unidirectional Path ................................ 3.1.3 Bidirectional Path ................................. 3.1.4 Path Failure ....................................... 3.1.5 Working Path ....................................... 3.1.6 Recovery Path ...................................... 3.2 Protection System 3.2.1 Failure Detection .................................. 3.2.2 Switch-Over ........................................ 3.2.3 Protection Switching ............................... 3.2.4 Protection-Capable Node ............................ 3.2.5 Protection System .................................. 3.3 Reference Model for Protection Benchmarking 3.3.1 Pseudo-Failure Equipment ........................... 3.3.2 Trigger for Protection ............................. 3.3.3 Reference Model for Protection Benchmarking ........ 3.4 Metrics 3.4.1 Lost Packets ....................................... 3.4.2 Bit-Error Packets .................................. 3.4.3 Loss of Service .................................... 3.4.4 Out-of-Order Packets ............................... 3.4.5 Duplicated Packets ................................. 3.4.6 Additive Latency ................................... 3.4.7 Induced Latency .................................... 3.4.8 Loss-of-Service Time ............................... 3.4.9 Recovery Time ...................................... 4. Security Considerations ................................... 5. References ................................................ 6. Authors' Addresses ........................................ 1. Introduction Reliability is needed for the IP networks of today, because the Internet has already become an important communication infrastructure, and quality-sensitive applications are being used on it. To improve IP-layer reliability, protection technologies are implemented in lower layers. Automatic Protection Switching (APS) is for SONET/SDH, the Resilient Packet Ring (RPR) is for Ethernet, and Multi-Protocol Label Switching (MPLS) is for just below the IP-layer. Recovery time in IP-layer is different from that in lower layers because of the mechanism for recognizing when interfaces go up and down and the buffering effect of IP routers. Protection performance specifications and methodologies for testing them are required to provide for the objective comparison of implementations. Performance metrics are based on effects in the IP layer; different protection technologies are thus made comparable. Kimura & Perser Expires April 2003 [Page 2] INTERNET-DRAFT Protection Performance Terminology October 2002 2. Existing definitions RFC 1242[2] "Benchmarking Terminology for Network Interconnect Devices" and RFC 2285[3] "Benchmarking Terminology for LAN Switching Devices" should be consulted before attempting to make use of this document. This document adopts the definition format in Section 2 of RFC 1242[2]. The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119[4]. 3. Term definitions 3.1 Path 3.1.1 Path Definition: A path is defined as a lower-layer path between two IP nodes, and one path is regarded as being equivalent to one link in IP layer. For example, SONET/SDH path, Label Switched path for MPLS, and Optical path. Discussion: Measurement units: n/a Issues: See Also: 3.1.2 Unidirectional path Definition: A path which transmits traffic in only one direction. One node pair has at least two unidirectional paths, one in each direction. The two paths are treated as independent of each other. Discussion: Measurement units: n/a Kimura & Perser Expires April 2003 [Page 3] INTERNET-DRAFT Protection Performance Terminology October 2002 Issues: See Also: Path Bidirectional Path 3.1.3 Bidirectional path Definition: A path which transmits traffic in both directions. One bidirectional path consists of two unidirectional paths. One node pair has at least one bidirectional path. Discussion: Measurement units: n/a Issues: See Also: Path Unidirectional path 3.1.4 Path Failure Definition: A lower-layer path failure, causing path continuity to be lost. It is caused by faults of link(s) or node(s) in a lower-layer. Discussion: Measurement units: n/a Issues: See Also: 3.1.5 Working Path Definition: A path which is active and used. Discussion: Kimura & Perser Expires April 2003 [Page 4] INTERNET-DRAFT Protection Performance Terminology October 2002 Measurement units: n/a Issues: See Also: Recovery Path 3.1.6 Recovery Path Definition: A path which is prepared against the eventuality of path failure, and is used to recover path continuity in case of working-path failure. Discussion: There are two or more types of recovery paths: including, at least, the dedicated recovery path (1+1) or shared recovery path (m:n). Measurement units: n/a Issues: See Also: Working Path 3.2 Protection System 3.2.1 Failure Detection Definition: To detect path failure which is caused by a fault or faults in a link or node in a lower-layer. Discussion: Measurement units: n/a Issues: See Also: Kimura & Perser Expires April 2003 [Page 5] INTERNET-DRAFT Protection Performance Terminology October 2002 3.2.2 Switch-Over Definition: To change active paths, in cases of failure, from working paths to recovery paths. Discussion: Measurement units: n/a Issues: See Also: 3.2.3 Protection Switching Definition: The detection of path failures and response to path failures by switching of the active path from the working path to recovery path. A protection-switching scheme includes mechanisms for failure detection and switch-over. Discussion: Measurement units: n/a Issues: See Also: 3.2.4 Protection-Capable Node Definition: A node which includes functional elements for handling of protection switching. Discussion: Measurement units: n/a Issues: See Also: Kimura & Perser Expires April 2003 [Page 6] INTERNET-DRAFT Protection Performance Terminology October 2002 3.2.5 Protection System Definition: A system which consists of two or more protection-capable nodes connected to each other by both working paths and recovery paths. Discussion: When a path failure occurs, the system detects the failure and switches the active path from the failed working path to the recovery path. Some technologies for this are in lower layers: MPLS-based recovery, SONET/SDH-based recovery and Optical path recovery [5]-[8]. Measurement units: n/a Issues: See Also: Failure Detection Switch-Over Protection-Capable Node Protection Switching 3.3 Reference Model for Protection Benchmarking 3.3.1 Pseudo-Failure Equipment Definition: Equipment which creates a pseudo path failure after receiving a signal from a tester. Discussion: An item of pseudo-failure equipment is used in benchmarking protection systems, since it provides more reliable and reproducible testing than actual path failure. Measurement units: n/a Issues: See Also: Trigger for Protection Kimura & Perser Expires April 2003 [Page 7] INTERNET-DRAFT Protection Performance Terminology October 2002 3.3.2 Trigger for Protection Definition: A signal which is sent from a tester to make an item of pseudo- failures equipment create a pseudo failure of a working path. Discussion: Measurement units: n/a Issues: See Also: Pseudo-Failure Equipment 3.3.3 Reference Model for Protection Benchmarking Definition: A fundamental model which is used in benchmarking protection systems. A System Under Test (SUT) consists of two protection- capable nodes connected by both a working path and a recovery path. An item of pseudo-failure equipment is placed at a point along the working path. A tester is set outside the two nodes and generates IP traffic. The tester also sends the triggers for protection that cause the item of pseudo-failure equipment to simulate path failures. +-----------+ +--------------------| Tester |<-------------------+ | +-----------+ | | | Trigger | | | for Protection | | Working v | | +--------+ Path +---------+ +--------+ | | | |-------| Failure |------>| | | +--->| Node 1 | +---------+ | Node 2 |----+ | |- - - - - - - - - - - - >| | +--------+ Recovery Path +--------+ | | +-------------------------------------------+ System Under Test (SUT) Figure 1 Kimura & Perser Expires April 2003 [Page 8] INTERNET-DRAFT Protection Performance Terminology October 2002 Discussion: Figure 1 shows the reference model for protection benchmarking. A SUT consists of two protection-capable nodes connected by both a working path and a recovery path. The working path includes pseudo-failure equipment inside. A tester, which is placed outside the two nodes, continuously sends IP packets that include sequence numbers and time stamps to one of the nodes and receives packets from the other node. After the tester sent a trigger for protection to the pseudo-failure equipment, the system detects the failure and switches from the failed working path to the recovery path. The tester records the sequence numbers and time stamps in the IP packets as well as the packet- reception times, during the time protection switching takes to detect and finish responding to a failure. Measurement units: n/a Issues: See Also: 3.4 Metrics 3.4.1 Lost Packets Definition: Packets which are lost during the time protection switching takes to detect and finish responding to a failure. Discussion: The input traffic rate SHOULD be less than or equal to the Throughput[2]. Measurement units: Number of N-octet frames Issues: See Also: Throughput[2] 3.4.2 Bit-Error Packets Definition: Received packets that have incurred bit-errors during the time Kimura & Perser Expires April 2003 [Page 9] INTERNET-DRAFT Protection Performance Terminology October 2002 protection switching takes to detect and finish responding to a failure. Discussion: Measurement units: Number of N-octet frames Issues: See Also: 3.4.3 Loss of Service Definition: The volume of dropped packets and bit-error packets during the time protection switching takes to detect and finish responding to a failure. Discussion: The input traffic rate SHOULD be equal to the Throughput[2]. Measurement units: Octets Issues: See Also: Throughput[2] 3.4.4 Out-of-Order Packets Definition: Received packets with a lower sequence number than was expected. It is elsewhere [9] referred to as re-ordered packets. Discussion: Measurement units: Number of N-octet frames Issues: See Also: Out-of-order Packet[9] Kimura & Perser Expires April 2003 [Page 10] INTERNET-DRAFT Protection Performance Terminology October 2002 3.4.5 Duplicated Packets Definition: Received packets with the same sequence number as packets that has already been received. Discussion: Measurement units: Number of N-octet frames Issues: See Also: 3.4.6 Additive Latency Definition: Difference between working-path latency and recovery-path latency. Discussion: If a recovery path takes more hops than a working path, the latency is increased by protection switching. Measurement units: Seconds Issues: See Also: Latency[2] 3.4.7 Induced Latency Definition: An instantaneous increase in latency by more than the latency of either the working path and recovery path. Discussion: This latency may be induced by buffering in nodes during protection switching. Measurement units: Seconds Kimura & Perser Expires April 2003 [Page 11] INTERNET-DRAFT Protection Performance Terminology October 2002 Issues: See Also: Latency[2] 3.4.8 Loss-of-Service Time Definition: Loss-of-Service Time is defined as the time taken to switch-over to a recovery path and to restart the forwarding of packets after a path failure. Discussion: Loss-of-service time may be the sum of the failure-detection and switch-over times. This time is derived from Loss of Service and the input traffic rate. Measurement units: Seconds Issues: See Also: Loss of Service 3.4.9 Recovery Time Definition: Recovery time is defined as the time when protection switching in response to a path failure is finished and stability is restored to the forwarding of packets: abnormal and abnormally received packets (lost, bit-error, out-of-order, and duplicated) are no longer present, induced latency is decreased, and latency becomes stable. Discussion: Recovery time may be the sum of the failure detection time, switch-over time and time taken for the system to be stabilized. Measurement units: n/a Issues: See Also: Kimura & Perser Expires April 2003 [Page 12] INTERNET-DRAFT Protection Performance Terminology October 2002 4. Security Considerations This document solely addresses only terminology for the performance benchmarking of protection systems, and the information contained in this document has no effect on the security of the Internet. 5. References [1] Bradner, S., "The Internet Standards Process -- Revision 3", RFC 2026, October 1996. [2] Bradner, S., Editor, "Benchmarking Terminology for Network Interconnection Devices", RFC 1242, July 1991. [3] Mandeville, R., "Benchmarking Terminology for LAN Switching Devices", RFC 2285, February 1998. [4] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", RFC 2119, March 1997. [5] Lai, W.S., et al., "Network Hierarchy and Multilayer Survivability", Internet Draft, Work in progress, draft-ietf-tewg-restore-hierarchy-01.txt, July 2002. [6] Owens, K., et al., "Network Survivability Considerations for Traffic Engineered IP Networks", Internet Draft, Work in Progress, draft-owens-te-network-survivability-03.txt, May 2002. [7] Sharma, V., et al., "Framework for MPLS-based Recovery", Internet Draft, Work in Progress, draft-ietf-mpls-recovery-frmwrk-07.txt, September 2002. [8] Mannie, E., et al., "Recovery (Protection and Restoration) Terminology for GMPLS", Internet Draft, Work in progress, draft-ietf-ccamp-gmpls-recovery-terminology-00.txt, June 2002. [9] Perser, J., et al., "Terminology for Benchmarking Network-layer Traffic Control Mechanisms", Internet Draft, Work in Progress, draft-ietf-bmwg-dsmterm-03.txt, June 2002. 6. Authors' Addresses Takumi Kimura NTT Service Integration Laboratories Kimura & Perser Expires April 2003 [Page 13] INTERNET-DRAFT Protection Performance Terminology October 2002 3-9-11 Midori-cho Musashino-shi, Tokyo 180-8585 Japan Phone: +81 422 59 3026 EMail: takumi.kimura@lab.ntt.co.jp Jerry Perser Spirent Communications 26750 Agoura Road Calabasas, CA 91302 USA Phone: + 1 818 676 2300 EMail: jerry.perser@spirentcom.com Kimura & Perser Expires April 2003 [Page 14]