Internet Engineering Task Force                           Takumi Kimura
INTERNET-DRAFT                                            NTT
Expires in: April 2003                                    Jerry Perser
                                                          Spirent
                                                          October 2002


          Benchmarking Terminology for Protection Performance

                 <draft-kimura-protection-term-00.txt>


Status of this Memo

   This document is an Internet-Draft and is in full conformance with
   all provisions of Section 10 of RFC2026[1].

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups. Note that other
   groups may also distribute working documents as Internet-Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.


Abstract

   This document addresses terminology and metrics for the performance
   benchmarking of lower-layer protection systems: Automatic Protection
   Switching (APS) for SONET/SDH, the Resilient Packet Ring (RPR) for
   Ethernet, and Multi-Protocol Label Switching (MPLS) for just below
   the IP-layer.  The benchmarks describe system performance in the IP-
   layer.


Table of Contents

    1. Introduction  ..............................................
    2. Existing definitions  ......................................
    3. Term definitions  ..........................................


Kimura & Perser            Expires April 2003                   [Page 1]

INTERNET-DRAFT     Protection Performance Terminology       October 2002


      3.1 Path
        3.1.1 Path  ...............................................
        3.1.2 Unidirectional Path  ................................
        3.1.3 Bidirectional Path  .................................
        3.1.4 Path Failure  .......................................
        3.1.5 Working Path  .......................................
        3.1.6 Recovery Path  ......................................
      3.2 Protection System
        3.2.1 Failure Detection  ..................................
        3.2.2 Switch-Over  ........................................
        3.2.3 Protection Switching  ...............................
        3.2.4 Protection-Capable Node  ............................
        3.2.5 Protection System  ..................................
      3.3 Reference Model for Protection Benchmarking
        3.3.1 Pseudo-Failure Equipment  ...........................
        3.3.2 Trigger for Protection  .............................
        3.3.3 Reference Model for Protection Benchmarking  ........
      3.4 Metrics
        3.4.1 Lost Packets  .......................................
        3.4.2 Bit-Error Packets  ..................................
        3.4.3 Loss of Service  ....................................
        3.4.4 Out-of-Order Packets  ...............................
        3.4.5 Duplicated Packets  .................................
        3.4.6 Additive Latency  ...................................
        3.4.7 Induced Latency  ....................................
        3.4.8 Loss-of-Service Time  ...............................
        3.4.9 Recovery Time  ......................................
    4. Security Considerations  ...................................
    5. References  ................................................
    6. Authors' Addresses  ........................................


1. Introduction

   Reliability is needed for the IP networks of today, because the
   Internet has already become an important communication
   infrastructure, and quality-sensitive applications are being used on
   it.  To improve IP-layer reliability, protection technologies are
   implemented in lower layers.  Automatic Protection Switching (APS) is
   for SONET/SDH, the Resilient Packet Ring (RPR) is for Ethernet, and
   Multi-Protocol Label Switching (MPLS) is for just below the IP-layer.
   Recovery time in IP-layer is different from that in lower layers
   because of the mechanism for recognizing when interfaces go up and
   down and the buffering effect of IP routers.  Protection performance
   specifications and methodologies for testing them are required to
   provide for the objective comparison of implementations.  Performance
   metrics are based on effects in the IP layer; different protection
   technologies are thus made comparable.


Kimura & Perser            Expires April 2003                   [Page 2]

INTERNET-DRAFT     Protection Performance Terminology       October 2002


2.  Existing definitions

   RFC 1242[2] "Benchmarking Terminology for Network Interconnect
   Devices" and RFC 2285[3] "Benchmarking Terminology for LAN Switching
   Devices" should be consulted before attempting to make use of this
   document.  This document adopts the definition format in Section 2 of
   RFC 1242[2].

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED",  "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC 2119[4].


3. Term definitions

3.1 Path

3.1.1 Path

    Definition:
        A path is defined as a lower-layer path between two IP nodes,
        and one path is regarded as being equivalent to one link in IP
        layer.  For example, SONET/SDH path, Label Switched path for
        MPLS, and Optical path.

    Discussion:

    Measurement units:
        n/a

    Issues:

    See Also:


3.1.2 Unidirectional path

    Definition:
        A path which transmits traffic in only one direction.  One node
        pair has at least two unidirectional paths, one in each
        direction.  The two paths are treated as independent of each
        other.

    Discussion:

    Measurement units:
        n/a


Kimura & Perser            Expires April 2003                   [Page 3]

INTERNET-DRAFT     Protection Performance Terminology       October 2002


    Issues:

    See Also:
        Path
        Bidirectional Path


3.1.3 Bidirectional path

    Definition:
        A path which transmits traffic in both directions.  One
        bidirectional path consists of two unidirectional paths.  One
        node pair has at least one bidirectional path.

    Discussion:

    Measurement units:
        n/a

    Issues:

    See Also:
        Path
        Unidirectional path


3.1.4 Path Failure

    Definition:
        A lower-layer path failure, causing path continuity to be lost.
        It is caused by faults of link(s) or node(s) in a lower-layer.

    Discussion:

    Measurement units:
        n/a

    Issues:

    See Also:


3.1.5 Working Path

    Definition:
        A path which is active and used.

    Discussion:


Kimura & Perser            Expires April 2003                   [Page 4]

INTERNET-DRAFT     Protection Performance Terminology       October 2002


    Measurement units:
        n/a

    Issues:

    See Also:
        Recovery Path


3.1.6 Recovery Path

    Definition:
        A path which is prepared against the eventuality of path
        failure, and is used to recover path continuity in case of
        working-path failure.

    Discussion:
        There are two or more types of recovery paths: including, at
        least, the dedicated recovery path (1+1) or shared recovery path
        (m:n).

    Measurement units:
        n/a

    Issues:

    See Also:
        Working Path


3.2 Protection System

3.2.1 Failure Detection

    Definition:
        To detect path failure which is caused by a fault or faults in a
        link or node in a lower-layer.

    Discussion:

    Measurement units:
        n/a

    Issues:

    See Also:


Kimura & Perser            Expires April 2003                   [Page 5]

INTERNET-DRAFT     Protection Performance Terminology       October 2002


3.2.2 Switch-Over

    Definition:
        To change active paths, in cases of failure, from working paths
        to recovery paths.

    Discussion:

    Measurement units:
        n/a

    Issues:

    See Also:


3.2.3 Protection Switching

    Definition:
        The detection of path failures and response to path failures by
        switching of the active path from the working path to recovery
        path.  A protection-switching scheme includes mechanisms for
        failure detection and switch-over.

    Discussion:

    Measurement units:
        n/a

    Issues:

    See Also:


3.2.4 Protection-Capable Node

    Definition:
        A node which includes functional elements for handling of
        protection switching.

    Discussion:

    Measurement units:
        n/a

    Issues:

    See Also:


Kimura & Perser            Expires April 2003                   [Page 6]

INTERNET-DRAFT     Protection Performance Terminology       October 2002


3.2.5 Protection System

    Definition:
        A system which consists of two or more protection-capable nodes
        connected to each other by both working paths and recovery
        paths.

    Discussion:
        When a path failure occurs, the system detects the failure and
        switches the active path from the failed working path to the
        recovery path.  Some technologies for this are in lower layers:
        MPLS-based recovery, SONET/SDH-based recovery and Optical path
        recovery [5]-[8].

    Measurement units:
        n/a

    Issues:

    See Also:
        Failure Detection
        Switch-Over
        Protection-Capable Node
        Protection Switching


3.3 Reference Model for Protection Benchmarking

3.3.1 Pseudo-Failure Equipment

    Definition:
        Equipment which creates a pseudo path failure after receiving a
        signal from a tester.

    Discussion:
        An item of pseudo-failure equipment is used in benchmarking
        protection systems, since it provides more reliable and
        reproducible testing than actual path failure.

    Measurement units:
        n/a

    Issues:

    See Also:
        Trigger for Protection


Kimura & Perser            Expires April 2003                   [Page 7]

INTERNET-DRAFT     Protection Performance Terminology       October 2002


3.3.2 Trigger for Protection

    Definition:
        A signal which is sent from a tester to make an item of pseudo-
        failures equipment create a pseudo failure of a working path.

    Discussion:

    Measurement units:
        n/a

    Issues:

    See Also:
        Pseudo-Failure Equipment


3.3.3 Reference Model for Protection Benchmarking

    Definition:
        A fundamental model which is used in benchmarking protection
        systems.  A System Under Test (SUT) consists of two protection-
        capable nodes connected by both a working path and a recovery
        path.  An item of pseudo-failure equipment is placed at a point
        along the working path.  A tester is set outside the two nodes
        and generates IP traffic.  The tester also sends the triggers
        for protection that cause the item of pseudo-failure equipment
        to simulate path failures.


                                 +-----------+
            +--------------------|  Tester   |<-------------------+
            |                    +-----------+                    |
            |                          | Trigger                  |
            |                          |   for Protection         |
            |              Working     v                          |
            |    +--------+  Path +---------+       +--------+    |
            |    |        |-------| Failure |------>|        |    |
            +--->| Node 1 |       +---------+       | Node 2 |----+
                 |        |- - - - - - - - - - - - >|        |
                 +--------+      Recovery Path      +--------+

                 |                                           |
                 +-------------------------------------------+
                              System Under Test (SUT)

                                    Figure 1


Kimura & Perser            Expires April 2003                   [Page 8]

INTERNET-DRAFT     Protection Performance Terminology       October 2002


    Discussion:
        Figure 1 shows the reference model for protection benchmarking.
        A SUT consists of two protection-capable nodes connected by both
        a working path and a recovery path.  The working path includes
        pseudo-failure equipment inside.  A tester, which is placed
        outside the two nodes, continuously sends IP packets that
        include sequence numbers and time stamps to one of the nodes and
        receives packets from the other node.  After the tester sent a
        trigger for protection to the pseudo-failure equipment, the
        system detects the failure and switches from the failed working
        path to the recovery path.  The tester records the sequence
        numbers and time stamps in the IP packets as well as the packet-
        reception times, during the time protection switching takes to
        detect and finish responding to a failure.

    Measurement units:
        n/a

    Issues:

    See Also:


3.4 Metrics

3.4.1 Lost Packets

    Definition:
        Packets which are lost during the time protection switching
        takes to detect and finish responding to a failure.

    Discussion:
        The input traffic rate SHOULD be less than or equal to the
        Throughput[2].

    Measurement units:
        Number of N-octet frames

    Issues:

    See Also:
        Throughput[2]


3.4.2 Bit-Error Packets

    Definition:
        Received packets that have incurred bit-errors during the time


Kimura & Perser            Expires April 2003                   [Page 9]

INTERNET-DRAFT     Protection Performance Terminology       October 2002


        protection switching takes to detect and finish responding to a
        failure.

    Discussion:

    Measurement units:
        Number of N-octet frames

    Issues:

    See Also:


3.4.3 Loss of Service

    Definition:
        The volume of dropped packets and bit-error packets during the
        time protection switching takes to detect and finish responding
        to a failure.

    Discussion:
        The input traffic rate SHOULD be equal to the Throughput[2].

    Measurement units:
        Octets

    Issues:

    See Also:
        Throughput[2]


3.4.4 Out-of-Order Packets

    Definition:
        Received packets with a lower sequence number than was expected.
        It is elsewhere [9] referred to as re-ordered packets.

    Discussion:

    Measurement units:
        Number of N-octet frames

    Issues:

    See Also:
        Out-of-order Packet[9]


Kimura & Perser            Expires April 2003                  [Page 10]

INTERNET-DRAFT     Protection Performance Terminology       October 2002


3.4.5 Duplicated Packets

    Definition:
        Received packets with the same sequence number as packets that
        has already been received.

    Discussion:

    Measurement units:
        Number of N-octet frames

    Issues:

    See Also:


3.4.6 Additive Latency

    Definition:
        Difference between working-path latency and recovery-path
        latency.

    Discussion:
        If a recovery path takes more hops than a working path, the
        latency is increased by protection switching.

    Measurement units:
        Seconds

    Issues:

    See Also:
        Latency[2]


3.4.7 Induced Latency

    Definition:
        An instantaneous increase in latency by more than the latency of
        either the working path and recovery path.

    Discussion:
        This latency may be induced by buffering in nodes during
        protection switching.

    Measurement units:
        Seconds


Kimura & Perser            Expires April 2003                  [Page 11]

INTERNET-DRAFT     Protection Performance Terminology       October 2002


    Issues:

    See Also:
        Latency[2]


3.4.8 Loss-of-Service Time

    Definition:
        Loss-of-Service Time is defined as the time taken to switch-over
        to a recovery path and to restart the forwarding of packets
        after a path failure.

    Discussion:
        Loss-of-service time may be the sum of the failure-detection and
        switch-over times.  This time is derived from Loss of Service
        and the input traffic rate.

    Measurement units:
        Seconds

    Issues:

    See Also:
        Loss of Service


3.4.9 Recovery Time

    Definition:
        Recovery time is defined as the time when protection switching
        in response to a path failure is finished and stability is
        restored to the forwarding of packets: abnormal and abnormally
        received packets (lost, bit-error, out-of-order, and duplicated)
        are no longer present, induced latency is decreased, and latency
        becomes stable.

    Discussion:
        Recovery time may be the sum of the failure detection time,
        switch-over time and time taken for the system to be stabilized.

    Measurement units:
        n/a

    Issues:

    See Also:


Kimura & Perser            Expires April 2003                  [Page 12]

INTERNET-DRAFT     Protection Performance Terminology       October 2002


4. Security Considerations

   This document solely addresses only terminology for the performance
   benchmarking of protection systems, and the information contained in
   this document has no effect on the security of the Internet.


5. References

   [1]  Bradner, S., "The Internet Standards Process -- Revision 3",
        RFC 2026, October 1996.

   [2]  Bradner, S., Editor, "Benchmarking Terminology for
        Network Interconnection Devices", RFC 1242, July 1991.

   [3]  Mandeville, R., "Benchmarking Terminology for LAN
        Switching Devices", RFC 2285, February 1998.

   [4]  Bradner, S., "Key words for use in RFCs to Indicate
        Requirement Levels", RFC 2119, March 1997.

   [5]  Lai, W.S., et al., "Network Hierarchy and Multilayer
        Survivability", Internet Draft, Work in progress,
        draft-ietf-tewg-restore-hierarchy-01.txt, July 2002.

   [6]  Owens, K., et al., "Network Survivability Considerations
        for Traffic Engineered IP Networks",
        Internet Draft, Work in Progress,
        draft-owens-te-network-survivability-03.txt, May 2002.

   [7]  Sharma, V., et al., "Framework for MPLS-based Recovery",
        Internet Draft, Work in Progress,
        draft-ietf-mpls-recovery-frmwrk-07.txt, September 2002.

   [8]  Mannie, E., et al., "Recovery (Protection and Restoration)
        Terminology for GMPLS", Internet Draft, Work in progress,
        draft-ietf-ccamp-gmpls-recovery-terminology-00.txt, June 2002.

   [9]  Perser, J., et al., "Terminology for Benchmarking Network-layer
        Traffic Control Mechanisms",
        Internet Draft, Work in Progress,
        draft-ietf-bmwg-dsmterm-03.txt, June 2002.


6. Authors' Addresses

   Takumi Kimura
   NTT Service Integration Laboratories


Kimura & Perser            Expires April 2003                  [Page 13]

INTERNET-DRAFT     Protection Performance Terminology       October 2002


   3-9-11 Midori-cho
   Musashino-shi, Tokyo 180-8585
   Japan
   Phone: +81 422 59 3026
   EMail: takumi.kimura@lab.ntt.co.jp

   Jerry Perser
   Spirent Communications
   26750 Agoura Road
   Calabasas, CA 91302
   USA
   Phone: + 1 818 676 2300
   EMail: jerry.perser@spirentcom.com


Kimura & Perser            Expires April 2003                  [Page 14]