Internet DRAFT - draft-janapath-intarea-traceflow

draft-janapath-intarea-traceflow



 



Intarea working group                             Janardhanan Narasimhan
Internet Draft                                Balaji Venkat Venkataswami
Intended Status: Proposed Standard                          Dell-Force10
Expires: 23 July 2012                                        Rich Groves
                                                               Microsoft
                                                             Peter Hoose
                                                                Facebook
                                                         23 January 2012


                               Traceflow
                 draft-janapath-intarea-traceflow-00.txt


Abstract

   This document describes a new OAM protocol - TraceFlow that captures
   information pertaining to a traffic flow along the path that the flow
   takes through the network. TraceFlow is ECMP and link-aggregation
   aware and captures the information about constituent members through
   which the traffic flow passes. TraceFlow gathers information that is
   relevant to the flow such as outgoing interface Layer 3 address,
   Next-hop to which the packet of the flow is forwarded, effect of
   network policies such as access control lists on the flow. This draft
   requires the Traceflow protocol to be processed by Layer 3 devices
   only. Devices such as Layer 2 devices, MPLS LERs/LSRs along the way
   are passed through without any processing as if in a pass-through
   mode. IP tunnels such as IP-in-IP, IP-in-GRE mechanisms are expected
   to pass the Traceflow packets through them using the pass through
   mode.  For achieving its purpose Traceflow advocates the use of a
   specific UDP destination port to be assigned from IANA. 


Status of this Memo

   This Internet-Draft is submitted to IETF in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as
   Internet-Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

 


Janardhanan et.al.         Expires July 2012                    [Page 1]

INTERNET DRAFT                 Traceflow                    January 2012


   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/1id-abstracts.html

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html


Copyright and License Notice

   Copyright (c) 2012 IETF Trust and the persons identified as the document
   authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document. Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document. Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.



Table of Contents

   1  Introduction  . . . . . . . . . . . . . . . . . . . . . . . . .  5
     1.1  Terminology . . . . . . . . . . . . . . . . . . . . . . . .  5
   2. Motivation  . . . . . . . . . . . . . . . . . . . . . . . . . .  5
     2.1. Evolution of IP networks  . . . . . . . . . . . . . . . . .  5
   3. Packet Formats  . . . . . . . . . . . . . . . . . . . . . . . .  7
     3.1. Flow Discovery Request/Response Packet Format . . . . . . .  7
     3.2. Flow Discovery Request TLVs . . . . . . . . . . . . . . . .  8
       3.2.1. Flow Descriptor TLV . . . . . . . . . . . . . . . . . .  8
       3.2.2. Originator Address TLV  . . . . . . . . . . . . . . . . 10
       3.2.3. Information Request bitmap TLV  . . . . . . . . . . . . 11
       3.2.4. Termination TLV . . . . . . . . . . . . . . . . . . . . 12
     3.3. Flow Discovery Response TLVs  . . . . . . . . . . . . . . . 13
       3.3.1. Information Response TLV  . . . . . . . . . . . . . . . 13
         3.3.1.1 Utilization Anomaly TLV  . . . . . . . . . . . . . . 16
       3.3.2. Result TLV  . . . . . . . . . . . . . . . . . . . . . . 19
       3.3.3. Additional Informational Code TLV . . . . . . . . . . . 21
     3.4. TLVs common to Flow Discovery Request and Response  . . . . 22
       3.4.1. Encapsulated Packet TLV . . . . . . . . . . . . . . . . 22
       3.4.2. Encapsulated Packet Mask TLV  . . . . . . . . . . . . . 24
       3.4.3. Record Route TLV  . . . . . . . . . . . . . . . . . . . 25
   4. Protocol Operation  . . . . . . . . . . . . . . . . . . . . . . 26
       4.0.1 Assessing why redundant responses come through.  . . . . 30
 


Janardhanan et.al.         Expires July 2012                    [Page 2]

INTERNET DRAFT                 Traceflow                    January 2012


     4.1. Using Hardware to gather details for the response packet. . 31
     4.2 Interaction with MPLS based transit devices. . . . . . . . . 31
     4.3 Applicability to Layer 2 devices.  . . . . . . . . . . . . . 31
     4.4 Applicability to platforms that have trouble determining
         incoming Interface.  . . . . . . . . . . . . . . . . . . . . 31
     4.5 Applicability to Network Address Translators . . . . . . . . 31
   5. Application Scenarios . . . . . . . . . . . . . . . . . . . . . 32
     5.1. Troubleshooting network failures  . . . . . . . . . . . . . 32
     5.2. Network flow planning . . . . . . . . . . . . . . . . . . . 33
       5.2.1 Programmatic migration to mitigate LAG link 
             polarization . . . . . . . . . . . . . . . . . . . . . . 34
   6. Security Considerations . . . . . . . . . . . . . . . . . . . . 35
   7. Hardware pre-requisites for implementing Traceflow. . . . . . . 35
     7.1 filter to trap packets with UDP destination port . . . . . . 35
     7.2 Packet injection mode directly to egress port. . . . . . . . 36
     7.3 Packet injection mode through hardware engine but not to
         output port. . . . . . . . . . . . . . . . . . . . . . . . . 36
     7.4 Hardware rate limiter support (preventing DOS attacks) . . . 36
     7.5 RPF check support in hardware (security consideration) . . . 36
     7.6 Regular Security ACLs in the boundary of the network.  . . . 37
     7.7 Implementing the LAG / ECMP using software state . . . . . . 37
     7.8 Implementation considerations  . . . . . . . . . . . . . . . 37
     7.7.l Using ingress port as part of the LAG/ECMP hashing
           function.  . . . . . . . . . . . . . . . . . . . . . . . . 37
   8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 38
   9. Contributors  . . . . . . . . . . . . . . . . . . . . . . . . . 38
   APPENDIX A:  . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
     A.1. Encapsulation Format Choices  . . . . . . . . . . . . . . . 38
       A.1.1. Carrying a separate Flow Descriptor TLV inside the 
              Flow  . . . . . . . . . . . . . . . . . . . . . . . . . 38
       A.1.2. Using the traffic flow's parameter values in the 
              external header.  . . . . . . . . . . . . . . . . . . . 39
     A.2. Layer 4 Protocol Choices and Router Alert option  . . . . . 39
       A.2.1. UDP Encapsulation . . . . . . . . . . . . . . . . . . . 39
       A.2.2. ICMP Encapsulation  . . . . . . . . . . . . . . . . . . 39
     A.3. Legacy Devices (Not supporting TraceFlow) . . . . . . . . . 40
     A.4. TTL Scoping . . . . . . . . . . . . . . . . . . . . . . . . 40
     A.5. Additional Information in the Flow Discovery Response . . . 40
     A.6. Choices for supporting remote TraceFlow requests  . . . . . 41
       A.6.1. Terminating the request at the Proxy device and
              re-originate it . . . . . . . . . . . . . . . . . . . . 41
       A.6.2. Source-Routing the request through the Proxy device . . 41
     A.7. Applicability to Multicast  . . . . . . . . . . . . . . . . 41
     A.8. Applicability to Layer 2 networks . . . . . . . . . . . . . 41
     A.9. Applicability to IPv6 . . . . . . . . . . . . . . . . . . . 42
     A.10. Applicability to MPLS  . . . . . . . . . . . . . . . . . . 42
     A.11. Flow Discovery and Response packet fragmentation . . . . . 42
   9. References  . . . . . . . . . . . . . . . . . . . . . . . . . . 42
 


Janardhanan et.al.         Expires July 2012                    [Page 3]

INTERNET DRAFT                 Traceflow                    January 2012


     9.1. Normative References  . . . . . . . . . . . . . . . . . . . 42
     9.2. Informative References  . . . . . . . . . . . . . . . . . . 42
   Author's Addresses . . . . . . . . . . . . . . . . . . . . . . . . 43













































 


Janardhanan et.al.         Expires July 2012                    [Page 4]

INTERNET DRAFT                 Traceflow                    January 2012


1  Introduction

   TraceFlow protocol allows user to determine the path taken by a flow
   through a network. It provides capability to collect relevant
   information at each hop of the network that pertains to the
   forwarding for the flow. Information can include individual member
   information in a link-aggregation group (LAG) or ECMP. 

   There is a need for a mechanism that allows user to determine the
   path that a flow takes through a network [3]. Current solutions (such
   as traceroute) do not provide the details about the exact physical or
   logical interface through with the flow passes in cases where LAG
   and/or ECMP are employed or policy based routing is in effect.  

   Such information at intermediate hops in the network can prove to be
   useful to network operators in trouble-shooting network failures.  

1.1  Terminology

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC 2119 [RFC2119].


2. Motivation 

   Network operators have traditionally managed IP networks with classic
   OAM tools like Ping and Traceroute[2]. Operators typically use Ping
   to perform end-2-end connectivity checks, and Traceroute to trace
   hop-by-hop path to a given destination. Traceroute is also used to
   isolate the point of failure along the path to a given destination. 
   These tools have performed very well for the IP networks they were
   designed for. 

2.1. Evolution of IP networks 

   With the passage of time networks have morphed into more complex
   heterogeneous entities. Many a times Layer-2 switches and MPLS LSRs
   are intermixed with IP routers. MPLS ping and MPLS traceroute also
   known as LSP ping and LSP traceroute handle the identification of the
   intermediate hops through which they travel, using methods such as
   router alert label. Relevant RFCs specify these methods as far as
   MPLS troubleshooting goes. This document doesnt intend to interfere
   with the MPLS OAM methods. Traceflow is exclusively intended for pure
   Layer 3 troubleshooting and will not troubleshoot layer 2 device
   failure or MPLS transit node failure. Also plain IP-in-IP tunneling
   varieties of forwarding will not be of interest in this document.

 


Janardhanan et.al.         Expires July 2012                    [Page 5]

INTERNET DRAFT                 Traceflow                    January 2012


   Increasing number of networks are using multipath configurations to
   improve load-balancing and redundancy in their networks. These
   multipaths could be in the form of end-2-end ECMP paths, or LAGs
   between directly connected hops. Existing tools such as Ping and
   Traceroute that follow the destination IP address based routing model
   may not follow the path taken by the actual traffic in multipath
   and/or policy based routing scenarios. The forwarding of actual
   traffic in such scenarios is based on a set of packet header fields.
   Clearly, the OAM tools have not kept up with the new requirements of
   the evolving networks. Hence there is a need to extend the OAM tools
   to facilitate the operators to execute new OAM functions: 

      1. Perform Ping or traceroute based on a set of link layer and/or
   TCP/IP header fields of actual user traffic. This feature will be
   very useful for troubleshooting network problems, and
   planning/provisioning network resources. 

      2. Trace end-2-end paths comprising of a mix of Layer-2 hops,
   IP+MPLS routers along the way. Layer 2 hops and MPLS hops are
   traversed through in pass through mode.

      3. Collect more intelligent and useful information to enable
   operators to perform more detailed problem analysis. 

   This document proposes a new OAM protocol - TraceFlow that attempts
   to bridge the gap between today's fast evolving networks and the
   traditional OAM tools. The following section (Section 3) discusses
   the packet formats used by TraceFlow to avoid forward references in
   subsequent sections. It is suggested that first-time readers skip
   section 3 and read the Protocol Overview in Section 4. Applications
   scenarios are discussed in section 5 and the security considerations
   in section 6. 
















 


Janardhanan et.al.         Expires July 2012                    [Page 6]

INTERNET DRAFT                 Traceflow                    January 2012


3. Packet Formats 

3.1. Flow Discovery Request/Response Packet Format 

   Flow Discovery Request and Response packets follow the general format
   shown below. The TLVs included in each message type may be
   different.
   0  
   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
   |     Version   |   Hopcount    |          Length               | 
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
   |     Type      |   Reserved    |         Query ID              | 
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
   |    16-byte opaque System Identifier of the Requestor.         | 
   //                                                             //
   |    Used as a unique identifier of the system requesting.      | 
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
   |    16-byte opaque System Identifier of the Responder.         | 
   //                                                             //
   |    Used as a unique identifier of the system Responding.      | 
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
   |                             TLVs...                           | 
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 


   The Flow Discovery Request packet SHOULD be sent with the DF bit set
   in the external IP header. 

   Version: The version number of the protocol. This document defines
   protocol version 1. 

   Hopcount: Allows keeping track of the number of transit nodes that
   processed the Flow Discovery Request packet. This field is
   decremented at each device that processes the Flow Discovery Request
   packet. This field also helps in determining if there were any legacy
   devices not supporting TraceFlow protocol along the way. 

   Length: Length of the packet including the length of the header. This
   offers a mechanism whereby the length of the payload can be
   determined by a simple subtraction of header length from this given
   Length field.

      Type: 1   Direct Flow Discovery Request - Ping mode 

            2   Direct Flow Discovery Request - Traceroute mode 

            3   Indirect Flow Discovery Request - Ping mode 
 


Janardhanan et.al.         Expires July 2012                    [Page 7]

INTERNET DRAFT                 Traceflow                    January 2012


            4   Indirect Flow Discovery Request - Traceroute mode 

            5   Response for the Flow Discovery Request 

   Reserved: This field should be set to zero on transmit and ignored on
   received entity. Future use could be determined at a later version of
   the protocol.

   Query ID: A unique identifier generated by the originator that allows
   it to co-relate the responses from the transit nodes with the Flow
   Discovery Request packet generated. 

   System Identifier: (Requestor and Responder) This is a opaque 16 byte
   field, which would be unique per node in that network, and it is up
   to the administrators to define what this means within their network,
   as long as they ensure that it is unique across all the nodes in that
   network. The Requestor fills in its System Identifier in its request
   packet while the Responder fills in both Requestor field (from the
   packet received) and the Responder field which corresponds to its
   System Identifier. Thus the Discovery Request packet contains the
   Requestor System Identifier and the Response packet contains both
   Requestor and Responder System Identifier as well.

      The TLVs are divided into three categories:  

      1. TLVs that can show up in the Flow Discovery Request packet 

      2. TLVs that can show up in the Flow Discovery Response packet 

      3. TLVs that can show up in the Flow Discovery Request as well as 
        Response packet 

   Those TLVs that are not understood in previous versions of the
   protocol are ignored. These TLVs SHOULD be considered as opaque and
   passed along to the next transit device along the path. Hence these
   opaque TLVs are treated as transitive for versions of the protocol
   that dont understand them.

3.2. Flow Discovery Request TLVs 

3.2.1. Flow Descriptor TLV 

   This TLV is included in the Flow Discovery Request packet and
   identifies the traffic flow that the originator device is interested
   in probing. This is a mandatory TLV. 

   The definition of a traffic flow varies from one network to another. 
   Most traffic flows in today's networks can be uniquely identified
 


Janardhanan et.al.         Expires July 2012                    [Page 8]

INTERNET DRAFT                 Traceflow                    January 2012


   using fields from the data packet's headers. TraceFlow protocol
   requires the first 256 bytes of the traffic flow's data packet to be
   encoded in this Flow Descriptor TLV. For version 1 including the
   versions to come henceforth, these 256 bytes SHOULD include the Layer
   2 headers as well. This way when Traceflow supports Layer 2 devices
   the information in the 256 bytes would help to discover intermediate
   Layer 2 devices as well.

      0                   1                   2                   3     
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1    
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   
   |     Type      |     Code      |          Length               |   
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   
   |                    Value...                   |    padding    |   
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 


   Type: The type of the TLV. In this case, the value is 1 meaning Flow
   Descriptor TLV 

   Code: The Code identifies the sub-type of the TLV. In this case, this
   field is not defined. It SHOULD be set to 0. 

   Length: The length of the TLV 

   Value: The value encoded in this TLV depending on the Type and the
   Code specified 

   Padding: This might be necessary to ensure the packet ends on a word
   boundary

   Refer to section 3.4.1.1 (Encapsulated Packet TLV) that describes how
   a data packet can be used to specify the traffic flow.  















 


Janardhanan et.al.         Expires July 2012                    [Page 9]

INTERNET DRAFT                 Traceflow                    January 2012


3.2.2. Originator Address TLV 

      This TLV carries the address of the originator of the Flow
   Discovery Request packet. The responses from the intermediate devices
   processing the request are sent to this address. This is an optional
   TLV to be included only when an Indirect Flow Discovery Request is
   originated.

      0                   1                   2                   3     
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1    
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   
   |     Type      |     Code      |          Length               |   
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   
   |                  Value...                     |    padding    |   
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 


      Type: 2  Originator Address 

      Code: 1  IPv4 Address 

            2  IPv6 Address 


























 


Janardhanan et.al.         Expires July 2012                   [Page 10]

INTERNET DRAFT                 Traceflow                    January 2012


3.2.3. Information Request bitmap TLV 

   This TLV is used by the originator device to specify the information
   requested for the flow identified by the Flow Descriptor TLV in the
   Flow Discovery Request packet. This is an optional TLV. In absence of
   this TLV, the transit and the end devices processing the Flow
   Discovery Request packet respond with the default set of
   information.

      0                   1                   2                   3     
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1    
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   
   |     Type      |     Code      |          Length               |   
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   
   |                           Flags...                            |   
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 


      Type: 3  Information Request 

      Code: 1  Incoming Interface related 

            2  Outgoing Interface related 

      Flags: 

      Bit 0 : IP Address 

      Bit 1 : SNMP ifName 

      Bit 2 : SNMP ifIndex and ifType 

      Bit 3 : Lag details.

      Bit 4 : Ecmp details. To be specified only for Outgoing interface.

      Bit 5 : Hash algorithm. To be specified only for Outgoing
   interface.

   Note that the Hash algorithm mask TLVs can be specified in the
   response packet. But the actual hash algorithm need not be specified
   in the response packet.

      Code: 3  Global information 

      Flags: 

      Bit 0 : Next Hop Router Address  
 


Janardhanan et.al.         Expires July 2012                   [Page 11]

INTERNET DRAFT                 Traceflow                    January 2012


3.2.4. Termination TLV 

   This TLV includes a list of addresses. If a device notices that it
   owns any of the addresses listed in this TLV, it MUST NOT forward the
   Flow Discovery request packet any further and MUST respond to the
   originator with a Flow Discovery Response packet. 

      0                   1                   2                   3     
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1    
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   
   |     Type      |     Code      |          Length               |   
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   
   |  Address-type |  Address...                                   |   
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   
   |  Address-type |  Address...                                   |   
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   
   //                                                             //   
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   
   |  Address-type |  Address...                                   |   
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 


      Address-type: 

      0x1: IPv4 Address 

      0x2: IPv6 Address 

      Address: The address where the request MUST be terminated. 



















 


Janardhanan et.al.         Expires July 2012                   [Page 12]

INTERNET DRAFT                 Traceflow                    January 2012


3.3. Flow Discovery Response TLVs 

3.3.1. Information Response TLV 

   This TLV is used by the devices processing the Flow Discovery Request
   packet to provide the information requested by the originator device.
    This is a mandatory TLV. It should be included in the response sent
   to the device originating the Flow Discovery Request packet. 

      0                   1                   2                   3     
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1    
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   
   |     Type      |     Code      |          Sub-Code             |   
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   
   |          Length               |    Value...   |    padding    |   
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 


      Type: 5  Information Response 

      Code: 1  Incoming Interface related 

            2  Outgoing Interface related 

      Sub-Code:  

       0 : IP Address 

       1 : SNMP ifName 

       2 : SNMP ifIndex and ifType 

       3 : Lag details 

       4 : Ecmp details. To be specified only for Outgoing interface.

       5 : Hash algorithm. To be specified only for Outgoing interface.











 


Janardhanan et.al.         Expires July 2012                   [Page 13]

INTERNET DRAFT                 Traceflow                    January 2012


   The LAG and ECMP details are described in more detail. Following is
   the frame format if the originator device requested LAG or ECMP
   related details. 

      0                   1                   2                   3     
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1    
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   
   |     Type      |     Code      |          Sub-Code             |   
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   
   |          Length               |       No. of members          |   
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   
   |                    Component Link Information..               |   
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   
   //                                                             //   
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   
   |                    Component Link Information..               |   
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 

   No. of members: This is the number of members in the LAG or the ECMP
   segment that is being described 

   Component Link Information: Individual component links are encoded in
   this field. The "No. of members" field describes how many component
   links are listed. 
























 


Janardhanan et.al.         Expires July 2012                   [Page 14]

INTERNET DRAFT                 Traceflow                    January 2012


   The frame format for the "Component Link Information" portion of the
   TLV is shown below. 

      0                   1                   2                   3     
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1    
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   
   |                      SNMP ifIndex                             |   
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   
   |                      SNMP ifType                              |   
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   
   |           Flags               |       SNMP ifName length      |   
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   
   |                      SNMP ifName...                           |   
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 


      SNMP ifIndex: The ifIndex of the component link being specified 

      SNMP ifType: The ifType of the component link being specified 

      Flags: 

      0x1: If set, the Component Link is administratively down. 

      0x2: If set, the Component Link is operationally down. 

      If the above cannot be determined then the flags SHOULD be set to
   0.

      The rest of the bits in the Flags field are reserved. 


















 


Janardhanan et.al.         Expires July 2012                   [Page 15]

INTERNET DRAFT                 Traceflow                    January 2012


3.3.1.1 Utilization Anomaly TLV

   An optional TLV to report LAG utilization anomaly is also included.
   The user could configure a threshold of congruence with respect to
   utilization amongst the least utilized member of the LAG and the
   maximally used member of the LAG. If say the threshold is configured
   as 80% and if the difference in utilization between the least
   utilized member of the LAG and the maximally used member of the LAG,
   then an anomaly TLV is sent to report such a condition. On getting
   this Utilization anomaly TLV the Originator device could report this
   to the user and a subsequent NMS query to the appropriate device
   could reveal more information into this anomaly.

   The TLV format for this Utilization anomaly TLV would be as follows.

      0                   1                   2                   3     
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1    
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   
   |     Type      |     Code      |          Sub-Code             |   
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   
   |          Length               |Configured Divergence Threshold|  
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   
   |      SNMP ifIndex of Least used component link in the LAG     |   
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   
   |      SNMP ifIndex of Most  used component link in the LAG     |   
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   
   |Actual Divergence in percentage|   Padding...                  |  
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 

   It is important to note that in the case of this Optional TLV the
   device which reports it to the Originator should support keeping
   track of the rate at which each member unit of the LAG is forwarding
   traffic and report the divergence in terms of the rate. If the
   implementation cannot keep track of the rate then it would have to
   report the divergence in terms of packet counts. But the latter might
   lead to a mis-interpretation in case of link up down events or other
   conditions.











 


Janardhanan et.al.         Expires July 2012                   [Page 16]

INTERNET DRAFT                 Traceflow                    January 2012


   TLV format specifies the packet fields that are used by the hash
   algorithm configured on the device. 

      0                   1                   2                   3     
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1    
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   
   |     Type      |     Code      |          Sub-Code             |   
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   
   |          Length               |       No. of hash parameters  |   
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   
   |      byte-offset-1            |         no. of bytes          |   
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   
   |      byte-offset-2            |         no. of bytes          |   
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   
   |      ...                                                      |   
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   
   |                     Encapsulated Packet ...                   |   
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 


   No. of hash parameters: This specifies the number of parameters in
   the packet that are used by the hash algorithm to calculate the
   egress port 

   Byte-offset-N: This is the offset to the start of the Nth parameter
   that is used by the hash algorithm to calculate egress port 

   No. of bytes: For the byte-offset specified, the number of bytes
   starting at that offset that are used by the hash algorithm 

   Encapsulated Packet: The encapsulated packet received in the Flow
   Discovery Request packet on the input port by the device is returned
   in the response packet. This should be the packet that is used in the
   egress component link calculations by the device processing the Flow
   Discovery Request packet. 

   Note that the Hash algorithm mask TLVs can be specified in the
   response packet. But the actual hash algorithm need not be specified
   in the response packet.









 


Janardhanan et.al.         Expires July 2012                   [Page 17]

INTERNET DRAFT                 Traceflow                    January 2012


   The following TLV is mandatory.

      0                   1                   2                   3     
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1    
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   
   |     Type      |     Code      |          Sub-Code             |   
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   
   |          Length               |    Reserved                   |   
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   
   |  Address-type |  Next Hop Address ...                         |   
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 


      The next hop address is encoded as shown above. 

      Code: 3  Global information 

      Sub-Code:  

      1 Next Hop Address 

      Address-type: 

      0x1: IPv4 Address 

      0x2: IPv6 Address 

      Next Hop Address: This field carries the next hop address. 




















 


Janardhanan et.al.         Expires July 2012                   [Page 18]

INTERNET DRAFT                 Traceflow                    January 2012


3.3.2. Result TLV 

   The device processing the Flow Discovery Request packet includes a
   Result TLV in the response to the originator device to indicate the
   result of the processing. This TLV is mandatory. 

      0                   1                   2                   3     
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1    
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   
   |     Type      |     Code      |          Length               |   
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   
   |  Result Code  |  Sub-code     |   Diagnostic Data..           |   
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   
   |                  Diagnostic Data...           |    padding    |   
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 


      Type: 7  Result TLV  

   Result Code: This field carries a value indicating the result of the
   processing of the Flow Discovery Request packet 

   Sub-Code: This field further qualifies the "Result Code" field and
   provides more information about the result of processing the Flow
   Discovery Request packet 

   Diagnostic Data: This field is used in conjunction with the "Result
   Code" and "Sub-code" to return any information that may be useful to
   the originator of the Flow Discovery Request packet. Its format is
   defined based on the "Result Code" and "Sub-code" field. 

      Result Code: 1  Success  

      Result Sub-code: 0   



      Result Code: 2  Administratively disabled   

      Result Sub-code: 0   

   Diagnostic Data: A list of Information Request Sub-Codes that are not
   being fulfilled. These Sub-Codes could indicate whether the outgoing
   interface is currently disabled or not. If the forwarding tables in
   hardware are set to the interface which has been Administratively
   disabled then that would indicate an error in those tables which may
   lead to a confirmation that the software state is not in sync with
   the hardware.
 


Janardhanan et.al.         Expires July 2012                   [Page 19]

INTERNET DRAFT                 Traceflow                    January 2012


      Result Code: 3  Routing failure  

      Result Sub-code: 1  No route in table  

      Result Sub-code: 2  RPF check failed  

      Result Sub-code: 3  ARP Failure.



      Result Code: 4  Packet Error  

      Result Sub-code: 1  hopcount = 0  

   This may be the case where the TTL has counted down to 0 in IPv4 or
   Hopcount has counted down to 0 in IPv6. This is a method by which
   even if the ICMP "Time to Live Exceeded" packets are dropped on the
   way back, the Originator may be able to determine that the TTL
   counted down to zero. 

      Result Code: 5  Malformed packet  

      Result Sub-code: 1 Unknown TLVs for this version.

   In this case the packet is not dropped but forwarded with the unknown
   TLVs. This offers the older versions of the protocol the ability to
   report back to the originator that the packet was processed but with
   one or more unknown TLVs, but that the packet was forwarded to the
   next transit device with the unknown TLVs.


      Result Code: 6  Data-path Error 

      Result Sub-code: 1  Fragmentation needed but not allowed by Flow
   Information TLV in Flow Discovery Request packet 


      Result Code: 7  Generic Error  

      Result Sub-code: 0  (TBD: Sub-codes to identify the type of error
   may need to be defined) 







 


Janardhanan et.al.         Expires July 2012                   [Page 20]

INTERNET DRAFT                 Traceflow                    January 2012


3.3.3. Additional Informational Code TLV 

   This TLV may accompany the Result TLV if the device processing the
   Flow Discovery Request packet has any additional information that the
   originator device may be interested in. This TLV is optional. 

      0                   1                   2                   3     
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1    
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   
   |     Type      |     Code      |          Length               |   
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   
   |  Status Code  |  Sub-code     |   Additional Data..           |   
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   
   |                  Additional Data...                           |   
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 


      Type: 8 Additional Informational Code 

      Status code: 1  ACL drop  

      Status Sub-code: 1  Ingress ACL drop  

      Status Sub-code: 2  Egress ACL drop  



      Status code: 2  Dataplane failure  

      Status Sub-code: 1  Switch fabric failure  

      Status Sub-code: 2  Linecard failure  

      Status Sub-code: 3  Port failure  


      Status Code: 3  Generic Information  

      Status Sub-code: 1  TTL/Hopcount mismatch noticed  

      Status Sub-code: 2  Default route used to forward packet  

      Status Sub-code: 3  Per-packet load-balancing enabled.

   In case of TTL/Hopcount mismatch, the "Additional Data" field carries
   the difference in the Hopcount and the IP TTL field values. This may
   provide an indication of the number of previous hop routers that did
   not support TraceFlow protocol. 
 


Janardhanan et.al.         Expires July 2012                   [Page 21]

INTERNET DRAFT                 Traceflow                    January 2012


3.4. TLVs common to Flow Discovery Request and Response 

3.4.1. Encapsulated Packet TLV 

   This TLV is included in the Flow Discovery Request and is returned in
   the Flow Discovery Response packet by devices processing the request
   packet. In the response packet, this TLV contains the encapsulated
   packet as it was received from the previous-hop device. It helps the
   originator keep track of how the data packet gets modified along the
   way. This TLV is mandatory. 

      0                   1                   2                   3     
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1    
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   
   |     Type      |     Code      |          Length               |   
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   
   |          Flags                |   First Hdr   | Reserved      |    
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   
   |     Encapsulated Packet...                                    |   
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 


      Type: 1 Flow Discovery Request 

      Code: 1 Encapsulated traffic flow data packet 

      Encapsulated Packet: The first 256 bytes of a data packet
   belonging to the flow are encapsulated in this field of the packet 

      Flags: 

      0x1: fan-out option; if set, the transit node SHOULD forward the
   Flow Discovery Request packet to all possible egress links for the
   specified flow. Since use of the fan-out option is liable to create
   multiple instances of the packet through each egress link possible in
   a LAG or ECMP situation, this should be used with caution. A specific
   admin command knob should be available to turn this option off or on,
   on the device. Thus even if fan-out is requested in the Flags the
   fan-out discovery is done only if the said transit device permits it
   through an admin command knob.

      First Hdr: Specifies the first header that appears in the
   encapsulated packet. The values defined by this document are: 

      0x1: Layer 2 MAC Header 

      0x2: IPv4 Header 

 


Janardhanan et.al.         Expires July 2012                   [Page 22]

INTERNET DRAFT                 Traceflow                    January 2012


      0x3: IPv6 Header 

      0x4: MPLS Header 













































 


Janardhanan et.al.         Expires July 2012                   [Page 23]

INTERNET DRAFT                 Traceflow                    January 2012


3.4.2. Encapsulated Packet Mask TLV 

   This TLV allows the operator to specify what portion of the
   encapsulated packet carries flow data and what portion is left
   unspecified. This allows the intermediate nodes to determine if they
   have enough information to calculate an egress interface to forward
   the Flow Discovery Request packet. If this TLV is omitted from the
   Flow Discovery Request packet, no portion of the packet is left
   unspecified and the transit device may use any of the fields to make
   the forwarding decision. This TLV is optional. 

      This TLV includes a sequence of <byte-offset, number of bytes,
   mask>    tuples. 

      0                   1                   2                   3     
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1    
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   
   |     Type      |     Code      |          Length               |   
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   
   |    No. of tuples              |      byte-offset-1            |   
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   
   |      no. of bytes             |      byte-mask-1              |   
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   
   |      byte-offset-2            |      no. of bytes             |   
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   
   |      byte-mask-1              |      ...                      |   
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 


      No. of tuples: Total number of <byte-offset, no. of bytes, mask>
   tuples carried in this TLV 

      Byte-offset: The byte offset for the field being specified 

      No. of bytes: The number of bytes from the byte-offset to
   consider

      Mask: The mask to be applied to the bytes starting at the byte-
   offset. This specifies the bits starting at byte-offset the length of
   which is specified by the number of bytes which is to be used in
   determinaton of the information to calculate the egress interface to
   forward the Flow Discovery Request packet.






 


Janardhanan et.al.         Expires July 2012                   [Page 24]

INTERNET DRAFT                 Traceflow                    January 2012


3.4.3. Record Route TLV 

   This TLV is used to record the information about the path taken by a
   Flow Discovery Request packet as it traverses through the network. It
   is included by the originator and each transit device processing the
   Flow Discovery Request packet includes information about its incoming
   interface in this TLV. This TLV is included in the response sent by
   the transit nodes (in trace-route mode) to the originator of the
   Flow


   Discovery Request packet. This TLV is optional. However if it is
   included by the originator node in the Flow Discovery Request packet,
   the subsequent nodes SHOULD prepend to the list of addresses. 

      0                   1                   2                   3     
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1    
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   
   |     Type      |     Code      |          Length               |   
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   
   |  Address-type |  Incoming interface Address...                |   
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 

      Type : 9  Record-Route TLV 

      Code: 1

      Address-type: 

      0x1: IPv4 Address 

      0x2: IPv6 Address 

      Incoming interface Address: This field carries the incoming
   interface address at the device processing the Flow Discovery Request
   packet.  Each node receiving the request packet with this TLV should
   prepend its incoming interface address to this TLV. 

   The device SHOULD include the Record-Route TLV as it received on its
   input interface in the Flow Discovery Response packet it sends out. 








 


Janardhanan et.al.         Expires July 2012                   [Page 25]

INTERNET DRAFT                 Traceflow                    January 2012


4. Protocol Operation 

   A Flow Discovery Request packet is a UDP packet addressed to a well-
   known destination port. The source UDP port in the packet is
   ephemeral. It consists of a "Flow Descriptor" TLV that allows the
   originator of the request to encode a flow data packet in the TLV. 
   On Layer 3 or multi-layer devices that incorporate Layer 3 based
   forwarding, using a UDP port would be most useful. Hardware support
   for this needs to be provided in terms of programming a filter that
   inspects a packet for a specific UDP destination port and punts the
   same to the software. Layer-2 devices in L2 clouds are passed through
   and so are MPLS LSRs. For the pure L3 devices the ability to setup
   the filter to enable traceflow should be turned on by a per-device
   knob.

   Certain fields in a traffic flow data packet get modified by the
   transit devices as the data packet traverses the network. A transit
   device that processes a Flow Discovery Request packet would need to
   edit those fields in the encapsulated data packet that represents the
   flow. Some such fields are source and destination MAC Addresses and
   MPLS label stack.  

   Consider a transit device that uses the source or destination MAC
   address of a data packet in order to determine the egress port. The
   transit device could choose to pick up the MAC addresses from the
   external header of the Flow Discovery Request packet or from the
   encapsulated packet. 

      TraceFlow can operate in two separate modes: 

      1) Trace-route mode: In the traceroute mode of operation, each
   transit device and the end node respond to the Flow Discovery Request
   packet by sending a flow discovery response. 

      2) Ping mode: Transit nodes do not send a response message to the
   originator. Rest of the behavior is same as traceroute mode. 

   The following applies to Ping and Traceroute mode unless otherwise
   specified.

   The destination address of the Flow Discovery Request packet is the
   destination address for the desired traffic flow. In Ping mode a
   separate TLV may be included that specifies a list of addresses. If a
   device processing the Flow Discovery Request packet notices that one
   of its IP addresses matches with one of the addresses specified in
   the Termination TLV, then the device MUST NOT forward the Flow
   Discovery Request packet further and send a response packet to the
   originator.  
 


Janardhanan et.al.         Expires July 2012                   [Page 26]

INTERNET DRAFT                 Traceflow                    January 2012


   The Flow Discovery Request packet travels the exact same path that a
   data packet for the specified traffic flow would have followed. This
   includes the exact physical or logical interface that belongs to a
   LAG or a set of ECMP paths. It is important to note here that the
   hardware supports a mechanism to determine where the packet would be
   forwarded and send the result to the software as well as inject the
   packet to the next-hop along the way to the destination.

   If per-packet loadbalancing is enabled on the way to the destination
   then it would be ambiguous to return the Discovery response packet
   since another iteration of flow discovery packets headed through the
   node would result in packet being forwarded across a interface
   (logical or otherwise) which is different from the one in the
   previous iteration.  So if per-packet is enabled on the multipaths
   that exist (ECMP or otherwise) it is important to return in the
   response packet that it is so configured on that node. A status code
   is reserved for this to note this anomaly.  This may totally vary the
   path that is taken by a traceflow packet than an actual data packet
   if two or more ECMP or UCMP paths exist.

   The device interested in receiving information about the traffic flow
   originates a Flow Discovery Request packet. The Flow Descriptor TLV
   in this packet specifies the flow of interest whereas a Requested
   Information TLV specifies the flow related information that the
   originator device is requesting from each transit router. The Flow
   Discovery packet needs to be processed by all routers along the path
   to the destination. This can be achieved by using a well-known UDP
   port as the destination port in the UDP header. When a transit device
   receives a Flow Discovery Request packet, it reads the flow
   information from the Flow Descriptor TLV, looks up the local
   forwarding database(s) and determines an egress port or ports for
   this traffic flow. The transit device forwards the Flow Discovery
   packet along the egress port calculated using this lookup. The egress
   port is calculated based on the flow information from the Flow
   Descriptor TLV in the request packet and not based on destination IP
   address in the IP header of the Flow Discovery Request packet. 

   When processing the Flow Discovery Request packet, the transit node
   MUST consider the packet length specified in the encapsulated packet
   in the Flow descriptor TLV. 

   The transit device also gathers the relevant information for the flow
   which could include details such as: 

      1. incoming and outgoing interface related details such as
   ifIndex, IP Address, Lag and ECMP related information.

      2. Next-Hop Router information 
 


Janardhanan et.al.         Expires July 2012                   [Page 27]

INTERNET DRAFT                 Traceflow                    January 2012


   The transit device processing the Flow Discovery Request packet may
   choose to respond to only a subset of the information requested in
   the Flow Discovery Request packet. 

   The transit device includes additional information related to the
   incoming or outgoing LAG or ECMP interface. This additional
   information includes the number of LAG or ECMP links that are
   configured and their operational status and the parameters included
   in the hashing algorithm that is used to select an egress port for
   the traffic flow. 

   This information is sent back to the IP address specified as the
   Originator IP Address in the Flow Discovery Request packet. In case
   the Indirect Request is used the Originator TLV specifies the IP
   address else the source IP address in the outer header is the
   Originator address.

   The Flow Discovery Request packet includes a hop count field which is
   initialized to the same value as the IP header's TTL field. This hop
   count field is decremented by one at each intermediate hop router
   that processes the Flow Discovery Request. In conjunction with the
   TTL field in the IP header this hop count field can help determine if
   there are any intermediate routers that do not support the TraceFlow
   protocol. When an intermediate hop router detects that the hop count
   field is greater than the IP header TTL field it indicates that one
   or more previous hop routers do not support the TraceFlow protocol. 
   This information is added to the response sent to the Originator IP
   Address. Thus the intermediate router after one or more hops of
   devices not supporting Traceflow, will determine the fact that one or
   more previous devices did not support Traceflow. The output at the
   Originator end can be customized to display in the following format..

   Device 1: (Description)          (Traceflow capable)

   Unknown Devices : n (where n >= 1)

   Device 2: (Description)          (Traceflow capable)

   The IP TTL field as well as the hopcount field SHOULD be initialized
   to values that limit the Flow Discovery Request packet to the desired
   network boundary. This may be required to restrict the Traceflow
   packets to specific boundaries within an administrative domain given
   that there are well defined such boundaries within the domain.

   A router can originate periodic Flow Discovery Requests for a traffic
   flow. The Query ID field in the Flow Discovery Request packet helps
   the originator identify the responses from the transit routers as
   they process the request. 
 


Janardhanan et.al.         Expires July 2012                   [Page 28]

INTERNET DRAFT                 Traceflow                    January 2012


   When processing a Flow Discovery Request packet at a device along the
   path towards the destination it is likely that the device may
   encounter an error condition and is not able to continue processing
   the packet. Some examples of the error conditions are: 

      1. TraceFlow protocol has been administratively disabled 

      2. Unicast RPF check failed for the flow specified in the Flow
   Discovery Request packet 

      3. No route exists in the routing table to route the flow
   specified in the Flow Descriptor TLV. 

      4. IP TTL or the Hop Count field in the Flow Discovery Packet
   becomes zero. 

   The "Result TLV" is used to carry this information back to the
   originator of the Flow Discovery Request packet. 

   It is also likely that the device is able to successfully process the
   Flow Discovery Request packet; however it encounters a condition
   during the processing that may be of interest to the originator. Some
   examples of such conditions are: 

      1. The flow specified in the Flow Descriptor TLV would be dropped
   due to Ingress ACL or Egress ACL policies 

      2. Dataplane failure may prevent the specified flow from being
   successfully switched/routed. 

      3. IP TTL and the Hop-count field in the Flow Discovery Request
   packet do not match possibly due to one or more previous hop routers
   not supporting the TraceFlow protocol. 

      4. The specified flow would be routed using default route in the
   routing table. 

   This information is returned to the originator of the Flow Discovery 
   Request packet using the "Additional Information Code TLV". 

   The originator of the Flow Discovery Request packet may set the fan-
   out bit in the Flow Descriptor TLV to request the transit node to
   forward the request packet through all possible egress ports for the
   specified flow. The transit device would process the Flow Discovery
   Request packet as described above and forward it out of all possible
   egress ports in multipath scenarios. If the fan-out option is
   selected, the Flow Discovery Request packet received, is forwarded
   only on the primary port of the LAG interface. The primary port
 


Janardhanan et.al.         Expires July 2012                   [Page 29]

INTERNET DRAFT                 Traceflow                    January 2012


   selected may differ from vendor to vendor. This helps reduce the
   number of redundant request packets generated as a result of the fan-
   out behavior. The originator of the request packet with the fan-out
   option enabled may get redundant responses in certain circumstances. 

   Note that the LAG details are provided in the response packet, only
   if the LAG exists on an L3 device. This is due to the fact that L2
   devices supporting LAG do not have the capability to process the
   Traceflow protocol for now. In future drafts L2 support may be added
   to the Traceflow protocol and at that point it may be dealt with in
   detail.

4.0.1 Assessing why redundant responses come through.

   In case a fan-out happens at a initial point in the path towards the
   destination, there might be a case that the paths diverge initially
   and cover a few transit devices before they re-converge to one more
   points to the destination. In this case the multiple fan-out
   Discovery packets may result in redundant responses from the same re-
   converged transit devices along the way. This can be used to find out
   if there exist totally dis-joint paths to the destination. If the
   redundant responses emanate from the ultimate destination it is
   reasonably easy to figure out that there exist totally dis-joint
   paths to the destination. But if in case redundant responses arise
   from transit devices much earlier than the destination there would be
   a need to assume that the reconvergence of paths (partially dis-joint
   case) has occurred earlier to the ultimate destination. This would be
   a most opportune moment to use this feature for finding all possible
   paths by correlating the information received at the originator using
   an Network management station on an appliance or otherwise.

   The Flow Discovery Request packet SHOULD pass through the Layer 2 or
   MPLS routed segments along the path in pass-through mode as data
   packets. The appendix discusses the possibility of extending the
   TraceFlow protocol to allow the devices in the Layer 2 and MPLS
   segments along the path of the traffic flow to respond to the Flow
   Discovery Request packet. But this is saved for future work.

   The discussion so far has assumed that the Flow Discovery Request
   packet would originate on one device (say device A) and terminate on
   some other device (say device B). It is likely that a third device
   (say device C) would be interested in obtaining the flow related
   information for a flow traversing from device A to device B. In this
   case, device C sends a Flow Discovery Packet to device A. The Flow
   Discovery Request type specified in the packet would indicate to
   device A that this is an indirect request from device C to obtain
   information relevant to the flow specified in the Flow Descriptor
   TLV. Device A then generates a new Flow Discovery Request packet with
 


Janardhanan et.al.         Expires July 2012                   [Page 30]

INTERNET DRAFT                 Traceflow                    January 2012


   the destination IP set to device B and the Originator IP Address set
   to device C. All transit routers that process this request would send
   their responses to device C. See security considerations to get more
   information on issues with the indirect mode and ways to mitigate
   them.

4.1. Using Hardware to gather details for the response packet.

   It is RECOMMENDED that the TLVs SHOULD be filled with as much
   information gathered directly by reading the hardware elements that
   are used in forwarding of a flow. 

4.2 Interaction with MPLS based transit devices.

   Current MPLS ping standard supports ping/traceroute between ingress
   and egress LSRs only. There is need for a singular probe that traces
   all types of hops which includes MPLS LSRs which can be addressed
   with our protocol. But we intend to support only pass pipe mode (pass
   through) of tracing where entire MPLS lsp is treated as a single
   interface. Uniform mode where we trace every hop along the way is
   totally excluded in this scheme. It may however be taken up for
   future work.

   In the MPLS case given the difference in the TTL value one can arrive
   at the conclusion that the MPLS network in the middle did a pass
   through of the packet. The egress LER can begin to send back the
   Discovery responses from where the Ingress LER left off.

4.3 Applicability to Layer 2 devices.

   Layer 2 devices in this version of the draft are totally bypassed
   with respect to Traceflow. L2 devices are expected to merely forward
   the Traceflow frames. Future work may be done to extend to support
   Traceflow on Layer 2 devices.

4.4 Applicability to platforms that have trouble determining incoming
   Interface.

   Appropriate hardware assists need to be done to indicate to the
   software as regards which incoming interface the packet came on with
   regard to platforms that have trouble determining which interface the
   packet came through.

4.5 Applicability to Network Address Translators

   This aspect has not been studied well as yet and future revisions of
   the draft or addendum documents to this draft may make this behaviour
   more clearer. The aspect to worry about is the shipping back of the
 


Janardhanan et.al.         Expires July 2012                   [Page 31]

INTERNET DRAFT                 Traceflow                    January 2012


   response packet to the originator in case the outer IP header is
   subject to translation. Both the encapsulated packet and the outer IP
   header may need to undergo translation. Normally firewalls that
   surround NATs or are in-built with the capability of NATs may drop
   packets for which the port assignments are not set for pass-thru or
   translation. So some hole poking on the firewall may be required to
   pass the response through to get the response packet back to the
   originator. As specified, this aspect has to be thought through and
   document in subsequent versions or added as additional drafts
   modifying the behaviour to enable NAT traversal of Traceflow packets.

   One advantage though is that since the request and response is not an
   ICMP packet, the Traceflow packets may need to be considered as mere
   data packets and may pass through without a hitch. Trust boundaries
   as encompassed by firewalls may however not like the intrusion.

5. Application Scenarios 

   This section discusses Trouble-shooting applications of this
   proposal.  The application scenarios can broadly be divided into two
   categories:

      1. Troubleshooting network failures 

      2. Network planning 

5.1. Troubleshooting network failures 

   Several network monitoring tools provide us the capability to monitor
   the health of a network by polling information from the network
   devices (primarily through the use of SNMP). They help us in
   detecting network failures, imminent failures or other anomalies in
   the network. 

   For troubleshooting these failures, the network operators typically
   rely initially on tools such as ping and traceroute. Unfortunately
   they do not provide detailed information about the traffic flow that
   is affected for a couple of reasons: 

      1. It is likely that ping and traceroute control packets follow a
   different path through the network compared to the traffic flow that
   is being investigated - for example when policy-based routing is in
   effect or when there are one or more ECMP segments along the path of
   the traffic flow. 

      2. Ping and traceroute do not provide us with details about the
   constituent members of a port-channel trunk through which the
   affected flow would have traversed. 
 


Janardhanan et.al.         Expires July 2012                   [Page 32]

INTERNET DRAFT                 Traceflow                    January 2012


      3. It is common practice to rate limit ping and traceroute traffic
   at the router. This creates a lack of deterministic responses to ping
   and traceroute. 

   Being able to trace the exact path that a particular flow might have
   taken through the network and obtain all relevant information about
   the hops along that path provides the network operator with enough
   information to troubleshoot a network failure quickly. 

   By setting the fan-out bit in the Flow Descriptor TLV, the operator
   should be able to determine all possible paths through the network
   that traffic to a particular destination may take. Along with the
   paths, the operator should also be able to obtain information
   relevant to the traffic flow from transit devices along the paths. 
   This might prove to be useful in trouble-shooting certain type of
   network problems. 

5.2. Network flow planning 

   During production, it may be useful to know which ephemeral source
   port can be used to divert the flow on a suitable LAG member or an
   ECMP component link by using Traceflow packets with different
   ephemeral source port / ports in a range.

   It would be useful to determine that the network access-lists are
   properly configured and the traffic would not get blocked
   inadvertently by an access-list somewhere. 

   Typically the issues listed above are discovered once the network is
   in production. 

   By having the ability to exercise the traffic flow's data path before
   it starts handling production traffic would help the operator to: 

      1. Rectify any configuration issues such as ACL policies. 

      2. Modify the ephemeral source port to get the flow traffic to
   flow across a specific constituent member of a port-channel trunk or
   an ECMP path 

   Note that this application of the Traceflow protocol may not be
   relevant to all types of networks. Campus networks, enterprise
   networks and datacenters with well defined traffic flow patterns may
   benefit from the capability to detect the above problems. However for
   tier 1 providers this application of the TraceFlow has limited
   relevance as the traffic flows are not well-defined. 

   The operator may use the fan-out bit in the Flow Descriptor TLV to
 


Janardhanan et.al.         Expires July 2012                   [Page 33]

INTERNET DRAFT                 Traceflow                    January 2012


   request the transit devices to provide all the paths that traffic
   flow to a certain destination address would take. This allows the
   operator to validate the ECMP or LAG configuration in the network. 

5.2.1 Programmatic migration to mitigate LAG link polarization 

   In later versions of the openflow specification virtual ports such as
   LAGs are exposed to the openflow forwarding path. It is imperative
   that  the controller has a standards based ability to discover lag
   hashing functionality. Through the traceflow discovery and fanout
   process the controller is able to proactively determine which action
   to take to influence flows to move from one Lag member to another.
   This will aid in the automated troubleshooting of link polarity
   problems


































 


Janardhanan et.al.         Expires July 2012                   [Page 34]

INTERNET DRAFT                 Traceflow                    January 2012


6. Security Considerations 

   This section discusses threats to which TraceFlow might be vulnerable
   and discusses means by which those threats might be mitigated. 

   There is a concern that this protocol might allow an external user to
   probe the detailed path that a flow takes through a network. 

   The network operator can associate multiple levels with the different
   types of information that are included in the response to a Flow
   Discovery Request packet. For example only the "Next Hop Router" may
   be marked as publicly accessible information whereas everything else
   may be marked as private information. On receiving a Flow Discovery
   Request packet originating outside the local network, only the
   publicly accessible information is included in the response to the
   originator. However if the request was originated locally the device
   includes all requested information in the response. 

   The Result TLV and Additional Information Codes TLV provide detailed
   information about the processing of the Flow Discovery Request packet
   and may possibly leak information about the locally configured
   policies. The amount of information to be included in these TLVs
   should also depend on whether the request was originated externally
   or internally. The network operator may choose to silently drop the
   Flow Discovery Request packet without providing any indication of the
   reason for doing so if the request was originated externally. 

   Today most network operators throttle conventional OAM traffic (For
   example ping and traceroute) that is serviced by the device to
   protect against Denial-of-Service attacks. Such mechanisms should be
   employed for TraceFlow packets for the same reason. Rate limiting any
   packets punted to the software can include traffic relating to
   management plane. Many platforms offer to rate limit M no of packets
   per second or per minute. Facilities like these can be used to
   procure a rate limited quantum of traffic to go to the management
   plane as would be the case in Traceflow traffic. Configuring M would
   be a user provided option with a default set to a suitable quantum.

   Hardware assisted rate limiting would be a pre-requisite for this
   feature.

7. Hardware pre-requisites for implementing Traceflow.

7.1 filter to trap packets with UDP destination port

   Filters with a corresponding PUNT to software action should be
   programmable in hardware to trap packets with UDP destination port
   signifying Traceflow packets. For platforms that support hardware
 


Janardhanan et.al.         Expires July 2012                   [Page 35]

INTERNET DRAFT                 Traceflow                    January 2012


   based filtering would benefit most from this filter support. All
   Layer 3 devices would be most appropriate for programming this
   filter. However please note that the UDP port based filter will not
   be and SHOULD not be applied to MPLS packets or IP-in-IP tunneled
   packets. This tunneling variety of packets be it MPLS or IP-in-IP
   (include IP-GRE) are out of scope of this document.

7.2 Packet injection mode directly to egress port.

   For the purpose of making Traceflow take a proper output member in a
   LAG or ECMP case, there should be packet injection mode supported in
   hardware. Once the software control plane for Traceflow gets the
   packet, the updated packet should be sent across to the appropriate
   next-hop transit device through the appropriate LAG or ECMP member as
   is calculated by the hardware algorithm and for this purpose the
   hardware should support packet injection mode directly to egress port
   without interference from the hardware forwarding engine. In this
   mode the software sends the packet across to the egress port
   bypassing the hardware forwarding engine from the software control
   plane to make it take the appropriate LAG or ECMP member which ever
   is appropriate.

7.3 Packet injection mode through hardware engine but not to output
   port.

   For the purpose of making Traceflow provide the proper result as to
   which LAG / ECMP member the packet will go out on, the hardware
   should provide assist to the CPU to inject the packet to get the
   forwarding result but not route or switch the packet onto the next-
   hop.

7.4 Hardware rate limiter support (preventing DOS attacks)

   There should exist support for hardware rate limiter based on filters
   in order that DOS attacks are not mounted on the control plane / the
   software part of the Traceflow engine. Normally the control plane of
   the Traceflow engine exists in the Router Processor Module of the
   transit devices or the end device against which a Traceflow
   traceroute and ping packets are sent respectively. This hardware rate
   limiter makes use of the filter to count the number of packets per
   unit time like a minute to determine if too many Traceflow packets
   are being sought to be sent to the control plane in the Route
   Processor Module. This is another requirement from the hardware.

7.5 RPF check support in hardware (security consideration)

   To implement security across trust boundaries Reverse Path Forwarding
   check (RPF check) should be enabled on the domain's boundary devices.
 


Janardhanan et.al.         Expires July 2012                   [Page 36]

INTERNET DRAFT                 Traceflow                    January 2012


    This is to ensure that the IP addresses internal to the domain are
   not used by outside entities to initiate a Traceflow from the outside
   of the boundary of the domain in question.

7.6 Regular Security ACLs in the boundary of the network.

   Apart from RPF check to check whether the Originator IP address is
   internal to the network and is being spoofed from an outside the
   boundary entity, regular security ACLs should be programmed at the
   boundary to ensure that outside entities are not allowed to generate
   Traceflow packets into the boundary and across into the insides of a
   network domain.

7.7 Implementing the LAG / ECMP using software state

   Earlier exact same hashing function / functions that the hardware
   implements was required to be implemented in the software control
   plane of the Traceflow engine in the Route Processor Module. This is
   in effect to determine the LAG or ECMP member through which the
   packet will be forwarded if sent through hardware. This mimicing is
   not sufficient as the hardware software synchronization may not be in
   place at that point in time. That is the hardware and software may be
   out of sync with each other resulting in the wrong result if mimicing
   the hardware in software, is the mechanism to get the result. The
   hardware would possibly give us a wrong result if actually exercised.
   In effect the hardware assist should support packet injection from
   CPU and provide the required results back to the CPU Traceflow
   control process.

7.8 Implementation considerations

   Several aspects of hardware utilize internal packet headers to
   determine aspects of an incoming packet such as ingress port, ACL
   based packet drops etc. All the said codes corresponding to the
   reasons why a packet is dropped should be determined through the
   packet injection mode available in a hardware in part utilizing these
   internal headers.  This is so because when a packet is sought to be
   forwarded and is actually dropped in hardware the reason codes like
   ACL based drops, policing etc., should be available to the software
   control plane to construct the Traceflow response packet with their
   appropriate fields.

7.7.l Using ingress port as part of the LAG/ECMP hashing function.

   LAG / ECMP hashing function on certain platforms use the ingress port
   as well in their hashing to arrive at the LAG / ECMP member on which
   the packet is to be forwarded out on. Normally packet injection mode
   supporting platforms provide the ability to inject a packet into the
 


Janardhanan et.al.         Expires July 2012                   [Page 37]

INTERNET DRAFT                 Traceflow                    January 2012


   hardware Forwarding Engine and make it look like the packet came in
   on a specific ingress port. Now on some vendor platforms this may not
   be possible.  On platforms where the ingress port is not part of the
   equation to the hashing function, they can support Traceflow with
   normal packet injection supported.

   When ingress port is involved, CPU injection MAY be used.

   If we do so the LAG or ECMP that the packet takes MAY be different
   from the one that is actually chosen if the ingress port was taken
   into account.

   All this just because ingress port is part of a hashing function
   determining a LAG / ECMP member and some platforms dont support
   packet injection from software with the ingress port under
   consideration.

8. IANA Considerations 

   TraceFlow protocol would need a UDP port assignment to be used as the
   destination port in the TraceFlow packets. 

9. Contributors 

   This document in its original version was submitted to the IETF on
   August 16th 2008 by the following authors. These authors were namely
   A. Viswanathan, S. Krishnamurthy, R. Manur, V. Zinjuvadia who at that
   time were part of Force10 Networks with inputs and suggestions from
   Shane Amante. We would like to acknowledge their contribution to this
   draft as in its original version.



   This document was prepared using Nroff Internet Draft Editor. 

APPENDIX A:  

A.1. Encapsulation Format Choices 

A.1.1. Carrying a separate Flow Descriptor TLV inside the Flow Discovery
   Request packet 

   This is the approach selected for this proposal. In order to specify
   a flow, the originating device encapsulates the entire data packet
   belonging to the traffic flow of interest in the Flow Descriptor TLV.
   

   If a traffic flow data packet is not readily available, the operator
 


Janardhanan et.al.         Expires July 2012                   [Page 38]

INTERNET DRAFT                 Traceflow                    January 2012


   may have to generate a data packet with the traffic flow information
   available and encapsulate that in the Flow Descriptor TLV. 

   Future revisions of this document may update the Flow Descriptor TLV
   if there is a need to allow the Flow Descriptor TLV to carry
   individual flow parameters (such as the Source IP Address,
   Destination IP Address, UDP/TCP Port numbers, etc.) in sub-TLV format
   rather than using an encapsulated data packet. 

A.1.2. Using the traffic flow's parameter values in the external header.

   This is done to encapsulate the Flow Discovery Request packet. This
   approach involves using the traffic flow's header as the outer header
   of the Flow Discovery Request packet. This ensures that the Flow
   Discovery Request packet would take the same path as the traffic flow
   would have. We could use Layer 2 EtherType to differentiate between
   this OAM packet and the data packets belonging to the traffic flow. 

   This approach was not selected due to the added requirement on the
   intermediate devices to process new EtherType which might be limited
   by hardware. Moreover it is likely that the OAM packet would have to
   make a stop at the intermediate device anyway in order to gather the
   relevant information for the traffic flow specified. 

   If the Flow Discovery Request packet does not use a special
   EtherType, it would be difficult for network operator to filter these
   OAM packets as they would be indistinguishable compared to the
   traffic flow. Moreover such TraceFlow OAM packets may be considered
   as 'spoofed' packets. 

   Even though this approach is not being selected for TraceFlow
   protocol in this document, it helps TraceFlow protocol in supporting
   certain networks with legacy devices (not supporting TraceFlow). This
   approach may be reconsidered in future revisions of this document. 

A.2. Layer 4 Protocol Choices and Router Alert option 

A.2.1. UDP Encapsulation 

   This approach has been selected in this proposal. The Traceflow
   packets are UDP packets with a well-known destination port number (to
   be requested from IANA). 

A.2.2. ICMP Encapsulation 

   This approach involves sending TraceFlow packets as ICMP packets. 
   This was not selected in this proposal due to the simplicity of the
   UDP approach. 
 


Janardhanan et.al.         Expires July 2012                   [Page 39]

INTERNET DRAFT                 Traceflow                    January 2012


A.3. Legacy Devices (Not supporting TraceFlow) 

   It is necessary that the entire flow information available through
   the encapsulated packet in the Flow Discovery Request packet be used
   in determining the egress port. If the Flow Discovery Request packet
   reaches a legacy device that does not support TraceFlow, it is likely
   that the request packet gets forwarded along a different egress link
   compared to the egress link through which the data packets belonging
   to the traffic flow would have been forwarded. Hence the information
   received from the transit routers beyond the legacy device in a
   TraceFlow probe may not be useful. Typically if the legacy device
   does not employ LAGs or ECMP paths or policy-based routing, the
   TraceFlow packet may proceed in the direction that the traffic flow
   would have taken and subsequent transit nodes may still be able to
   provide useful and relevant information to the originator of the Flow
   Discovery Request packet. 

A.4. TTL Scoping 

   Conventional traceroute employs TTL Scoping as a means to determine
   the path followed by destination address based hop-by-hop routing of
   a packet. 

   TraceFlow protocol does not employ TTL Scoping in the current
   specification. However using TraceFlow with TTL Scoping has certain
   applications in networks that contain some legacy devices that do not
   support TraceFlow. This may be explored in future revisions of this
   document if there is interest in the community to solve this problem.
   

   An implementation may allow the operator to send out the TraceFlow
   packets with TTL Scoping just like conventional traceroute. In such a
   mode following points should be noted: 

      1) The originator node may receive multiple packets from the
   transit nodes - an ICMP 'TTL Expired' packet and a TraceFlow response
   packet

      2) In this mode, the transit devices SHOULD send out the TraceFlow
   response packet only if the TTL has also expired for that Flow
   Discovery Request packet on that device. This is needed to prevent
   duplicate Flow Discovery Response packets from the transit node for
   each request packet that the originator device sends when performing
   TTL Scoping.  

A.5. Additional Information in the Flow Discovery Response 

   This document lists the information that can be requested by the
 


Janardhanan et.al.         Expires July 2012                   [Page 40]

INTERNET DRAFT                 Traceflow                    January 2012


   originator of the TraceFlow Flow Discovery Request packet and that
   may be included by the transit devices in their response. Future
   revisions of this document may modify this list based on the feedback
   from the community. For example the QoS related statistics and queue
   depth information may be included in the Flow Discovery Response
   packets for the traffic flow being investigated. 

A.6. Choices for supporting remote TraceFlow requests 

A.6.1. Terminating the request at the Proxy device and re-originate it 

   This approach was selected in this proposal. For indirect Flow
   Discovery Requests, the originating device sends the request to
   another proxy device that is the intended starting point for probing
   the flow and gathering relevant information about the flow. This
   proxy device receives the Flow Discovery Request packet, processes it
   and re-originates a Flow Discovery Request towards the destination of
   the flow. 

A.6.2. Source-Routing the request through the Proxy device 

   This approach involved sending the Flow Discovery Request with IP
   Source Routing option that forced the packet to be received by the
   proxy device that is the intended starting point for probing the flow
   and gathering relevant information about the flow.  It was not
   selected for this proposal. 

A.7. Applicability to Multicast 

   Multicast networks have also evolved into more complex heterogeneous
   networks in the recent years. These advancements place more burden on
   multicast OAM tools employed by network operators. Troubleshooting
   network problems, monitoring network performance and network planning
   and provisioning become difficult due to the gap between the
   complexities in the network compared to the capabilities of the OAM
   tools. Mtrace [4] has evolved into a useful OAM tool to address some
   of the problems faced in multicast network. However it does not
   address all the problems discussed in this document. We believe that
   TraceFlow protocol can be extended to assist the network operator
   with their multicast deployments.  Specific mechanics of any such
   extensions may be defined in the later versions of the draft. 

A.8. Applicability to Layer 2 networks 

   The Layer 2 devices in the path taken by the TraceFlow packets should
   be able to snoop on the higher layer headers in the packet to
   determine that it is a TraceFlow Flow Discovery Request packet. Most
   of the TraceFlow packet processing and operations discussed in this
 


Janardhanan et.al.         Expires July 2012                   [Page 41]

INTERNET DRAFT                 Traceflow                    January 2012


   document should apply to the layer 2 devices also. But however, the
   current version of the draft treats Layer 2 devices as pass-through. 
   Refer to section 4.3 to see more of the discussion with respect to
   this issue.

   However specific mechanics of any separate extensions necessary for
   Layer 2 networks may be defined in the later versions of the
   protocol.

A.9. Applicability to IPv6 

   The TraceFlow protocol described in this document should apply to
   IPv6 networks or IPv4-IPv6 dual stack networks with straight-forward
   extensions.

   Specific mechanics of extensions to address IPv6 networks may be
   defined in the later versions of the draft. 

A.10. Applicability to MPLS 

   MPLS networks are to be considered at a later point in time in the
   future. Revisions or addendums to this proposal to include MPLS
   networks are currently out of scope of this document.  

A.11. Flow Discovery and Response packet fragmentation 

   It is highly RECOMMENDED that the network allow the Flow Discovery
   Request packet to travel through to the destination without
   fragmentation. The Flow Discovery Response packet that is originated
   by the transit devices processing the request packet may be
   fragmented on its way to the originator device. 

9. References 

9.1. Normative References 

   [KEYWORDS] Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119, March 1997.

   [RFC1776]  Crocker, S., "The Address is the Message", RFC 1776, April
              1 1995.

   [TRUTHS]   Callon, R., "The Twelve Networking Truths", RFC 1925,
              April 1 1996.


9.2. Informative References 

 


Janardhanan et.al.         Expires July 2012                   [Page 42]

INTERNET DRAFT                 Traceflow                    January 2012


   [EVILBIT]  Bellovin, S., "The Security Flag in the IPv4 Header",
              RFC 3514, April 1 2003.

   [RFC5513]  Farrel, A., "IANA Considerations for Three Letter
              Acronyms", RFC 5513, April 1 2009.

   [RFC5514]  Vyncke, E., "IPv6 over Social Networks", RFC 5514, April 1
              2009.

Author's Addresses 

      Janardhanan Narasimhan.P,
      Dell-Force10,
      Olympia Technology Park,
      Fortius block, 7th & 8th Floor,
      Plot No. 1, SIDCO Industrial Estate,
      Guindy, Chennai - 600032.
      TamilNadu, India.
      Tel: +91 (0) 44 4220 8400
      Fax: +91 (0) 44 2836 2446

      Email: Pathangi_janardhanan@dell.com 

      Balaji Venkat Venkataswami,
      Dell-Force10,
      Olympia Technology Park,
      Fortius block, 7th & 8th Floor,
      Plot No. 1, SIDCO Industrial Estate,
      Guindy, Chennai - 600032.
      TamilNadu, India.
      Tel: +91 (0) 44 4220 8400
      Fax: +91 (0) 44 2836 2446

      Email: BALAJI_VENKAT_VENKAT@dell.com 

      Richard Groves,
      Microsoft Corporation,
      One Microsoft Way,
      Redmond, WA 98052

      Email: rgroves@microsoft.com

      Peter Hoose,
      Facebook,
      Willow Rd., 
      Menlo Park, CA 94025

      Email: phoose@fb.com



Janardhanan et.al.         Expires July 2012                   [Page 43]