A. Viswanathan S. Krishnamurthy R. Manur V. Zinjuvadia Internet Draft Force10 Networks Intended status: Standard Track August 16, 2008 Expires: February 2009 TraceFlow draft-zinjuvadia-traceflow-02.txt Status of this Memo By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html This Internet-Draft will expire on February 16, 2009. Copyright Notice Copyright (C) The IETF Trust (2008). Abstract This document describes a new OAM protocol - TraceFlow that captures information pertaining to a traffic flow along the path that the flow takes through the network. TraceFlow is ECMP and link-aggregation Zinjuvadia Expires February 16, 2009 [Page 1] Internet-Draft TraceFlow August 2008 aware and captures the information about constituent members through which the traffic flow passes. TraceFlow gathers information that is relevant to the flow such as interface address, interface statistics, effect of network policies on the flow and so on. Conventions used in this document The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC-2119 [1]. Table of Contents 1. Introduction................................................. 3 2. Motivation................................................... 4 2.1. Evolution of IP networks................................ 4 3. Packet Formats............................................... 5 3.1. Flow Discovery Request/Response Packet Format .......... 5 3.2. Flow Discovery Request TLVs............................. 6 3.2.1. Flow Descriptor TLV................................ 6 3.2.1.1. Flow Information TLV.......................... 7 3.2.2. Originator Address TLV............................. 8 3.2.3. Information Request bitmap TLV..................... 8 3.2.4. Termination TLV.................................... 9 3.3. Flow Discovery Response TLVs .......................... 10 3.3.1. Information Response TLV.......................... 10 3.3.2. Result TLV........................................ 15 3.3.3. Additional Informational Code TLV................. 17 3.4. TLVs common to Flow Discovery Request and Response..... 19 3.4.1. Encapsulated Packet TLV .......................... 19 3.4.2. Encapsulated Packet Mask TLV...................... 20 3.4.3. Record Route TLV.................................. 20 3.4.4. Authentication TLV ............................... 21 4. Protocol Operation.......................................... 22 4.1. Out-of-Sync Hardware................................... 26 5. Application Scenarios....................................... 26 5.1. Troubleshooting network failures....................... 27 5.2. Performance monitoring................................. 28 5.3. Network planning....................................... 28 6. Security Considerations..................................... 29 7. IANA Considerations......................................... 30 8. Contributors................................................ 30 APPENDIX A:.................................................... 31 A.1. Encapsulation Format Choices .......................... 31 A.1.1. Carrying a separate Flow Descriptor TLV inside the Flow Discovery Request packet................................. 31 Zinjuvadia Expires February 16, 2009 [Page 2] Internet-Draft TraceFlow August 2008 A.1.2. Using the traffic flow's parameter values in the external header to encapsulate the Flow Discovery Request packet................................................... 31 A.2. Layer 4 Protocol Choices and Router Alert option....... 32 A.2.1. UDP Encapsulation................................. 32 A.2.2. ICMP Encapsulation ............................... 32 A.3. Legacy Devices (Not supporting TraceFlow).............. 32 A.4. TTL Scoping............................................ 32 A.5. Additional Information in the Flow Discovery Response.. 33 A.6. Choices for supporting remote TraceFlow requests....... 33 A.6.1. Terminating the request at the Proxy device and re- originate it............................................. 33 A.6.2. Source-Routing the request through the Proxy device33 A.7. Applicability to Multicast............................. 33 A.8. Applicability to Layer 2 networks...................... 34 A.9. Applicability to IPv6.................................. 34 A.10. Applicability to MPLS................................. 34 A.11. Flow Discovery and Response packet fragmentation...... 35 A.12. Authetication TLV..................................... 35 9. References.................................................. 36 9.1. Normative References................................... 36 9.2. Informative References................................. 36 Author's Addresses............................................. 36 Intellectual Property Statement ............................... 37 Disclaimer of Validity......................................... 37 1. Introduction TraceFlow protocol allows user to determine the path taken by a flow through a network. It provides capability to collect relevant information at each hop of the network that pertains to the forwarding for the flow. Information can include individual member information in a link-aggregation group (LAG) or ECMP. There is a need for a mechanism that allows user to determine the path that a flow takes through a network [3]. Current solutions (such as traceroute) do not provide the details about the exact physical or logical interface through with the flow passes in cases where LAG and/or ECMP are employed or policy based routing is in effect. Furthermore current OAM techniques do not collect any detailed information relevant to a traffic flow as it traverses through the network. Such information at intermediate hops in the network can prove to be useful to network operators in trouble-shooting network failures and in monitoring network performance. Zinjuvadia Expires February 16, 2009 [Page 3] Internet-Draft TraceFlow August 2008 2. Motivation Network operators have traditionally managed IP networks with classic OAM tools like Ping and Traceroute[2]. Operators typically use Ping to perform end-2-end connectivity checks, and Traceroute to trace hop-by-hop path to a given destination. Traceroute is also used to isolate the point of failure along the path to a given destination. These tools have performed very well for the IP networks they were designed for. 2.1. Evolution of IP networks With the passage of time networks have morphed into more complex heterogeneous entities. Many a times Layer-2 switches and MPLS LSRs are intermixed with IP routers. Also, increasing number of networks are using multipath configurations to improve load-balancing and redundancy in their networks. These multipaths could be in the form of end-2-end ECMP paths, or LAGs between directly connected hops. Existing tools such as Ping and Traceroute that follow the destination IP address based routing model may not follow the path taken by the actual traffic in multipath and/or policy based routing scenarios. The forwarding of actual traffic in such scenarios is based on a set of packet header fields. Clearly, the OAM tools have not kept up with the new requirements of the evolving networks. Hence there is a need to extend the OAM tools to facilitate the operators to execute new OAM functions: 1. Perform Ping or traceroute based on a set of link layer and/or TCP/IP header fields of actual user traffic. This feature will be very useful for troubleshooting network problems, and planning/provisioning network resources. 2. Trace end-2-end paths comprising of a mix Layer-2 hops, IP/MPLS routers along the way. 3. Collect more intelligent and useful information to enable operators to perform more detailed problem analysis. This document proposes a new OAM protocol - TraceFlow that attempts to bridge the gap between today's fast evolving networks and the traditional OAM tools. The following section (Section 3) discusses the packet format used by TraceFlow to avoid forward references in subsequent sections. It is suggested that first-time readers skip section 3 and read the Protocol Overview in Section 4. Applications scenarios are discussed in section 5 and the security considerations in section 6. Zinjuvadia Expires February 16, 2009 [Page 4] Internet-Draft TraceFlow August 2008 3. Packet Formats 3.1. Flow Discovery Request/Response Packet Format Flow Discovery Request and Response packets follow the general format shown below. The TLVs included in each message type may be different. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Version | Hopcount | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Reserved | Query ID | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | TLVs... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ The Flow Discovery Request packet SHOULD be sent with the DF bit set in the external IP header. Version: The version number of the protocol. This document defines protocol version 1. Hopcount: Allows keeping track of the number of transit nodes that processed the Flow Discovery Request packet. This field is decremented at each device that processes the Flow Discovery Request packet. This field also helps in determining if there were any legacy devices not supporting TraceFlow protocol along the way. Length: Length of the packet Type: 1 Direct Flow Discovery Request - Ping mode 2 Direct Flow Discovery Request - Traceroute mode 3 Indirect Flow Discovery Request - Ping mode 4 Indirect Flow Discovery Request - Traceroute mode 5 Response for the Flow Discovery Request Query ID: A unique identifier generated by the originator that allows it to co-relate the responses from the transit nodes with the Flow Discovery Request packet generated. The TLVs are divided into three categories: Zinjuvadia Expires February 16, 2009 [Page 5] Internet-Draft TraceFlow August 2008 1. TLVs that can show up in the Flow Discovery Request packet 2. TLVs that can show up in the Flow Discovery Response packet 3. TLVs that can show up in the Flow Discovery Request as well as Response packet 3.2. Flow Discovery Request TLVs 3.2.1. Flow Descriptor TLV This TLV is included in the Flow Discovery Request packet and identifies the traffic flow that the originator device is interested in probing. This is a mandatory TLV. The definition of a traffic flow varies from one network to another. Most traffic flows in today's networks can be uniquely identified using fields from the data packet's headers. TraceFlow protocol requires the first 256 bytes of the traffic flow's data packet to be encoded in this Flow Descriptor TLV. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Code | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Value... | padding | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Type: The type of the TLV. In this case, the value is 1 meaning Flow Descriptor TLV Code: The Code identifies the sub-type of the TLV. In this case, this field is not defined. It SHOULD be set to 0. Length: The length of the TLV Value: The value encoded in this TLV depending on the Type and the Code specified Padding: This might be necessary to ensure the packet ends on a word boundary Refer to section 3.4.1.1 (Encapsulated Packet TLV) that describes how a data packet can be used to specify the traffic flow. Zinjuvadia Expires February 16, 2009 [Page 6] Internet-Draft TraceFlow August 2008 3.2.1.1. Flow Information TLV There are certain attributes of a flow that are not carried in the traffic flow's data packet header. Some such attributes include MTU requirements of the traffic flow, traffic rate for the flow, whether the data packets of the traffic flow may be fragmented, and so on. To specify such traffic flow attributes a Flow Information TLV is specified. This is an optional TLV. The Flow Information TLV allows the operator to specify the traffic flow related parameters that are not carried inside the flow packet. Following are some such the parameters. 1. MTU requirement 2. Fragmentation Information 3. Traffic Rate requirement 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Code | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Packet Rate | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | MTU | Flags | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Type: 1 Flow Information TLV Code: Not defined. It SHOULD be set to 0. MTU: MTU requirement for the flow Packet Rate: Packet Rate in Kbps required by the flow Flags: Not defined. It SHOULD be set to 0. If the traffic flow's data packets should not be fragmented, the DF bit in the encapsulated IP packet should be set to 1. Zinjuvadia Expires February 16, 2009 [Page 7] Internet-Draft TraceFlow August 2008 3.2.2. Originator Address TLV This TLV carries the address of the originator of the Flow Discovery Request packet. The responses from the intermediate devices processing the request are sent to this address. This is an optional TLV to be included only when an Indirect Flow Discovery Request is originated. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Code | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Value... | padding | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Type: 2 Originator Address Code: 1 IPv4 Address 2 IPv6 Address 3.2.3. Information Request bitmap TLV This TLV is used by the originator device to specify the information requested for the flow identified by the Flow Descriptor TLV in the Flow Discovery Request packet. This is an optional TLV. In absence of this TLV, the transit and the end devices processing the Flow Discovery Request packet respond with the default set of information. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Code | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Flags... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Type: 3 Information Request Code: 1 Incoming Interface related 2 Outgoing Interface related Flags: Zinjuvadia Expires February 16, 2009 [Page 8] Internet-Draft TraceFlow August 2008 Bit 0 : IP Address Bit 1 : SNMP ifName Bit 2 : SNMP ifIndex and ifType Bit 3 : incoming packet count Bit 4 : outgoing packet count Bit 5 : incoming packet rate Bit 6 : outgoing packet rate Bit 7 : incoming packet error count Bit 8 : outgoing packet error count Bit 9 : Lag details Bit 10: Ecmp details Bit 11: Hash algorithm Code: 3 Global information Flags: Bit 0 : Timestamps (incoming and outgoing) Bit 1 : Next Hop Router Address 3.2.4. Termination TLV This TLV includes a list of addresses. If a device notices that it owns any of the addresses listed in this TLV, it MUST NOT forward the Flow Discovery request packet any further and MUST respond to the originator with a Flow Discovery Request packet. Zinjuvadia Expires February 16, 2009 [Page 9] Internet-Draft TraceFlow August 2008 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Code | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Address-type | Address... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Address-type | Address... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ // // +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Address-type | Address... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Address-type: 0x1: IPv4 Address 0x2: IPv6 Address Address: The address where the request MUST be terminated. 3.3. Flow Discovery Response TLVs 3.3.1. Information Response TLV This TLV is used by the devices processing the Flow Discovery Request packet to provide the information requested by the originator device. This is a mandatory TLV. It should be included in the response sent to the device originating the Flow Discovery Request packet. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Code | Sub-Code | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Length | Value... | padding | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Type: 5 Information Response Code: 1 Incoming Interface related Zinjuvadia Expires February 16, 2009 [Page 10] Internet-Draft TraceFlow August 2008 2 Outgoing Interface related Sub-Code: 0 : IP Address 1 : SNMP ifName 2 : SNMP ifIndex and ifType 3 : incoming packet count 4 : outgoing packet count 5 : incoming packet rate 6 : outgoing packet rate 7 : incoming packet error count 8 : outgoing packet error count 9 : Lag details 10: Ecmp details 11: Hash algorithm The LAG and ECMP details are described in more detail. Following is the frame format if the originator device requested LAG or ECMP related details. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Code | Sub-Code | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Length | No. of members | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Component Link Information.. | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ // // +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Component Link Information.. | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Zinjuvadia Expires February 16, 2009 [Page 11] Internet-Draft TraceFlow August 2008 No. of members: This is the number of members in the LAG or the ECMP segment that is being described Component Link Information: Individual component links are encoded in this field. The "No. of members" field describes how many component links are listed. The frame format for the "Component Link Information" portion of the TLV is shown below. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | SNMP ifIndex | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | SNMP ifType | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Input % Utilization | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Output % Utilization | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Flags | SNMP ifName length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | SNMP ifName... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ SNMP ifIndex: The ifIndex of the component link being specified SNMP ifType: The ifType of the component link being specified Input % Utilization: The percentage utilization of the input link at the device processing the Flow Discovery Request packet Output % Utilization: The percentage utilization of the output link at the device processing the Flow Discovery Request packet Flags: 0x1: If set, the Component Link is administratively down. 0x2: If set, the Component Link is operationally down. The rest of the bits in the Flags field are reserved. If the hash algorithm information is requested in the Flow Discovery Request packet, the following TLV format is used to encode it. This Zinjuvadia Expires February 16, 2009 [Page 12] Internet-Draft TraceFlow August 2008 TLV format specifies the packet fields that are used by the hash algorithm configured on the device. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Code | Sub-Code | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Length | No. of hash parameters | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | byte-offset-1 | no. of bytes | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | byte-offset-2 | no. of bytes | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Encapsulated Packet ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ No. of hash parameters: This specifies the number of parameters in the packet that are used by the hash algorithm to calculate the egress port Byte-offset-N: This is the offset to the start of the Nth parameter that is used by the hash algorithm to calculate egress port No. of bytes: For the byte-offset specified, the number of bytes starting at that offset that are used by the hash algorithm Encapsulated Packet: The encapsulated packet received in the Flow Discovery Request packet on the input port by the device is returned in the response packet. This should be the packet that is used in the egress component link calculations by the device processing the Flow Discovery Request packet. Zinjuvadia Expires February 16, 2009 [Page 13] Internet-Draft TraceFlow August 2008 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Code | Sub-Code | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Length | Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | Incoming Timestamp | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | Outgoing Timestamp | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Code: 3 Global information Sub-Code: 0 Timestamps (incoming and outgoing) Timestamp size: 64 bits Incoming Timestamp: The timestamp corresponding to the time at which the packet was received Outgoing Timestamp: The timestamp corresponding to the time at which the packet was forwarded to the next device Zinjuvadia Expires February 16, 2009 [Page 14] Internet-Draft TraceFlow August 2008 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Code | Sub-Code | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Length | Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Address-type | Next Hop Address ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ The next hop address is encoded as shown above. Code: 3 Global information Sub-Code: 1 Next Hop Address Address-type: 0x1: IPv4 Address 0x2: IPv6 Address Next Hop Address: This field carries the next hop address. 3.3.2. Result TLV The device processing the Flow Discovery Request packet includes a Result TLV in the response to the originator device to indicate the result of the processing. This TLV is mandatory. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Code | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Result Code | Sub-code | Diagnostic Data.. | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Diagnostic Data... | padding | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Type: 7 Result TLV Zinjuvadia Expires February 16, 2009 [Page 15] Internet-Draft TraceFlow August 2008 Result Code: This field carries a value indicating the result of the processing of the Flow Discovery Request packet Sub-Code: This field further qualifies the "Result Code" field and provides more information about the result of processing the Flow Discovery Request packet Diagnostic Data: This field is used in conjunction with the "Result Code" and "Sub-code" to return any information that may be useful to the originator of the Flow Discovery Request packet. Its format is defined based on the "Result Code" and "Sub-code" field. Result Code: 1 Success Result Sub-code: 0 Result Code: 2 Authentication Failure Result Sub-code: 0 Result Code: 3 Administratively disabled Result Sub-code: 0 Diagnostic Data: A list of Information Request Sub-Codes that are not being fulfilled. Result Code: 4 Routing failure Result Sub-code: 1 No route in table Result Sub-code: 2 RPF check failed Result Code: 5 Packet Error Result Sub-code: 1 hopcount = 0 Zinjuvadia Expires February 16, 2009 [Page 16] Internet-Draft TraceFlow August 2008 Result Code: 6 Malformed packet Result Sub-code: 1 Unknown TLVs (TBD: More sub-codes to identify the type of error in the packets may need to be defined) Result Code: 7 Data-path Error Result Sub-code: 1 Fragmentation needed but not allowed by Flow Information TLV in Flow Discovery Request packet (TBD: Sub-codes to identify the type of error in the TLV may need to be defined) Result Code: 8 Generic Error Result Sub-code: 0 (TBD: Sub-codes to identify the type of error may need to be defined) 3.3.3. Additional Informational Code TLV This TLV may accompany the Result TLV if the device processing the Flow Discovery Request packet has any additional information that the originator device may be interested in. This TLV is optional. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Code | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Status Code | Sub-code | Additional Data.. | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Additional Data... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Type: 8 Additional Informational Code Status code: 1 ACL drop Zinjuvadia Expires February 16, 2009 [Page 17] Internet-Draft TraceFlow August 2008 Status Sub-code: 1 Ingress ACL drop Status Sub-code: 2 Egress ACL drop Status code: 2 Dataplane failure Status Sub-code: 1 Switch fabric failure Status Sub-code: 2 Linecard failure Status Sub-code: 3 Port failure Status Code: 3 Generic Information Status Sub-code: 1 TTL/Hopcount mismatch noticed Status Sub-code: 2 Default route used to forward packet In case of TTL/Hopcount mismatch, the "Additional Data" field carries the difference in the Hopcount and the IP TTL field values. This may provide an indication of the number of previous hop routers that did not support TraceFlow protocol. Zinjuvadia Expires February 16, 2009 [Page 18] Internet-Draft TraceFlow August 2008 3.4. TLVs common to Flow Discovery Request and Response 3.4.1. Encapsulated Packet TLV This TLV is included in the Flow Discovery Request and is returned in the Flow Discovery Response packet by devices processing the request packet. In the response packet, this TLV contains the encapsulated packet as it was received from the previous-hop device. It helps the originator keep track of how the data packet gets modified along the way. This TLV is mandatory. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Code | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Flags | First Hdr | Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Encapsulated Packet... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Type: 1 Flow Discovery Request Code: 1 Encapsulated traffic flow data packet Encapsulated Packet: The first 256 bytes of a data packet belonging to the flow are encapsulated in this field of the packet Flags: 0x1: fan-out option; if set, the transit node SHOULD forward the Flow Discovery Request packet to all possible egress links for the specified flow First Hdr: Specifies the first header that appears in the encapsulated packet. The values defined by this document are: 0x1: Layer 2 MAC Header 0x2: IPv4 Header 0x3: IPv6 Header 0x4: MPLS Header Zinjuvadia Expires February 16, 2009 [Page 19] Internet-Draft TraceFlow August 2008 3.4.2. Encapsulated Packet Mask TLV This TLV allows the operator to specify what portion of the encapsulated packet carries flow data and what portion is left unspecified. This allows the intermediate nodes to determine if they have enough information to calculate an egress interface to forward the Flow Discovery Request packet. If this TLV is omitted from the Flow Discovery Request packet, no portion of the packet is left unspecified and the transit device may use any of the fields to make the forwarding decision. This TLV is optional. This TLV includes a sequence of tuples. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Code | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | No. of tuples | byte-offset-1 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | no. of bytes | byte-mask-1 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | byte-offset-2 | no. of bytes | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | byte-mask-1 | ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ No. of tuples: Total number of tuples carried in this TLV Byte-offset: The byte offset for the field being specified No. of bytes: The number of bytes from the byte-offset to consider Mask: The mask to be applied to the bytes starting at the byte- offset. 3.4.3. Record Route TLV This TLV is used to record the information about the path taken by a Flow Discovery Request packet as it traverses through the network. It is included by the originator and each transit device processing the Flow Discovery Request packet includes information about its incoming interface in this TLV. This TLV is included in the response sent by the transit nodes (in trace-route mode) to the originator of the Flow Zinjuvadia Expires February 16, 2009 [Page 20] Internet-Draft TraceFlow August 2008 Discovery Request packet. This TLV is optional. However if it is included by the originator node in the Flow Discovery Request packet, the subsequent nodes SHOULD prepend to the list of addresses. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Code | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Address-type | Incoming interface Address... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Type : 9 Record-Route TLV Code: Not defined. This SHOULD be set to 0. Address-type: 0x1: IPv4 Address 0x2: IPv6 Address Incoming interface Address: This field carries the incoming interface address at the device processing the Flow Discovery Request packet. Each node receiving the request packet with this TLV should prepend its incoming interface address to this TLV. The device SHOULD include the Record-Route TLV as it received on its input interface in the Flow Discovery Response packet it sends out. 3.4.4. Authentication TLV This TLV is used by any device sending or receiving TraceFlow packets to identify itself and validate its credentials. TraceFlow protocol uses a shared secret for authenticating the peer's identity. The hash is calculated by appending the shared secret to the Flow Discovery Request packet. Zinjuvadia Expires February 16, 2009 [Page 21] Internet-Draft TraceFlow August 2008 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Code | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Value... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Type: 10 Authentication TLV Code: 1 NULL 2 HMAC MD5 Digest 3 HMAC SHA1 Digest 4. Protocol Operation A Flow Discovery Request packet is a UDP packet addressed to a well- known destination port. The source UDP port in the packet is ephemeral. It consists of a "Flow Descriptor" TLV that allows the originator of the request to encode a flow data packet in the TLV Certain fields in a traffic flow data packet get modified by the transit devices as the data packet traverses the network. A transit device that processes a Flow Discovery Request packet would need to edit those fields in the encapsulated data packet that represents the flow. Some such fields are source and destination MAC Addresses and MPLS label stack. Consider a transit device that uses the source or destination MAC address of a data packet in order to determine the egress port. The transit device could choose to pick up the MAC addresses from the external header of the Flow Discovery Request packet or from the encapsulated packet. TraceFlow can operate in two separate modes: 1) Trace-route mode: In the traceroute mode of operation, each transit device and the end node respond to the Flow Discovery Request packet by sending a flow discovery response. 2) Ping mode: Transit nodes do not send a response message to the originator. Rest of the behavior is same as traceroute mode. Zinjuvadia Expires February 16, 2009 [Page 22] Internet-Draft TraceFlow August 2008 The destination address of the Flow Discovery Request packet is the destination address for the desired traffic flow. In Ping mode a separate TLV may be included that specifies a list of addresses. If a device processing the Flow Discovery Request packet notices that one of its IP addresses matches with one of the addresses specified in the Termination TLV, then the device MUST NOT forward the Flow Discovery Request packet further and send a response packet to the originator. The Flow Discovery Request packet travels the exact same path that a data packet for the specified traffic flow would have followed. This includes the exact physical or logical interface that belongs to a LAG or a set of ECMP paths. The device interested in receiving information about the traffic flow originates a Flow Discovery Request packet. The Flow Descriptor TLV in this packet specifies the flow of interest whereas a Requested Information TLV specifies the flow related information that the originator device is requesting from each transit router. The Flow Discovery packet needs to be processed by all routers along the path to the destination. This can be achieved by using a well-known UDP port as the destination port in the UDP header. When a transit device receives a Flow Discovery Request packet, it reads the flow information from the Flow Descriptor TLV, looks up the local forwarding database(s) and determines an egress port or ports for this traffic flow. The transit device forwards the Flow Discovery packet along the egress port calculated using this lookup. The egress port is calculated based on the flow information from the Flow Descriptor TLV in the request packet and not based on destination IP address in the IP header of the Flow Discovery Request packet. When processing the Flow Discovery Request packet, the transit node MUST consider the packet length specified in the encapsulated packet in the Flow descriptor TLV. The transit device also gathers the relevant information for the flow which could include details such as: 1. incoming and outgoing interface related details such as ifIndex, IP Address, packet count, packet rate, error count, Lag and ECMP related information 2. Specified flow related statistics 3. Next-Hop Router information Zinjuvadia Expires February 16, 2009 [Page 23] Internet-Draft TraceFlow August 2008 The transit device processing the Flow Discovery Request packet may choose to respond to only a subset of the information requested in the Flow Discovery Request packet. The transit device includes additional information related to the incoming or outgoing LAG or ECMP interface. This additional information includes details about the percentage utilization on the component links of the LAG, the number of LAG or ECMP links that are configured and their operational status and the parameters included in the hashing algorithm that is used to select an egress port for the traffic flow. This information is sent back to the IP address specified as the Originator IP Address in the Flow Discovery Request packet. The Flow Discovery Request packet includes a hop count field which is initialized to the same value as the IP header's TTL field. This hop count field is decremented by one at each intermediate hop router that processes the Flow Discovery Request. In conjunction with the TTL field in the IP header this hop count field can help determine if there are any intermediate routers that do not support the TraceFlow protocol. When an intermediate hop router detects that the hop count field is greater than the IP header TTL field it indicates that one or more previous hop routers do not support the TraceFlow protocol. This information is added to the response sent to the Originator IP Address. The IP TTL field as well as the hopcount field SHOULD be initialized to values that limit the Flow Discovery Request packet to the desired network boundary. A router can originate periodic Flow Discovery Requests for a traffic flow. The Query ID field in the Flow Discovery Request packet helps the originator identify the responses from the transit routers as they process the request. The TraceFlow protocol provides options to choose no authentication, MD5 based authentication and SHA1 based authentication. Refer to Appendix for more information on the use of Authentication TLV. When processing a Flow Discovery Request packet at a device along the path towards the destination it is likely that the device may encounter an error condition and is not able to continue processing the packet. Some examples of the error conditions are: 1. Authentication failure Zinjuvadia Expires February 16, 2009 [Page 24] Internet-Draft TraceFlow August 2008 2. TraceFlow protocol has been administratively disabled 3. Unicast RPF check failed for the flow specified in the Flow Discovery Request packet 4. No route exists in the routing table to route the flow specified in the Flow Descriptor TLV. 5. IP TTL or the Hop Count field in the Flow Discovery Packet becomes zero. The "Result TLV" is used to carry this information back to the originator of the Flow Discovery Request packet. It is also likely that the device is able to successfully process the Flow Discovery Request packet; however it encounters a condition during the processing that may be of interest to the originator. Some examples of such conditions are: 1. The flow specified in the Flow Descriptor TLV would be dropped due to Ingress ACL or Egress ACL policies 2. Dataplane failure may prevent the specified flow from being successfully switched/routed. 3. IP TTL and the Hop-count field in the Flow Discovery Request packet do not match possibly due to one or more previous hop routers not supporting the TraceFlow protocol. 4. The specified flow would be routed using default route in the routing table. This information is returned to the originator of the Flow Discovery Request packet using the "Additional Information Code TLV". The originator of the Flow Discovery Request packet may set the fan- out bit in the Flow Descriptor TLV to request the transit node to forward the request packet through all possible egress ports for the specified flow. The transit device would process the Flow Discovery Request packet as described above and forward it out of all possible egress ports in multipath scenarios. If the fan-out option is selected, only the Flow Discovery Request packet received on the primary port of the LAG interface is forwarded. This helps reduce the number of redundant request packets generated as a result of the fan- out behavior. The originator of the request packet with the fan-out option enabled may get redundant responses in certain circumstances. Zinjuvadia Expires February 16, 2009 [Page 25] Internet-Draft TraceFlow August 2008 The Flow Discovery Request packet could pass through the Layer 2 or MPLS routed segments along the path in pass-through mode as data packets. The appendix discusses the possibility of extending the TraceFlow protocol to allow the devices in the Layer 2 and MPLS segments along the path of the traffic flow to respond to the Flow Discovery Request packet. The discussion so far has assumed that the Flow Discovery Request packet would originate on one device (say device A) and terminate on some other device (say device B). It is likely that a third device (say device C) would be interested in obtaining the flow related information for a flow traversing from device A to device B. In this case, device C sends a Flow Discovery Packet to device A. The Flow Discovery Request type specified in the packet would indicate to device A that this is an indirect request from device C to obtain information relevant to the flow specified in the Flow Descriptor TLV. Device A then generates a new Flow Discovery Request packet with the destination IP set to device B and the Originator IP Address set to device C. All transit routers that process this request would send their responses to device C. In such usage of the TraceFlow protocol it is strongly RECOMMENDED to include the Authentication TLV in the Flow Discovery Request packet to identify the device originating the request - device A in the above example. This would ensure that the request is authenticated at each hop between device B and device C. 4.1. Out-of-Sync Hardware Since the TraceFlow packets are processed by the CPU, the information collected in protocol TLVs potentially reflect forwarding state as seen by the CPU software. It is possible that the hardware forwarding entries that forward the actual flow through a hop don't necessarily match the information CPU software uses to fill up the TLVs. It is RECOMMENDED that the TLVs be filled with as much information gathered directly by reading the hardware elements that are used in forwarding of a flow. In case when its not, the implicit assumption is that the software implements synchronization and auditing techniques to ensure hardware and software are in sync, which will ensure the information gathered in the TLVs give the exact forwarding treatment that flow will encounter on a node. The technique of achieving this synchronization is outside the scope of this document. 5. Application Scenarios This section discusses some of the applications of this proposal. The application scenarios can broadly be divided into following categories: Zinjuvadia Expires February 16, 2009 [Page 26] Internet-Draft TraceFlow August 2008 1. Troubleshooting network failures 2. Performance monitoring 3. Network planning 5.1. Troubleshooting network failures Several network monitoring tools provide us the capability to monitor the health of a network by polling information from the network devices (primarily through the use of SNMP). They help us in detecting network failures, imminent failures or other anomalies in the network. For troubleshooting these failures, the network operators typically rely initially on tools such as ping and traceroute. Unfortunately they do not provide detailed information about the traffic flow that is affected for a couple of reasons: 1. It is likely that ping and traceroute control packets follow a different path through the network compared to the traffic flow that is being investigated - for example when policy-based routing is in effect or when there are one or more ECMP segments along the path of the traffic flow. 2. Ping and traceroute do not provide us with details about the constituent members of a port-channel trunk through which the affected flow would have traversed. Being able to trace the exact path that a particular flow might have taken through the network and obtain all relevant information about the hops along that path provides the network operator with enough information to troubleshoot a network failure quickly. For example consider a traffic flow that is reported to be experiencing certain percentage of packet loss. The network operator originates a Flow Discovery Request packet for that flow. The information returned by the intermediate devices for the specified traffic flow show that on a certain intermediate device, some of the members of a LAG that is supposed to be carrying the traffic flow are reporting a high % bandwidth utilization. This could be due to uneven load balancing among the LAG members. The network operator should be able to correlate this observation with the reported problem symptoms and possibly come up with some conclusion much faster than the conventional techniques would allow. Zinjuvadia Expires February 16, 2009 [Page 27] Internet-Draft TraceFlow August 2008 By setting the fan-out bit in the Flow Descriptor TLV, the operator should be able to determine all possible paths through the network that traffic to a particular destination may take. Along with the paths, the operator should also be able to obtain information relevant to the traffic flow from transit devices along the paths. This might prove to be useful in trouble-shooting certain type of network problems. 5.2. Performance monitoring It is important to be able to measure the performance of the network after it is provisioned and is in production. Traffic monitoring tools provide us with information about the traffic rates throughout the network allowing us to determine if the traffic is unevenly distributed across network segments. However following are a couple of short-comings of the conventional techniques. 1. Even though a port-channel trunk as a whole may display normal traffic levels, it is likely that a constituent member is over- subscribed and may be dropping traffic. 2. For a given traffic flow it is difficult to determine the end-to- end performance throughout a network. The TraceFlow protocol allows us to monitor the link utilization of the constituent members of a LAG or an ECMP path segment. Being able to monitor several network parameters for each individual flow with the same ease of use as is associated with ping and traceroute would help the network operator monitor the network's performance better. 5.3. Network planning In the network planning and configuration stage, it would be useful to see how the network would behave once it is in production. 1. It would be useful to see if the typical traffic flow patterns get evenly distributed among the constituent members of a LAG or an ECMP path along the route to the destination. 2. It would be useful to see whether the traffic gets appropriately queued and buffered and that no network resource is un- intentionally over-subscribed. Zinjuvadia Expires February 16, 2009 [Page 28] Internet-Draft TraceFlow August 2008 3. It would be useful to determine that the network access-lists are properly configured and the traffic would not get blocked inadvertently by an access-list somewhere. Typically the issues listed above are discovered once the network is in production. By having the ability to exercise the traffic flow's data path before it starts handling production traffic would help the operator to: 1. Rectify any configuration issues such as ACL policies. 2. Modify the hashing algorithms to evenly distribute traffic among the constituent members of a port-channel trunk or an ECMP path Note that this application of the Traceflow protocol may not be relevant to all types of networks. Campus networks, enterprise networks and datacenters with well defined traffic flow patterns may benefit from the capability to detect the above problems. However for tier 1 providers this application of the TraceFlow has limited relevance as the traffic flows are not well-defined. The operator may use the fan-out bit in the Flow Descriptor TLV to request the transit devices to provide all the paths that traffic flow to a certain destination address would take. This allows the operator to validate the ECMP or LAG configuration in the network. 6. Security Considerations This section discusses threats to which TraceFlow might be vulnerable and discusses means by which those threats might be mitigated. There is a concern that this protocol might allow an external user to probe the detailed path that a flow takes through a network. TraceFlow protocol supports Authentication TLV that allows each intermediate device to authenticate the originator. It is strongly RECOMMENDED that Authentication TLV be included in all TraceFlow packets. The network operator can associate multiple levels with the different types of information that are included in the response to a Flow Discovery Request packet. For example only the "Next Hop Router" may be marked as publicly accessible information whereas everything else may be marked as private information. On receiving a Flow Discovery Request packet originating outside the local network, only the publicly accessible information is included in the response to the originator. However if the request was originated locally the device Zinjuvadia Expires February 16, 2009 [Page 29] Internet-Draft TraceFlow August 2008 includes all requested information in the response. Devices belonging to the same network may be configured with a shared secret that allows them to authenticate each other through the user of Authentication TLV. Refer to the Appendix for more information on Authentication TLV. Moreover interfaces on a device may be marked as internal or external through the use of per network subnet configuration. The Result TLV and Additional Information Codes TLV provide detailed information about the processing of the Flow Discovery Request packet and may possibly leak information about the locally configured policies. The amount of information to be included in these TLVs should also depend on whether the request was originated externally or internally. The network operator may choose to silently drop the Flow Discovery Request packet without providing any indication of the reason for doing so if the request was originated externally. It's also possible to encrypt the responses back to the originator using a shared secret mechanism. This will ensure that even when for reasons unknown the response are directed outside of an administrative domain the network sensitive information remains secure. Today most network operators throttle conventional OAM traffic (For example ping and traceroute) that is serviced by the device to protect against Denial-of-Service attacks. Such mechanisms should be employed for TraceFlow packets for the same reason. 7. IANA Considerations TraceFlow protocol would need a UDP port assignment to be used as the destination port in the TraceFlow packets. 8. Contributors This document is a result of discussions amongst the authors with inputs and suggestions from Shane Amante. This document was prepared using 2-Word-v2.0.template.dot. Zinjuvadia Expires February 16, 2009 [Page 30] Internet-Draft TraceFlow August 2008 APPENDIX A: A.1. Encapsulation Format Choices A.1.1. Carrying a separate Flow Descriptor TLV inside the Flow Discovery Request packet This is the approach selected for this proposal. In order to specify a flow, the originating device encapsulates the entire data packet belonging to the traffic flow of interest in the Flow Descriptor TLV. If a traffic flow data packet is not readily available, the operator may have to generate a data packet with the traffic flow information available and encapsulate that in the Flow Descriptor TLV. Future revisions of this document may update the Flow Descriptor TLV if there is a need to allow the Flow Descriptor TLV to carry individual flow parameters (such as the Source IP Address, Destination IP Address, UDP/TCP Port numbers, etc.) in sub-TLV format rather than using an encapsulated data packet. A.1.2. Using the traffic flow's parameter values in the external header to encapsulate the Flow Discovery Request packet This approach involves using the traffic flow's header as the outer header of the Flow Discovery Request packet. This ensures that the Flow Discovery Request packet would take the same path as the traffic flow would have. We could use Layer 2 EtherType to differentiate between this OAM packet and the data packets belonging to the traffic flow. This approach was not selected due to the added requirement on the intermediate devices to process new EtherType which might be limited by hardware. Moreover it is likely that the OAM packet would have to make a stop at the intermediate device anyway in order to gather the relevant information for the traffic flow specified. If the Flow Discovery Request packet does not use a special EtherType, it would be difficult for network operator to filter these OAM packets as they would be indistinguishable compared to the traffic flow. Moreover such TraceFlow OAM packets may be considered as 'spoofed' packets. Even though this approach is not being selected for TraceFlow protocol in this document, it helps TraceFlow protocol in supporting certain networks with legacy devices (not supporting TraceFlow). This approach may be reconsidered in future revisions of this document. Zinjuvadia Expires February 16, 2009 [Page 31] Internet-Draft TraceFlow August 2008 A.2. Layer 4 Protocol Choices and Router Alert option A.2.1. UDP Encapsulation This approach has been selected in this proposal. The Traceflow packets are UDP packets with a well-known destination port number (to be requested from IANA). A.2.2. ICMP Encapsulation This approach involves sending TraceFlow packets as ICMP packets. This was not selected in this proposal due to the simplicity of the UDP approach. A.3. Legacy Devices (Not supporting TraceFlow) It is necessary that the entire flow information available through the encapsulated packet in the Flow Discovery Request packet be used in determining the egress port. If the Flow Discovery Request packet reaches a legacy device that does not support TraceFlow, it is likely that the request packet gets forwarded along a different egress link compared to the egress link through which the data packets belonging to the traffic flow would have been forwarded. Hence the information received from the transit routers beyond the legacy device in a TraceFlow probe may not be useful. Typically if the legacy device does not employ LAGs or ECMP paths or policy-based routing, the TraceFlow packet may proceed in the direction that the traffic flow would have taken and subsequent transit nodes may still be able to provide useful and relevant information to the originator of the Flow Discovery Request packet. A.4. TTL Scoping Conventional traceroute employs TTL Scoping as a means to determine the path followed by destination address based hop-by-hop routing of a packet. TraceFlow protocol does not employ TTL Scoping in the current specification. However using TraceFlow with TTL Scoping has certain applications in networks that contain some legacy devices that do not support TraceFlow. This may be explored in future revisions of this document if there is interest in the community to solve this problem. An implementation may allow the operator to send out the TraceFlow packets with TTL Scoping just like conventional traceroute. In such a mode following points should be noted: Zinjuvadia Expires February 16, 2009 [Page 32] Internet-Draft TraceFlow August 2008 1) The originator node may receive multiple packets from the transit nodes - an ICMP 'TTL Expired' packet and a TraceFlow response packet 2) In this mode, the transit devices SHOULD send out the TraceFlow response packet only if the TTL has also expired for that Flow Discovery Request packet on that device. This is needed to prevent duplicate Flow Discovery Response packets from the transit node for each request packet that the originator device sends when performing TTL Scoping. A.5. Additional Information in the Flow Discovery Response This document lists the information that can be requested by the originator of the TraceFlow Flow Discovery Request packet and that may be included by the transit devices in their response. Future revisions of this document may modify this list based on the feedback from the community. For example the QoS related statistics and queue depth information may be included in the Flow Discovery Response packets for the traffic flow being investigated. A.6. Choices for supporting remote TraceFlow requests A.6.1. Terminating the request at the Proxy device and re-originate it This approach was selected in this proposal. For indirect Flow Discovery Requests, the originating device sends the request to another proxy device that is the intended starting point for probing the flow and gathering relevant information about the flow. This proxy device receives the Flow Discovery Request packet, processes it and re-originates a Flow Discovery Request towards the destination of the flow. A.6.2. Source-Routing the request through the Proxy device This approach involved sending the Flow Discovery Request with IP Source Routing option that forced the packet to be received by the proxy device that is the intended starting point for probing the flow and gathering relevant information about the flow. It was not selected for this proposal. A.7. Applicability to Multicast Multicast networks have also evolved into more complex heterogeneous networks in the recent years. These advancements place more burden on multicast OAM tools employed by network operators. Troubleshooting network problems, monitoring network performance and network planning Zinjuvadia Expires February 16, 2009 [Page 33] Internet-Draft TraceFlow August 2008 and provisioning become difficult due to the gap between the complexities in the network compared to the capabilities of the OAM tools. Mtrace [4] has evolved into a useful OAM tool to address some of the problems faced in multicast network. However it does not address all the problems discussed in this document. We believe that TraceFlow protocol can be extended to assist the network operator with their multicast deployments. Specific mechanics of any such extensions may be defined in the later versions of the draft. A.8. Applicability to Layer 2 networks The Layer 2 devices in the path taken by the TraceFlow packets should be able to snoop on the higher layer headers in the packet to determine that it is a TraceFlow Flow Discovery Request packet. Most of the TraceFlow packet processing and operations discussed in this document should apply to the layer 2 devices also. However specific mechanics of any separate extensions necessary for Layer 2 networks may be defined in the later versions of the draft. A.9. Applicability to IPv6 The TraceFlow protocol described in this document should apply to IPv6 networks or IPv4-IPv6 dual stack networks with straight-forward extensions. Specific mechanics of extensions to address IPv6 networks may be defined in the later versions of the draft. A.10. Applicability to MPLS Current MPLS ping standard supports ping/traceroute between ingress and egress LSRs only. There is need for a singular probe that traces all types of hops which includes MPLS LSRs which can be addressed with our protocol. We intend to support both pass pipe mode (pass through) of trace where entire MPLS lsp is treated as a single interface or uniform mode where we trace every hop along the way. Current MPLS ping does try to address ECMP issue partially by providing a way to exercise individual ECMP paths along the way. But our protocol extends this to allow operators to exercise specific multipath components based on the specific user defined flow. Specific mechanics will be defined in the later versions of the draft. Zinjuvadia Expires February 16, 2009 [Page 34] Internet-Draft TraceFlow August 2008 A.11. Flow Discovery and Response packet fragmentation It is highly RECOMMENDED that the network allow the Flow Discovery Request packet to travel through to the destination without fragmentation. The Flow Discovery Response packet that is originated by the transit devices processing the request packet may be fragmented on its way to the originator device. A.12. Authetication TLV A shared secret key is configured on all devices attached to a common network. The key is used to generate and verify a message digest for all TraceFlow packets. The message digest is a one-way hash function of the TraceFlow packet and the shared secret key. The message digest is included in the Authentication TLV which is included in the TraceFlow packet. This TLV is not considered a part of the TraceFlow packet for the purposes of calculating the message digest. TraceFlow allows a choice of no authentication, MD5 based authentication and SHA1 based authentication using shared secret key. Zinjuvadia Expires February 16, 2009 [Page 35] Internet-Draft TraceFlow August 2008 9. References 9.1. Normative References [1] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. 9.2. Informative References [2] Postel, J., "Internet Control Message Protocol", RFC 792. [3] NG-OAM Requirements, draft-amante-oam-ng-requirements-01.txt [4] Asaeda H., Jinmei T., Fenner W., Casner S., "Mtrace Version 2: Traceroute Facility for IP Multicast", draft-ietf-mboned- mtrace-v2-01.txt Author's Addresses Arun Viswanathan Force10 Networks 350 Holger Way San Jose, CA 95134 Email: arunv@force10networks.com Subi Krishnamurthy Force10 Networks 350 Holger Way San Jose, CA 95134 Email: subi@force10networks.com Rajeev Manur Force10 Networks 350 Holger Way San Jose, CA 95134 Email: rmanur@force10networks.com Vishal Zinjuvadia Force10 Networks 350 Holger Way San Jose, CA 95134 Email: vzinjuvadia@force10networks.com Zinjuvadia Expires February 16, 2009 [Page 36] Internet-Draft TraceFlow August 2008 Intellectual Property Statement The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org. Disclaimer of Validity This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Copyright Statement Copyright (C) The IETF Trust (2008). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. Acknowledgment Funding for the RFC Editor function is currently provided by the Internet Society. Zinjuvadia Expires February 16, 2009 [Page 37] Internet-Draft TraceFlow August 2008 Zinjuvadia Expires February 16, 2009 [Page 38]