Real Time Flow Measurement Working Group S.W. Handelman Internet-draft IBM Hawthorne, NY USA Nevil Brownlee U of Auckland, NZ Greg Ruth GTE Laboratories, Inc Waltham, MA USA July 20, 1997 expires January 20, 1998 RTFM Working Group - New Attributes for Traffic Flow Measurement 1. Status of this Memo This document is an Internet Draft. Internet Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet Drafts. Internet Drafts are draft documents valid for a maximum of six months, and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet Drafts as reference material or to cite them other than as "work in progress". To learn the current status of any Internet Draft, please check the "1id-abstracts.txt" listing contained in the Internet Drafts shadow directories on ftp.is.co.za (Africa), nic.nordu.net (Europe), munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or ftp.isi.edu (US West Coast). This memo provides information for the Internet community. This memo does not specify an Internet standard of any kind. Distribution of this memo is unlimited. Handelman, Brownlee, Ruth [Page 1] Internet-Draft July 20, 1997 2. Introduction The Real-time Traffic Flow Measurement (RTFM) Working Group has developed a system for measuring and reporting information about traffic flows in the Internet. This document explores the definition of extensions to the flow measurements as currently defined in [1] and [5]. The new attributes described in this document will be useful for monitoring network performance and expand the scope of RTFM beyond simple measurement of traffic rates. Performance attributes typically deal with throughput, packet loss, and delays. We will explore the methods by which RTFM can extract values from flows so as to measure these attributes. We will also look at capturing information on jitter and congestion control. The RTFM Working Group has defined the concept of a standardized meter which records flows from a traffic stream according to Rule Sets which are active in the meter[1]. Implementations of this meter have been done by Nevil Brownlee in the University of Auckland, NZ, and Stephen Stibler and Sig Handelman at IBM in Hawthorne, NY, USA. The RTFM WG has also defined the Meter Reader Program whose job is to fetch flow data from the Meter. 2.1 RTFM's Definition of Flows The RTFM Meter architecture views a flow as a set of packets between two end-points (as defined by their source and destination attribute values), and as BI-DIRECTIONAL (i.e. the meter effectively monitors two sub-flows, one in each direction). Reasons why RTFM flows are bi-directional: - We are interested in understanding the behavior of sessions between end-points. - We want to perform as much data reduction as possible, so as to reduce the amount of data to be retrieved from a remote meter. - The endpoint attribute values (the "Address" and "Type" ones) are the same for both directions; storing them in bi-directional flows reduces the meter's memory demands. 2.2 RTFM's Current Definition of Flows and their Attributes Flows, as described in the "Architecture" document [1] have the following properties: a. They occur between two endpoints, specified as sets of attribute Handelman, Brownlee, Ruth [Page 2] Internet-Draft July 20, 1997 values in the meter's current rule set. A flow is completely identified by its set of endpoint attribute values. b. Each flow may also have values for "computed" attributes (Class and Kind). These are directly derived from the endpoint attribute values. c. A new flow is created when a packet is to be counted which is not classified by the Rule Set into an existing flow. The meter records the time when this new flow is created. d. Attribute values in (a), (b) and (c) are set when the meter sees the first packet for the flow, and are never changed. e. Each flow has a "LastTime" attribute, which indicates the time the meter last saw a packet for the flow. f. Each flow has two packet and byte counters, one for each flow direction (Forward and Backward). These are updated as packets are observed by the meter. g. ALL the attributes have (more or less) the same meaning for a variety of protocols; IPX, AppleTalk, DECnet and CLNS as well as TCP/IP. Current flow attributes - as described above - fit very well into the SNMP data model. They are either static, or are continuously updated counters. They are NEVER reset. In this document they will be referred to as "old-style" attributes. It is easy to add further "old-style" attributes, since they don't require any new features in the architecture. For example: - Count of the number of "lost" packets (determined by watching sequence number fields for packets in each direction; only available for protocols which have sequence numbers). - In the future, RTFM could coordinate directly with the Flow number from the IPv6 header. At the June, 1996 meeting of the RTFM WG in Montreal, Canada, a proposal was made to extend the work of the group to produce an Internet Draft "New Attributes for Traffic Flow Measurement". That proposal has brought forth this document. The goal of this work is to produce a simple set of abstractions, which can be easily implemented and at the same time enhance the value of RTFM meters. This document also defines a method for organizing the flow abstractions to preserve the existing RTFM flow table. Handelman, Brownlee, Ruth [Page 3] Internet-Draft July 20, 1997 2.3 RTFM Flows, Integrated Services, IPPM and Research in Flows The concept of flows has been studied in various different contexts. For the purpose of extending RTFM, a starting point is the work of the Integrated Services WG. We will measure quantities that are often set by Integrated Services configuration programs. We will look at the work of the Benchmarking / IP Performance Metrics Working Group, and also look at the work of Claffy, Braun and Polyzos [4]. We will demonstrate how RTFM can compute throughput, packet loss, and delays from flows. An example of the use of capacity and performance information is found in "The Use of RSVP with IETF Integrated Services" [2]. RSVP's use of Integrated Services revolves around Token Bucket Rate, Token Bucket Size, Peak Data Rate, Minimum Policed Unit, Maximum Packet Size, and the Slack term. These are set by TSpec, ADspec and FLowspec (Integrated Services Keywords), and are used in configuration and operation of Integrated Services. RTFM could monitor explicitly Peak Data Rate, Minimum Policed Unit, Maximum Packet Size, and the Slack term. RTFM could infer details of the Token Bucket. We will develop measures to work with these service metrics. RTFM will work with several traffic measurements identified by IPPM [3]. There are three broad areas in which RTFM is useful for IPPM. An RTFM meter could act as a passive device that, gathering traffic and performance statistics at appropriate places in TCP/IP networks (server or client locations). RTFM could give detailed analyses of IPPM test flows that pass through the Network segment that RTFM is monitoring. RTFM could be used to identify the most-used paths in a network mesh, so that detailed IPPM work could be applied to the most used paths. 3.0 Flow Abstractions Performance attributes include throughput, packet loss, delays, jitter, and congestion measures. RTFM will calculate these attributes in the form of extensions to the RTFM flow attributes according to three general classes: o 'packet traces' - collections of individual packets in a flow or a segment of a flow o 'aggregates' - statistics derived from the flow taken as a whole (e.g. mean rate, max packet size). o 'series'- attributes that depend on more than one packet (e.g. inter-arrival times, short-term traffic rates). The following sections suggest implementations for each of these classes of extensions. Handelman, Brownlee, Ruth [Page 4] Internet-Draft July 20, 1997 As an introduction to flow abstractions one fact must be emphasized. Several of the measurements enumerated below can be implemented by a Meter Reader that is tied to the meter with instantaneous response and very high bandwidth. If the Meter Reader and Meter can be arranged in such a way, RTFM could collect Packet Traces with time stamps and provide them to the Meter Reader for further processing. A more useful alternative is to have the meter calculate some flow statistics locally. This allows a looser coupling between the meter and Meter Reader. RTFM will create an 'extended attribute' depending upon settings in its Rule table. RTFM will not create any "extended attribute" data without explicit instructions in the Rule table. 3.1. Attrubute Types The previous section described three different classes of attribute; this section considers what the types of these attributes could be. Packet Traces (as described below) are a special case in that they are tables with each row containing a sequence of values, each of varying type. They are essentially 'compound objects,' and will not be considered further here. Aggregate attributes are like the 'old-style' ones. Their types are - Addresses, represented as byte strings (1 to 20 bytes long) - Counters, represented as 64-bit unsigned integers - Times, represented as 32-bit unsigned integers Addresses are set when the first packet of a flow is observed. They do not change with time, and they are used as a key to find the flow's entry in the meter's flow table. Counters are incremented for each packet, and are never reset. An analysis application can compute differences between readings of the counters, so as to determine rates for these attributes. For example, if we read flow data at five-minute intervals, we can calculate five-minute packet and byte rates for the flow's two directions. Times - the FirstTime for a flow is set when its first packet is observed. LastTime is updated for every packet of the flow. All the above types have the common feature that they are expressed as single values. At least some of the new attributes will require Handelman, Brownlee, Ruth [Page 5] Internet-Draft July 20, 1997 multiple values. If, for example, we are interested in inter-packet time intervals, we can compute an interval for every packet after the first. If we are interested in packet sizes, a new value is produced as each packet arrives. When it comes to storing this data we have two options: - As a distribution, i.e. in an array of 'buckets.' On the other hand meter storage requirements are well-defined, as is the amount of data to be read from the meter. - As a sequence of integers. This saves all the information, but does not fit well with the RTFM goal of doing as much data reduction as possible within the meter. Studies which would be limited by the use of distributions might well use packet traces instead. For most of RTFM's attributes, a 'distribution' (as described above) appears to be the most effective attribute type. A method of specifying the distribution parameters, and for encoding the distribution so that it can be easily read, are described in section 4.2. 3.2. Packet Traces The simplest way of collecting a trace in the meter would be to have a new attribute called, say, "PacketTrace." This could be a table, with a column for each property of interest. For example, one could trace - Arrival time (TimeTicks from SysUptime, or microseconds from FirstTime for the flow). - Direction (Forward or Backward) - Sequence number (for protocols with sequence numbers) - Flags (for TCP at least). To add a row to the table, we only need a rule which PushPkts the PacketTrace attribute. To use this, one would write a rule set which selected out a small number of flows of interest, with a 'PushPkt PacketTrace' rule for each of them. A MaxTraceRows default value of 2000 would be enough to allow a Meter Reader to read 1-second ping traces every 10 minutes or so. More realistically, a MaxTraceRows of 500 would be enough for one- minute pings, read once each hour. Note that packet traces are already implemented in the RMON MIB [6], in Handelman, Brownlee, Ruth [Page 6] Internet-Draft July 20, 1997 the Packet Capture Group; they are therefore a low priority for RTFM. 3.3 Aggregate Attributes Performance aspects of flows are of interest in the case of a flow between a server and client. TCP/IP and UDP flows contain equivalent performance, with additional data from TCP flows. The performance data found by this method define the flow capacity used by the individual flow, as experienced in the locale of the RTFM meter. For both TCP/IP and UDP, RTFM's "old-style" flow attributes count the bytes and packets for packets which match the rule set for an individual flow. In addition to these totals, RTFM could calculate Packet size and Bit rate statistics. Bit rate statistics point to the throughput-related performance metrics. Packet size - RTFM's packet flows can be examined to determine the maximum packet size found in a flow. This will give the Network Operator an indication of the MTU being used in a flow. It will also give an indication of the sensitivity to loss of a flow, for losing large packets causes more data to be repeated. Short-term bit rate - The data could also be recorded as the maximum and minimum data rate of the flow, found over specific time periods during the lifetime of a flow; this is a special kind of 'distribution.' Bit rate could be used to define the throughput of a flow, and if the RTFM flow is defined to be the sum of all traffic in a network, one can find the throughput of the network. If we are interested in '10-second' forward data rates, the meter might compute this for each flow of interest as follows: - maintain an array of counters to hold the flow's 10-second data rate distribution. - every 10 seconds, compute the 10-second octet count, and save a copy of the flow's forward octet counter. To achieve this, the meter will have to keep a list of aggregate flows and the intervals at which they require processing. Careful programming will be needed to achieve this, but provided the meter is not asked to do it for very large numbers of flows, it should not be too difficult! Note that aggregate attributes are a simple extension of the 'old- style' attributes; their values are never reset. For example, an array of counters could hold a '10-second bit rate' distribution. Handelman, Brownlee, Ruth [Page 7] Internet-Draft July 20, 1997 The counters continue to increase, a meter reader will collect their values at regular intervals, and an analysis application will compute and display distributions of the 10-second bit rate for each collection interval. 3.4 Series Attributes The notion of series attributes is to keep simple statistics for measures that involve more than one packet. The attribute values would be stored in the meter as a distribution (see above). TCP and UDP Inter-arrival statistics - TCP and UDP. The Meter knows the time that it encounters each individual packet. Statistics can be kept to record the inter-arrival times of the packets, which would give an indication of the jitter found in the Flow. TCP Only - Packet loss - RTFM can calculate packet loss performance metrics. This is an area for further study. TCP packets have byte sequence numbers and SYNS, FINS, and ACK's associated with them. RTFM could track the sequence numbers in the flows, and calculate the packet loss occurring in a flow, and thus we can develop a metric of lost packets and useful traffic. Delay analysis - TCP flows could be examined for the timing between Transmissions and ACKS and thus we can get some measure of delay (of IPPM performance metrics). This assumes the forward and reverse packets are both visible to the meter. In the case of asymmetric flows, RTFM can be run on multiple paths, and with precise timing create packet traces, which can be compared at later times. Subflow analysis - TCP flows, e.g. a Web server's httpd flows actually contain many individual sub flows. Given, a well known Web Server WW, and a client CC, RTFM would normally pick up an aggregation of all the flows of text, graphics, Java programs, etc. that are sent between WW and CC. By analyzing the Sequence numbers, RTFM could estimate when each subflow occurs, and thus maintain statistics about the subflows on a network. Congestion Analysis - In a TCP/IP flow we have information on the negotiation of Window sizes which are used by TCP/IP to control congestion. Well behaved flows honor these requests and in the vast majority of cases the sender will slow down and thus decrease its rate of injecting packets into the congested network. We will look for cases where flows do not honor these congestion control and are not slowing down. We will also look for flows which have the "precedence" fields turned on and thus are aggressively competing for Handelman, Brownlee, Ruth [Page 8] Internet-Draft July 20, 1997 network resources. 3.5 Actions on Exceptions The user of RTFM will have the ability to mark flows as having High Watermarks. The existence of abnormal service conditions, such as non-ending flow, a flow that exceeds a given limit in traffic (e.g. a flow that is exhausting the capacity of the line that carries it) causes an ALERT to be sent to the Meter Reader for forwarding to the Manager. Operations Support may define service situations in many different environments. This is an area for further discussion on Alert and Trap handling. 4. Extensions to the 'Basic' RTFM Meter 4.1 Flow table extensions The architecture of RTFM has defined the structure of flows, and this draft does not change that structure. The flow table could have ancillary tables called "Distribution Tables" and "Trace Tables," these would contain rows of values and or actions as defined above. Each entry in these tables would be marked with the number of its corresponding flow in the RTFM flow table. In order to identify the data in a Packet Flow Table, the attribute name could be pushed into a string at the head of each row. For example, if a table entry has Bit Rates for a particular flow, the "BitRate" string would be found at the head of the row. 4.2. Specifying Distributions in RuleSets At first sight it would seem neccessary to add extra features to the RTFM Meter architecture to support distributions. This, however, is not neccessarily the case. What is actually needed is a way to specify, in a ruleset, the distribution parameters. These include the number of counters, the lower and upper bounds of the distribution, whether it is linear or logarithmic, and any other details (e.g. the time interval for short-term rate attributes). Any attribute which is distribution-valued needs to be allocated a RuleAttributeNumber value. These will be chosen so as to extend the list already in the RTFM Meter MIB document [7]. Since distribution attributes are multi-valued it does not make sense to test them. This means that a PushPkt (or PushPkttoAct) action Handelman, Brownlee, Ruth [Page 9] Internet-Draft July 20, 1997 must be executed to add a new value to the distribution. The old- style attributes use the 'mask' field to specify which bits of the value are required, but again, this is not the case for distributions. Lastly, the MatchedValue ('value') field of a PushPkt rule is never used. Overall, therefore, the 'mask' and 'value' fields in the PushPkt rule are available to specify distribution parameters. Both these fields are at least six bytes long, the size of a MAC address. All we have to do is specify how these bytes should be used! As a starting point, the following is proposed (bytes are numbered left-to-right. Mask bytes: 1 Transform 1 = linear, 2 = logarithmic 2 Scale Factor Power of 10 multiplier for Limits and Counts 3-4 Lower Limit Highest value for first bucket 5-6 Upper Limit Highest value for last bucket Value bytes: 1-2 Buckets Number of buckets. Does not include the 'overflows' bucket 3-4 Parameter-1 Parameter use depends on 5-6 Parameter-2 distribution attribute For example: ToPacketSize & 1.0,1,1500 = 100,0,0: PushPkt, Next FromBitrate & 2.3,16,2048 = 7,5,0: PushPkt, Next In these mask and value fields a dot indicates that the next number is a one-byte integer, and the commas indicate that the next number is a two-byte integer. The first rule specifies that a distribution of packet sizes is to be built. It uses an array of 100 buckets, storing values from 1 to 1500 bytes (i.e. linear steps of 15 bytes each). Any packets with size greater than 1500 will be counted in the 'overflow' bucket, hence there are 101 counters for the distribution. The second rule specifies a bit-rate distribution, with the rate being calculated every 5 seconds (parameter 1). A logarithmic array of 7 counters (and an overflow bucket) are used for rates from 16 kbps to 2048 kbps. The scale factor of 10 indicates that the limits are given in kilobits per second. These distribution parameters will need to be stored in the meter Handelman, Brownlee, Ruth [Page 10] Internet-Draft July 20, 1997 so that they are available for building the distribution. They will also need to be read from the meter and saved together with the other flow data. 4.3 Reading Distributions Since RTFM flows are bi-directional, each distribution-valued quantity (e.g. packet size, bit rate, etc.) will actually need two sets of counters, one for packets travelling in each direction. It is tempting to regard these as components of a single 'distribution,' but in many cases only one of the two directions will be sensible; it seems better to keep them in separate distributions. This is similar to the old-style counter-valued attributes such as toOctets and fromOctets. A distribution should be read by a meter reader as a single, structured object. The components of a distribution object are - 'mask' and 'value' field from rule which created the distribution - sequence of counters ('buckets' + overflow) These could be easily collected into a BER-encoded octet string, and be read and referred to as a 'distribution.' 5. Extensions to the Rules Table The Rules Table of "old-style" attributes will be extended for the new flow types. A list of actions, and Keywords, such as "BitRate"- for Bit Rate, "MaxPackSize", for Max Packet size will be developed and used to inform RTFM to collect a set of extended values for a particular flow (or set of flows). To begin with, here are ten possible distribution-valued attributes: ToPacketSize(61) size of PDUs in bytes (i.e. number FromPacketSize(62) of bytes actually transmitted) ToInterarrivalTime(63) microseconds between successive packets FromInterarrivalTime(64) travelling in the same direction ToTurnaroundTime(65) microseconds between successive packets FromTurnaroundTime(66) travelling in opposite directions ToBitRate(67) short-term flow rate in bits per second FromBitRate(68) Parameter 1 = rate interval in seconds ToPDURate(69) short-term flow rate in PDUs per second FromPDURate(70) Parameter 1 = rate interval in seconds Handelman, Brownlee, Ruth [Page 11] Internet-Draft July 20, 1997 6. Security Considerations The attributes considered in this document represent properties of traffic flows; they do not present any security issues in themselves. The attributes may, however, be used in measuring the behaviour of traffic flows, and the collected traffic flow data could be of considerable value. Anyone making such measurments should have a clearly-defined purpose in doing so. They should also take great care to ensure that the data is properly stored, and is used solely for its intended purpose. 7. Acknowledgments We thank Stephen Stibler of IBM for his input to, and comments on this draft. 8. Author's Address: Sig Handelman IBM Research Division Hawthorne, NY Phone: 1-914-784-7626 E-mail: handel@watson.ibm.com Nevil Brownlee The University of Auckland New Zealand Phone: +64 9 373 7599 x8941 E-mail: n.brownlee@auckland.ac.nz Greg Ruth GTE Laboratories Waltham, MA Phone: 1 617 466 2448 E-mail: grr1@gte.com 9. References: [1] Brownlee, N., Mills, C., Ruth, G.: "Traffic Flow Measurement: Architecture", RFC 2063, 1997 [2] Wroclawski, J.: "The Use of RSVP with IETF Integrated Services" Internet Draft, October, 1996 [3] Almes, G. et al: "Framework for IP Performance Metrics" Internet Handelman, Brownlee, Ruth [Page 12] Internet-Draft July 20, 1997 Draft. July 1996 [4] Claffy, K., Braun, H-W, Polyzos, G.: "A Parameterizable Methodology for Internet Traffic Flow Profiling," IEEE Journal on Selected Areas in Communications, Vol. 13, No. 8, October 1995. [5] Mills, C., Ruth, G.: "Internet Accounting Background," RFC 1272, 1992. [6] Waldbusser, S.: "Remote Network Monitoring Management Information Base," RFC 1757, 1995, and RFC 2021, 1997. [7] Brownlee, N: "Traffic Flow Measurement: Meter MIB", RFC 2064, 1997 Handelman, Brownlee, Ruth [Page 13]