Traffic Engineering Working Group Wai Sum Lai Internet Draft AT&T Labs Document: February 2001 Category: Informational A Framework for Internet Traffic Measurement Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026 [1]. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. 1. Abstract This document is submitted in response to the call for contributions on the TEM (Traffic Engineering Measurement) category as described in the tewg charter. It is work in progress and proposes a measurement framework to support the traffic engineering of IP-based networks. Consideration for including this document as a tewg working-group item for further development is requested. 2. Conventions used in this document The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC-2119 [1]. 3. Introduction This document describes a framework for Internet traffic measurement, with the objective of providing principles and requirements for the development of a set of measurement systems to support the traffic engineering of IP-based networks [2]. A major goal is to provide guidance for establishing protocol-independent and platform-independent traffic measurement standards to achieve multi-vendor inter-operability. It is critical to minimize the possibilities of inconsistencies arising from, e.g., overlapping Lai Category - Expiration [Page 1] Internet-Draft Framework for Internet Traffic Measurement Feb. 2001 data collecting and processing at various protocol levels, due to the use of different measurement principles by different vendors. The initial scope is limited to those aspects of measurement pertaining to intra-domain, i.e., within a given autonomous system. In this document, the use of traffic measurement in traffic characterization, network monitoring, and traffic control is first described. Depending on the network operations to be performed in these tasks, three different time scales can be identified, ranging from months, through days or hours, to minutes or less. To support these operations, traffic measurement must be able to capture accurately, within a given confidence interval, the traffic variations and peaks without degrading network performance and without generating an immense amount of data. Therefore, specification of a suitable read-out period for each service class for traffic summarization is essential. Traffic measurement can be performed on the basis of flows, interfaces, node-pairs, or paths. Based on these objects, different measurement entities can be defined, such as traffic volume, throughput, delay, delay variation, packet loss, average holding time, resource usage, and link bandwidth availability. Using these measured traffic data, in conjunction with other network data such as topological data and router configuration data, traffic matrix and other relevant statistics can be derived for traffic engineering purposes. Related work in this area includes [3] which proposes a functional architecture for measurement, and [4] on operational measurements. 4. Purposes of Traffic Measurement Traffic measurement is used to collect traffic data for the following purposes: Traffic characterization - identifying traffic patterns, particularly traffic peak patterns, and their variations in statistical analysis - determining traffic distributions in the network on the basis of flows, interfaces, node-pairs, paths, or destinations - estimation of the traffic load according to service classes in different routers and the network - observing trends for traffic growth and forecasting of traffic demands Network monitoring - determining the operational state of the network - monitoring the continuity and quality of network services, to ensure that QoS/GoS objectives are met for various classes of traffic, to verify the performance of delivered services, or to serve as a means of sectionalizing performance issues seen by a customer Lai Category - Expiration [Page 2] Internet-Draft Framework for Internet Traffic Measurement Feb. 2001 - evaluating the effectiveness of traffic engineering policies, or triggering certain policy-based actions upon threshold crossing Traffic control - adaptively optimizing network performance in response to network events, e.g., rerouting to work around congestion or failures - support of measurement-based admission control, i.e., by predicting the future demands of the aggregate of existing flows so that admission decisions can be made on new flows 5. Time Scales for Network Operations The information collected by traffic measurement can be provided to the end user or application either in real time or for record in non-real time, depending on the activities to be performed and the network actions to be taken. Traffic control will generally require real-time information. For network planning and capacity management as described below, information may be provided after the processing of raw data in non-real time. Broadly speaking, the following three time scales can be classified, according to the use of observed traffic information for network operations [5]. Network planning Information that changes on the order of months is used to make traffic forecasts as a basis for network extensions and long-term network configuration. That is, for planning the topology of the network, planning alternative routes to survive failures or determining where capacity must be augmented in advance of projected traffic growth. Forecasting and planning may also lead to the introduction of new technology and architecture. Capacity management Information that changes on the order of days or hours is used to manage the deployed facilities, to take appropriate maintenance or engineering actions to optimize utilization. For example, new MPLS tunnels may be set up or existing tunnels modified while meeting Service Level Agreements. Also, load balancing may be performed or, traffic may be rerouted for re-optimization after a failure. Real-time network control Information that changes on the order of minutes or less is used to adapt to the current network conditions in near real time. Thus, to combat localized congestion, traffic management actions may perform temporary rerouting to redistribute the load. Upon detecting a failure, traffic may be diverted to pre-established, secondary routes until more optimized routes can be arranged. 6. Read-out Periods A measurement infrastructure must be able to scale with the size and the speed of a network as it evolves. Hence, it is important to Lai Category - Expiration [Page 3] Internet-Draft Framework for Internet Traffic Measurement Feb. 2001 minimize the amount of data to be collected, and to condense the collected data by periodic summarization. This is to prevent network performance from being adversely affected by the unnecessarily excessive loading of router control processors, router memories, transmission facilities, and the administrative support systems. A measurement interval is the time interval over which measurements are taken. Some traffic data must be collected continuously, while others by sampling, or on a scheduled basis. For example, peak loads and peak periods can be identified only by continuous measurement as traffic typically fluctuates irregularly during the whole day. If traffic variations are regular and predictable, it may be possible to measure the expected normal load on pre- determined portions of the day. This requires the definition of a busy period. Special studies on selected segments of the network may be conducted on a scheduled basis. Active measurement, with the involvement of network operator, may be activated manually. For instance, active throughput measurement may be used to identify alternate paths during periods of network congestion. A measurement interval consists of a sequence of consecutive read- out periods. Summarization is usually done by integrating the raw data over a pre-specified read-out period. The granularity of this period must be suitably chosen. It should be short enough to capture, with acceptable accuracy, the bursty nature of the traffic, i.e., the traffic variations and peaks. Since measurements represent a load for the router, the read-out period should not be so short that router performance is degraded while a voluminous quantity of data is produced. Also, read-out may be started when the measured data exceeds a preset threshold, or when the space allocated for temporarily holding the data in a router is exhausted. For a multi-service IP-based network, each service typically has its own traffic characteristics and performance objectives. To ensure that service-specific features are reflected in the measurement process, different read-out periods may be needed for different classes of service. (Note: This document should recommend some expected range of service-specific measurement intervals, read-out periods, and busy periods in a future version.) 7. Measurement Bases Measurements can be classified on the basis of where, and at which level the traffic data are gathered and aggregated. It is generally assumed that the measurements are taken at network elements such as routers; customer-based measurements are not considered in this document. Also, as far as possible, measurements should be collected by a network element without requiring coordination with other network elements. Lai Category - Expiration [Page 4] Internet-Draft Framework for Internet Traffic Measurement Feb. 2001 Flow-based This is conceptually similar to the call detail record (CDR) in telecommunication networks. It is primarily used on interfaces at access routers, edge routers, or aggregation routers where traffic originates or terminates, rather than on backbone routers in the core network. Like CDR measurements, flow-based records can be used to collect detailed information about a flow, such as source and destination IP addresses/port numbers, protocol, type of service, timestamps for the start and end of a flow, packet count, octet count, etc. Interface-based, link-based SNMP/RMON MIBs use passive monitoring to collect raw data on an interface at an edge or backbone router. This includes data such as counts on packets and octets sent/received, packet discards, errored packets. (Consideration for link bundling in next version of this document.) Node-pair-based Active measurements by probing, as specified in the IPPM framework [6], can be conducted between each pair of major routing hubs for determining edge-to-edge performance of a core network. A problem with this approach that needs to be accounted for is the routing changes among the multiple routes due to, e.g., changes in interdomain policies. This is further discussed in the Section on Auxiliary Information. Path-based The ability of MPLS to use fixed preferred paths for routing traffic, so-called route pinning, gives the means to develop path- based measurements. This may enable the development of methodologies for such functions as admission control and performance verification of delivered service. (In this document, the term path specifically refers to MPLS tunnel, or label-switched path.) Currently, the first three measurement bases are already in use. However, path-based measurement capability remains to be developed. 8. Measurement Entities A measurement entity defines what is measured: it is a quantity for which data collection must be performed with a certain measurement. A measurement type can be specified by a (meaningful) combination of a measurement entity with the measurement basis described in the previous section. The following is a partial list of measurement entities. (A more complete list, with definitions and usage/applications, is to be provided in a future version.) - Traffic volume (mean and variance for normal/high load) - Throughput (in both bits per second and packets per second) Lai Category - Expiration [Page 5] Internet-Draft Framework for Internet Traffic Measurement Feb. 2001 - Delay (e.g., cross-router delay may be used to measure queueing delay within a router) - Delay variation - Packet loss (e.g., excessive packet loss may be used as a means of fault detection) - Average holding time (e.g., flow duration, duration of an MPLS tunnel) - Resource usage, such as link/router utilization, buffer occupancy (e.g., fraction of arriving packets finding the buffer above a given set of thresholds) - Available bandwidth of a link or path - useful for load balancing, measurement-based admission control to determine the feasibility of creating a new MPLS tunnel (real-time information can be used for dynamic establishment) Further study is needed to determine the relevance of and methods of measurement for burst characterization, probability of admission denial. 9. Auxiliary Information Additional information such as topological data and router configuration data are usually needed to make use of raw measurement data. For example, an important set of data for traffic engineering is point-to-point or point-to-multipoint demands. Because destination-based routing/forwarding as used in OSPF does not provide a network operator with a precise control of the paths used for traffic flows, it is not easy to obtain network-wide traffic demands from the local interface measurements taken by different IP routers. As explained in [7, 8], information from diverse network measurements and various configuration files are needed to infer the traffic volume. Based on flow-level measurements, this reference describes how to determine the traffic volume from an ingress link to a set of egress links by validating and joining various data sets together. 10. Report Generation Data storage, data processing, statistics generation and reporting are outside the scope of this document. 11. Security Considerations Security considerations are not addressed in this version of the draft. 12. References 1 Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997 Lai Category - Expiration [Page 6] Internet-Draft Framework for Internet Traffic Measurement Feb. 2001 2 D.O. Awduche, A. Chiu, A. Elwalid, I. Widjaja, and X. Xiao, "A Framework for Internet Traffic Engineering," Internet-Draft, Work in Progress, July 2000. 3 S. Van den Berghe, P. Vanheuven, P. Demeester, and H. Asgari, "Some Issues for Desiging a Measurement Architecture for Traffic Engineered IP Networks," Internet-Draft, Work in Progress, February 2001. 4 B. Christian, B. Davies, and H.Tse, "Operational Measurements for Traffic Engineering," Internet-Draft, Work in Progress, July 2000. 5 G. Ash, "Traffic Engineering & QoS Methods for IP-, ATM-, & TDM- Based Networks," Internet-Draft, Work in Progress, December 2000. 6 V. Paxson, G. Almes, J. Mahdavi, and M. Mathis, "Framework for IP Performance Metrics," RFC 2330, May 1998. 7 A. Feldmann, A. Greenberg, C. Lund, N. Reingold, J. Rexford, and F. True, "Deriving Traffic Demands for Operational IP Networks: Methodology and Experience," Proc. ACM SIGCOMM 2000, Stockholm, Swedan. 8 A. Feldmann, A. Greenberg, C. Lund, N. Reingold, and J. Rexford, "NetScope: Traffic Engineering for IP Networks," IEEE Network, March/April 2000. 13. Acknowledgments The support of Gerald Ash on this work and his comments are much appreciated. 14. Author's Addresses Wai Sum Lai AT&T Labs Room D5-3D18 200 Laurel Avenue Middletown, New Jersey 07748, USA Phone: 732-420-3712 Email: wlai@att.com Lai Category - Expiration [Page 7] Internet-Draft Framework for Internet Traffic Measurement Feb. 2001 Full Copyright Statement "Copyright (C) The Internet Society (date). All Rights Reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implmentation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English. The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assigns. This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Lai Category - Expiration [Page 8]