Internet Draft Alban Couturier Document: draft-couturier-nsis-measure-00.txt Alcatel Expires: November 2003 May 2003 Signaling for QoS Measurement Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026 [1]. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Abstract This document defines SQM (Signaling for QoS Measurement), an architecture for QoS measurements of IP flows along their paths. The goal of this architecture is to configure on demand several probes, and to collect and coordinate the results in a multi domain environment, in order to determine in real time the QoS experienced by a flow on several nodes of its path. A new signaling is used to install metering and reporting states in network nodes. A coordination of works related to NSIS, PSAMP and IPFIX could achieve the SQM specification. Couturier Expires - November 2003 [Page 1] Internet Draft draft-couturier-nsis-measure-00.txt May 2003 Conventions used in this document The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC-2119 [2]. Table of Contents 1. Introduction...................................................2 2. Motivation and scope...........................................3 3. Terminology....................................................5 4. The SQM architecture...........................................8 4.1 Observation point's density................................8 4.2 One way signaling..........................................8 4.3 SQM measurement state management..........................10 4.3.1 State installation alternatives.......................10 4.3.2 State installation algorithms.........................11 4.4 SQM reports and SQM Collectors............................12 5. Measurement methods of SQM....................................14 5.1 Passive delay measure in SQM..............................14 5.2 Packet loss in SQM........................................16 5.3 Other measures............................................17 6. Formal Syntax for the SQM signaling...........................17 7. Remaining Work................................................17 8. Security issues...............................................18 8.1 Confidence in reports.....................................18 9. IANA Consideration............................................18 10. References...................................................19 Acknowledgments..................................................19 Author's Addresses...............................................19 1. Introduction With the development and deployment of end-to-end QoS solutions for IP flows and microflows, there will be a need for flow's QoS measurements. Expected applications using these measurements tools are usage-based accounting, QoS monitoring and troubleshooting. Several IETF Working Groups currently work on measures. The IPFIX Couturier Expires - November 2003 [Page 2] Internet Draft draft-couturier-nsis-measure-00.txt May 2003 working group [3] aims at describing a standard way of exporting traffic flow information out of IPFIX boxes, such as routers. The PSAMP working group defines a set of sampling capabilities of network elements, in order to report synthetic flow observations. PSAMP framework [4] already proposes to use trajectory sampling to track customer's flow performance. Continuing these efforts, this draft proposes the Signalling for QoS Measurement (SQM) architecture, which allows measurement in real time of the QoS experienced by a flow all along its path, eventually crossing several measurement domains. In this version, the draft provides a high-level description of the key components and their functions. In addition, it gives to the NSIS community an example for a signalling which is not dedicated to resource reservation, and a candidate for a NSIS application protocol. In section 2, the motivation and scope of the SQM are given In section 3, the SQM terminology is proposed, reusing at maximum PSAMP, IPFIX and NSIS existing terminology. In section 4, the SQM architecture is explained and discussed. In section 5, the SQM signaling is described. In section 6, the remaining work is discussed. 2. Motivation and scope Classical measurement models are commonly based on two devices architectures. For example in [5], delay is evaluated with active measures end-to-end. Usually, and following the same philosophy, measures in a core network are also done comparing ingress and egress reports. Such measurement architectures determine end to end, or edge to edge performances, but are not sufficient to determine what are the performances the different parts of a network did reach, or to determine which parts did not provide the expected QoS in case of an incorrect end-to-end QoS. Another problem of end-to-end performance architectures is that it requires to be able to configure two edge devices and retrieve information from them. In case the two probes are not located in the same ISP, this configuration is difficult, if not possible, as administration of a node, typically through SNMP, is a critical operation that can not be easily outsourced. This problem is illustrated in the following picture, where observation points are located next to (or integrated in) the first network edge router, in a transit networkÆs border router and in the egress edge router of the last domain: Couturier Expires - November 2003 [Page 3] Internet Draft draft-couturier-nsis-measure-00.txt May 2003 +-------+ | ISP A | |manager| ------->? +-------+ / / snmp | | ---- ----- ---- | / \ / \ / \ +----+/ \ / +----+ / +----+ +---+ |(OP)| +---+ +---+ |(OP)| +---+ |(OP)| +----+ |src|-|ER1 | ISP A |BR2|--|BR3| | BR4|--|BR5| |ER6 |-|sink| +---+ +----+ +---+ +---+ +----+ +---+ +----+ +----+ \ / \ / \ / \_____/ \_____/ \____/ OP: observation point ER: edge router BR border router This document defines the SQM, an signalling based architecture for QoS measurements of IP flows. The goal of this architecture is to configure on demand several observations points, and to coordinate the results in a multi domain environment, in order to determine in real time the QoS experienced by a flow on several nodes of its path. This means that a flow is observed on several places it goes through, not only in its source and in its sink, or in a predefined couple of probes. For example, the transmission delay of any flow can be measured at several places, between the first ingress router to all or several observation points until the flow sink. These measures can be active or passive measures, the SQM imposing no constraint on the flow to be measured. With these kind of measures along a flow path, an end user, or a monitoring application, could determine not only the end-to-end QoS, but also the QoS provided on different path segments. Path segments refers here to parts of the IP flow path delimited by observations points. This precision given by several path segment related measures brings a good analysis and troubleshooting tool for network management and billing applications. Indeed based on these measures, the localization of QoS misconfigurations and responsibilities is possible in a multidomain environment. For instance, knowing the delay between a source and several successive probes located at each Couturier Expires - November 2003 [Page 4] Internet Draft draft-couturier-nsis-measure-00.txt May 2003 network boundaries crossed by the flow, provides basic information to determine the network where a congestion occurs. Similarly, the comparison of measures of the volume of a flow in several intermediary nodes can determine the packet loss ratio on the several path segment. Again, in case of a congestion in a network, the "fautive" path segment is located between the last observation point where the volume was almost equal to the first measure in the path, and the next observation point which sees far less volume. In the case of a too high end-to-end packet loss, a "map" of the several segments and their packet loss performances could give indications on what parts of the network to re-dimension, or reconfigure. The main objectives of this document are then to: * describe the key architectural components of this SQM architecture, * define the architectural requirements, e.g., filtering and sampling capacities in network elements, the security and measurement domain's interconnection issues * provide an overview of the signaling as a NSIS signaling upper layer, also known as a Network Service Layer Protocol (NSTL) * list future actions to be taken in order to achieve the standardization of the SQM. 3. Terminology * Collector: [similar to the draft-ietf-ipfix-architecture-02.txt definition] The collector receives flow records from one or more exporters. * Control Information, Data Stream: [similar to the draft-ietf-ipfix-architecture-02.txt definition] The information that needs to be exported from the IPFIX device can be classified into the following categories: * Control Information : This includes the flow type definition, selection criteria for packets within the flow send by the export process and any IPFIX protocol messages (eg. Keepalives). This stream carries all the information for the end-points to understand the IPFIX protocol and specifically for the receiver to understand and interpret the data send by the sender. * Device: A device hosting at least an observation point, a metering process and a export process. Typically, corresponding observation point(s), metering process(es), and exporter process(es) are co-located at this device, for example, at a router. * Export Process: [similar to the draft-ietf-ipfix-architecture-02.txt definition] Couturier Expires - November 2003 [Page 5] Internet Draft draft-couturier-nsis-measure-00.txt May 2003 The process of sending flow records to one or more collectors. * Filtering [identical to draft-ietf-psamp-sample-tech-00.txt] Filtering selects a subset of packets by applying deterministic functions on parts of the packet content like header fields or parts of the payload. Filtering techniques can also be used to emulate a (pseudo)random selection of packets with a given probability p. A filtering process needs to process the packet (look at packet header and/or payload) in order to make the selection decision. * Flow Record: [similar to the draft-ietf-ipfix-architecture-02.txt definition] A flow record contains information about a specific flow that was metered at an observation point. A flow record contains measured properties of the flow (e.g. the total number of bytes of all packets of the flow) and usually characteristic properties of the flow (e.g. source IP address). * Sampling/identification hash functions: A formal descriptions of sampling and identification hash functions are presented in [6]. A hash function is a function which associates to a packet, and more precisely its invariant packet content (part of the packet header, and its payload), a fixed length bit stream. The sampling hash function is a hashing function used to select packet on the value of the hashing. The identification hash function uses the hashing function result as a packet identifier. * Metering Process: [similar to the draft-ietf-ipfix-architecture-02.txt definition] The metering process generates flow records. Input to the process are IP packets observed in an observation point. The metering process consists of a set of functions that includes packet header capturing, timestamping, sampling, classifying, and maintaining flow records. * Observation Domain: The set of observation points which is the largest aggregatable set of flow information at the IPFIX Device is termed as an observation domain. The observation domain presents itself a unique ID to the collector for identifying the export packets generated by it. One or more Observation Domains can interface with the same export process. Example: The observation domain could be a router line-card, composed of several interfaces with each interface being an observation point. * Observation Point: [inspired from draft-ietf-ipfix-architecture-02.txt and draft-ietf- psamp-framework-01.txt 's definition] The observation point is a location in the network where IP packets can be observed. Examples are, a line to which a probe is attached, a Couturier Expires - November 2003 [Page 6] Internet Draft draft-couturier-nsis-measure-00.txt May 2003 shared medium, such as an Ethernet-based LAN, a single port of a router, or a set of interfaces (physical or logical) of a router. An observation point could also be a port mirroring destination like a regular traffic analyzer. An observation point would be called active when it has installed metering and reporting states for a particular flow. * Path Segment: The segment of a path delimited by two observation points. * Reporting Process: [ draft-ietf-psamp-framework-01.txt definition] The creation of a report stream of information on packets selected by a selection process, in preparation for export. The input to a reporting process comprises that information available to a selection process, for the selected packets. The report stream contains two distinguished types of information: packet reports, and report interpretation. * Selection Process: [draft-ietf-psamp-framework-01.txt definition] A selection process selects packets for reporting at an observation point. The inputs to the selection process are the packets observed at the observation point (including packet encapsulation headers), information derived from the packets' treatment at the observation point, and selection state that may be maintained by the observation point. Selection is accomplished through operating on these inputs with one or more selection operations. * Selection State: the observation point may maintain state information for use by the reporting process, and/or by multiple selection operations, either on the same packet, or on different packets. Examples include counters, timestamps, iterators for pseudorandom number generators, calculated hash values, and indicators of whether a packet was selected by a given selection operation. [draft-ietf-psamp-framework-01.txt definition] * SQM Initiator: SQM entity that initiates SQM signaling for a flow measurement. It can be located in the end system, but may reside elsewhere in the flow path. * Template: Template is an ordered n-tuple (eg. , TLV), used to completely identify the structure and semantics of a particular information that needs to be communicated from the IPFIX Device to the collector. Each template is uniquely identifiable by some means (eg. by using a Template ID). Couturier Expires - November 2003 [Page 7] Internet Draft draft-couturier-nsis-measure-00.txt May 2003 The definitions in this section are intended be identical or inspired with that in the IPFIX and PSAMP frameworks. 4. The SQM architecture 4.1 Observation point's density The goal of the SQM architecture is to initiate and retrieve measurements of a flow on several observation points, in order to have a precise understanding of the QoS the flow receives. A network which implements the SQM capabilities must contain probes disseminated in the network with a certain density, in order to allow measures to be done for any flow in this network. A first level of density would be an observation point on every access router, in order to measure the end-to-end QoS of end user flows. An higher density defined by one observation point on every ingress or egress link of a domain allows to measure the performance of this particular domain. If this observation point density is adopted in a set of peering networks, or even a lighter density such as one observation on every ingress link, the SQM allows flow measures to be made edge- to-edge for each network crossed by the flow. In order to get a set of more fine tuned measures of a flow in one network, other observation points would be also located inside the network, thus making possible a path segment decomposition of the network. 4.2 One way signaling Because measurement requires a certain amount of computing in probes and in analysis applications, only the flows whose QoS monitoring is needed will be measured, and only the observation points located on the paths of the flows will make the requested measures. Then a protocol is needed to specify which flows must be measured: the emitter of a flow, or an application monitoring the flow will emit a QoS measurement request to the network(s). In classical two- devices configuration, this protocol is often SNMP or CLI between the collector and the devices. Here, the devices are spread in the network(s), they must be found on the fly for each flow, and the protocol to configure them is used in interdomain configuration. In order to determine the available devices on the flow path, and install in these devices in real time metering and reporting processes, on-path signalling is a natural choice. Signalling follows the IP flow and is dedicated to install states, usually resource reservation states, in the network elements the flow goes through. In SQM, the signaling is used to install the configurations of the devices for the metering and reporting processes, that is why it is Couturier Expires - November 2003 [Page 8] Internet Draft draft-couturier-nsis-measure-00.txt May 2003 said the SQM signaling installs metering and reporting states in devices. In this document, the emitter of the signalling for QoS measurement request is called the signalling initiator. On-path signalling, like RSVP, can be transported by IP networks which does not implement it. That is an advantage for gradual deployment of a signaling based solution, and also because the deployment of probes in the network is necessarily limited: precise measures do not require that each nodes (especially routers) make measures. However, the question about measure accuracy is to be studied with regard to the observation pointÆs density. In the RSVP model, an over-provisioned network surrounded by RSVP network may provide acceptable QoS, thus preserving the RSVP service, even is the routers are not RSVP capable. Concerning SQM, a signalling unaware node, or network, does not weaken at all the integrity of the signalled service, which is measurement, but it simply reduces the global observation point density, and may reduce locally the accuracy of congestion localization. The results of the local measurements would not be sent back to the signalling initiator using signalling, for three reasons: o first, the application requesting the measures is not necessarily the flow emitter or signaling initiator, and can even be off-path, as a classical QoS monitoring application, o second, if each device sends back reports to the signalling initiator the amount of measures could be important in size, and then overflowing the signalling initiator, o third, for confidentiality and integrity of the measures, the measures performed at the exit of one network should not be sent back to this network, which would gives the opportunity to this network to corrupt the measures for its benefit. Furthermore, not knowing the measures made downstream or upstream, and even not knowing if measures are performed, a network has little opportunity to send forged measures, as it may be discovered by comparing with the neighbour measures. Consequently, the same type of collector than in IPFIX and PSAMP is required to get the results of the measures. One first effect on the signalling is that a response to a SQM request is not needed in normal cases: the signalling is one way, because no backward message is needed for reporting results. The second consequence is that the signalling must carry the information about where and how to send the reports. An address (URL, or IP address and port number) designating the collector is inserted in the SQM request and will be part of active device's reporting states. Couturier Expires - November 2003 [Page 9] Internet Draft draft-couturier-nsis-measure-00.txt May 2003 4.3 SQM measurement state management The installation of metering and reporting processes and associated states is proposed to follow the soft state model, for the same reasons than RSVP: the states are removed if they are not regularly refreshed, which copes gracefully with routing updates and end point/signaling initiator's silent crashes. However, the SQM state management is enriched with specific metering and reporting state installation procedures: it is proposed that the devices, routers or probes, can decide to accept or not to perform the requested measures. 4.3.1 Sate installation alternatives A SQM device has the possibility to intercept a measurement request, and, based on a local procedure, decide not to install the measurement state. In case the probe is inside a mediation device, or inside a router, the request is of course forwarded downstream, as a normal packet would be. The reasons why a SQM device would accept for a particular request to install or nor measurement states are the following: o systematic state installation is not always needed. It is indeed expected that in certain configurations, SQM capable routers or probes are often and regularly spread in the network, in order to allow a large thus accurate set of measures. This high observation point density allows a very close monitoring of flows that are seen as critical. However, for other flows, this accuracy would not be needed. For example, if the QoS measured in a device close to the flow sink is good, intermediates measures are not so useful as no problem is to be investigated. But if later, this flow does not receive a good QoS anymore, the signalling initiator would like to get measurement's reports from more devices, or even from all SQM capable devices along the flow path. Such heavy measures would indeed reduce the size of the path segments, and consequently would localize more precisely the origin of problem. o load balancing of measures. Devices located in large links, typically at peering points between ISPs, may receive a large amount of requests. As in RSVP, SQM scales linearly in function of the number of (accepted) requests, which may be problematic. A solution can be to distribute the measures between a group of probes that observe the same link, a measure being always selected by one unique probe of the group. For instance, a probe would silently refuse to Couturier Expires - November 2003 [Page 10] Internet Draft draft-couturier-nsis-measure-00.txt May 2003 install all measures for flows with even port number, but would accept the ones with odd port number. o CPU and memory shortage. While load balancing solve the feasibility of a large number of measure, it does not decrease the amount of memory and CPU needed for a measure, and does not always prevent a CPU or memory shortage. In Intserv networks, a shortage of resource in a specific node simply precludes the whole chain of RSVP routers to establish the resource reservation. On the contrary, measurement does not require an unbroken chain of active measurement points, a shortage of measurement resource in a specific device should not prevent the other device to make a measure. Of course, a device declining the measure lower the accuracy of the measure, by refusing to divide in two path segments one path segment. Anyway, a device reaching its maximum measurement capacity can decline silently to install measurement state, which means acts as a non capable SQM node for this request. 4.3.2 State installation algorithms As seen in the previous chapter, the behavior of a SQM capable device is not to accept unconditionally the measurement requests and to install the metering and reporting processes/states. A SQM device may even accept a certain fraction of the metering and reporting states installation from the SQM requests. Locally to a probe, there are several algorithms that can be implemented in order to accept or not a measure. For instance, in a case of load balancing where a group of devices observes the same link, one device can select measures which are related to flow having destination IP addresses of a certain type, e.g. whose bitÆs sum is odd or even. A local default metering and reporting states installation acceptance ratio may also be set by configuration in a device, in order to accept randomly a certain portion of the requests. It is then expected that this installation acceptance ratio is maximum in probes located at the edge of the network, as this edgeÆs probes are crucial to determine the end-to-end QoS, and can support a load whose amount is more controlled. More complex local acceptance algorithm may be possible, but are not described in this document. SQM must also support a user-defined state installation policy. Here user-defined means the SQM initiator can influence the state installation in the probes. This corresponds to the case where the end-to-end QoS is either acceptable, and does not require a heavy Couturier Expires - November 2003 [Page 11] Internet Draft draft-couturier-nsis-measure-00.txt May 2003 monitoring, or bad, and thus requires a higher number of active observation points, in order to accurately make a diagnostic, and define responsibilities. To do so, the signalling request carries a dynamic state installation acceptance factor, that can increase or decrease the chance of the metering and reporting stateÆs installation. The higher the state installation acceptance factor is, the higher the probability is for every probe to make the requested measure. This factor can be compared to a priority parameter in classical signaling, and can evolve during the flow lifetime. Indeed, a flow can be first monitored with a low dynamic state installation factor, and after a while subsequent SQM requests can have a higher factor value. Typically, in this case, only the two devices next to the sink and source would be active. Then, if the end-to-end QoS is bad, SQM requests with higher state installation factor value would make devices on the flow path active. Associated with the local default states installation acceptance ratio mechanism, this priority mechanism allows the selection of the critical flows inside the loaded devices, in order to make loaded probes measure the really important flows. 4.4 SQM reports and SQM Collectors Because SQM is targeted to get QoS measures in real time, the report to sent to collectors must be emitted quite frequently. The order of the report time interval is expected to be around a couple of seconds. Too frequent reports would generate a heavy traffic between devices and collectors, but having a too long report period would be prejudicial as QoS failures would be detected too late, compared to the end user experience. So the value of report frequency is a critical, and the question whether this frequency is fixed and set by configuration, or dynamic and advertised in the signalling request is still open. The signalling request must specify where to send the reports, as there can be several instance of collectors for a network, and the place of the collector is not fixed: it could be in the flow emitter, or in a specific network QoS monitoring application. The basic architecture of the SQM is illustrated below, where one collector receives the reports for one flow from the devices where measurement points are active. Couturier Expires - November 2003 [Page 12] Internet Draft draft-couturier-nsis-measure-00.txt May 2003 +---------+<~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |<~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~ |Collector|<~~~~~~~~~~~~~~~ ~ ~ +---------+<~~~~~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ +-------+ +-------+ +-------+ +-------+ -->| |---->| |---->| |---->| |---> Source ==>|device1|====>|device2|====>|device3|====>|device4|===>Sink +-------+ +-------+ +-------+ +-------+ ====>: flow ---->: SQM signalling ~~~~>: reports The signalling should then contain a URL, or IP address/port number, where to send the reports. A public key carried in the signalling request could be also used to encrypt flow records, and then provide integrity and privacy of the measures. In case of a multidomain architecture, the links between the collector and the different measurement domains should be secure, as they cross other domains. Especially, because the identity of the collector is advertised by signalling on different domains it could become a target for attacks, such as DoS, or forged measurement reports. On the other hand, networks which report measures may want to protect themselves by hiding their different devices/routers and consequently prefer to have a limited set of communication channels with external collectors. That is why a first stage of proxy collectors is proposed, as illustrated in the next figure. +---------+ +-------------+ | |<~~~~~~~~~~~~~~~~~~~~~~~~~~~~~|domainC proxy| |Collector|<~~~~~~~~~~~~~~~~ | collector | +---------+<~~~ ~ +-------------+ ~ ~ ~ ~ ~ ~ ~ ~ +-------+ +-------+ +-------+ +-------+ -->|device1|---->|device2|---->|device3|---->|device4|---> Source ==>|domainA|====>|domainB|====>|domainC|====>|domainC|===>Sink +-------+ +-------+ +-------+ +-------+ ====>: flow ---->: SQM signalling ~~~~>: reports Couturier Expires - November 2003 [Page 13] Internet Draft draft-couturier-nsis-measure-00.txt May 2003 The proxy collector inside domain C is a proxy for its domain. It collects the several reports, and forward a bundle report to the real collector. A proxy collector could just forward flow records to the collector, or already re-associate several measures concerning the same packet together, in order to report aggregated flow records. This architecture based on proxy collectors allows the measurement reports flow from domain C toward the collector to be manageable and secured, as it is a unique pre-defined communication between only two servers. In this case, the devices in domain C must transmit to their proxy collector the information that the flow records must be forwarded to the real collector. As a matter of fact, the reports exported by the device must contain the collector's address that was in the SQM signaling request. 5. Measurement methods of SQM This chapter explains two examples of measures in the SQM architecture, in order to derive requirements on the signalling and the SQM devices. The first example is a solution to make passive measure of delay, the second is a solution to measure packet loss. Other measurement methods are also possible, and should be identified in order to make sure the signalling would offer the full capabilities and flexibility to support them. 5.1 Passive delay measure in SQM To measure the delay of a flow in the SQM, the idea is to report the arrival time at each active observation points of particular packets of the flow. A selection of packets to be observed must be done to limit the number of records to analyze. These selected packets must be ideally chosen, or sampled, not too frequently, but regularly inside this flow. Of course, the different devices must have synchronized clocks, e.g. based on the Global Positioning System. The difficulty with passive measures is that exactly the same packet must be chosen/sampled, and identified at every active observation point. The observation points and collectors must absolutely not mistake one of the sampled packets with others sampled packets, or with other packets of the flow. Otherwise, the measure would be incoherent in the best case, or erroneous in the worst case. To sample and unambiguously identify one packet out of the flow in each active devices, a sampling based on a double hash functions would be performed. The first sampling uses a deterministic function of the invariant packet content (packet payload and fields which are not changed by routers), in order to select a fraction of the flow packet. Because this function is deterministic and is activated in all active observation points, a packet which is selected by this sampling is Couturier Expires - November 2003 [Page 14] Internet Draft draft-couturier-nsis-measure-00.txt May 2003 then selected in every other active observation points. This deterministic function and the results of this function to be matched must be chosen with care for two reasons: o in order to have a reasonable packet selection ratio, o the packets of the flow have already the same IP addresses and ports numbers, so a more in-depth hashing function is required. Once selected by the first deterministic sampling, the packet is then time stamped, and must be identified unambiguously, in order to be distinguishable from other sampled packets. The collector which will receives this packet's report must indeed correlate this report only with reports of the same packet generated by other devices. That's why a second identification hash function will generate a packet identifier to be inserted with the timestamp in the report. This identifier must be unique during a relatively long period of the flow's measurement, in order to avoid duplicated packet identifications. It must of course be the same identification hash functions installed in all active observation points. To summarize the different actions implemented in the SQM device, the following diagram explains them: +-------+ pack- +--------+ +--------+ +--------+ +-------+ |filter-| ets |sampling| |time- | |identi- | |export-| |ing +------>| process+-->|stamping+-->|fication|-->|ing | |5-tuple| | (hash) | | | | (hash) | |process| +-------+ +--------+ +--------+ +--------+ +-------+ Concerning timestamping, this operation could be performed after the two hash functions, instead of in between, the difference not being important. The collector receives then several reports and can calculate the delay of the packet to reach every observation point. The successive measures can then give an average delay for each active observation points. The signalling request must then carry: o the flow filter o the first sampling function (which is a hash function of the invariant packet content). Indeed, the sampling must be chosen knowing what must be the sampling probability and the flow rate, and the "entropy" of the fields of the flow's packets. o the indication that time must be stamped o the second (identification) hash function, as it must be the same function used in all devices to identify unambiguously the packets Couturier Expires - November 2003 [Page 15] Internet Draft draft-couturier-nsis-measure-00.txt May 2003 5.2 Packet loss in SQM This measures is based on the notification of sampled and unambiguously identified packets such as explained above. The difference is that instead of a time stamp, the report contains the number of the packet given by a counter. The configuration of the device is the following: +-------+ pack- +-------+ +--------+ +--------+ +-------+ |filter-| ets | | |sampling| |identi- | |export-| |ing +------>|counter+-->|process +-->|fication|-->|ing | |5-tuple| | | | (hash) | | (hash) | |process| +-------+ +-------+ +--------+ +--------+ +-------+ Of course, packet reordering may introduce a certain imprecision in reports, as a packet can be counted as lost or added when it has only been inverted in the flow with the sampled packet. Anyway, the collector should determine after several reports, a good estimation of the packet loss ratio, as this imprecision stays at the same order of few packets, while the amount of counted packet increases. The signalling request must then carry: o the flow filter o an activation of the packet counter o the first sampling function (which is a hash function of the invariant packet content). Indeed, the sampling must be chosen knowing what must be the sampling probability and the flow rate, and the "entropy" of the fields of the flow's packets. o the second (identification) hash function, as it must be the same function used in all devices to identify unambiguously the packets We can naturally derive the metering states that must be installed by a SQM initiator which requests a delay and packet loss of a flow: +------+ +-------+ +--------+ +------+ +--------+ +-------+ |filte-| | | |sampling| |time- | |identi- | |export-| |ring +-->|counter+-->|process +-->|stamp +-->|fication|-->|ing | | | | | | (hash) | |ing | |process | | | +------+ +-------+ +--------+ +------+ +--------+ +-------+ Couturier Expires - November 2003 [Page 16] Internet Draft draft-couturier-nsis-measure-00.txt May 2003 5.3 Other measures The delay and packet loss measures are examples to deduce capabilities the signaling and observation points must support. Other example may be also presented in future versions, especially if they requires new signaling capabilities. Some measures for collecting aggregate informations or router information may be also possible, but are to for further studies. 6. Formal Syntax for the SQM signaling The following syntax specification uses the augmented Backus-Naur Form (BNF) as described in RFC-2234 [7]. = = = 1* = / / = [ < report_frequency > ] [ < Security_data > ] are tbd. 7. Remaining Work The sampling/identification hash function to be activated in devices by SQM signalling must be defined, and formulated in the signalling description. This work could be common to PSAMP and NSIS. IPFIX WG aims at designing the reporting protocol between IPFIX device and collectors, thus the protocol defined by IPFIX will be a reference if not the protocol between SQM device and collectors. The identified new requirement that SQM brings to IPFIX protocol is to Couturier Expires - November 2003 [Page 17] Internet Draft draft-couturier-nsis-measure-00.txt May 2003 allow to carry the information about the final collector. Indeed, the SQM proxy collector must receive information from the device to know which collector to re-emit reports to, and how. This information gathers the MeasureID, and COLLECTOR fields from the signalling. NSIS is the work group which deals with signalling in IETF. The SQM signalling is a candidate for a new NSIS signalling layer application. 8. Security issues SQM identifies three interfaces: the signalling, the device to collector interface (the collector may be a intermediary collector), and the intermediary collector to collector interface. The two latter's should preferably be realized by the same protocol. The security considerations of IPFIX should apply to the SQM collectors interfaces. The NSIS security considerations of NSIS applies to the SQM signalling. [more to come] 8.1 Confidence in reports The question about the validity of a report received by a collector must be addressed, as a report can be used to identify responsibilities of a carrier. Indeed, in case of delay measurement, a report sent by an ingress router of a domain would indicate the level of service the upstream network provides: good, or slow, depending on the arrival time of the packets in the ingress router. By giving a false arrival time, later than the real one, a device might indicate the upstream network as responsible for its own delay. This might also happen if a clock is not well synchronized. To prevents erroneous interpretations of incorrect reports, the solution is to get several measures from active observation points. Having several measures, time stampings can be compared and checked each others. For instance, a SQM capable egress router would report measures that can be compared with the same measures performed in the next router which is the ingress router of the downstream domain. Incoherences between these two measures are easy to detect. 9. IANA Consideration Need Port number assigned from IANA [more to be written] Couturier Expires - November 2003 [Page 18] Internet Draft draft-couturier-nsis-measure-00.txt May 2003 10. References [1] Bradner, S., "The Internet Standards Process -- Revision 3", BCP 9, RFC 2026, October 1996. [2] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997 [3] IP Flow Information Export (IPFIX) http://www.ietf.org/html.charters/ipfix-charter.html [4] N. Duffield et al., "A framework for Passive Packet Measurement",(work in progress) ,Internet Draft, Internet Engineering Task Force, , March 2003 [5] S. Shalunov, B. Teitelbaum, M. J. Zekauskas, "A One-Way Active Measurement Protocol", (work in progress), Internet Draft, Internet Engineering Task Force, , May 2003 [6] N. G. Duffield and M. Grossglauser, Trajectory Sampling for Direct Traffic Observation, IEEE/ACM Trans. on Networking, 9(3), pp.280-292, June 2001. [7] Crocker, D. and Overell, P.(Editors), "Augmented BNF for Syntax Specifications: ABNF", RFC 2234, Internet Mail Consortium and Demon Internet Ltd., November 1997 Acknowledgments The author thanks Mustapha Aissaoui and Stefan De Cnodder for their comments. Author's Addresses Alban Couturier Alcatel Email: Alban.Couturier@alcatel.fr Couturier Expires - November 2003 [Page 19]