Internet Draft Chang H. Kim draft-kim-ipfix-ppr-00.txt Taesang Choi Expires: December 2003 ETRI June 2003 Supplementing IPFIX Flow Informaion with Per-packet Records Status of this Memo This document is an Internet-Draft and is subject to all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet- Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/1id-abstracts.html The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html Copyright Notice Copyright (C) The Internet Society (2003). All Rights Reserved. Abstract This document describes extensions required to supplement the IP flow information export with per-packet records. The extension supports more precise application-aware usage accounting, detailed traffic profiling, enhanced intrusion and attack detection, etc. Extensions on the information model, the Metering Process, the Exporting Process, and Configuration are mentioned. Kim, et al. expires - December, 2003 [Page 1] Internet Draft Per-packet Records June 2003 Table of Contents 1. Introduction ................................................ 3 2. Applicability ............................................... 4 2.1. Usage-based Billing ..................................... 5 2.2. Traffic Profiling ....................................... 5 2.3. Attack/Intrusion Detection .............................. 5 2.4. Application Monitoring and Profiling .................... 6 3. Methods for Adopting Per-packet Records...................... 6 3.1. Information Model Extension ............................. 6 3.1.1. pktArrivalTimeOffset .................................. 6 3.1.1.1. Type ................................................ 6 3.1.1.2. Field Id ............................................ 6 3.1.1.3. Reference ........................................... 6 3.1.2. pktLen ................................................ 6 3.1.2.1. Type ................................................ 6 3.1.2.2. Field Id ............................................ 6 3.1.3. pktId ................................................. 7 3.1.3.1. Type ................................................ 7 3.1.3.2. Field Id ............................................ 7 3.1.4. pktFlags .............................................. 7 3.1.4.1. Type ................................................ 7 3.1.4.2. Field Id ............................................ 7 3.1.5. pktTtl ................................................ 7 3.1.5.1. Type ................................................ 7 3.1.5.2. Field Id ............................................ 7 3.1.6. pktFragIndication ..................................... 7 3.1.6.1. Type ................................................ 7 3.1.6.2. Field Id ............................................ 7 3.1.7. tcpFlags .............................................. 7 3.1.7.1. Type ................................................ 8 3.1.7.2. Field Id ............................................ 8 3.1.8. tcpSeqNumber .......................................... 8 3.1.8.1. Type ................................................ 8 3.1.8.2. Field Id ............................................ 8 3.1.9. tcpAckNumber .......................................... 8 3.1.9.1. Type ................................................ 8 3.1.9.2. Field Id ............................................ 8 3.1.10. pktEntirePayload ..................................... 8 3.1.10.1. Type ............................................... 8 3.1.10.2. Field Id ........................................... 8 3.1.10.3. Reference .......................................... 8 3.1.11. pktPartialPayload .................................... 9 3.1.11.1. Type ............................................... 9 3.1.11.2. Field Id ........................................... 9 3.1.11.3. Reference .......................................... 9 3.2. Metering Process Extension .............................. 9 3.2.1. Metering Process Functions ............................ 10 3.2.2. Pattern Matching Criteria ............................. 10 3.2.3. Pattern Specification and Pattern Matching ............ 11 3.3. Configuration Extension ................................. 11 3.3.1. Configuration Extension of the Metering Process ....... 11 3.3.2. Configuration Extension of the Exporting Process ...... 11 3.4. Data Export Extension ................................... 12 3.5. Collector Extension ..................................... 12 4. Security Considerations ..................................... 12 5. References .................................................. 12 Kim, et al. expires - December, 2003 [Page 2] Internet Draft Per-packet Records June 2003 1. Introduction Internet traffic has been dominated by client-server applications until late 90s. Starting from "Napster", various peer-to-peer applications became popular. Network-based Internet games have also grown rapidly. Traditional Internet applications could be easily identified by their port numbers because not many applications forged those numbers. Also they occupied most of the traffic in the Internet. However, since late 90s, large portion of unknown traffic is identified when simple port number matching method is used. This volume started from minimal percentage and grew over more than 60% recently. Both the recent explosion of the number of such applications and the common use of port range allocation give rise to a high frequency of the overlapped service ports problem. For example, users intentionally configure a new application to share a well known service port, say 80 of HTTP, for an infiltration purpose. Overlapped service ports, however, even combined with the flow concept, preclude ensuring a high degree of application recognition correctness. To overcome this limitation, we need to investigate the contents of packets to search for each applicationí¯s distinctive signature. The signature can be an ASCII string, a binary code, or a combination of them. Using the flow concept in conjunction with payload inspection, we can obtain synergy effect; packets that do not contain any distinctive signature can also be classified to the application whose signature is found in the other packets in the same flow. Nevertheless, since the operational burden of the payload investigation method is much higher than that of the simple port-based one, its use should be limited according to the number and speed of the monitored links, available computing resources, and other operational constraints. Configurability, thus, plays a principal role in the designing and implementing process of a recognition system which is capable of payload investigation. One conspicuous feature found in the operational network is the existence of quite a number of dummy flows which are composed only of 40 byte long packets. Nearly all of these are streams of control packets of TCP sessions: ACK, SYN, SYN-ACK, RST packets, etc. The dominance of this type of flows completely invalidates the benefit of the payload investigation approach because all of the packets in the flow comprise only IP and TCP headers. In this case, retrieving a corresponding flow found in the reverse link can provide a key to identifying the dummy flow. However, there is no guarantee that all dummy flows can be identified only by such an inclusive application recognition method because Internet routing is asymmetric. Kim, et al. expires - December, 2003 [Page 3] Internet Draft Per-packet Records June 2003 Another characteristic of the operational network is that fragmented packets take quite a portion of traffic even in backbone or mid-level networks [1]. TCP or UDP packet fragments, but the first one, do not contain transport layer headers; no port numbers are given in such packets although they really are linked with a port. This tendency may deepen because more encapsulated services, such as IPv6 over IPv4 and IPsec, are becoming more popular. In this sense, the flow based recognition with fragmentation treatment increases the overall application recognition ratio. Also for the proper traffic profiling, various statistics are needed such as flow duration, volume, time and burstiness. Besides these flow related statistics, packet specific statistics information (e.g., packet size distribution, packet] inter-arrival time, etc.) is also very useful for traffic profiling. IPFIX architecture doesní¯t take this aspect into consideration. Additional packet related information, thus, needs to be added as a part of the IPFIX information model if packet related traffic profiling is required. In summary, we described simple port number to application mapping is no longer precise or even reasonable means for its usage accounting due to the characteristics of the current Internet application behaviors. Additional per-packet based information is needed for traffic profiling and other IPFIX applications. Thus we propose to add per-packet record supplementing IPFIX flow information in order to support IPFIX applications with more detailed and precise information. We address these additional requirements in section 2 for each application and details of information elements and methods for such requirements are provided in section 3. 2. Applicability IPFIX applicability draft [2] describes critical customer applications which utilize IPFIX data. The applications are accounting, peering agreements, traffic engineering, data warehousing and mining, and network monitoring. This draft explains that IPFIX architecture can support all these applications. As described in the previous section, however, additional information or mechanisms may be necessary. More details are provided per application basis below. Kim, et al. expires - December, 2003 [Page 4] Internet Draft Per-packet Records June 2003 2.1. Usage-based Accounting IPFIX applicability draft states that charging can be based on application usage among others. IPFIX architecture provides this information based on port numbers. But, as mentioned in the introduction section, mapping applications and port numbers are no longer safe and application signature matching and other methods are required for precise application usage accounting. Some applications can be identified simply by matching application specific signature per packet basis. However, other applications require cross check with internal sub-flows which may occur in opposite flow direction or even in another link. In the former case, application signature information needs to be added in the information model. In the latter case, however, precise application recognition with signature matching on a single link is not possible. Instead, flow record somehow has to keep the application signature information and a collector which monitors multiple links later correlate results for the accurate recognition. For this purpose, we propose to add a packet record and keep an application signature in it. Collector can use them later during analysis phase. 2.2. Traffic Profiling For the proper traffic profiling, various statistics are needed and many of them are listed in the requirement and applicability draft such as flow duration, volume, time and burstiness. Besides these flow related statistics, packet specific statistics information (e.g., packet size distribution, packet inter-arrival time, etc.) is also very important for traffic profiling. IPFIX architecture doesní¯t take this aspect into consideration. Additional packet related information, thus, needs to be added as a part of the IPFIX information model if packet related traffic profiling is required. Since this information is packet specific ones, flow record is not a good place to add them. We propose a packet record for the place holder of such information. 2.3. Attack/Intrusion Detection IPFIX architecture allows packet content inspection for attack/intrusion detection. However, it doesní¯t define information model for storing such information, metering process and exporting process. Similar to usage-based accounting case, virus or worm signature information can be captured in packet records for post-processing by the analysis server. Kim, et al. expires - December, 2003 [Page 5] Internet Draft Per-packet Records June 2003 2.4. Application Monitoring and Profiling IPFIX architecture states that it enables content and service providers to view detailed, time-based, and application-based usage of a network. Simple port based accounting per application doesní¯t faithfully measure its usage. Also QoS monitoring per application may be a requirement for content and service providers. It requires more precise application profiling as the case of the usage-based accounting. 3. Methods for Adopting Per-packet Records The following are required extension on the existing IP flow information export specifications [3][4][5]. 3.1. Information Model Extension As specified in section 6 of IPFIX Information Model document [5], the existing IPFIX information model allows for extending the set of information items. This section, thus, defines new information elements for exporting per-packet information. 3.1.1. pktArrivalTimeOffset The offset of the packet's arrival timestamp to the flowCreationTime of the flow. 3.1.1.1. Type The pktArrivalTime element is of type ipdr:dateTimeUsec. 3.1.1.2. Field Id The field id will be assigned by IANA. 3.1.1.3. Reference Because an exporter terminates a long-lasting flow on a regular basis, the value of pktArrivalTimeOffset MUST be smaller than the active timeout period. 3.1.2. pktLen Packet length in the IP packet. 3.1.2.1. Type The pktLen element is of type int. 3.1.2.2. Field Id The field id will be assigned by IANA. Kim, et al. expires - December, 2003 [Page 6] Internet Draft Per-packet Records June 2003 3.1.3. pktId The identification value of the IP packet. 3.1.3.1. Type The pktId is of type short. 3.1.3.2. Field Id The field id will be assigned by IANA. 3.1.4. pktFlags The flags of the IP packet. 3.1.4.1. Type The pktFlags is of type byte. 3.1.4.2. Field Id The field id will be assigned by IANA. 3.1.5. pktTtl The TTL value of the IP packet. 3.1.5.1. Type The pktTtl is of type unsignedByte. 3.1.5.2. Field Id The field id will be assigned by IANA. 3.1.6. pktFragIndication The fragment field of the IP packet. 3.1.6.1. Type The pktFragIndication is of type byte. 3.1.6.2. Field Id The field id will be assigned by IANA. 3.1.7. tcpFlags The flags in the TCP header of the IP packet. Kim, et al. expires - December, 2003 [Page 7] Internet Draft Per-packet Records June 2003 3.1.7.1. Type The tcpFlags is of type byte. 3.1.7.2. Field Id The field id will be assigned by IANA. 3.1.8. tcpSeqNumber The sequence number in the TCP header of the IP packet. 3.1.8.1. Type The tcpSeqNumber is of type int. 3.1.8.2. Field Id The field id will be assigned by IANA. 3.1.9. tcpAckNumber The acknowledgement number in the TCP header of the IP packet. 3.1.9.1. Type The tcpAckNumber is of type int. 3.1.9.2. Field Id The field id will be assigned by IANA. 3.1.10. pktEntirePayload The payload of the IP packet. 3.1.10.1. Type The pktEntirePayload is of type string. 3.1.10.2. Field Id The field id will be assigned by IANA. 3.1.10.3. Reference When a predefined byte long part of an IP packet is captured, pktEntirePayload means the entire captured portion of the payload. The packet capture length can be configured during the initial communication between an exporter and a collector. For avoiding resource exhaustion and performance degradation, the Kim, et al. expires - December, 2003 [Page 8] Internet Draft Per-packet Records June 2003 use of pktEntirePayload element should be strictly conservative. The exporter, therefore, does not necessarily include a pktEntirePayload element to every packet information record; when a packet's payload contains a specific pattern, the exporter may append the pktEntirePayload element. 3.1.11. pktPartialPayload The partial payload of the IP packet. 3.1.11.1. Type The pktPartialPayload is of type string. 3.1.11.2. Field Id The field id will be assigned by IANA. 3.1.11.3. Reference During the initial communication between a collector and an exporter, operators can specify and assign some patterns (signatures) of interest. When an exporter detects a specific pattern in a packet's payload the exporter can append only the pattern in the pktPartialPayload field. The choice of appending an entire or a partial payload can also be configured. The exporter does not necessarily include a pktPartialPayload element to every packet information record; when a packet's payload contains a specific pattern, the exporter may append the pktPartialPayload element. 3.2. Metering Process Extension The metering process defined in IPFIX architecture model [4] may be extended as specified in the followings. Kim, et al. expires - December, 2003 [Page 9] Internet Draft Per-packet Records June 2003 3.2.1. Metering Process Functions Flow classification specification may contain a flag enabling or disabling per-packet information export. If the flag is off, packets within the flow are processed as the existing manner in the architecture model [4]. If the flag is on, the Metering Process needs to execute some additional processes; the Metering Process must generate per-packet information. When the flag is on and a specific pattern is assigned to the flow as well, the Metering Process also needs to perform pattern matching processes on the basis of the pattern(s). When no pattern or a wildcard pattern is specified for a flow, the pattern matching process does not perform anything in effect. For a wildcard pattern specification, however, the pktEntirePayload or pktPartialPayload element must be generated and included in the per-packet records. The figure below describes the extended architecture of the Metering Process. packet capturing | timestamping | V +------+ | | | sampling (1:1 in case of no sampling) | | | classifying -------------+ | (NULL when No criteria) | | | | +------+ pattern matching | (NULL for No pattern or Wildcard pattern) | | V V Flow Records Packet Records 3.2.2. Pattern Matching Criteria The measurement device may define rules so that the information of packets within only a certain flow is exported. Packets that satisfy a function on the fields defined by the packet header fields or fields obtained while doing the packet processing or the properties of the packet itself. The measurement device can include the rules in the Selection Criteria in the architecture model [4] and the flag enabling or disabling Per-Packet Information Export (PPIE) is called PPIE flag. Example: The per-packet information of flows whose {Protocol == TCP, Destination Port = 80} are generated and exported. Kim, et al. expires - December, 2003 [Page 10] Internet Draft Per-packet Records June 2003 3.2.3. Pattern Specification and Pattern Matching A Pattern Specification is a pattern and its attributes. A Pattern Specification is composed as the following: Pattern Specification = Pattern Position is composed of packet sequence range and byte range in which the pattern should be sought for. A Selection Criteria [4] whose PPIE flag is on may contain one or more Pattern Specifications; packets that suffice the Selection Criteria are investigated in searching for the patterns in the Pattern Specifications. This mechanism, in conjunction with the Pattern Position, restricts the excessive use of pattern matching function which may consume serious amount of computing and network resources. To implement the Pattern Matching process, the measurement process may use an efficient pattern matching algorithm. In IPFIX's context, the payload of a packet becomes the text on which pattern matching is performed. For fragmented packets, however, the pattern may occur in a separate form on a number of consecutive packets especially when the specified pattern is long. The pattern matching process, thus, should operate on a text obtained by reassembling packet fragments. 3.3. Configuration Extension The Configuration specified in the requirement document [6] may be extended as specified in the followings. 3.3.1. Configuration Extension of the Metering Process The Metering Process may provide a way of configuring the extended features in the Selection Criteria. The following parameters of the metering process MAY be configurable: 1. specifications of flows whose constituting packets' information will be generated and exported; this can be accomplished by adding the PPIE flag in the Selection Criteria. 2. specifications of flows whose constituting packets' payload will be investigated in searching for patterns; this can be accomplished by adding the Pattern Specification in the Selection Criteria 3. Pattern Specifications 4. packet capture length; this should be uniformly applied to every packet in every flow 3.3.2. Configuration Extension of the Exporting Process There is no extension required for the Exporting Process. Kim, et al. expires - December, 2003 [Page 11] Internet Draft Per-packet Records June 2003 3.4. Data Export Extension Since the Data Export mechanism specified in [3] is extensible and configurable, there is no special extension required for adopting per-packet information export. However, because the length of an export packet must not exceed the local MTU, the measurement device cannot combine the per-packet records in the Data Flowset for the same flow. Instead, the exporter can use some other Flowsets. On the basis of the proper combinations of per-packet information elements specified in the section 3.1, the exporter can utilize a number of different Data Flowsets. Introducing new Data Flowsets can also be easily accomplished by adding new Templates in a Template Flowset. 3.5. Collector Extension There is no extension required for the Collector. 4. Secutiry Considerations This document describes the requirements and elements of information model for supplementing IPFIX flow information with per-packet record. The security requirements for the IPFIX flow information are addressed in the IPFIX requirement draft. hese requirements must be considered for the extension proposed in this document as well. No further security threats are induced from this document. 5. References [1] Colleen Shannon, David Moore, K Claffy, Characteristics of Fragmented IP Traffic on Internet Links, PAM 2001, 83 - 97. Nov. 2001. [2] Tanja Zseby, et. al.., IPFIX Applicability, draft-ietf-ipfix-as-00.txt, June 2003. [3] B. Claise, et. al.., IPFIX Procotol Specifications, draft-ietf-ipfix-protocol-00.txt, June 2003. [4] G. Sadasivan, et. al.., Architecture Model for IP Flow Information Export, draft-ietf-ipfix-arch-00.txt, June 2003. [5] P. Calato, et. al.., Information Model for IP Flow Information Export, draft-ietf-ipfix-info-00.txt, June 2003. [6] J. Quittek, et. al.., Requirements for IP Flow Information Export, draft-ietf-ipfix-reqs-10.txt, June 2003. Author's Addresses Changhoon Kim Engineering Staff, ETRI, 161 Gajeong-Dong, Yuseong-Gu, Daejon, 305-350, South Korea kimch@etri.re.kr Taesang Choi Senior Engineering Staff, ETRI, 161 Gajeong-Dong, Yuseong-Gu, Daejon, 305-350, South Korea choits@etri.re.kr Kim, et al. expires - December, 2003 [Page 12]