IPFIX Working Group                                          B. Trammell
Internet-Draft                                                CERT/NetSA
Intended status: Standards Track                               E. Boschi
Expires: January 10, 2008                                 Hitachi Europe
                                                                 L. Mark
                                                                T. Zseby
                                                        Fraunhofer FOKUS
                                                               A. Wagner
                                                              ETH Zurich
                                                            July 9, 2007


                       An IPFIX-Based File Format
                    draft-trammell-ipfix-file-04.txt

Status of this Memo

   By submitting this Internet-Draft, each author represents that any
   applicable patent or other IPR claims of which he or she is aware
   have been or will be disclosed, and any of which he or she becomes
   aware will be disclosed, in accordance with Section 6 of BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   This Internet-Draft will expire on January 10, 2008.

Copyright Notice

   Copyright (C) The IETF Trust (2007).

Abstract

   This document describes a file format for the storage of flow data
   based upon the IPFIX message format.  It proposes a set of


Trammell, et al.        Expires January 10, 2008                [Page 1]

Internet-Draft                 IPFIX Files                     July 2007


   requirements for flat-file, binary flow data file formats, evaluates
   flow storage systems presently in use for their conformance to these
   requirements, then applies the IPFIX message format to these
   requirements to build a new file format.  This IPFIX-based file
   format is designed to facilitate interoperability and reusability
   among a wide variety of flow storage, processing, and analysis tools.


Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  4
   2.  Terminology  . . . . . . . . . . . . . . . . . . . . . . . . .  4
   3.  Motivation . . . . . . . . . . . . . . . . . . . . . . . . . .  5
   4.  Requirements . . . . . . . . . . . . . . . . . . . . . . . . .  7
     4.1.  Record Format Flexibility  . . . . . . . . . . . . . . . .  7
     4.2.  Self Description . . . . . . . . . . . . . . . . . . . . .  7
     4.3.  Data Compression . . . . . . . . . . . . . . . . . . . . .  8
     4.4.  Indexing and Searching . . . . . . . . . . . . . . . . . .  8
     4.5.  Data Integrity . . . . . . . . . . . . . . . . . . . . . .  9
     4.6.  Creator Authentication and Confidentiality . . . . . . . .  9
     4.7.  Anonymization and Obfuscation  . . . . . . . . . . . . . . 10
     4.8.  Performance Characteristics  . . . . . . . . . . . . . . . 10
   5.  Survey of Existing Flow and Trace File Formats . . . . . . . . 11
     5.1.  NetFlow V5/V7  . . . . . . . . . . . . . . . . . . . . . . 11
     5.2.  Argus 2  . . . . . . . . . . . . . . . . . . . . . . . . . 11
     5.3.  SiLK . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
     5.4.  libpcap dumpfile . . . . . . . . . . . . . . . . . . . . . 12
   6.  IPFIX File Format Description  . . . . . . . . . . . . . . . . 13
     6.1.  Recommended Information Elements for IPFIX Files . . . . . 15
       6.1.1.  collectionTimeMilliseconds . . . . . . . . . . . . . . 16
       6.1.2.  informationElementAnonymizationType  . . . . . . . . . 16
       6.1.3.  maxExportSeconds . . . . . . . . . . . . . . . . . . . 16
       6.1.4.  maxFlowEndSeconds  . . . . . . . . . . . . . . . . . . 17
       6.1.5.  messageMD5Checksum . . . . . . . . . . . . . . . . . . 17
       6.1.6.  messageScope . . . . . . . . . . . . . . . . . . . . . 17
       6.1.7.  minExportSeconds . . . . . . . . . . . . . . . . . . . 18
       6.1.8.  minFlowStartSeconds  . . . . . . . . . . . . . . . . . 18
       6.1.9.  sessionScope . . . . . . . . . . . . . . . . . . . . . 19
     6.2.  Recommended Options Templates for IPFIX Files  . . . . . . 19
       6.2.1.  Message Checksum Options Template  . . . . . . . . . . 19
       6.2.2.  Template Anonymization Options Template  . . . . . . . 20
       6.2.3.  File Time Window Options Template  . . . . . . . . . . 21
       6.2.4.  Export Session Details Options Template  . . . . . . . 22
       6.2.5.  Message Details Options Template . . . . . . . . . . . 23
     6.3.  Recommended Compression Error Resilience Strategy  . . . . 25
     6.4.  Recommended Encryption Error Resilience Strategy . . . . . 27
   7.  Applicability of IPFIX Files . . . . . . . . . . . . . . . . . 27
     7.1.  Testing IPFIX Collecting Processes . . . . . . . . . . . . 27


Trammell, et al.        Expires January 10, 2008                [Page 2]

Internet-Draft                 IPFIX Files                     July 2007


     7.2.  Storage of IPFIX-collected Flow Data . . . . . . . . . . . 28
   8.  Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
   9.  Security Considerations  . . . . . . . . . . . . . . . . . . . 29
   10. IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 29
   11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 30
   12. References . . . . . . . . . . . . . . . . . . . . . . . . . . 30
     12.1. Normative References . . . . . . . . . . . . . . . . . . . 30
     12.2. Informative References . . . . . . . . . . . . . . . . . . 31
   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 31
   Intellectual Property and Copyright Statements . . . . . . . . . . 34


Trammell, et al.        Expires January 10, 2008                [Page 3]

Internet-Draft                 IPFIX Files                     July 2007


1.  Introduction

   This document proposes a file format based upon IPFIX.  It begins by
   exploring the motivation for proposing a standardized flow file
   format, and using IPFIX as the basis for this new file format.  It
   then proposes a set of requirements for this file format, evaluates
   existing flow storage file formats for their conformance to these
   requirements, and describes either how the IPFIX message format meets
   each requirement, or how a file format based upon it could meet the
   requirement.  It closes by proposing an initial specification of the
   new file format and providing examples of IPFIX Files meeting this
   specification.  This format makes use of the IPFIX Options mechanism
   for additional file metadata, in order to avoid requiring any
   protocol or message format extensions.


2.  Terminology

   Terms used in this document that are defined in the Terminology
   section of the IPFIX Protocol [I-D.ietf-ipfix-protocol] document are
   to be interpreted as defined there.

   IPFIX File:   An IPFIX File is a serialized stream of IPFIX Messages
      stored on a filesystem.  Any IPFIX Message stream that would be
      considered valid when transported one or more of the specified
      IPFIX transports (SCTP, TCP, or UDP) as defined in the IPFIX
      Protocol draft [I-D.ietf-ipfix-protocol] is considered an IPFIX
      File for purposes of this draft; however, this draft further
      restricts that definition with recommendations on the construction
      of IPFIX Files that meet the requirements identified herein.

   IPFIX File Reader:   An IPFIX File Reader is a Process which reads
      IPFIX Files from a filesystem, and is analogous to an IPFIX
      Collecting Process.  An IPFIX File Reader MUST behave as an IPFIX
      Collecting Process as outlined in the IPFIX Protocol draft
      [I-D.ietf-ipfix-protocol], except as modified by this document.

   IPFIX File Writer:   An IPFIX File Writer is a process which writes
      IPFIX Files to a filesystem, and is analogous to an IPFIX
      Exporting Process.  An IPFIX File Writer MUST behave as an IPFIX
      Exporting Process as outlined in the IPFIX Protocol draft
      [I-D.ietf-ipfix-protocol], except as modified by this document.

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC 2119 [RFC2119].


Trammell, et al.        Expires January 10, 2008                [Page 4]

Internet-Draft                 IPFIX Files                     July 2007


3.  Motivation

   There are a wide variety of applications for the file-based storage
   of IP flow data, across a continuum of time scales.  Tools used in
   the analysis of flow data and creation of analysis products often use
   files as a convenient unit of work, with an ephemeral lifetime.  A
   set of flows relevant to a security investigation may be stored in a
   file for the duration of that investigation, and futher exchanged
   among incident handlers via email or within an external incident
   handling workflow application.  Sets of flow data relevant to
   Internet measurement research may be published as files, much as
   libpcap packet trace files are, to provide common data sets for the
   repeatability of research efforts; these files would have lifetimes
   measured in months or years.  Operational flow measurement systems
   also have a need for long-term, archival storage of flow data, either
   as a primary flow data repository, or as a backing tier for online
   storage in a relational database management system (RDBMS).

   The variety of applications of flow data, and the variety of
   presently deployed storage approaches, would seem to indicate the
   need for a standard approach to flow storage with applicability
   across the continuum of time scales over which flow data is stored.
   A storage format based around flat files would best address the
   variety of storage requirements.  While much work has been done on
   structured storage via RDBMS, relational database systems are not a
   good basis for format standardization owing to the fact that their
   internal data structures are generally private to a single
   implementation and subject to change for internal reasons.  Also,
   there are a wide variety of operations available on flat files, and
   external tools and standards can be leveraged to meet file-based flow
   storage requiremenets.  Further, flow data is often not very
   semantically complicated, is managed in very high volume, and
   therefore an RDBMS-based flow storage system would not benefit much
   from the advantages of relational database technology.

   The simplest way to create a new file format is simply to serialize
   some internal data model to disk, with either textual or binary
   representation of data elements, and some framing strategy for
   delimiting fields and records.  "Ad-hoc" file formats such as this
   have several important disadvantages.  One, they impose the semantics
   of the data model from which they are derived on the file format; as
   such, they are difficult to extend, describe, and standardize.

   Over the past decade XML markup has emerged as a new "universal"
   representation format for structured data.  It is intended to be
   human-readable; indeed, that is one reason for its rapid adoption.
   However XML has limited usefulness for representing network flow
   data.  Network flow data has a simple, repetitive, non-hierarchical


Trammell, et al.        Expires January 10, 2008                [Page 5]

Internet-Draft                 IPFIX Files                     July 2007


   structure that does not benefit much from XML.  An XML representation
   of flow data would be an essentially flat list of the attributes and
   their values for each flow record.  At the same time network flow
   data has well-defined semantics, required to do any meaningful
   processing; these semantics are not known to typical XML tools.

   The XML approach to data encoding is very heavyweight when compared
   to binary flow encoding.  While binary flow encodings use a small
   number of (or even just one) flat data structures that are entirely
   sufficient to encode flow data, XML uses start- and end-tags, and
   plain-text encoding of the actual values.  This leads to significant
   inefficiency in encoding size.  Typical network flow datasets can
   contain millions or billions of flows per hour of traffic
   represented.  Any increase in storage size per record can have
   dramatic impact on flow data storage and transfer sizes.  While data
   compression algorithms can partially remove the redundancy introduced
   by XML encoding, they introduce additional overhead of their own.

   A further problem is that XML processing tools require a full XML
   parser.  XML parsers are fully general and therefore complex,
   resource-intensive and relatively slow.  Since network flow datasets
   can be very large, XML parsing introduces significant processing time
   overhead.  At the same time, parsers for typical binary flow data
   encoding are simply structured, since they only need to parse a very
   small header and then have complete knowledge of all following fields
   for the particular flow.  These can then be read in a very efficient
   linear fashion without the need for any further decisions.  The
   overhead from encoding flow data with XML may well be prohibitive for
   processing steps that are easily done with standard binary flow
   encodings.  At the same time XML encoding offers no discernible
   advantage to the flow storage use case.

   This leads us to propose the IPFIX message format as the basis for a
   new flow data file format.  The IPFIX working group, in defining the
   IPFIX protocol, has already defined an information model and data
   formatting rules for representation of flow data.  Especially at
   shorter time scales, when a file is a unit of data interchange, the
   filesystem may be viewed as simply another IPFIX message transport
   between processes.  This format is especially well suited to
   representing flow data, as it was designed specifically for flow data
   export; it is easily extensible unlike ad-hoc serialization, and
   compact unlike XML.  In addition, IPFIX is an emerging standard for
   the export and collection of flow data; using a common format for
   storage and analysis at the collection side allows implementors to
   use substantially the same information model and data formatting
   implementation for transport as well as storage.


Trammell, et al.        Expires January 10, 2008                [Page 6]

Internet-Draft                 IPFIX Files                     July 2007


4.  Requirements

   In this section, we outline a proposed set of requirements
   [SAINT2007] for any persistent storage format for flow data.  First
   and foremost, a flow data file format should support storage across
   the continuum of time scales important to flow storage applications.
   Each of the requirements enumerated in the sections below is broadly
   applicable to flow storage applications, though each may be more
   important at certain time scales.  For each, we first identify the
   requirement, then explain how the IPFIX message format addresses it,
   or briefly outline the changes that must be made in order for an
   IPFIX-based file format to meet the requirement.

4.1.  Record Format Flexibility

   Due to the wide variety of flow attributes collected by different
   network flow attribute measurement systems, the ideal flow storage
   format will not impose a single data model or a specific record type
   on the flows it stores.  The file format must be flexible and
   extensible; that is, it must support multiple record types definable
   within the file itself, and must be able to support new field types
   for data within the records in a graceful way.

   IPFIX provides extensibility through the use of Templates to describe
   each Data Record, through the use of an IANA Registry to define its
   Information Elements, and through the use of enterprise-specific
   Information Elements.

4.2.  Self Description

   Archived data may be read at a time in the future where any external
   reference to the meaning of the data may be lost.  The ideal flow
   storage format should be self-describing; that is, a process reading
   flow data from storage should be able to properly interpret the
   stored flows without reference to anything other than standard
   sources (e.g., the standards document describing the file format) and
   the stored flow data itself.

   The IPFIX message format is partially self-describing; that is, IPFIX
   Templates containing only IANA-assigned Information Elements can be
   completely interpreted according to the IPFIX Information Model
   without additional external data.

   However, Templates containing private information elements lack
   detailed type and semantic information; a Collecting Process
   receiving data described by a template containing private Information
   Elements it does not understand can only treat the data contained
   within those Information Elements as octet arrays.  To be fully self-


Trammell, et al.        Expires January 10, 2008                [Page 7]

Internet-Draft                 IPFIX Files                     July 2007


   describing, Enterprise-Specific Information Elements must be
   additionally described via IPFIX Options according to the Information
   Element Semantics Options Template defined in "Extended Type
   Information for IPFIX Enterprise-Specific Information Elements"
   [I-D.boschi-ipfix-extended-type].

4.3.  Data Compression

   Regardless of the representation format, flow data describing traffic
   on real networks tends to be highly compressible.  Compression tends
   to improve the scalability of flow collection systems, by reducing
   the disk storage and I/O bandwidth requirement for a given workload.
   The ideal flow storage format should support applications which wish
   to leverage this fact by supporting compression of stored data.

   The IPFIX message format has no support for data compression, as the
   IPFIX protocol was designed for speed and simplicity of export.  Of
   course, any flat file is readily compressible using a wide variety of
   external data compression tools, formats, and algorithms; therefore,
   this requirement can be met externally.

   However, a couple of simple optimizations can be made by File Writers
   to increase the integrity and usability of compressed IPFIX data;
   these are outlined in the Recommended Compression Strategy section,
   which appears below.

4.4.  Indexing and Searching

   Binary, record stream oriented file formats natively support only one
   form of searching, sequential scan in file order.  By choosing the
   order of records in a file carefully (e.g., by flow start or flow end
   time), a file can be indexed by a single key.

   Beyond this, properly addressing indexing is an application-specific
   problem, as it inherently involves tradeoffs between storage
   complexity and retrieval speed, and requirements vary widely based on
   time scales and the types of queries used from site to site.
   However, a generic standard flow storage format may provide limited
   direct support for indexing and searching.

   The ideal flow storage format will support a limited table of
   contents facility noting that the records in a file contain data
   relating only to certain keys or values of keys, in order to keep
   multi-file search implementations from having to scan a file for data
   it does not contain.

   The IPFIX message format has no direct support for indexing.
   However, its template mechanism and the technique described in


Trammell, et al.        Expires January 10, 2008                [Page 8]

Internet-Draft                 IPFIX Files                     July 2007


   "Reducing Redundancy in IPFIX and PSAMP Reports"
   [I-D.ietf-ipfix-reducing-redundancy] can be used to describe the
   contents of a file in a limited way.  Additionally, as flow data is
   often sorted and divided by time, the start and end time of the flows
   in a file may be declared using the File Time Window Options Record
   defined below.

4.5.  Data Integrity

   When storing flow data over long time scales, especially for archival
   purposes, it is important to ensure that hardware or software faults
   do not introduce errors into the data over time.  The ideal flow
   storage format will support the detection and correction of encoding-
   level errors in the data.

   Note that more advanced error correction is almost certainly best
   handled at a layer below that addressed by this document.  Error
   correction is a topic well addressed by the storage industry in
   general (e.g. by RAID and other technolgies), and by specifying a
   flow storage format based upon files, we can leverage these features
   to meet this requirement.

   However, the ideal flow storage format will be resilient against
   errors, providing an internal facility for the detection of errors
   and the ability to isolate errors to as few data records as possible.

   Note that this requirement interacts with the choice of data
   compression or encryption algorithm.  The use of block compression
   algorithms can serve to isolate errors to a single compression block,
   unlike stream compressors, which may fail to resynchronize after a
   single bit error, invalidating the entire message stream.  Similarly,
   the use of a stream cipher can serve to isloate errors in the
   plaintext without amplifying them as, for example, a cipher in CBC
   mode can.  See the "Recommended Compression Error Resilience
   Strategy" and "Recommended Encryption Error Resilience Strategy"
   sections below for more on this interaction.

   The IPFIX message format does not support data integrity assurance.
   It is assumed that advanced error correction will be provided
   externally.  For simple error detection support, checksums may be
   attached to messages via IPFIX Options according to the Message
   Checksum Options Template defined below.

4.6.  Creator Authentication and Confidentiality

   Storage of flow data across long time scales may also require
   assurance that no unauthorized entity can read or modify the stored
   data.  Asymmetric-key cryptography can be applied to this problem, by


Trammell, et al.        Expires January 10, 2008                [Page 9]

Internet-Draft                 IPFIX Files                     July 2007


   signing flow data with the private key of the creator, and encrypting
   it with the public keys of those authorized to read it.  The ideal
   flow storage format will support the encryption and signing of flow
   data.

   As with error correction, this problem has been addressed well at a
   layer below that addressed by this document.  Instead of specifying a
   particular choice of encryption technology, we can leverage the fact
   that existing cryptographic technologies work quite well on data
   stored in files to meet this requirement.

   Beyond support for the use of TLS for transport over TCP or DTLS for
   transport over SCTP or UDP, both of which provide transient
   authentication and confidentiality, the IPFIX protocol does not
   support this requirement directly.  It is assumed that this
   requirement will be met externally.

4.7.  Anonymization and Obfuscation

   To ensure the privacy of individuals and organizations at the
   endpoints of communications represented by flow records, it is often
   necessary to obfuscate or anonymize stored and exported flow data.
   The ideal flow storage format will provide for a notation that a
   given information element on a given record type represents
   anonymized, rather than real, data.

   The IPFIX message format presently has no support for anonymization
   notation.  It should be noted that anonymization is one of the
   requirements given for IPFIX in RFC 3917 [RFC3917].  The decision to
   qualify this requirement with 'MAY' and not 'MUST' in the
   requirements document, and its subsequent lack of specification in
   the current version of the IPFIX protocol, is due to the fact that
   anonymization algorithms are still a research issue, and that there
   currently exist no standardized methods for anonymization.

   Simple anonymization notation may be attached to templates via IPFIX
   Options according to the Template Anonymization Options Template
   defined below.

4.8.  Performance Characteristics

   The ideal standard flow storage format will not have a significant
   negative impact on the performance of the application implementing
   it.  This is a non-functional requirement, but it is important to
   note that a standard that implies a performance penalty is unlikely
   to be widely implemented and adopted.

   A static analysis of the IPFIX message format would seem to suggest


Trammell, et al.        Expires January 10, 2008               [Page 10]

Internet-Draft                 IPFIX Files                     July 2007


   that implementations of it are not particularly prone to slowness;
   indeed, a template-based data representation is more easily subject
   to optimization for common cases than representations that embed
   structural information directly in the data stream (e.g.  XML).
   However, a full analysis of the impact of using IPFIX messages as a
   basis for flow data storage on read/write performance will require
   more implementation experience and performance measurement.


5.  Survey of Existing Flow and Trace File Formats

5.1.  NetFlow V5/V7

   One de facto standard for the storage of flow data collected via
   Cisco NetFlow V5 or V7 is to serialize a stream of "raw" NetFlow
   datagrams into files.  These NetFlow PDU files consist of a
   collection of header- prefixed blocks (corresponding to the datagrams
   as received on the wire) containing fixed-length binary flow records.
   NetFlow V5 and V7 data may be mixed within a given file, as the
   header on each datagram defines the NetFlow version of the records
   following; there is indeed very little difference between the two
   record formats.

   NetFlow V5/V7 PDU files are neither extensible nor self-describing;
   however, their status as a de facto standard means the definition of
   the data format is well-understood.  Indexing, compression, error
   detection and correction, authentication, and confidentiality must be
   handled externally.

5.2.  Argus 2

   QoSient's Argus (as of version 2.0.6) uses a file format based upon a
   stream of type-and-length prefixed records.  There are two general
   types of records in this stream, management records and flow records.
   Management records export flow collection statistics, much like the
   recommended scoped data records in the IPFIX protocol.  Flow records
   contain information about a single flow each, and are further typed
   based upon the protocol of the flow (e.g., IP, ICMP, ARP).  The Argus
   file format natively spports bidirectional flow export, as each flow
   record contains both forward and reverse counters.

   The Argus tools support a transport protocol that simply encapsulates
   a record stream over a TCP connection.  Transport is collector-
   initiated; that is, a collector establishes a connection to an
   exporter in order to read a record stream.

   Argus files are not self-describing; that is, only the Argus tools
   themselves encapsulate the definition of each of the record types.


Trammell, et al.        Expires January 10, 2008               [Page 11]

Internet-Draft                 IPFIX Files                     July 2007


   The Argus file format is not extensible without changing the Argus
   implementation.  Argus provides no indexing facility for its file
   format, though records are roughly sorted by record generation time.
   Compression, error correction, authentication, and confidentiality
   are handled externally to the format, and are available as with all
   files.  There is no special support for data obfuscation in the
   format.

5.3.  SiLK

   The CERT/NetSA SiLK tools use a set of fixed-length binary record
   formats.  Each file is prefixed with a header which denotes which
   record format the file is stored in.  These record formats are
   differentiated by the presence or absence of certain fields; in this
   way, each format identifier is essentially a short-hand identifier
   for a template describing the record.  This also implies that only
   one type of record may be stored in any given file.

   As with Argus, SiLK files are not self-describing and are not
   extensible.  SiLK provides no indexing facility, though files are
   generally stored in flow end time order; and when used for archival
   storage, information about sensors and flow times appearing in each
   file is stored in the file path name.  Compression is handled
   internally to the file format, and allows the storage of compressed
   data in a file with uncompressed headers, and a guarantee of
   compression block boundary alignment with record boundaries.  Error
   correction, authentication, and confidentiality can be handled
   externally.  There is no special support for data obfuscation in the
   SiLK file format.

5.4.  libpcap dumpfile

   The libpcap dumpfile format is a packet trace format rather than a
   flow file format, so it does not address any of the requirements
   outlined above.  However, it is used widely in a use case (data
   storage and distribution for network measurement research) similar to
   one addressed by the format proposed in this draft, so we include it
   here.

   libpcap dumpfiles consist of a file header containing information
   common to the whole file (most importantly, the datalink layer, for
   interpretation of the datalink headers on each frame), followed by a
   set of raw captured frame records each prefixed by a frame header
   containing timestamp and length information.  The format is not
   particularly flexible or self-describing, nor does it need to be:
   undecoded frames are about as semantically simple as network traffic
   data can get.


Trammell, et al.        Expires January 10, 2008               [Page 12]

Internet-Draft                 IPFIX Files                     July 2007


   However, the simplicity and ubiquity of the libpcap dumpfile format
   has led to its becoming a de facto standard for the distribution of
   packet trace data for Internet measurement applications.  We propose
   the file format described in this draft in part as an analogue to the
   libpcap dumpfile format for flow data.

   Note that libpcap dumpfiles could be used as a storage format for any
   unidirectional, datagram-oriented protocol such as IPFIX or NetFlow,
   simply by storing the captured export session.  However, this has
   several important drawbacks.  First, the additional per-packet
   headers provided by pcap are redundant in the case of IPFIX, as
   length and export time are already available in the IPFIX Message
   Header.  Second, the link, network, and transport layer headers are
   stored in a dumpfile; these are not necessary for the successful
   interpretation of an IPFIX Message, and add additional decode
   overhead.  Third, a file created by capturing an export session may
   require additional processing to reassemble fragmented datagrams in
   the message stream.


6.  IPFIX File Format Description

   An IPFIX file, as defined by this draft and elaborated below, is at
   its core simply an IPFIX Message stream serialized to some
   filesystem.  Any valid serialized IPFIX Message stream MUST be
   accepted by a File Reader as a valid IPFIX file.  In this way, the
   filesystem is simply treated as another IPFIX Transport alongside
   SCTP, TCP, and UDP, although one with unusually high latency, as the
   File Reader and File Writer are not necessarily synchronized in time,
   unlike IPFIX Collecting and Exporting Processes.

   An IPFIX File Reader MUST accept as valid any IPFIX message stream
   that would be considered valid by one or more of the other defined
   IPFIX transport layers.  Practically, this means that the union of
   template management features supported by SCTP, TCP, and UDP MUST be
   supported in IPFIX Files.  The following requirements apply to IPFIX
   File Readers:

   o  File Readers MUST accept IPFIX Messages containing Template Sets,
      Options Template Sets, and Data Sets within the same message, as
      with IPFIX over TCP or UDP.

   o  File Readers MUST accept Template Sets that define templates
      already defined within the file, as may occur with template
      retransmission when using IPFIX over UDP as described in section
      10.3.6 of the IPFIX Protocol draft [I-D.ietf-ipfix-protocol].  In
      the event of a conflict between a resent definition and a previous
      definition, the File Reader MUST assume that the new template


Trammell, et al.        Expires January 10, 2008               [Page 13]

Internet-Draft                 IPFIX Files                     July 2007


      replaces the old, as consistent with UDP template expiration and
      ID reuse.

   o  File Readers MUST accept Template Withdrawals as described in
      section 8 of the IPFIX Protocol draft [I-D.ietf-ipfix-protocol],
      provided that the Template to be withdrawn is defined, as is the
      case with IPFIX over TCP and SCTP.

   However, for representation simplicity and read performance, File
   Writers SHOULD use the following template and scope management
   strategy:

   o  File Writers SHOULD emit Template Sets and Options Template Sets
      to appear at the beginning of the file, before any Data Sets, to
      ensure all Templates are available and can be inspected before any
      data is read.  If the set of Templates used within a File is not
      known when the File Writer starts writing the File, the File
      Writer MAY interleave Template Sets and Options Template Sets with
      Data Sets within the File, but SHOULD write each Template Set or
      Options Template Set before any Data Set described by that
      Template.

   o  File Writers SHOULD emit special Data Records described by Options
      Templates at the beginning of the file after Template Sets and
      Options Template Sets as above, but before any other Data Records,
      in the following order:

      *  Time window order records described by the File Time Window
         Options Template as defined in section 6.2.3 below; followed by

      *  commonPropertiesId definitions as described in "Reducing
         Redundancy in IPFIX and PSAMP Reports"
         [I-D.ietf-ipfix-reducing-redundancy]; followed by

      *  Semantics records as described in "Extended Type Information
         for IPFIX Enterprise-Specific Information Elements"
         [I-D.boschi-ipfix-extended-type]; followed by

      *  Anonymization notation records described by the Template
         Anonymization Options Template as defined in section 6.2.2
         below.

   o  File Writers SHOULD emit Data Records described by Options
      Templates to appear in the file before any Data Records which
      depend on the scopes defined by those options.

   o  File Writers SHOULD use Template Withdrawals to withdraw Templates
      if template IDs need to be reused.  In this case, the new


Trammell, et al.        Expires January 10, 2008               [Page 14]

Internet-Draft                 IPFIX Files                     July 2007


      Templates reusing those IDs SHOULD appear directly in the file
      after the Template Withdrawals making the IDs available for reuse.
      Template Withdrawals SHOULD NOT be used unless necessary to reuse
      template IDs.

   Each IPFIX File is generally synonymous with a single Transport
   Session.  File Writers SHOULD store the Templates and Options
   required to decode the data within the File in the File itself, and
   File Readers SHOULD NOT use Templates or Options defined in one file
   to decode or interpret Data Sets in another.

   However, some applications, particularly those storing large
   collections of data over long periods of time, may benefit from the
   ability to treat a collection of IPFIX Files as a single Transport
   Session.  A File Reader MAY be configurable to treat a collection of
   Files (e.g., all the files in a directory) as a single Transport
   Session.  However, a File Reader MUST NOT treat a single IPFIX File
   as containing multiple Transport Sessions.

   File Writers SHOULD write IPFIX Messages within an IPFIX File in
   ascending Export Time order.  If a File Writer is writing data
   collected from an IPFIX Collecting Process, the Export Time SHOULD be
   the export time as reported by the remote IPFIX Exporting Process;
   otherwise, the Export Time should be the time at which the message
   was written to the file.

   By default, File Writers MAY write records to an IPFIX File in any
   order.  However, File Writers that write flow records to an IPFIX
   File in flowStartTime or flowEndTime order SHOULD be consistent in
   this ordering within each File.

   If an IPFIX File uses the technique described in "Reducing Redundancy
   in IPFIX and PSAMP Reports" [I-D.ietf-ipfix-reducing-redundancy] AND
   all of the non-Options Templates in the File contain the
   commonPropertiesId Information Element, a File Reader MAY assume the
   set of commonPropertiesId definitions provides a complete table of
   contents for the file, for searching purposes.

6.1.  Recommended Information Elements for IPFIX Files

   The following information elements are used by the options templates
   below to allow IPFIX message streams to meet the requirements
   outlined above without extension to the message format or protocol.
   IPFIX File Readers and Writers SHOULD support these Information
   Elements as defined below.

   In addition, IPFIX File Readers and Writers SHOULD support the
   Information Elements defined in "Extended Type Information for IPFIX


Trammell, et al.        Expires January 10, 2008               [Page 15]

Internet-Draft                 IPFIX Files                     July 2007


   Enterprise-Specific Information Elements"
   [I-D.boschi-ipfix-extended-type] in order to support self-description
   of Enterprise-Specific Information Elements and anonymization
   notation.

6.1.1.  collectionTimeMilliseconds

   Description:   The absolute timestamp at which the data within the
      scope containing this IE was received by a Collecting Process.
      This IE SHOULD be bound to its containing IPFIX Message via an
      options record and the messageScope IE, as defined below.

   Abstract Data Type:   dateTimeMilliseconds

   ElementId:   TBD1

   Status:   Proposed

6.1.2.  informationElementAnonymizationType

   Description:   A description of the anonymization status of an IPFIX
      information element within a template.  If this field is FALSE,
      the corresponding IE is not anonymized; to the best ability of the
      Exporting Process to determine, it represents a real value.  If
      this field is TRUE, the corresponding IE is anonymized; to the
      best ability of the Exporting Process to determine, it represents
      a value that has been transformed to maintain privacy.  Note that
      if no informationElementAnonymizationType is specified for an
      information element, it is assumed to be FALSE, or not anonymized.

   Abstract Data Type:   boolean

   ElementId:   TBD2

   Status:   Proposed

6.1.3.  maxExportSeconds

   Description:   The absolute Export Time of the latest IPFIX message
      within the scope containing this IE.  This IE SHOULD be bound to
      its containing IPFIX Transport Session (i.e., File) via an options
      record and the sessionScope IE, as defined below, and SHOULD
      appear only once in a given IPFIX File.

   Abstract Data Type:   dateTimeSeconds


Trammell, et al.        Expires January 10, 2008               [Page 16]

Internet-Draft                 IPFIX Files                     July 2007


   ElementId:   TBD3

   Status:   Proposed

   Units:   seconds

6.1.4.  maxFlowEndSeconds

   Description:   The latest absolute timestamp of the last packet
      within any Flow within the scope containing this IE, rounded up to
      the second.  This IE SHOULD be bound to its containing IPFIX
      Transport Session (i.e., File) via an options record and the
      sessionScope IE, as defined below, and SHOULD appear only once in
      a given IPFIX File.

   Abstract Data Type:   dateTimeSeconds

   ElementId:   TBD4

   Status:   Proposed

   Units:   seconds

6.1.5.  messageMD5Checksum

   Description:   The MD5 checksum of the IPFIX Message containing this
      record.  This IE SHOULD be bound to its containing IPFIX Message
      via an options record and the messageScope IE, as defined below,
      and SHOULD appear only once in a given IPFIX Message.  To
      calculate the value of this IE, first buffer the containing IPFIX
      Message, setting the value of this IE to all zeroes.  Then
      caluclate the MD5 checksum of the resulting buffer as defined in
      RFC 1321 [RFC1321], place the resulting value in this IE, and
      export the buffered message.

   Abstract Data Type:   octetArray (16 bytes)

   ElementId:   TBD5

   Status:   Proposed

   Reference:   RFC 1321, The MD5 Message-Digest Algorithm [RFC1321]

6.1.6.  messageScope


Trammell, et al.        Expires January 10, 2008               [Page 17]

Internet-Draft                 IPFIX Files                     July 2007


   Description:   The presence of this Information Element as scope in
      an Options Template signifies that the options described by the
      Template apply to the IPFIX Message that contains them.  It is
      defined for general purpose message scoping of options, and
      proposed specifically to allow the attachment a checksum to a
      message via IPFIX Options.  The value of this Information Element
      MUST be written as 0 by the File Writer or Exporting Process.  The
      value of this Information Element MUST be ignored by the File
      Reader or the Collecting Process.

   Abstract Data Type:   octet

   ElementId:   TBD6

   Status:   Proposed

6.1.7.  minExportSeconds

   Description:   The absolute Export Time of the earliest IPFIX message
      within the scope containing this IE.  This IE SHOULD be bound to
      its containing IPFIX Transport Session (i.e., File) via an options
      record and the sessionScope IE, as defined below, and SHOULD
      appear only once in a given IPFIX File.

   Abstract Data Type:   dateTimeSeconds

   ElementId:   TBD7

   Status:   Proposed

   Units:   seconds

6.1.8.  minFlowStartSeconds

   Description:   The earliest absolute timestamp of the first packet
      within any Flow within the scope containing this IE, rounded down
      to the second.  This IE SHOULD be bound to its containing IPFIX
      Transport Session (i.e., File) via an options record and the
      sessionScope IE, as defined below, and SHOULD appear only once in
      a given IPFIX File.

   Abstract Data Type:   dateTimeSeconds

   ElementId:   TBD8


Trammell, et al.        Expires January 10, 2008               [Page 18]

Internet-Draft                 IPFIX Files                     July 2007


   Status:   Proposed

   Units:   seconds

6.1.9.  sessionScope

   Description:   The presence of this Information Element as scope in
      an Options Template signifies that the options described by the
      Template apply to the IPFIX Transport Session that contains them.
      Note that as all options are implicitly scoped to Transport
      Session and Observation Domain, this Information Element is
      equivalent to a "null" scope.  It is defined for general purpose
      session scoping of options, and proposed specifically to allow the
      attachment of time window to a file via IPFIX Options.  The value
      of this Information Element MUST be written as 0 by the File
      Writer or Exporting Process.  The value of this Information
      Element MUST be ignored by the File Reader or the Collecting
      Process.

   Abstract Data Type:   octet

   ElementId:   TBD9

   Status:   Proposed

6.2.  Recommended Options Templates for IPFIX Files

   The following Options Templates allow IPFIX message streams to meet
   the requirements outlined above without extension to the message
   format or protocol.  They are defined in terms of existing
   Information Elements defined in the IPFIX Information Model
   [I-D.ietf-ipfix-info], the extended type Information Elements defined
   in "Extended Type Information for IPFIX Enterprise-Specific
   Information Elements" [I-D.boschi-ipfix-extended-type], as well as
   Information Elements defined in the section above.  IPFIX File
   Readers and Writers SHOULD support these options templates as defined
   below.

   In addition, IPFIX File Readers and Writers SHOULD support the
   Options Templates defined in "Extended Type Information for IPFIX
   Enterprise-Specific Information Elements"
   [I-D.boschi-ipfix-extended-type] in order to support self-description
   of enterprise-specific Information Elements.

6.2.1.  Message Checksum Options Template

   The Message Checksum Options Template specifies the structure of a
   Data Record for attaching an MD5 message checksum to an IPFIX


Trammell, et al.        Expires January 10, 2008               [Page 19]

Internet-Draft                 IPFIX Files                     July 2007


   Message.  An MD5 message checksum as described MAY be used if long-
   term data integrity is important to the application.  The described
   Data Record MUST appear only once per IPFIX Message.

   The template SHOULD contain the following Information Elements:

   +--------------------+----------------------------------------------+
   | IE                 | Description                                  |
   +--------------------+----------------------------------------------+
   | messageScope       | A marker denoting this Option applies to the |
   |                    | whole IPFIX message; content is ignored.     |
   |                    | This Information Element MUST be defined as  |
   |                    | a Scope Field.                               |
   | messageMD5Checksum | The MD5 checksum of the containing IPFIX     |
   |                    | Message.                                     |
   +--------------------+----------------------------------------------+

6.2.2.  Template Anonymization Options Template

   The Template Anonymization Options Template specifies the structure
   of a Data Record for attaching anonymization notation information to
   Information Elements in specified Template Records.  A Data Record
   described by this Template SHOULD appear for each Information Element
   within a Template known by the Exporting Process or File Writer to
   contain anonymized data.

   The template SHOULD contain the following Information Elements:

   +-------------------------------------+-----------------------------+
   | IE                                  | Description                 |
   +-------------------------------------+-----------------------------+
   | templateId                          | The Template ID of the      |
   |                                     | template this record        |
   |                                     | describes; it is assumed to |
   |                                     | be valid within the         |
   |                                     | Observation Domain ID of    |
   |                                     | the containing IPFIX        |
   |                                     | Message, and MUST identify  |
   |                                     | a Template that has already |
   |                                     | been exported.  This        |
   |                                     | Information Element MUST be |
   |                                     | defined as a Scope Field.   |


Trammell, et al.        Expires January 10, 2008               [Page 20]

Internet-Draft                 IPFIX Files                     July 2007


   | informationElementId                | The Information Element     |
   |                                     | identifier of the           |
   |                                     | Information Element within  |
   |                                     | the specified Template this |
   |                                     | record describes.  This     |
   |                                     | Information Element MUST be |
   |                                     | defined as a Scope Field.   |
   | privateEnterpriseNumber             | The Private Enterprise      |
   |                                     | number of the Information   |
   |                                     | Element within the          |
   |                                     | specified Template this     |
   |                                     | record describes.  May be 0 |
   |                                     | if this record describes a  |
   |                                     | public Information Element. |
   |                                     | This Information Element    |
   |                                     | MUST be defined as a Scope  |
   |                                     | Field.                      |
   | informationElementAnonymizationType | The anonymization type of   |
   |                                     | the specified Information   |
   |                                     | Element.                    |
   +-------------------------------------+-----------------------------+

6.2.3.  File Time Window Options Template

   The File Time Window Options Template specifies the structure of a
   Data Record for attaching a time window to an IPFIX File; this Data
   Record is referred to as a time window record.  A time window record
   defines the earliest flow start time and the latest flow end time of
   the flow records within a File.  One and only one time window record
   MAY appear within an IPFIX File if the time window information is
   available; a File Writer MUST NOT write more than one time window
   record to an IPFIX File.  A File Writer that writes a time window
   record to a File MUST NOT write any Flow with a start time before the
   beginning of the window or an end time after the end of the window to
   that File.

   The template SHOULD contain the following Information Elements:

   +---------------------+---------------------------------------------+
   | IE                  | Description                                 |
   +---------------------+---------------------------------------------+
   | sessionScope        | A marker denoting this Option applies to    |
   |                     | the whole IPFIX Transport Session (i.e.,    |
   |                     | IPFIX File); content is ignored.  This      |
   |                     | Information Element MUST be defined as a    |
   |                     | Scope Field.                                |


Trammell, et al.        Expires January 10, 2008               [Page 21]

Internet-Draft                 IPFIX Files                     July 2007


   | minFlowStartSeconds | The start time of the earliest flow in the  |
   |                     | Transport Session (i.e., File) in epoch     |
   |                     | seconds.                                    |
   | maxFlowEndSeconds   | The end time of the latest flow in the      |
   |                     | Transport Session (i.e., File) in epoch     |
   |                     | seconds.                                    |
   +---------------------+---------------------------------------------+

6.2.4.  Export Session Details Options Template

   The Export Session Details Options Template specifies the structure
   of a Data Record for recording the details of an IPFIX Transport
   Session in an IPFIX File.  It is intended for use in storing a single
   complete IPFIX Transport Session in a single IPFIX File.  The
   described Data Record SHOULD appear only once in a given IPFIX File.

   The template SHOULD contain the following Information Elements,
   subject to applicability as noted on each Information Element:

   +----------------------------+--------------------------------------+
   | IE                         | Description                          |
   +----------------------------+--------------------------------------+
   | sessionScope               | A marker denoting this Option        |
   |                            | applies to the whole IPFIX Transport |
   |                            | Session (i.e., IPFIX File); content  |
   |                            | is ignored.  This Information        |
   |                            | Element MUST be defined as a Scope   |
   |                            | Field.                               |
   | exporterIPv4Address        | IPv4 address of the IPFIX Exporting  |
   |                            | Process from which the Messages in   |
   |                            | this Transport Session were          |
   |                            | received.  Present only for          |
   |                            | Exporting Processes with an IPv4     |
   |                            | interface.  For multi-homed SCTP     |
   |                            | associations, this SHOULD be the     |
   |                            | primary path endpoint address of the |
   |                            | Exporting Process.                   |
   | exporterIPv6Address        | IPv6 address of the IPFIX Exporting  |
   |                            | Process from which the Messages in   |
   |                            | this Transport Session were          |
   |                            | received.  Present only for          |
   |                            | Exporting Processes with an IPv6     |
   |                            | interface.  For multi-homed SCTP     |
   |                            | associations, this SHOULD be the     |
   |                            | primary path endpoint address of the |
   |                            | Exporting Process.                   |


Trammell, et al.        Expires January 10, 2008               [Page 22]

Internet-Draft                 IPFIX Files                     July 2007


   | exporterTransportPort      | The source port from which the       |
   |                            | Messages in this Transport Session   |
   |                            | were received.                       |
   | collectorIPv4Address       | IPv4 address of the IPFIX Collecting |
   |                            | Process which received the Messages  |
   |                            | in this Transport Session.  Present  |
   |                            | only for Collecting Processes with   |
   |                            | an IPv4 interface.  For multi-homed  |
   |                            | SCTP associations, this SHOULD be    |
   |                            | the primary path endpoint address of |
   |                            | the Collecting Process.              |
   | collectorIPv6Address       | IPv6 address of the IPFIX Collecting |
   |                            | Process which received the Messages  |
   |                            | in this Transport Session.  Present  |
   |                            | only for Collecting Processes with   |
   |                            | an IPv6 interface.  For multi-homed  |
   |                            | SCTP associations, this SHOULD be    |
   |                            | the primary path endpoint address of |
   |                            | the Collecting Process.              |
   | collectorTransportPort     | The destination port on which the    |
   |                            | Messages in this Transport Session   |
   |                            | were received.                       |
   | collectorTransportProtocol | The IP Protocol Identifier of the    |
   |                            | transport protocol used to transport |
   |                            | Messages within this Transport       |
   |                            | Session.                             |
   | collectorProtocolVersion   | The version of the IPFIX Protocol    |
   |                            | used to transport Messages within    |
   |                            | this Transport Session.              |
   | minExportSeconds           | The Export Time of the first Message |
   |                            | in the Transport Session.            |
   | maxExportSeconds           | The Export Time of the last Message  |
   |                            | in the Transport Session.            |
   +----------------------------+--------------------------------------+

6.2.5.  Message Details Options Template

   The Message Details Options Template specifies the structure of a
   Data Record for attaching additional export details to an IPFIX
   Message.  These details include the time at which a message was
   received and information about the export and collection
   infrastructure used to transport the Message.

   The template SHOULD contain the following Information Elements,
   subject to applicability as noted for each Information Element.  Note
   that when used in conjunction with the Export Session Details Options
   Template, when storing a single complete IPFIX Transport Session in
   an IPFIX File, this template SHOULD contain only the messageScope and


Trammell, et al.        Expires January 10, 2008               [Page 23]

Internet-Draft                 IPFIX Files                     July 2007


   collectionTimeMilliseconds Information Elements.

   +----------------------------+--------------------------------------+
   | IE                         | Description                          |
   +----------------------------+--------------------------------------+
   | messageScope               | A marker denoting this Option        |
   |                            | applies to the whole IPFIX message;  |
   |                            | content is ignored.  This            |
   |                            | Information Element MUST be defined  |
   |                            | as a Scope Field.                    |
   | collectionTimeMilliseconds | The absolute time at which this      |
   |                            | Message was received by the IPFIX    |
   |                            | Collecting Process.                  |
   | exporterIPv4Address        | IPv4 address of the IPFIX Exporting  |
   |                            | Process from which the Messages in   |
   |                            | this Transport Session were          |
   |                            | received.  Present only for          |
   |                            | Exporting Processes with an IPv4     |
   |                            | interface, and if this information   |
   |                            | is not available via the Export      |
   |                            | Session Details Options Template.    |
   |                            | For multi-homed SCTP associations,   |
   |                            | this SHOULD be the primary path      |
   |                            | endpoint address of the Exporting    |
   |                            | Process.                             |
   | exporterIPv6Address        | IPv6 address of the IPFIX Exporting  |
   |                            | Process from which the Messages in   |
   |                            | this Transport Session were          |
   |                            | received.  Present only for          |
   |                            | Exporting Processes with an IPv6     |
   |                            | interface, and if this information   |
   |                            | is not available via the Export      |
   |                            | Session Details Options Template.    |
   |                            | For multi-homed SCTP associations,   |
   |                            | this SHOULD be the primary path      |
   |                            | endpoint address of the Exporting    |
   |                            | Process.                             |
   | exporterTransportPort      | The source port from which the       |
   |                            | Messages in this Transport Session   |
   |                            | were received.  Present only if this |
   |                            | information is not available via the |
   |                            | Export Session Details Options       |
   |                            | Template.                            |


Trammell, et al.        Expires January 10, 2008               [Page 24]

Internet-Draft                 IPFIX Files                     July 2007


   | collectorIPv4Address       | IPv4 address of the IPFIX Collecting |
   |                            | Process which received the Messages  |
   |                            | in this Transport Session.  Present  |
   |                            | only for Collecting Processes with   |
   |                            | an IPv4 interface, and if this       |
   |                            | information is not available via the |
   |                            | Export Session Details Options       |
   |                            | Template.  For multi-homed SCTP      |
   |                            | associations, this SHOULD be the     |
   |                            | primary path endpoint address of the |
   |                            | Collecting Process.                  |
   | collectorIPv6Address       | IPv6 address of the IPFIX Collecting |
   |                            | Process which received the Messages  |
   |                            | in this Transport Session.  Present  |
   |                            | only for Collecting Processes with   |
   |                            | an IPv6 interface, and if this       |
   |                            | information is not available via the |
   |                            | Export Session Details Options       |
   |                            | Template.  For multi-homed SCTP      |
   |                            | associations, this SHOULD be the     |
   |                            | primary path endpoint address of the |
   |                            | Collecting Process.                  |
   | collectorTransportPort     | The destination port on which the    |
   |                            | Messages in this Transport Session   |
   |                            | were received.  Present only if this |
   |                            | information is not available via the |
   |                            | Export Session Details Options       |
   |                            | Template.                            |
   | collectorTransportProtocol | The IP Protocol Identifier of the    |
   |                            | transport protocol used to transport |
   |                            | Messages within this Transport       |
   |                            | Session.  Present only if this       |
   |                            | information is not available via the |
   |                            | Export Session Details Options       |
   |                            | Template.                            |
   | collectorProtocolVersion   | The version of the IPFIX Protocol    |
   |                            | used to transport Messages within    |
   |                            | this Transport Session.  Present     |
   |                            | only if this information is not      |
   |                            | available via the Export Session     |
   |                            | Details Options Template.            |
   +----------------------------+--------------------------------------+

6.3.  Recommended Compression Error Resilience Strategy

   Note that, since any file may be compressed and decompressed with a
   variety of widely available tools implementing a variety of
   compression standards (both specified and de facto), compression of


Trammell, et al.        Expires January 10, 2008               [Page 25]

Internet-Draft                 IPFIX Files                     July 2007


   IPFIX File data can be accomplished externally.  However, compression
   at the file level is not particularly resilient to errors; in the
   worst case, a single bit error in a stream-compressed file may result
   in the loss of the entire file.

   To limit the impact of errors on the recoverability of compressed
   data, we recommend the use of block compression where possible.
   Ideally, the block compression algorithm should support the
   identification and isolation of blocks containing errors; bzip2 is an
   example of such a block compressor.

   Since the block boundary of a block-compressed IPFIX File may fall in
   the middle of an IPFIX Message, resynchronization of an IPFIX Message
   stream by a File Reader after a compression error requires some care.
   The beginning of an IPFIX Message may be identified by its header
   signature (the Version field of the Message Header, 0x00 0x0A,
   followed by a 16-bit Message Length), but simply searching for the
   first occurance of the Version field is insufficient, since these two
   bytes may occur in valid IPFIX Template or Data Sets.

   Therefore, we propose the following algorithm for File Readers to
   resynchronize an IPFIX Message Stream after skipping a compressed
   block containing errors:

   1.  Search after the error for the first occurance of the byte string
       0x00, 0x0A (the IPFIX Message Header Version field.)

   2.  Treat this field as the beginning of a candidate IPFIX Message.
       Read the two bytes following the Version field as a Message
       Length, and seek to that offset from the beginning of the
       candidate IPFIX Message.

   3.  If the first two bytes after the candidate IPFIX Message are
       0x00, 0x0A (i.e., the IPFIX Message Header Version field of the
       next message in the stream), or if the end of the file is reached
       precisely at the end of the candidate IPFIX Message, presume that
       the candidate IPFIX Message is valid, and begin reading the IPFIX
       File from the start of the candidate IPFIX Message.

   4.  If not, or if the seek reaches end-of-file or another block
       containing errors before finding the end of the candidate
       message, go back to step 1, starting the search two bytes from
       the start of the candidate IPFIX Message.

   The algorithm above will improperly identify a non-message as a
   message approximately 1 in 2^32 times, assuming random IPFIX data.
   It may be expanded to consider multiple candidate IPFIX Messages in
   order to increase reliability.


Trammell, et al.        Expires January 10, 2008               [Page 26]

Internet-Draft                 IPFIX Files                     July 2007


   In applications (e.g. archival storage) in which error resilience is
   very important, File Writers SHOULD use block compression algorithms,
   and MAY attempt to align IPFIX Messages within compression blocks to
   ease resynchronization after errors, if such is supported by the
   chosen block compressor.  File Readers SHOULD use the
   resynchronization algorithm above to minimize data loss due to
   compression errors.

6.4.  Recommended Encryption Error Resilience Strategy

   File-level encryption has error resiliency issues similar to file-
   level compression.  Single bit errors in the encrypted data stream
   can result in unreadability of the entire remaining file, dependent
   on the encryption method used.  The use of CBC (Cipher Block
   Chaining) mode, which suffers from this low error resilience, is
   relatively common.

   In applications (e.g. archival storage) in which error resilience is
   very important, File Writers SHOULD use a stream cipher, for example
   a block cipher in OFB (Output Feedback) mode (often referred to as
   stream mode) instead of modes like CBC when encrypting, since errors
   are not amplified by stream ciphers: A single-bit error in the
   ciphertext results in a single bit error in the plaintext.
   Alternatively File Writers SHOULD use any other cipher which can
   resynchronize after bit errors.  An example is a block cipher in CBC
   mode that is reinitialized after a specific amount of data has been
   encrypted.  The maximum data loss per bit-error is then up to the
   next reinitialization point.  In this case, File Writers SHOULD also
   use the Message Checksum Options Template to attach a checksum to
   each IPFIX Message in the IPFIX File, in order to support the
   recognition of errors in the decrypted data.


7.  Applicability of IPFIX Files

   This section describes the specific applicability of IPFIX Files to
   various use cases.  IPFIX Files are particularly useful in a flow
   collection and processing infrastructure using IPFIX for flow export.
   We explore the applicability and provide guidelines for using IPFIX
   files during the implementation and operation of IPFIX Collecting
   Processes.

7.1.  Testing IPFIX Collecting Processes

   IPFIX Files can be used to store IPFIX Messages for the testing of
   IPFIX Collecting Processes.  A variety of test cases may be stored in
   IPFIX Files.  First, IPFIX data sets collected in real network
   environments and stored in an IPFIX File can be used as input to


Trammell, et al.        Expires January 10, 2008               [Page 27]

Internet-Draft                 IPFIX Files                     July 2007


   check the behavior of new or extended implementations of IPFIX
   Collectors.  Furthermore, IPFIX Files could be used to validate the
   operation of a given IPFIX Collecting Process in a new environment,
   i.e., to test with recorded IPFIX data from the target network before
   installing the Collecting Process in the network.

   The IPFIX File format can also be used to store artificial, non-
   compliant reference messages for specific Collecting Process test
   cases.  Examples for such test cases are sets of IPFIX records with
   undefined Information Elements, Data Records described by missing
   Templates, or incorrectly framed messages or data sets.
   Representative error handling test cases are defined in "IPFIX
   Testing" [I-D.ietf-ipfix-testing].

   Furthermore, fast replay of IPFIX records stored in a file can be
   used for stress/load tests (e.g., high rate of incoming Data Records,
   large Templates with high Information Element counts), as described
   in "IPFIX Testing" [I-D.ietf-ipfix-testing].  The provisioning and
   use of a set of reference files for testing simplifies the
   performance of tests and increases the comparability of test results.

   Note that an extremely simple IPFIX Exporting Process may be crafted
   for testing purposes by simply reading an IPFIX File and transmitting
   it directly to a Collecting Process.  Similarly, an extremely simple
   Collecting Process may be crafted for testing purposes by simply
   accepting connections and/or IPFIX Messages from Exporting Processes
   and writing the session's message stream to an IPFIX File.

7.2.  Storage of IPFIX-collected Flow Data

   IPFIX Files can also, naturally, be used to store flow data collected
   by an IPFIX Collecting Process; indeed, this was one of the primary
   initial motivations behind the file format described within this
   document.  Using IPFIX Files as such allows IPFIX implementations to
   leverage substantially the same code for flow export and flow
   storage.  In addition, the storage of single Transport Sessions in
   IPFIX Files is particularly important for network measurement
   research, allowing repeatability of experiments by providing a format
   for the storage and exchange of IPFIX flow trace data much as the
   libpcap format is used for experiments on packet trace data.

   As noted in the section above, the simplest way for a Collecting
   Process to store the data collected in a single Transport Session is
   to simply write the incoming IPFIX Messages to a file as they are
   read.  However, while the resulting files are valid IPFIX Files, they
   are lacking information about the IPFIX Transport Session used to
   export them, such as the network addresses of the Exporting and
   Collecting Processes and the protocols used to transport them.  An


Trammell, et al.        Expires January 10, 2008               [Page 28]

Internet-Draft                 IPFIX Files                     July 2007


   IPFIX File Writer MAY store a single IPFIX Transport Session in an
   IPFIX File and record information about the Transport Session using
   the Export Session Details Options Template described above.

   Additional per-Message information MAY be recorded by the File Writer
   using the Message Details Options Template described above.  Per-
   message information includes the time at which each IPFIX Message was
   received at the Collecting Process, and can be used to resend IPFIX
   Messages while keeping the original measurement plane traffic
   profile.  This Options Template also allows the storage of the export
   session metainformation provided the Export Session Details Options
   Template, for storing information from multiple Transport Sessions in
   the same IPFIX File.


8.  Examples

   [TODO in revision -05 or later]


9.  Security Considerations

   The IPFIX-based file format itself does not directly introduce
   security issues.  Rather it is used to store information which may
   for privacy or business issues be considered sensitive.  The file
   format must therefore provide appropriate procedures to guarantee the
   integrity and confidentiality of the stored information.

   The underlying protocol used to exchange the information that will be
   stored using the format proposed in this document must as well apply
   appropriate procedures to guarantee the integrity and confidentiality
   of the exported information.  Such issues are addressed in separate
   documents, specifically in the IPFIX Protocol
   [I-D.ietf-ipfix-protocol].


10.  IANA Considerations

   This document specifies the creation of several new IPFIX Information
   Elements in the IPFIX Information Element registry located at
   http://www.iana.org/assignments/ipfix, as defined in section 6.1
   above.  IANA has assigned the following Information Element numbers
   for their respective Information Elements as specified below:

   o  Information Element number TBD1 for the collectionTimeMilliseconds
      Information Element.


Trammell, et al.        Expires January 10, 2008               [Page 29]

Internet-Draft                 IPFIX Files                     July 2007


   o  Information Element number TBD2 for the
      informationElementAnonymizationType Information Element.

   o  Information Element number TBD3 for the maxExportSeconds
      Information Element.

   o  Information Element number TBD4 for the maxFlowEndSeconds
      Information Element.

   o  Information Element number TBD5 for the messageMD5Checksum
      Information Element.

   o  Information Element number TBD6 for the messageScope Information
      Element.

   o  Information Element number TBD7 for the minExportSeconds
      Information Element.

   o  Information Element number TBD8 for the minFlowStartSeconds
      Information Element.

   o  Information Element number TBD9 for the sessionScope Information
      Element.

   [NOTE for IANA: The text TBD1, TBD2, TBD3, TBD4, TBD5, TBD6, TBD7,
   TBD8, and TBD9 should be replaced with the respective assigned
   Information Element numbers where they appear in this document.]


11.  Acknowledgements

   Thanks to Maurizio Molina, Tom Kosnar, Andreas Kind, and Andrew
   Johnson for technical assistance with the requirements and their
   implementation within this specification.


12.  References

12.1.  Normative References

   [I-D.ietf-ipfix-protocol]
              Claise, B., "Specification of the IPFIX Protocol for the
              Exchange", draft-ietf-ipfix-protocol-24 (work in
              progress), November 2006.

   [I-D.ietf-ipfix-info]
              Quittek, J., "Information Model for IP Flow Information
              Export", draft-ietf-ipfix-info-15 (work in progress),


Trammell, et al.        Expires January 10, 2008               [Page 30]

Internet-Draft                 IPFIX Files                     July 2007


              February 2007.

   [I-D.ietf-ipfix-reducing-redundancy]
              Boschi, E., "Reducing Redundancy in IP Flow Information
              Export (IPFIX) and Packet  Sampling (PSAMP) Reports",
              draft-ietf-ipfix-reducing-redundancy-04 (work in
              progress), May 2007.

   [RFC1321]  Rivest, R., "The MD5 Message-Digest Algorithm", RFC 1321,
              April 1992.

   [I-D.boschi-ipfix-extended-type]
              Boschi, E., Mark, L., Trammell, B., and T. Zseby,
              "Extended Type Information for IPFIX Enterprise-Specific
              Information Elements", draft-boschi-ipfix-ext-type-00
              (work in progress), June 2007.

12.2.  Informative References

   [I-D.ietf-ipfix-biflow]
              Trammell, B. and E. Boschi, "Bidirectional Flow Export
              using IPFIX", draft-ietf-ipfix-biflow-05 (work in
              progress), June 2007.

   [I-D.ietf-ipfix-testing]
              Schmoll, C. and P. Aitken, "IP Flow Information eXport
              (IPFIX) Testing", draft-ietf-ipfix-testing-01 (work in
              progress), June 2007.

   [RFC3917]  Quittek, J., Zseby, T., Claise, B., and S. Zander,
              "Requirements for IP Flow Information Export (IPFIX)",
              RFC 3917, October 2004.

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119, March 1997.

   [SAINT2007]
              Trammell, B., Boschi, E., Mark, L., and T. Zseby,
              "Requirements for a standardized flow storage solution",
               in Proceedings of the SAINT 2007 workshop on Internet
              Measurement Technology, Hiroshima, Japan, January 2007.


Trammell, et al.        Expires January 10, 2008               [Page 31]

Internet-Draft                 IPFIX Files                     July 2007


Authors' Addresses

   Brian H. Trammell
   CERT Network Situational Awareness
   Software Engineering Institute
   4500 Fifth Avenue
   Pittsburgh, Pennsylvania  15213
   United States

   Phone: +1 412 268 9748
   Email: bht@cert.org


   Elisa Boschi
   Hitachi Europe SAS
   Immeuble Le Theleme
   1503 Route les Dolines
   06560 Valbonne
   France

   Phone: +33 4 89874100
   Email: elisa.boschi@hitachi-eu.com


   Lutz Mark
   Fraunhofer Institute for Open Communication Systems
   Kaiserin-Augusta-Allee 31
   10589 Berlin
   Germany

   Phone: +49 30 3463 7306
   Email: lutz.mark@fokus.fraunhofer.de


   Tanja Zseby
   Fraunhofer Institute for Open Communication Systems
   Kaiserin-Augusta-Allee 31
   10589 Berlin
   Germany

   Phone: +49 30 3463 7153
   Email: tanja.zseby@fokus.fraunhofer.de


Trammell, et al.        Expires January 10, 2008               [Page 32]

Internet-Draft                 IPFIX Files                     July 2007


   Arno Wagner
   Swiss Federal Institute of Technology Zurich
   Gloriastrasse 35
   8092 Zurich
   Switzerland

   Phone: +41 44 632 70 04
   Email: arno@wagner.name


Trammell, et al.        Expires January 10, 2008               [Page 33]

Internet-Draft                 IPFIX Files                     July 2007


Full Copyright Statement

   Copyright (C) The IETF Trust (2007).

   This document is subject to the rights, licenses and restrictions
   contained in BCP 78, and except as set forth therein, the authors
   retain all their rights.

   This document and the information contained herein are provided on an
   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
   THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
   OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
   THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.


Intellectual Property

   The IETF takes no position regarding the validity or scope of any
   Intellectual Property Rights or other rights that might be claimed to
   pertain to the implementation or use of the technology described in
   this document or the extent to which any license under such rights
   might or might not be available; nor does it represent that it has
   made any independent effort to identify any such rights.  Information
   on the procedures with respect to rights in RFC documents can be
   found in BCP 78 and BCP 79.

   Copies of IPR disclosures made to the IETF Secretariat and any
   assurances of licenses to be made available, or the result of an
   attempt made to obtain a general license or permission for the use of
   such proprietary rights by implementers or users of this
   specification can be obtained from the IETF on-line IPR repository at
   http://www.ietf.org/ipr.

   The IETF invites any interested party to bring to its attention any
   copyrights, patents or patent applications, or other proprietary
   rights that may cover technology that may be required to implement
   this standard.  Please address the information to the IETF at
   ietf-ipr@ietf.org.


Acknowledgment

   Funding for the RFC Editor function is provided by the IETF
   Administrative Support Activity (IASA).


Trammell, et al.        Expires January 10, 2008               [Page 34]