TOC 
Network Working GroupS. Niccolini
Internet-DraftNEC
Intended status: InformationalB. Claise
Expires: August 13, 2010Cisco Systems Inc.
 B. Trammell
 Hitachi Europe
 H. Kaplan
 ACME Packet
 February 09, 2010


Advantages of using an IPFIX File Format for SIPCLF
draft-niccolini-sipclf-ipfix-00

Abstract

The SIPCLF WG is currently trying to identify file format(s) suitable for its scope as defined in the current charter. The group has so far received proposals for a text format and an indexed-text format that are going to be soon combined. Several people believe that IPFIX could be a valid alternative to these two proposals for several reasons detailed in this document.

This document discusses the main advantages related to the usage of IPFIX file format for SIPCLF. Additionally it gives preliminary indications on the basic set of information elements that would need to be defined by the SIPCLF WG.

Status of this Memo

This Internet-Draft is submitted to IETF in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as “work in progress.”

The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt.

The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html.

This Internet-Draft will expire on August 13, 2010.

Copyright Notice

Copyright (c) 2010 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the BSD License.



Table of Contents

1.  Introduction
2.  Why IPFIX makes sense for SIPCLF
    2.1.  Thinking about the future
3.  IPFIX File Format
4.  Information Model and Information Elements for SIPCLF
5.  IPFIX file format for SIPCLF: an example visualized
6.  Summary
7.  Security Considerations
8.  IANA Considerations
9.  Acknowledgments
10.  Informative References
§  Authors' Addresses




 TOC 

1.  Introduction

The SIPCLF WG is chartered to produce a format suitable for logging at any SIP element. The charter explicitly addresses the need to search, merge and summarize log records from one or more possibly diverse elements as well as the need to correlate messages from multiple elements. An additional consideration to be taken into account when defining the file format is the extensibility of SIP. The SIPCLF WG is chartered to identify the fields to appear in a log record and provide one or more formats for encoding those fields. Specifying the mechanics of exchanging, transporting, and storing SIP Common Log Format records is explicitly out of scope.

This document tries to detail why a file format based on IPFIX would make sense for SIPCLF, trying to highlight the main advantages correlated to it. The intention of the authors is to consider both the needs of finding an easy solution today, as well as one that is also ready to accommodate future requirements asily.



 TOC 

2.  Why IPFIX makes sense for SIPCLF

These are the main advantages in favor of IPFIX:

These points suggest that, even if today file format has potentially not much overlap with the IPFIX one now, there are advantages in choosing IPFIX file format from the beginning of SIPCLF design. This solution will allow quicker adoption in the beginning (thanks to the tools parsing IPFIX that are already available and can be reused right away) and avoid sub-optimality (e.g., proxy translating formats between each other) in the future.



 TOC 

2.1.  Thinking about the future

Thinking about the future (even if the charter doesn’t address these points), there are high chances that:



 TOC 

3.  IPFIX File Format

A draft on using IPFIX for SIPCLF does not need to completely detail a file format - the "file format" is already defined in [RFC5655] (Trammel, B., Boschi, E., Mark, L., Zseby, T., and A. Wagner, “Specification of the IP Flow Information Export (IPFIX) File Format,” October 2009.) and the SIPCLF one would be based on that. The advantage of this file format is that it was already designed to facilitate interoperability and reusability among a wide variety of storage, processing, and analysis tools.



 TOC 

4.  Information Model and Information Elements for SIPCLF

As defined in [RFC3444] (Pras, A. and J. Schoenwaelder, “On the Difference between Information Models and Data Models,” January 2003.) the main purpose of an Information Model is to model managed objects at a conceptual level, independent of any specific implementations or protocols used to transport the data.

This draft does not yet attempt to formally define information elements for SIPCLF. Below you can find initial reasoning about the "standard set" needed for logging SIP events.

Main information for SIPCLF log record is largely already defined in other documents, e.g., in [RFC3261] (Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, A., Peterson, J., Sparks, R., Handley, M., and E. Schooler, “SIP: Session Initiation Protocol,” June 2002.).

There are fundamentally two options depending on how detailed the information elements needs to be. The first possible option is to keep the information elements coarse grained, e.g., for each SIP message that is logged, a Status-Line or Request-Line, an ordered list of pairs of where each pair has a header-name and a header-value, and finally zero or one message-bodies. Additional information could be about where the message was coming from and going to from the transport layer as well as a timestamp. The second option is to have fine grained definition of information elements, e.g., defining SIP to and from tags as elements themselves (the list could be extended to include Call-Id, CSeq, Max-Forwards, Via, Contact, etc.). This draft does not favor one option or another as they both have pros and cons; the authors just want to point out that the definition of such information elements is going to be quite straightforward once the SIPCLF WG has agreed on the level of detail and complexity that SIPCLF log record requires (an example of these fields is available here [I‑D.gurbani‑sipclf‑problem‑statement] (Gurbani, V., Burger, E., Anjali, T., Abdelnur, H., and O. Festor, “The Common Log Format (CLF) for the Session Initiation Protocol (SIP),” January 2010.).

The authors wish to point out that there have been already attempts to define IPFIX information elements for when using IPFIX for SIP previously. This has been discussed already in the IPFIX WG based on an expired draft (draft-huici-ipfix-sipfix-00), the paper that was linked with that draft is referenced here [IM.sipfix] (Anderson, S., Niccolini, S., and D. Hogrefe, “SIPFIX: A Scheme For Distributed SIP Monitoring,” .). This solution clearly goes beyond what is the current chartered scope of SIPCLF WG but it is an useful reference with concrete examples on how the information elements that could be defined (excluding the information elements for the media plane that are out of scope for the SIPCLF WG).



 TOC 

5.  IPFIX file format for SIPCLF: an example visualized

This section is meant to give a short example of how an IPFIX file format would look like in the case of SIPCLF in order to increase understanding of what IPFIx would mean for SIPCLF. The SIPCLF fields used for defining the information elements (IEs) are meant to be just an example and are taken from available references of the SIPCLF WG [I‑D.gurbani‑sipclf‑problem‑statement] (Gurbani, V., Burger, E., Anjali, T., Abdelnur, H., and O. Festor, “The Common Log Format (CLF) for the Session Initiation Protocol (SIP),” January 2010.).

The information elements used by this example are shown in the table below; new Information Elements necessary for SIPCLF that are not yet registered with IANA have placeholder numbers ("AAA", "BBB", etc.)

NameNumTypeDescription
observationTimeSeconds 322 dateTimeSeconds Log timestamp in seconds since the UNIX epoch.
sourceIPv4Address 8 ipv4Address IPv4 address of the upstream client.
sourceIPv6Address 27 ipv6Address IPv6 address of the upstream client.
sipAuthUsername AAA string The user name by which the user has been authenticated.
sipMethod BBB unsigned8 (identifier) A code representing the SIP method, taken from the IPFIX sipMethod registry, to be created.
sipRequestURI CCC string The Request URI, including any parameters.
sipFromURI DDD string The From URI, including the tag.
sipToURI EEE string The To URI, including the tag.
sipCallId FFF string The To URI, including the tag.
sipResponseStatus GGG unsigned8 The SIP response code returned upstream
sipServerTransaction HHH string The transaction identifier associated with the server transaction.
sipClientTransaction JJJ string The transaction identifier associated with the server transaction.

[OPEN ISSUES WITH THIS IE SET: sip(Request|From|To)URI duplicate an enormous amount of information; is there value in parsing these out? For example, do they not always share a scheme? What about the host part; to what extent are these shared by To and Request, and how often are From and the sourceIPv[46]address the same? Do hosts need to be stored as names or will addresses suffice? What is a "tag" (From and To), and should it be separated; what structure does it have? What, if any, structure is there to sip(Server|Client)Transaction: is this always just an opaque ASCII string? Same question for sipCallId. Even if there are not "always" answers, we can leverage templates here to significantly increase the efficiency of common cases while still handling technically legal but weird ones. I'll start the box are bytefield drawing once these questions are answered; for now, here are tables describing the records. Here I'm specifying separate request and response records, but parsing these and tracking them based on transaction associations seems somewhat cumbersome. Is it possible to also provide a unified request/response template for the common case? Are sip(Server|Client)Transaction always necessary, or just for forking and for associating requests to responses; can we leave them off otherwise?]

A request record is then described by the following templates:

NameNumLenPresent?
observationTimeSeconds 322 4 always
sourceIPv4Address 8 4 v4 only
sourceIPv6Address 27 16 v6 only
sipMethod BBB 1 always
sipAuthUsername AAA variable if authenticated
sipRequestURI CCC variable always
sipFromURI DDD variable always
sipToURI EEE variable always
sipCallId FFF variable always
sipServerTransaction HHH variable always
sipClientTransaction JJJ variable always

and a response record by the following template:

NameNumLen
observationTimeSeconds 322 4
sipMethod BBB 1
sipResponseStatus GGG 1
sipServerTransaction HHH variable
sipClientTransaction JJJ variable
sipToURI EEE variable

A SIPCLF IPFIX file then consists of an IPFIX Message containing the template definitions above, followed by multiple IPFIX Messages containing the records in Data Sets corresponding to those template definitions, as in the figure below:



          +=================================================+
          | IPFIX Message                       seq. 0      |
          | +---------------------------------------------+ |
          | | Message Header (16 bytes)                   | |
          | +---------------------------------------------+ |
          | | Template Set (ID 2)                  5 recs | |
          | |   SIPCLF IPv4 request tmpl          ID 256  | |
          | |   SIPCLF IPv4 auth request tmpl     ID 257  | |
          | |   SIPCLF IPv6 request tmpl          ID 258  | |
          | |   SIPCLF IPv6 auth request tmpl     ID 259  | |
          | |   SIPCLF response tmpl              ID 260  | |
          | +---------------------------------------------+ |
          +=================================================+
          | IPFIX Message                       seq. 0      |
          | +---------------------------------------------+ |
          | | Message Header (16 bytes)                   | |
          | +---------------------------------------------+ |
          | | Data Set (ID 256)                    1 rec  | |
          | |  contains an unauthenticated IPv4 request   | |
          | +---------------------------------------------+ |
          | | Data Set (ID 260)                    1 rec  | |
          | |  contains a response                        | |
          | +---------------------------------------------+ |
          | | Data Set (ID 259)                    2 recs | |
          | |  contains authenticated IPv6 requests       | |
          | +---------------------------------------------+ |
          | | Data Set (ID 260)                    1 rec  | |
          | |  contains a response                        | |
          | +---------------------------------------------+ |
          +=================================================+
          | IPFIX Message                       seq. 5      |
          |                    . . .                        |
 Figure 1: File Example Structure 

Within each data set is one or more records, packed and encoded according to the template. Fixed length fields appear in place without additional framing or length prefixing. Variable length fields use length prefixing. Values 0-254 bytes long can be represented using a single-byte length prefix, 0x00 - 0xFE. Values 0-65516 bytes long can be represented using a three-byte length prefix, 0xFF followed by the length in network byte order. All SIPCLF variable-length fields are strings, and all IPFIX strings are encoded using UTF8.

[Note that the placement of data set and message boundaries are implementation dependent; some implementations may group and emit records by template, some may have only one record per data set or data set per message. If SIPCLF requires strict ordering of records by time, this can be specified, but is not mandatory IPFIX or IPFIX File behavior.]



 TOC 

6.  Summary

Quoted from Dave Harrington on the mailing list: “IPFIX already provides a protocol and a data modeling language. In addition, [RFC5655] (Trammel, B., Boschi, E., Mark, L., Zseby, T., and A. Wagner, “Specification of the IP Flow Information Export (IPFIX) File Format,” October 2009.) specifies a file format for storing data that has been received in the ipfix file format. The IPFIX File format is designed to facilitate interoperability and reusability among a wide variety of flow storage, processing, and analysis tools.”, let us ask the question differently: why would we have yet another data modeling language?



 TOC 

7.  Security Considerations

To be worked out.



 TOC 

8.  IANA Considerations

None.



 TOC 

9.  Acknowledgments

Cullen Jennings and many others provided insightful discussions, specific comments and much needed corrections.



 TOC 

10. Informative References

[I-D.gurbani-sipclf-problem-statement] Gurbani, V., Burger, E., Anjali, T., Abdelnur, H., and O. Festor, “The Common Log Format (CLF) for the Session Initiation Protocol (SIP),” draft-gurbani-sipclf-problem-statement-01 (work in progress), January 2010 (TXT).
[IM.sipfix] Anderson, S., Niccolini, S., and D. Hogrefe, “SIPFIX: A Scheme For Distributed SIP Monitoring,”  Proceedings of IEEE IM 2009, June 2009.
[RFC3261] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, A., Peterson, J., Sparks, R., Handley, M., and E. Schooler, “SIP: Session Initiation Protocol,” RFC 3261, June 2002 (TXT).
[RFC3444] Pras, A. and J. Schoenwaelder, “On the Difference between Information Models and Data Models,” RFC 3444, January 2003 (TXT).
[RFC3954] Claise, B., “Cisco Systems NetFlow Services Export Version 9,” RFC 3954, October 2004 (TXT).
[RFC5101] Claise, B., “Specification of the IP Flow Information Export (IPFIX) Protocol for the Exchange of IP Traffic Flow Information,” RFC 5101, January 2008 (TXT).
[RFC5102] Quittek, J., Bryant, S., Claise, B., Aitken, P., and J. Meyer, “Information Model for IP Flow Information Export,” RFC 5102, January 2008 (TXT).
[RFC5476] Claise, B., Johnson, A., and J. Quittek, “Packet Sampling (PSAMP) Protocol Specifications,” RFC 5476, March 2009 (TXT).
[RFC5655] Trammel, B., Boschi, E., Mark, L., Zseby, T., and A. Wagner, “Specification of the IP Flow Information Export (IPFIX) File Format,” RFC 5655, October 2009 (TXT).


 TOC 

Authors' Addresses

  Saverio Niccolini
  NEC Laboratories Europe, NEC Europe Ltd.
  Kurfuersten-Anlage 36
  Heidelberg 69115
  Germany
Phone:  +49 (0) 6221 4342 118
Email:  niccolini@nw.neclab.eu
URI:  http://www.nw.neclab.eu
  
  Benoit Claise
  Cisco Systems Inc.
  De Kleetlaan 6a b1
  Diegem, 1813
  Belgium
Phone:  +32 2 704 5622
Fax: 
Email:  bclaise@cisco.com
URI: 
  
  Brian Trammell
  Hitachi Europe
  c/o ETH Zurich
  Gloriastrasse 35
  8092 Zurich
  Switzerland
Phone:  +41 44 632 70 13
Email:  brian.trammell@hitachi-eu.com
  
  Hadriel Kaplan
  ACME Packet
  71 Third Ave.
  Burlington, MA 01803
  USA
Phone: 
Email:  hkaplan@acmepacket.com