HTTP/1.1 200 OK
Date: Tue, 09 Apr 2002 03:16:33 GMT
Server: Apache/1.3.20 (Unix)
Last-Modified: Tue, 11 Apr 2000 10:15:00 GMT
ETag: "2edc22-1fbba-38f2fb24"
Accept-Ranges: bytes
Content-Length: 129978
Connection: close
Content-Type: text/plain
Intrusion Detection Working Group D. Curry
draft-ietf-idwg-idmef-xml-00.txt ISS
Expires: September 14, 2000 March 15, 2000
Intrusion Detection Message Exchange Format
Extensible Markup Language (XML) Document Type Definition
Status of this Memo
This document is an Internet-Draft and is in full conformance with
all provisions of Section 10 of RFC 2026 [1].
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
Distribution of this memo is unlimited.
This Internet-Draft expires September 14, 2000.
1. Abstract
The purpose of the Intrusion Detection Message Exchange Format
(IDMEF) is to define data formats and exchange procedures for sharing
information of interest to intrusion detection and response systems,
and to the management systems which may need to interact with them.
The goals and requirements of the IDMEF are described in [2].
This Internet-Draft describes a proposed implementation of the data
format component of the IDMEF, using the Extensible Markup Language
(XML) [3] to represent the class hierarchy defined by Debar, Huang
and Donahoo [4]. The rationale for choosing XML is explained, a
Document Type Definition (DTD) is developed, and examples are
provided.
An earlier version of this implementation was reviewed, along with
other proposed implementations, by the IDWG at its September, 1999
and February, 2000 meetings. At the February meeting, it was decided
that the XML solution was best at fulfilling the IDWG requirements.
Internet-Draft IDMEF XML DTD March 15, 2000
TABLE OF CONTENTS
1. Abstract ........................................................ 1
2. Conventions used in this document ............................... 5
3. Introduction .................................................... 5
3.1 The Extensible Markup Language ............................. 6
3.2 Rationale for Implementing IDMEF in XML .................... 6
3.3 The Debar/Huang/Donahoo IDMEF Class Hierarchy .............. 7
4. Use of XML in the IDMEF ......................................... 8
4.1 The IDMEF Document Prolog .................................. 8
4.1.1 XML Declaration ...................................... 8
4.1.2 XML Document Type Definition (DTD) ................... 9
4.1.3 IDMEF DTD Formal Public Identifier ................... 10
4.1.4 IDMEF DTD Document Type Declaration .................. 10
4.2 Character Data Processing in XML and IDMEF ................. 10
4.2.1 Character Entity References .......................... 11
4.2.2 Character Code References ............................ 12
4.2.3 White Space Processing ............................... 12
4.3 Languages in XML and IDMEF ................................. 12
4.4 Unrecognized Tags in IDMEF Messages ........................ 13
5. IDMEF Data Types ................................................ 13
6. Structure of an IDMEF Message ................................... 14
6.1 The IDMEF-Message Root Element ............................. 15
6.2 The Message Type Elements .................................. 16
6.2.1 Alert ................................................ 16
6.2.1.1 CorrelationAlert ............................... 18
6.2.1.2 OverflowAlert .................................. 18
6.2.1.3 ToolAlert ...................................... 18
6.2.2 Heartbeat ............................................ 19
6.2.3 Query ................................................ 19
6.2.4 Response ............................................. 20
6.3 Time Elements .............................................. 21
6.3.1 Time ................................................. 21
6.3.2 DetectTime ........................................... 21
6.3.3 AnalyzerTime ......................................... 22
6.4 High-Level Entity Identification Elements .................. 22
6.4.1 Analyzer ............................................. 22
6.4.2 Source ............................................... 22
6.4.3 Target ............................................... 23
6.5 Low-Level Entity Identification Elements ................... 24
6.5.1 Address .............................................. 24
6.5.2 Name ................................................. 25
6.5.3 Node ................................................. 25
6.5.4 Process .............................................. 26
6.5.5 Service .............................................. 27
Curry Expires: September 14, 2000 [Page 2]
Internet-Draft IDMEF XML DTD March 15, 2000
6.5.5.1 SNMPService .................................... 28
6.5.5.2 WebService ..................................... 28
6.5.6 User ................................................. 29
6.6 Simple Elements ............................................ 30
6.6.1 alertid .............................................. 30
6.6.2 Arguments ............................................ 30
6.6.2.1 arg ............................................ 30
6.6.3 buffer ............................................... 30
6.6.4 cgi .................................................. 30
6.6.5 command .............................................. 30
6.6.6 community ............................................ 31
6.6.7 date ................................................. 31
6.6.8 dport ................................................ 31
6.6.9 Environment .......................................... 31
6.6.9.1 env ............................................ 31
6.6.10 gid ................................................. 31
6.6.11 group ............................................... 31
6.6.12 location ............................................ 31
6.6.13 method .............................................. 32
6.6.14 name ................................................ 32
6.6.15 oid ................................................. 32
6.6.16 path ................................................ 32
6.6.17 pid ................................................. 32
6.6.18 portlist ............................................ 32
6.6.19 program ............................................. 32
6.6.20 protocol ............................................ 32
6.6.21 reaction ............................................ 33
6.6.22 serial .............................................. 33
6.6.23 signature ........................................... 33
6.6.24 size ................................................ 33
6.6.25 sport ............................................... 33
6.6.26 time ................................................ 33
6.6.27 url ................................................. 33
6.6.28 uid ................................................. 33
6.7 Providing Additional Information ........................... 34
6.7.1 AdditionalData ....................................... 34
7. Examples ........................................................ 35
7.1 Denial of Service Attacks .................................. 35
7.1.1 The "teardrop" Attack ................................ 35
7.1.2 The "ping of death" Attack ........................... 36
7.2 Port Scanning Attacks ...................................... 37
7.2.1 Connection to a Disallowed Service ................... 37
7.2.2 Simple Port Scanning ................................. 38
7.3 Local Attacks .............................................. 39
7.3.1 The "loadmodule" Attack .............................. 39
7.3.2 The "phf" Attack ..................................... 41
7.4 System Policy Violation .................................... 42
7.5 Correlated Alerts .......................................... 43
7.6 Query and Response ......................................... 44
7.6.1 Passing Data by Reference ............................ 45
7.6.2 Manager-to-Analyzer Query ............................ 45
Curry Expires: September 14, 2000 [Page 3]
Internet-Draft IDMEF XML DTD March 15, 2000
7.6.3 Analyzer-to-Manager Response ......................... 46
7.7 Heartbeat .................................................. 46
8. Extending the IDMEF ............................................. 47
8.1 Extending an Existing Attribute ............................ 48
8.2 Adding an Attribute ........................................ 48
8.3 Adding an Element .......................................... 49
9. The IDMEF Document Type Definition .............................. 50
10. Security Considerations ........................................ 63
11. References ..................................................... 63
12. Acknowledgments ................................................ 64
13. Author's Address ............................................... 64
Curry Expires: September 14, 2000 [Page 4]
Internet-Draft IDMEF XML DTD March 15, 2000
2. Conventions used in this document
The key words "MUST," "MUST NOT," "REQUIRED," "SHALL," "SHALL NOT,"
"SHOULD," "SHOULD NOT," "RECOMMENDED," "MAY," and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [5].
An "IDMEF application" is a program or program component that reads
and/or writes messages in the format specified by this memo.
An "IDMEF document" is a message that adheres to the requirements
specified by this memo, and that is exchanged by two or more IDMEF
applications. An "IDMEF message" is another term for an "IDMEF
document."
3. Introduction
The Intrusion Detection Message Exchange Format (IDMEF) [2] is
intended to be a standard data format that automated intrusion
detection systems can use to report alerts about events that they
have deemed suspicious. The development of this standard format will
enable interoperability among commercial, open source, and research
systems, allowing users to mix-and-match the deployment of these
systems according to their strong and weak points to obtain an
optimal implementation.
The most obvious place to implement the IDMEF is in the data channel
between an intrusion detection "analyzer" (or "sensor") and the
"manager" (or "console") to which it sends alarms. But there are
other places where the IDMEF can be useful:
+ a single database system that could store the results from a
variety of intrusion detection products would make it possible for
data analysis and reporting activities to be performed on "the
whole picture" instead of just a part of it;
+ an event correlation system that could accept alerts from a
variety of intrusion detection products would be capable of
performing more sophisticated cross-correlation and cross-
confirmation calculations than one that is limited to a single
product;
+ a graphical user interface that could display alerts from a
variety of intrusion detection products would enable the user to
monitor all of the products from a single screen, and require him
or her to learn only one interface, instead of several; and
+ a common data exchange format would make it easier for different
organizations (users, vendors, response teams, law enforcement) to
not only exchange data, but also communicate about it.
The diversity of uses for the IDMEF needs to be considered when
Curry Expires: September 14, 2000 [Page 5]
Internet-Draft IDMEF XML DTD March 15, 2000
selecting its method of implementation.
3.1 The Extensible Markup Language
The Extensible Markup Language (XML) [4] is a simplified version of
the Standard Generalized Markup Language (SGML), a text markup syntax
defined by the ISO 8879 standard. XML is gaining widespread
attention as a language for representing and exchanging documents and
data on the Internet, and as the solution to most of the problems
inherent in HyperText Markup Language (HTML). XML was published as a
recommendation by the World Wide Web Consortium (W3C) on February 10,
1998.
XML is a metalanguage -- a language for describing other languages --
that enables an application to define its own markup. XML allows the
definition of customized markup languages for different types of
documents and different applications. This differs from HTML, in
which there is a fixed set of tags with preset meanings that must be
"adapted" for specialized uses. Both XML and HTML use tags
(identifiers delimited by '<' and '>') and attributes (of the form
"name='value'"). But where "
" always means "paragraph" in HTML,
it may mean "paragraph," "person," "price," or "platypus" in XML, or
it might have no meaning at all, depending on the particular
application.
The publication of XML was followed by the publication of a second
recommendation [6] by the World Wide Web Consortium, defining the use
of namespaces in XML documents. An XML namespace is a collection of
names, identified by a Universal Resource Identifier (URI). It
allows documents of different types, that use tags with the same
names, to be merged with no confusion. In anticipation of the
widespread use of XML namespaces, this memo includes the definition
of the URI to be used to identify the IDMEF namespace.
XML applications that conform to the requirements set forth in this
memo and also make use of namespaces MUST NOT include other non-IDMEF
namespaces in an IDMEF document.
3.2 Rationale for Implementing IDMEF in XML
XML-based applications are being used or developed for a wide variety
of uses, including electronic data interchange in a variety of
fields, financial data interchange, electronic business cards,
calendar and scheduling, enterprise software distribution, web "push"
technology, and markup languages for chemistry, mathematics, music,
molecular dynamics, astronomy, book and periodical publishing, web
publishing, weather observations, real estate transactions, and many
others.
XML's flexibility makes it a good choice for these applications; that
Curry Expires: September 14, 2000 [Page 6]
Internet-Draft IDMEF XML DTD March 15, 2000
same flexibility makes it a good choice for implementing the IDMEF as
well. Other, more specific reasons for choosing XML to implement the
IDMEF are:
+ XML allows a custom language to be developed specifically for the
purpose of describing intrusion detection alerts. It also defines
a standard way to extend this language, either for later revisions
of this document ("standard" extensions), or for vendor-specific
use ("non-standard" extensions).
+ Software tools for processing XML documents are widely available,
in both commercial and open source forms. A variety of tools and
APIs for parsing and/or validating XML are available in a variety
of languages, including Java, C, C++, Tcl, Perl, Python, and GNU
Emacs Lisp. Widespread access to tools will make adoption of the
IDMEF by product developers easier, and hopefully, faster.
+ XML meets IDMEF Requirement 5.1, that message formats support full
internationalization and localization. The XML standard specifies
support for both the UTF-8 and UTF-16 encodings of ISO 10646
(Unicode), making IDMEF compatible with both one- and two-byte
character sets. XML also provides support for specifying, on a
per-element basis, the language in which the element's content is
written, making IDMEF easy to adapt to "Natural Language Support"
versions of a product.
+ XML meets IDMEF Requirement 5.2, that message formats must support
filtering and aggregation. XML's integration with XSL, a style
language, allows messages to be combined, discarded, and
rearranged.
+ Ongoing XML development projects, in the W3C and elsewhere, will
provide object-oriented extensions, database support, and other
useful features. If implemented in XML, the IDMEF immediately
gains these features as well.
+ XML is free, with no license, no license fees, and no royalties.
3.3 The Debar/Huang/Donahoo IDMEF Class Hierarchy
Debar, Huang and Donahoo [4] have proposed that intrusion detection
alerts in the IDMEF be represented by a class hierarchy. This
representation has several advantages:
+ it is flexible, and capable of describing alerts to arbitrary
levels of complexity;
+ it is compact, and allows applications to specify only the data
they know; and
+ it is easy to extend, and will allow vendors to provide additional
Curry Expires: September 14, 2000 [Page 7]
Internet-Draft IDMEF XML DTD March 15, 2000
information about particular alerts.
This implementation follows the Debar/Huang/Donahoo model almost
exactly, with the following exceptions and restrictions:
+ XML tags have the names given to the various classes in the model,
with a few minor exceptions where changes were made to deal with
XML scoping rules or to increase consistency with the rest of the
implementation.
+ XML does not support "inheritance;" tags may only be used at the
level at which they are declared. Subclasses are implemented by
making the tags for those classes subtags of the tags for the
parent classes.
+ Several extensions have been made, represented by the following
elements: , , , ,
, , and .
These changes make little difference in the overall usefulness of the
Debar/Huang/Donahoo model, or XML as an implementation language.
4. Use of XML in the IDMEF
This section describes how some of XML's features and requirements
will impact the IDMEF.
4.1 The IDMEF Document Prolog
The "prolog" of an IDMEF document, that part that precedes anything
else, consists of the XML declaration and the document type
declaration.
4.1.1 XML Declaration
Every XML document (and therefore every IDMEF document) starts with
an XML declaration. The XML declaration specifies the version of XML
being used; it may also specify the character set being used.
The XML declaration looks like:
If a character encoding is specified, the declaration looks like:
where "charset" is the name of the character set in use (see section
4.2). If no encoding is specified, UTF-8 is assumed.
Curry Expires: September 14, 2000 [Page 8]
Internet-Draft IDMEF XML DTD March 15, 2000
IDMEF documents being exchanged between IDMEF applications MUST begin
with an XML declaration, and MUST specify the XML version in use.
Specification of the encoding in use is RECOMMENDED.
IDMEF applications MAY choose to omit the XML declaration internally
to conserve space, adding it only when the message is sent to another
destination (e.g., a web browser). This practice is NOT RECOMMENDED
unless it can be accomplished without loss of each message's version
and encoding information.
4.1.2 XML Document Type Definition (DTD)
The Document Type Definition (DTD) specifies the exact syntax of an
XML document. It defines the various tags that may be used in the
document, how the tags are related to each other, which tags are
mandatory and which are optional, and so forth.
The IDMEF Document Type Definition is listed in its entirety in
section 9.
It is expected that IDMEF applications will not normally include the
IDMEF DTD itself in their communications. Instead, the DTD will be
referenced in the document type declaration in the document entity
(see below). Such IDMEF documents will be well-formed and valid as
defined in [3].
Other IDMEF documents will be specified that do not include the
document prolog (e.g., entries in an IDMEF-format database). Such
IDMEF documents will be well-formed but not valid.
Generally, well-formedness implies that a document has a single
element that contains everything else (e.g., ""), and that all
the other elements nest nicely within each other without any
overlapping (e.g., a "chapter" does not start in the middle of
another "chapter").
Validity further implies that not only is the document well-formed,
but it also follows specific rules (contained in the Document Type
Definition) about which elements are "legal" in the document, how
those elements nest within other elements, and so on (e.g., a
"chapter" does not begin in the middle of a "title"). A document
cannot be valid unless it references a DTD (see Section 4.1.4).
XML processors are required to be able to parse any well-formed
document, valid or not. The purpose of validation is to make the
processing of that document (what's done with the data after it's
parsed) easier. Without validation, a document may contain elements
in nonsense order, elements "invented" by the author that the
processing application doesn't understand, and so forth.
Curry Expires: September 14, 2000 [Page 9]
Internet-Draft IDMEF XML DTD March 15, 2000
IDMEF documents MUST be well-formed. IDMEF documents SHOULD be valid
whenever both possible and practical.
4.1.3 IDMEF DTD Formal Public Identifier
The formal public identifier (FPI) for the Document Type Definition
described in this memo is:
"-//IETF//DTD RFCxxxx IDMEF v1.1//EN"
NOTE: The "RFCxxxx" text in the FPI value will be replaced
with the actual RFC number, if this memo is published
as an RFC.
This FPI MUST be used in the document type declaration within an XML
document referencing the DTD defined by this memo, as shown in the
following section.
4.1.4 IDMEF DTD Document Type Declaration
The document type declaration for an XML document referencing the DTD
defined by this memo will usually be specified in one of the
following ways:
The last component of the document type declaration is the formal
public identifier (FPI) specified in the previous section.
The last component of the document type declaration is a URL that
points to a copy of the Document Type Definition.
To be valid (see above), an XML document must contain a document type
declaration. However, this represents significant overhead to an
IDMEF application, both in the bandwidth it consumes as well as the
requirements it places on the XML parser (not only to parse the
declaration itself, but also to parse the DTD it references).
Implementors MAY decide, therefore, to have analyzers and managers
agree out-of-band on the particular document type definition they
will be using (the standard one as defined here, or one with
extensions), and then omit the document type declaration from IDMEF
messages. Great care must be taken in doing this however, as the
manager may have to accept messages from analyzers using DTDs with
different sets of extensions.
4.2 Character Data Processing in XML and IDMEF
Curry Expires: September 14, 2000 [Page 10]
Internet-Draft IDMEF XML DTD March 15, 2000
The XML standard requires that XML processors support the UTF-8 and
UTF-16 encodings of ISO 10646 (Unicode), making XML compatible with
both one- and two-byte character sets. While many XML processing
applications may support other character sets, only UTF-8 and UTF-16
can be relied upon from a portability viewpoint.
A document's XML declaration (see section 4.1.1) specifies the
character encoding to be used in the document, as follows:
where "charset" is the name of the character set, as registered with
the Internet Assigned Numbers Authority (IANA), see [7].
Consistent with the XML standard, if no encoding is specified for an
IDMEF message, UTF-8 SHALL be assumed.
IDMEF applications SHOULD NOT use, and IDMEF messages SHOULD NOT be
encoded in, character sets other than UTF-8 and UTF-16. Note that
since ASCII is a subset of UTF-8, it MAY be used to encode IDMEF
messages.
Per the XML standard, IDMEF documents encoded in UTF-16 MUST begin
with the Byte Order Mark described by ISO/IEC 10646 Annex E and
Unicode Appendix B (the "ZERO WIDTH NO-BREAK SPACE" character,
#xFEFF).
4.2.1 Character Entity References
Within XML documents, certain characters have special meanings in
some contexts. To include the actual character itself in one of
these contexts, a special escape sequence, called an entity
reference, must be used.
The characters that sometimes need to be escaped, and their entity
references, are:
Character Entity Reference
---------------------------------
& &
< <
> >
" "
' '
It is RECOMMENDED that IDMEF applications use the entity reference
form whenever writing these characters in data, to avoid any
possibility of misinterpretation.
Curry Expires: September 14, 2000 [Page 11]
Internet-Draft IDMEF XML DTD March 15, 2000
4.2.2 Character Code References
Any character defined by the ISO/IEC 10646 standard may be included
in an XML document by the use of a character reference. A character
reference is started with the characters '&' and '#', and ended with
the character ';'. Between these characters, the character code for
the character inserted.
If the character code is preceded by an 'x' it is interpreted in
hexadecimal (base 16), otherwise, it is interpreted in decimal (base
10). For instance, the ampersand (&) is encoded as & or &
and the less-than sign (<) is encoded as < or <.
Any one- or two-byte character specified in the Unicode standard can
be included in a document using this technique.
4.2.3 White Space Processing
XML preserves white space by default. The XML processor passes all
white space characters to the application unchanged. This is much
different from HTML (and SGML), in which, although the space/no space
distinction is meaningful, the one space/many spaces distinction is
not.
XML allows tags to identify the importance of white space in their
content by using the "xml:space" attribute:
where "action" is either "default" or "preserve."
If "action" is "preserve," the application MUST treat all white space
in the tag's content as significant. If "action" is "default," the
application is free to do whatever it normally would with white space
in the tag's content.
The intent declared with the "xml:space" attribute is considered to
apply to all attributes and content of the element where it is
specified, unless overridden with an instance of "xml:space" on
another element within that content.
All IDMEF tags support the "xml:space" attribute.
4.3 Languages in XML and IDMEF
XML allows tags to identify the language their content is written in
by using the "xml:lang" attribute:
Curry Expires: September 14, 2000 [Page 12]
Internet-Draft IDMEF XML DTD March 15, 2000
where "langcode" is a language tag as described in RFC 1766 [8].
The intent declared with the "xml:lang" attribute is considered to
apply to all attributes and content of the element where it is
specified, unless overridden with an instance of "xml:lang" on
another element within that content.
IDMEF applications SHOULD specify the language in which their
contents are encoded; in general this can be done by specifying the
"xml:lang" attribute for the top-level tag.
If no language is specified for an IDMEF message, English SHALL be
assumed.
All IDMEF tags support the "xml:lang" attribute.
4.4 Unrecognized Tags in IDMEF Messages
On occasion, an IDMEF application may receive a well-formed, or even
well-formed and valid, IDMEF message containing tags that it does not
understand. The tags may be either:
+ Recognized as "legitimate" (a valid document), but the application
does not know the semantic meaning of the tag's content; or
+ Not recognized at all.
IDMEF applications MUST continue to process IDMEF messages that
contain unknown tags, provided that such messages meet the
well-formedness requirement of section 4.1.2. It is up to the
individual application to decide how to process any content from the
unknown tag(s).
5. IDMEF Data Types
XML is a typeless language; everything is simply a stream of bytes,
and it is left to the application to extract meaning from them.
That being said, this specification makes the following rules:
1. Integer data MUST be encoded in either Base 10 or Base 16. Base
10 encoding uses the digits '0' through '9' and an optional
negative ('-') or positive ('+') sign. Base 16 encoding uses the
digits '0' through '9' and 'a' through 'f' (or their upper case
equivalents), and is preceded by the characters "0x". For
example, the number one hundred twenty-three would be encoded as
"123" in Base 10, or "0x7b" in Base 16.
2. Floating-point (real) data MUST be encoded in Base 10. For
example, the number one hundred twenty-three and forty-five
Curry Expires: September 14, 2000 [Page 13]
Internet-Draft IDMEF XML DTD March 15, 2000
one-hundredths would be encoded as "123.45".
3. Character and character string data does not require quoting, as
the IDMEF tags provide that functionality.
3. Dates MUST be encoded as a four-digit year, two-digit month, and
two-digit day, separated by forward slashes. The two-digit day
and its corresponding forward slash MAY be omitted to represent
an entire month. For example, March 13, 2000 would be encoded as
"2000/03/13", and December, 1999 would be encoded as "1999/12".
4. Time of day MUST be encoded as a two-digit hour, two-digit
minutes, and two-digit seconds, separated by colons. The
two-digit seconds and corresponding colon MAY be omitted to
represent times with less precision. The seconds field MAY be
followed by a decimal point and fractional number of seconds, if
more precision is needed. All times MUST be specified on a
24-hour clock. For example, 6:00 P.M. would be encoded as
"18:00", 3:15:27 A.M. would be encoded as "03:15:27", and
two-tenths of a second past midnight would be encoded as
"00:00:00.2".
5. Port lists, as used in the element, are encoded as a
comma-separated list of numbers (individual integers) and ranges
(N-M means ports N through M, inclusive). Any combination of
numbers and ranges may be used in a single list.
6. The identification strings used with the "id" attribute of the
, , , , ,