Internet-Draft Wood, M. Internet Engineering Task Force Internet Security Systems Intrusion Detection Exchange Format Working Group June, 1999 Category: Informational Intrusion Detection Exchange Format Requirements Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet- Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/lid-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html Distribution of this memo is unlimited. This Internet Draft expires December 22, 1999. 1. Abstract The purpose of the Intrusion Detection Exchange Format is to define data formats and exchange procedures for sharing information of interest to intrusion detection and response systems, and to the management systems which may need to interact with them. This Internet-Draft describes the high-level requirements for such communication, including the rationale for those requirements. Scenarios are used to illustrate the requirements. 2. Conventions used in this document The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC-2119 [1]. Wood Informational - December 22, 1999 1 3. Introduction This document defines requirements for the Intrusion Detection Exchange Format (IDEF), which is the intended work product of the Intrusion Detection Exchange Format Working Group (IDWG). IDEF is planned to be a standard format which automated Intrusion Detection Systems can use for reporting events which they have deemed to be suspicious. 3.1 Rationale The reasons such a format should be useful are as follows: 1) A number of commercial and free Intrusion Detection Systems (IDS) are available and more are coming onto the market all the time. Some products are aimed at detecting intrusions on the network, others are aimed at host operating systems, while still others are aimed at applications. Even within a given category, the products have very different strengths and weaknesses. Hence it is likely that customers will buy more than a single product. And customers will want to observe the output of these products from a one or more console(s). A standard format for reporting events will simplify this task greatly. 2) Intrusions frequently involve multiple organizations as victims, or multiple sites within the same organization. Typically, those sites will use different ID systems. It would be very helpful to correlate such distributed intrusions across multiple sites and administrative domains. Having reports from all sites in a common format would facilitate this task 3) The existence of a common format should allow components from different ID systems to be integrated more readily. ID research should be able to be integrated into commercial products more easily. 4) We feel that, in addition to enabling communication from an ID analyzer to an ID manager, the IDEF notification system may also enable communications between a variety of IDS components. However, for the remainder of this document, we refer to the communications as going from an analyzer to a manager. All of these reasons suggest that a common format for reporting suspicious events should help the IDS market to grow and innovate more successfully, and should result in IDS users obtaining better results from deployment of ID systems. 3.2 Intrusion Detection Terms In order to make the rest of the requirements clearer, we define some terms about typical intrusion detection systems. 3.2.1 Activity: Instantiations of the data source that are identified by the analyzer as being of interest to the operator. Examples of this include (but are not limited to) network sessions, user activity, and application events. Wood Informational - December 22, 1999 2 Activity can range from extremely serious occurrences (such as a unequivocally malicious attack) to less serious occurrences (such as unusual user activity that's worth a further look). 3.2.2 Administrator: The human with responsibility for the day-to-day maintenance and management of organizational security. This individual may or may not be the same person charged with the deployment of the intrusion detection system and may or may not be the same person that is actually monitoring the output of the IDS. In some organizations, the administrator is associated with the network or systems administration groups. In other organizations, it's an independent position. 3.2.3 Alert: A message from an analyzer to a manager that an event has been detected. An alert typically contains information about the unusual activity that was detected, as well as the specifics of the occurrence. 3.2.4 Analyzer: The ID component that analyzes the data collected by the sensor for signs of unauthorized or undesired activity or for events that might be of interest to the security administrator. In many existing ID systems, the sensor and the analyzer are part of the same component. In this document, the term analyzer is used generically to refer to the sender of the IDEF message. 3.2.5 Data Source: The raw information that an intrusion detection system uses to detect unauthorized or undesired activity. Common data sources include (but are not limited to) raw network packets, operating system audit logs, application audit logs, and system-generated checksum data. 3.2.6 Event: The occurrence in the data source that is detected by the analyzer and which may result in an IDEF alert being transmitted. For example, three failed logins in 10 seconds might indicate a brute-force login attack 3.2.7 Manager: The ID component from which the security administrator manages the various components of the ID system. Management functions typically include (but are not limited to) sensor configuration, analyzer configuration, event notification management, data consolidation, and reporting. 3.2.8 Notification: The method by which the IDS manager makes the operator aware of the event occurrence. In many ID systems, this is done via the display of a Wood Informational - December 22, 1999 3 colored icon on the IDS manager screen, the transmission of an e-mail or pager message, or the transmission of an SNMP trap, although other notification techniques are also used. 3.2.9 Operator: The human that is the primary user of the IDS manager. The operator often monitors the output of the ID system and initiates or recommends further action. 3.2.10 Response: The actions taken in response to an event. Responses may be undertaken automatically by some entity in the ID system architecture or may be initiated by a human. Sending a notification to the operator is a very common response. Other responses include (but are not limited to) logging the activity, recording the raw data (from the data source) that characterized the event, terminating a network, user, or application session, or altering network or system access controls. 3.2.11 Sensor: The ID component that collects data from the data source. The frequency of data collection will vary across IDS offerings. 3.2.12 Signature: A rule used by the analyzer to identify interesting activity to the security administrator. Signatures represent one of the mechanisms (though not necessarily the only mechanism) by which ID systems detect intrusions. 3.2.13 Security Policy: The predefined, formally documented statement which defines what services are allowed to be transported across the monitored segment of the network to support the business requirement. This includes, but it not limited to, which hosts are to be denied external network access. 3.3 Architectural Assumptions In this document, as defined in the terms above, we assume that an analyzer determines somehow that a suspicious event has been seen by a sensor, and sends an alert to a management console. The format of that alert is what IDEF proposes to standardize. For the purposes of this document, we assume that the analyzer and management console are separate components, and that they are communicating pairwise across a TCP/IP network. No other form of communication between these entities is contemplated in this document, and no other use of IDEF alerts is considered. We try to make no further architectural assumptions than those just stated. For example, the following points should not matter: Wood Informational - December 22, 1999 4 * Whether the sensor and the analyzer are integrated or separate. * Whether the analyzer and management console are isolated, or embedded in some large hierarchy or distributed mesh of components. * Whether the management console actually notifies a human, takes action automatically, or just analyzes incoming alerts and correlates them. * A component might act as an analyzer with respect to one component, but as a management console with respect to another. 3.4 Organization of this document. Besides this requirements document, the IDWG working group should produce two other documents. The first should describe a data format or language for exchanging information about suspicious events. In this document, we refer to that as the "data-format specification". The second document should identify existing IETF protocols that are best used for conveying the data so formatted, and explain how to package this data in those existing formats. We refer to this as the "communication specification". Accordingly, the requirements here are partitioned into five sections * The first of these contains general requirements that apply to all aspects of the IDEF specification. * The second section describes requirements on the formatting of IDEF messages. * The third section outlines requirements on the transport of IDEF messages from the analyzer to the manager. * The fourth section contains requirements on the content and semantics of the IDEF messages. * The final section places requirements on IDEF event definitions and the event definition process. For each requirement, we attempt to state the requirement as clearly as possible without imposing an idea of what a design solution should be. Then we give the rationale for why this requirement is important, and state whether this should be an essential feature of the specification, or is beneficial but could be lacking if it is difficult to fulfill. Finally, where it seems necessary, we give an illustrative scenario. 4. General Requirements 4.1 The IDEF shall reference and use previously published RFCs where possible. 4.1.1 Rationale: The IETF has already completed a great deal of research and work into the areas of networks and security. In the interest Wood Informational - December 22, 1999 5 of time, it is smart business to implement already defined and accepted standards. 4.1.2 Scenario: 4.2 The IDEF must operate in environments that contain IPv6 implementations.2.2.1 Rationale: Since pure IPv4, hybrid IPv6/IPv4, and pure IPv6 environments are expected to exist within the timeframe of IDEF implementations, the IDEF specification must support IPv6 environments. 4.2.2 Scenario: 5. Message Format 5.1 IDEF message formats shall support full internationalization and localization. 5.1.1 Rationale: Since network security and intrusion detection are areas that cross geographic, political, and cultural boundaries, the IDEF messages must be formatted such that they can be presented to an operator in a local language and adhering to local presentation customs. 5.1.2 Scenario: 5.2 The format of IDEF messages must support filtering and/or aggregation of data by the manager. 5.2.1 Rationale: Since it is anticipated that some managers may want to perform filtering and/or data aggregation functions on IDEF messages, the IDEF messages must be structured to facilitate these operations. 5.2.2 Scenario: 6. Transport Requirements 6.1 The IDEF must support reliable transmission of messages. 6.1.1 Rationale: IDS managers often rely on receipt of data from IDS analyzers to do their jobs effectively. Since IDS managers will rely on IDEF messages for this purpose, it is important, therefore, that IDEF messages be delivered reliably. 6.1.2 Scenario: The IDEF system might rely upon TCP reliability mechanisms or might design its own reliable protocol for use with UDP. Wood Informational - December 22, 1999 6 6.2 The IDEF must support transmission of messages from analyzers outside firewalls to managers inside firewalls without requiring changes to the firewall configuration that weaken the security of the perimeter. The IDEF design must also be relatively easy to implement. 6.2.1 Rationale: Since IDEF analyzers are often placed outside firewalls and since it is expected that IDEF managers will most often be located inside firewalls, it is necessary that analyzers be able to send IDEF messages through a firewall. Setting up this communication must not require changes to the configuration of the intervening firewall(s) that weaken the security of the protected network. 6.2.2 Scenario: One possible scenario is the use of TCP to convey IDEF messages. If the destination ports of this communications channel are user-programmable, it may be possible for the IDEF system to use existing firewall tunnels without change to the firewall configuration. 6.3 The IDEF must support peer-to-peer authentication of the analyzer to the manager. 6.3.1 Rationale: Since the alert messages are used by a manager to direct responses or further investigation related to the security of an enterprise network, it is important that the receiver have confidence in the identity of the sender. This is peer-to-peer authentication of the sender to the receiver. It must not be based on authentication of the underlying transport, for example, because of the risk that this authentication process may be subverted or misconfigured. 6.3.2 Scenario: Analyzer process authenticates itself to manager process via public key exchange or some other method. 6.4 The IDEF must maintain confidentiality of the message content. The selected design must be capable of supporting a variety of encryption algorithms and must be adaptable to a wide variety of environments. 6.4.1 Rationale: IDEF messages potentially contain extremely sensitive information (such as passwords) and would be of great interest to an intruder. Since it is likely some of these messages will be transmitted across uncontrolled network segments, it is important that the content be shielded. Furthermore, since the legal environment for encryption technologies is extremely varied and changes often, it is important that the design selected be capable of supporting a number of different encryption options and be adaptable by the user to a variety of environments. 6.4.2 Scenario: The IDEF system might offer two different encryption modules, one using 168-bit keys and another using 56-bit keys. Wood Informational - December 22, 1999 7 6.5 The IDEF must ensure the integrity of the message content. The selected design must be capable of supporting a variety of integrity algorithms and must be adaptable to a wide variety of environments. 6.5.1 Rationale: IDEF messages are used by the manager to direct action related to the security of the protected enterprise network. It is vital for the manager to be certain that the content of the message has not been changed after transmission. 6.5.2 Scenario: An integrity hash, such as the MD5 algorithm, might be part of the IDEF design. 6.6 The IDEF should ensure non-repudiation of the message transmission. 6.6.1 Rationale: 6.6.2 Scenario: 6.7 The IDEF communications mechanism should resist protocol denial of service attacks. 6.7.1 Rationale: 6.7.2 Scenario: 6.8 The IDEF communications mechanism should resist malicious duplication of messages. 6.8.1 Rationale: 6.8.2 Scenario: 7. Message Content 7.1 The IDEF message must encompass all types of intrusion detection mechanisms. 7.1.1 Rationale: There are many types of intrusion detection systems that analyze a variety of data sources. Some are profile based and operate on log files, attack signatures etc. Others are anomaly based and define normal behavior and detect deviations from the established baseline. Each of these systems report different data that, in part, depends on their intrusion detection methodology. All must be supported by this standard. 7.1.2 Scenario: An attacker invents a new attack. The profile-based system does not detect it. An anomaly-based system detects the Wood Informational - December 22, 1999 8 novel attack but it cannot provide an attack type in an alert message. 7.2 The IDEF must support reporting event creation date and time in each event. The IDEF may support reporting the attack detection date and time in addition to the event creation date and time. Time shall be reported as the localtime and time zone offset on the system generating the message. [See RFC 1902 for guidelines on reporting time.] (supporting reporting across multiple timezones and correlating across multiple timezones) The format for reporting the date must be compliant with all current standards for Year 2000 rollover, and it must have sufficient capability to continue reporting date values past the year 2038. Time granularity in event messages shall not be specified by the IDEF. 7.2.1 Rationale: Time is important from both a reporting and correlation point of view. Attack detection time may differ from the event creation time as it may take some time to actually generate the event message given that an attack has been detected. If the sensing element can determine the time the attack occurred it is strongly encouraged to place that information in the attack detection field. The IDEF cannot assume a certain clock granularity on sensing elements, and so cannot impose any requirements on the granularity of the event timestamps. 7.2.2 Scenario: 7.3 The IDEF message must provide information about the automatic actions taken by the analyzer in response to the event (if any). 7.3.1 Rationale: It is very important for the operator to know if there was an automated response and what that response was. This will help determine what further action to take, if any. 7.3.2 Scenario: The attacker launches the attack, the ID system detects the attack and disables the user account performing the suspicious activity. This suspension is for 10 minutes to allow the operator time to investigate the suspicious activity. The IDEF message contains this information. 7.4 The IDEF message must contain an indication of the potential impact of the attack, if it is known. 7.4.1 Rationale: Information concerning the possible impact of the attack on the target system provides an indication of what the attacker is attempting to do and is critical data for the operator to perform damage assessment. Not all systems will be able to Wood Informational - December 22, 1999 9 determine this, but it is important data to transmit for those systems that can. 7.4.2 Scenario: A buffer overflow attack is launched and detected by the ID analyzer. The IDEF message may contain information that this buffer overflow attack is an attempt to gain root or administrator privilege on the target system. The ID operator may use this data increase the priority of the response. 7.5 The IDEF message must contain the identity of the vendor and the tool that detected the attack. 7.5.1 Rationale: Users may run multiple intrusion detection systems to protect their enterprise. This data will help the systems administrator determine which vendor and tool detected the attack. 7.5.2 Scenario: Tool X from vendor Y detects a potential intrusion. A message is sent reporting that it found a potential break-in with X and Y specified. The operator is therefore able to include the known capabilities or weaknesses of tool X in his decision regarding further action. 7.6 The IDEF message must support a vendor extension mechanism used to define vendor specific data. The use of this mechanism by the vendor is optional. This data contains vendor specific information determined by each vendor. The vendors must indicate how to interpret these extensions. 7.6.1 Rationale: Vendors may wish to supply extra data such as the version number of their product or other data that they believe provides value added due to the specific nature of their product. 7.6.2 Scenario: The vendor passes back detailed information specific to their product after it detects a potential attack. 7.7 The IDEF message may include a list of sensors that sensed the event. This sensor list must be designated by a list of IP addresses. 7.7.1 Rationale: Intrusion detection sensors and analysis engines are often not co-located. It is important to know where the event was sensed. 7.7.2 Scenario: A distributed intrusion detection system has sensors placed throughout the enterprise. Detection of a sophisticated attack might actually involve passing of data through many analyzers, each of which performs a specific piece of analysis. The ultimate reporting of this event to the ID manager might include the IP addresses of all the analyzers involve in the detection. This information might be used by the manager to localize the attack to a specific portion of the network. Wood Informational - December 22, 1999 10 7.8 The content of IDEF messages must contain the identified name of the attack if it is known. This name will be drawn from a standardized list of attacks or will be a vendor-specific name if the attack identity has not yet been standardized. It is not known how this list will be defined or updated, although requirements on the creation of this list are presented in the next section of this document. In addition, the message must contain the attack technique if it is known. The attack technique will also be selected from a list of attack technique identifiers. 7.8.1 Rationale: Given than this document presents requirements on standardizing ID message formats so that an ID manager may receive alerts from analyzers from multiple vendors, it is important that the manager understand the semantics of the reported events. There is, therefore, a need to identify known attacks and store information concerning their method and possible fixes to these attacks. Some attacks are well known and this recognition can help the operator. The operator can also benefit from knowing the attack technique (e.g. flooding). 7.8.2 Scenario: Attacker launches an attack that is detected by two different analyzers from two distinct vendors. Both report the same event identity to the ID manager, even though the algorithms used to detect the attack by each analyzer may have been different. 7.9 The IDEF message must contain the identity of the source of the attack and target component identifier if it is known. In the case of a network-based attack, this will be the source and destination IP address of the session used to launch the attack. Note that the identify of source and target will vary for other types of attacks, such as those launched/detected at the operating system or application level. 7.9.1 Rationale: This will allow the operator to identify the source and target of the attack. 7.9.2 Scenario: Attacker launches a network attack against a DNS server using a buffer overflow attack. The IDEF alert message indicates the DNS server as the target and includes the source IP address used to launch the attack. 7.10 The IDEF message must contain the ability to reference additional detailed data related to this specific underlying event. It is optional for vendors to use this field. 7.10.1 Rationale: Operators may want more information on specifics of the attack. This field, if filled in by the analyzer, may point to additional or more detailed information about the intrusion Wood Informational - December 22, 1999 11 7.10.2 Scenario: Attacker attacks host and is detected by ID system. IDEF message contains a pointer to a set of records that gives access to system audit data. 7.11 The semantics of the IDEF message must be well defined. 7.11.1 Rationale: Good semantics are key to understanding what the message is trying to convey so there are no errors due to confusion over exactly what the message means. Operators will decide what action to take based on these messages, so it is important that they can interpret them correctly. 7.11.2 Scenario: Without this requirement, the operator receives an IDEF message and interprets it one way. The vendor who constructed the message intended it to have a different meaning from the system administrator's interpretation. The resulting corrective action is, therefore, incorrect. 7.12 The IDEF message must contain a field for an advisory from a noted authority such as the CERT. The vendor must fill in this field if the information is available and applicable to the event being reported. 7.12.1 Rationale: This information is used by administrators to report and fix problems. 7.12.2 Scenario: Attacker performs a well-known attack. CERT advisory number is included in IDEF message since the vendor has access to a list of CERT advisory numbers. Operator uses this information to initiate repairs on the vulnerable system. 7.13 The IDEF message must contain a field for data on the degree of penetration achieved by the attack. The vendors have the option of specifying this data. 7.13.1 Rationale: This information is valuable to determine the extent of damage to the system. Sometimes the attack fails and other times the attack is very successful and causes significant damage. Note that this information is not always available at the time or place the IDEF message is constructed. 7.13.2 Scenario: Attempt to break into host system fails. This is reported in the IDEF message with degree of penetration equal to zero. The operator reduces the priority of this event, since it appears no damage was done. 7.14 The IDEF message must contain a field for data on the degree of confidence of the report. The completion of this field by an analyzer is optional, as this data may not be available at all analyzers. Wood Informational - December 22, 1999 12 7.14.1 Rationale: Many ID systems contain thresholds to determine whether or not to generate an alert. This may influence the degree of confidence one has in the report or perhaps would indicate the likelihood of the report being a false alarm. 7.14.2 Scenario: The alarm threshold monitor is set at a low level to indicate that an organization wants reports on any suspicious activity, regardless of the probability of a real attack. The degree of confidence measure is used to indicate if this is a low probability or high probability attack. 8. Event Definitions and the Event-Definition Process 8.1 The IDEF must be extensible. As new events are defined by the community and as new methods of detecting them are offered by the industry, the IDEF must be able to grow with the technology. 8.1.1 Rationale: 6.1.2 Scenario: 8.2 The standard event definitions must be extensible by vendors and users. 8.2.1 Rationale: 8.2.2 Scenario: 8.3 The process by which new events are defined and standardized must be vendor independent. 8.3.1 Rationale: 8.3.2 Scenario: 9. References [1] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. Wood Informational - December 22, 1999 13 Acknowledgements: The following individuals contributed substantially to this document and should be recognized for their efforts. This document would not exist without their help: Mark Crosbie, Hewlett-Packard David Donahoo, Air Force Information Warfare Center Mike Erlinger, Harvey Mudd College Dipankar Gupta, Hewlett-Packard Stuart Staniford-Chen, Silicon Defense Maureen Stillman, Odyssey Research Associates Editor's Address: Mark Wood Internet Security Systems, Inc. 6600 Peachtree-Dunwoody Road 300 Embassy Row Atlanta, GA 30328 Phone: +1 (678) 443-6147 E-mail: mark1@iss.net Intrusion Detection Exchange Format Working Group: The Intrusion Detection Exchange Format Working Group can be contacted via the working group's mailing list (idwg-public@zurich.ibm.com) or through its chairs: Stuart Staniford-Chen stuart@SiliconDefense.com Silicon Defense Mike Erlinger mike@cs.hmc.edu Harvey Mudd College Full Copyright Statement Copyright (C) The Internet Society (1999). All Rights Reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implmentation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed. Wood Informational - December 22, 1999 14