INTERNET-DRAFT Yixian Yang Expires: December 2003 Ning An Yonggang Chu Beijing University of Posts and Telecom. June 2003 A Framework for Large-scale Distributed Intrusion Detection System(LDIDS) draft-yang-ldids-framework-00.txt Status of This Memo This document is an Internet Draft and is in full conformance with all provisions of Section 10 of RFC 2026. This document is an Internet-Draft. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet- Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/1id-abstracts.html. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Distribution of this memo is unlimited. Abstract Intrusion Detection Systems (IDSs) are designed to detect intrusions and protect the relative network or hosts. Now the network scale is becoming larger and larger, Large-scale Distributed Intrusion Detection Systems, which are IDSs that work in such environments, are the trends of IDSs evolution. This document describes a hierarchy framework for Large-scale Distributed Intrusion Detection Systems, with which a Large-scale Distributed IDS can be flexibly deployed. Each node in this framework can be seen as a simple IDS. This document gives a four-layer structure for the simple IDS. This four-layer structure can also be the structure of an independent IDS. Yixian Yang, et al. Expires December, 2003 [Page 1] INTERNET-DRAFT framework for LDIDS June, 2003 Table of Contents Status of This Memo ......................................... 1 Abstract .................................................... 1 1.Introduction............................................... 3 2.Glossary .................................................. 4 3.Conceptual Model .......................................... 5 3.1 Overview............................................... 5 3.2 Feature ............................................... 6 3.3 Organization .......................................... 7 3.3.1 Registration ....................................... 7 3.3.2 Security Conversation .............................. 8 3.3.3 Conversation Control ............................... 8 3.3.4 Invalidation ....................................... 9 4. Four-Layered Structure ................................... 9 4.1 Basic Function Modules ................................ 9 4.2 Layered Structure ..................................... 11 4.3 Collection Layer ...................................... 13 4.3.1 The Importance of Data Collection .................. 13 4.3.2 Data Collection Mechanism........................... 14 4.3.3 Log ................................................ 15 4.3.4 Network Datagram ................................... 15 4.3.5 Other Information................................... 15 4.4 Analysis Layer ........................................ 15 4.4.1 Analysis Technique ................................. 15 4.4.2 Analysis Process ................................... 16 4.5 Fusion Layer .......................................... 16 4.5.1 Congregation ....................................... 17 4.5.2 Coalition .......................................... 18 4.5.3 Association ........................................ 18 4.6 Harmonization and Management Layer .................... 18 4.6.1 Decision-Making .................................... 19 4.6.2 Harmonization ...................................... 19 4.6.3 Response ........................................... 19 4.6.4 Administrator Console .............................. 19 4.6.5 Mutual Interface ................................... 20 5. Acknowledgements.......................................... 20 6. Informative References.................................... 20 7. Authors' Addresses........................................ 20 Yixian Yang, et al. Expires December, 2003 [Page 2] INTERNET-DRAFT framework for LDIDS June, 2003 1. Introduction This document addresses the framework of a hierarchy for Large-scale Distributed Intrusion Detection Systems. Large-scale distributed intrusions have some unique features, such as large area, high speed and huge data stream and so on. Thus, the IDS should have some countermeasures for distributed intrusions. It is necessary to bring forward a framework to facilitate large-scale distributed intrusion detections. This framework provides a mechanism, through which several distributed IDSs can cooperate harmoniously in the detection. In this document, a new four-layer structure of IDS is presented. The IDSs adapted to this structure could be large-scale distributed intrusion detection systems as well as an independent IDS. There are different functional modules in IDSs, and layered structure shows how the functional modules cooperate in harmony to detect intrusions. Despite that the forms of IDSs are not always uniform, the operation mechanisms would accord with the four-layer Structure. According to the structure, the functional modules would properly harmonize the tasks of distributed intrusion detection. Yixian Yang, et al. Expires December, 2003 [Page 3] INTERNET-DRAFT framework for LDIDS June, 2003 2. Glossary This document uses terminology that is defined in [DSARCH]. There is also current work-in-progress on this terminology in the IETF and some of the definitions provided here are taken from that work. Some of the terms from these other references are defined again here in order to provide additional detail, along with some new terms specific to this document. IDS Intrusion Detection System, which is a security system that monitors computer systems and network traffic and analyzed that traffic for possible hostile attacks originating from outside the organization and also for system misuse or attacks originating from inside the organization. Distributed The system that operates in distributed manners, such System as the system that adopts distributed analysis methods. Distributed The intrusions that take several steps and involve a Intrusion great deal of hosts. Functional A basic building block of the conceptual IDSs. Module Typical elements are Collection, Analysis, Congregation, Coalition, Association, Decision-Making, Harmonization,Response, Administrator Console and Mutual Interface. Layer A function combination that is composed of one or more functional modules. The layers are Collection, Analysis, Fusion and Harmonization and Management. Yixian Yang, et al. Expires December, 2003 [Page 4] INTERNET-DRAFT framework for LDIDS June, 2003 3. Conceptual Module A hierarchy is somehow the best way to explore the issues involved in large-scale environments. It features a hierarchical decomposition of the protected organization and its networks. 3.1 Entire Design The framework is based on hierarchy, which is set up according to the network topologies. The hierarchy consists of leaf nodes, branch nodes and root nodes. Leaf nodes monitor local network activities, branch nodes monitor each child node network and root node monitors activities of the whole network. In a simple case, there are only three levels in hierarchy. That is, it exists only one level of branch nodes. +----------+ | Root | +----------+ | | +--------+ +--------+ | Middle | | Middle | +--------+ +--------+ | | | +--------+ +--------+ +--------+ | Leaf | | Leaf | | Leaf | +--------+ +--------+ +--------+ Figure 1: An Sketch Map of the Hierarchy Networks can be decomposed into several departments. Each department indicates a security organization and its network. In each department, there is an IDS that is on duty of the secure issues of the local networks. The combination of a department and the local IDS is defined as a leaf node of the hierarchy. Leaf nodes collect the data in local department, including logs, network datagram and alerts sent by other secure components. All the data would be analyzed using proper analysis methods. After that, the local IDS would produce alerts for intrusion activities, and locally respond to some of these alerts. At the same time, local IDS would send the alerts and data, which cannot be analyzed locally, to its parent node. These alerts and data sent to parent node might have some correlation with those of other departments. Yixian Yang, et al. Expires December, 2003 [Page 5] INTERNET-DRAFT framework for LDIDS June, 2003 A branch node may have one or more child nodes, which can be branch nodes or leaf nodes. Actually, a branch department is an aggregation of all the child departments. The IDS receives alerts and datagram from the IDSs of child nodes and determinates whether there are intrusions in the child departments. The root node is not always a root node and sometimes it would act as a branch node. When the scale of the network becomes larger, the depth of the hierarchy would also increase. In this case, the root node would become a branch node in the updated hierarchy. Each node in the hierarchy is a unit of given networks and corresponding IDS. The IDS of each node has its own independence. Therefore, any part of the hierarchy is also a complete distributed IDS. 3.2 Feature Firstly, the framework could be zoomed corresponding to the scale of networks. There are four types of potential changes as follows: o The number of the hosts in some department increases. These newly added hosts must register themselves. If the local IDS accepts them, they would join in this department. However, when the local IDS could not accept their registrations, a new IDS is required to create a new department. The new department needs to be registered, too. o After a new department is created, it must register itself. Upon registration having completed, this new department and the IDS would become a new node in the hierarchy. o If some node finds that one of its child nodes is inactive, this node would invalidate the inactive child node. o When a new firewall or another secure component is to join, they have to register themselves to IDS of the same level. This type of change would not alter the number of the nodes in the hierarchy, but would change the data source of the related IDS. While the various changes above exist, there is no doubt that this framework can zoom freely. So it is a framework that can adapt well to large-scale networks. Registration and invalidation are two mechanisms of the framework's security. These mechanisms can distinguish legal nodes from illegal nodes. Because registration would prevent intruder from damaging the hierarchy, some intruders who use illegal identity to cheat parent node will not get his own way. Yixian Yang, et al. Expires December, 2003 [Page 6] INTERNET-DRAFT framework for LDIDS June, 2003 Secondly, the use of hierarchy in network topology facilitates large-scale distributed intrusion detection. The reason is that the nodes in upper level would deal with less data. By this way, the nodes that conduct larger networks would dispose less data. So it is a reasonable way to detect distributed intrusions in large-scale networks. Finally, hierarchy reflects the actual network topology. In most cases, it would not have more than five levels. For this reason, the response speed of the framework would not be affected. So the response of root node can be transferred to the final destination timely. 3.3 Organization 3.3.1 Registration Registration is a good technique for the new legal node that is to join the framework. Registration guarantees that the hierarchy would keep in a relatively secure state. It can not only prevent imitative node from entering into this hierarchy, but avoid any illegal communications. As the efficiency and urgency requested by registration are not rigid, CA (Certification Authority) is a better way that can be used to accomplish registration. CA issues certificates to subscribers (CA clients) in order that such certificates can be verified by users. Thus, there are three main entities which can be outwardly recognized in certification procedures: o CA: a general designation for any entity that controls the authentication services and the management of certificates, is the cornerstone of a trust community. It issues and manages certificates of end-users, service providers, applications and appliances. o Subscriber: an entity that supplies to the CA the information that is to be included in the entityí¯s own certificate, signed by the CA. o User: any entity which relies upon a certificate issued by a CA in order to obtain information on the subscriber. In case of a leaf node that would make itself certificated, the leaf node acts as the user and its parent node acts as the subscriber. CA simplex distinguish protocol is used. Yixian Yang, et al. Expires December, 2003 [Page 7] INTERNET-DRAFT framework for LDIDS June, 2003 In case of a branch node certificated, the branch node acts as the user and its child nodes act as the subscriber. CA duplex distinguish protocol is used, for the parent node would not trust its child nodes whereas the child node always trusts the parent node. Of course, the branch node must adopt simplex distinguish protocol to get its parent node's credit, the same with the case of the leaf node. 3.3.2 Security Conversation When registration completed, the node would become a part of the hierarchy. The node should establish a secure and high-speed conversation with its parent nodes and child nodes. With this conversation, any node in the hierarchy could transmit data, alerts and responses. IDXP is a choice of the communication mechanism. In the conversations, two conjunct nodes act as peers. One of the nodes is the other's parent, called parent IDXP peer. The other node is called child IDXP peer. If a pair of IDXP peers attempts to start up a communication, they must at first initiate a BEEP conversation. In general, it is recommended that the one initiating the conversation is the child peer (client). Once BEEP conversation established, all exchanges would occur in the context of a channel in BEEP. When an IDXP channel is created between a pair of IDXP peers, the two peers can deliver data on the channel as respective role (server or client). 3.3.3 Conversation Control In IDMEF, there are Heartbeat messages that are used to indicate analyzers' current status to managers. Analogously in the hierarchy, any node should send Heartbeat messages to its parent node (except the root node). Heartbeat messages are intent to be sent in a regular period, say every ten minutes or every hour. The receipt of a Heartbeat message from a child node indicates to the parent node that the child is up and running. Lack of a Heartbeat message (or more likely, lack of some number of consecutive Heartbeat messages) indicates that the child node or its network connection has failed. Heartbeat messages keep an appropriative IDXP channel between the child peer and the parent peer. As a result, the BEEP conversation would be maintained so far as the child node is in gear, which assures the continuity of the conversation. Yixian Yang, et al. Expires December, 2003 [Page 8] INTERNET-DRAFT framework for LDIDS June, 2003 3.3.4 Invalidation There is no special invalidation message in the hierarchy. If a parent node cannot receive the regular Heartbeats from a child node, it would automatically cut off the BEEP conversation to invalidate the child node. The period of Heartbeats could be designated. The parent node would notify the administrator or produce security log when invalidation occurs. Automatic invalidation adds securities to the hierarchy. The node being invalidated would register again at the time when it would like to communicate with its parent node, and set up a new BEEP conversation. 4. Four-Layer Structure Different network topologies and various data sources would require different configuration of IDSs, even different type of IDSs. However, not any different IDSs could communicate with each other, and not all networks and its IDS could construct a node that can join in the hierarchy. Consequently, it is necessary to advance a structure to standardize the IDS that could be a member of the hierarchy. With provision for the flexibility, it must be guaranteed that any IDS designed according to the structure could be a part of the hierarchy, regardless of the type of the IDSs. Besides, it is remarkable that any IDS according with this criterion could independently detect the distributed intrusion in the monitored networks. 4.1 Basic Functional Module An IDS should have following functional modules: Sensor, Analyzer, Manager, Database and Responser. Figure 2 illustrates the major functional modules of an IDS. +-------------+ +--------| Manager |--------+ | +-------------+ | +---------+ | +----------+ | Database| | | Responser| +---------+ +-------------+ +----------+ | Analyzer | +-------------+ | | +-------------+ | Collector | +-------------+ Figure 2: main function modules of an IDS Yixian Yang, et al. Expires December, 2003 [Page 9] INTERNET-DRAFT framework for LDIDS June, 2003 However, a large-scale distributed IDS should comprise the following functional modules: o Collection Module: A module that collects the data that would be analyzed by IDS. For each type of data, there is a corresponding type of collection module. These modules are Log Collection Module, Datagram Collection Module and Other Information Collection Module. Each of the collection modules transmits the data to homologous analysis modules. o Analysis Module: A module that analyzes the data from collection modules. There are three types of analysis modules, which are log analysis modules, datagram analysis modules and other information analysis modules. Different data should be analyzed differently. Preceding two kinds of analysis modules mainly filtrate data, sort data and make elementary alerts. The last one deals with the alerts from other IDSs and secure components. Accordingly, analysis modules of the kind primarily transfer all the alerts to a uniform format. In most IDSs, a sensor always consists of collection modules and analysis modules. In view of functionality, the assignments accomplished by data collection and by analysis are far from one another. o Clustering Module: A module that makes alert clusters according to elementary alerts. The module congregates the alerts having some comparability and creates alert clusters. Furthermore, the module takes responsibility for sending alerts and data to Decision-Making Module, which cannot be disposed locally. o Merging Module: A module that makes intermediate alerts according to the alert clusters. And some of the intermediate alerts would be sent to Decision-Making Module, the others would be the materials of Corelation Module. o Corelation Module: A module that has the ability to figure out the local distributed intrusion. It fuses all of the local intermediate alerts and issues high-level alerts of local distributed intrusion. All the alerts created by the corelation module would be sent to Decision-Making Module. Clustering Module, Merging Module and Corelation Module are the modules that directly accounts for large-scale distributed intrusion detection. With these modules, the system would be an IDS that is able to detect distributed intrusion. Yixian Yang, et al. Expires December, 2003 [Page 10] INTERNET-DRAFT framework for LDIDS June, 2003 o Decision-Making Module: A module that makes decision of whether alerts, commands and data should be transmitted to Response Module or to Mutual Interface. Alerts come from the three modules above, data is from Clustering Module and commands come from Mutual Interface. o Harmonization Module: In large-scale distributed intrusion detection system, harmonization module mainly performs two tasks. One is to take charge of security of local IDS. If a local IDS adopts mobile agent mechanism, the module would assign the agents and control the agent platforms. The other is to deal with the issues with regard to the hierarchy, including registration, invalidation and the transmission of the control messages. o Response Module: A module that automatically responds to the alerts transmitted by Decision-Making Module and gives an alarm to Administrator Console when cannot respond by itself. o Administrator Console: A platform for the administrator to communicate with the IDS. With this platform, the administrator can manually configure the local IDS, upgrade character library, respond some alerts, make final decision, define a new type of agent and so on. o Mutual Interface: The interface that the local IDS communicates with other IDSs, firewalls and other secure Components, mainly for data and control information. In the hierarchy, the information through the interface is the data and alerts between parent nodes and child nodes. o Data Storage Module: The intrusion character, intrusion event and other data (such as the logs of IDS) would be stored in this module. The data stored in this module would be of importance for analysis in times to come. An IDS consisted of all the modules above, with the networks it monitors, could be a node of the hierarchy. In the actual IDS, an entity module may have more than one of the functional modules. And a functional module may have several kinds of realization in IDS. 4.2 Layered Structure Layered Structure shows how the functional modules cooperate in harmony to detect intrusions. Despite that the forms of different IDSs are not always uniform, the operation mechanisms would accord with the Four-Layer Structure. According to the structure, the functional modules would properly harmonize the tasks of detection. Yixian Yang, et al. Expires December, 2003 [Page 11] INTERNET-DRAFT framework for LDIDS June, 2003 +-----------+---------+------------+ |Admin |Harmoni- | | Harmonization and |Console |zation |Mutual | Management Layer +-----------+---------+ | |Response |Decision-|Interface | | |Making | | +-----------+---------+------------+ ^ | +---------+------------+ | | | +-----------+---------+------------+ Fusion Layer |Corelation | Merging | Clustering | +-----------+---------+------------+ ^ | +---------+-----------+ | | | +-----------+---------+------------+ Analysis Layer |Datagram |Log |Else | |Analysis |Analysis |Analysis | +-----------+---------+------------+ ^ ^ ^ | | | +-----------+---------+------------+ Fusion Layer |Datagram |Log |Else data | +-----------+---------+------------+ Figure 3: Four-Layer Structure A layer in this structure is defined as a function combination that is composed of one or more functional modules. The modules in a layer associate with each other and accomplish a specific task. Collection Layer collects the data for analysis and receives the alerts for fusion, which consists of different kinds of collection modules. Analysis Layer filters the raw data, classifies the data and produces elementary alerts. Corresponding to different kinds of collection modules, there are different analysis modules in the layer. Fusion Layer, which is a pure distributed analysis layer, makes some further analysis upon elementary alerts and issues high-level alerts. Clustering Module, Merging Module and Corelation Module all belong to the layer. Obviously, Harmonization and Management Layer is in responsibility for the management of IDS, whose main function is to accept or send out control and coordination messages. The layer is composed of Decision-Making Module, Harmonization Module, Response Module, Administrator Console and Mutual Interface. Yixian Yang, et al. Expires December, 2003 [Page 12] INTERNET-DRAFT framework for LDIDS June, 2003 In most IDSs, the sensor is composed of Collection Layer and Analysis Layer, and Harmonization and Management Layer constitutes the Manager. Fusion Layer is a layer that aims at large-scale distributed intrusion detection. Consequently, a common IDS that only consists of Collection Layer, Analysis Layer and Harmonization and Management layer could also run properly. The data transmitted between the layers are respectively raw data, elementary alerts and high-level alerts. The data stream is illustrated as follows. +------------------------------------+ | Harmonization and Management Layer | +------------------------------------+ ^ | High-level alerts +------------------------------------+ | Fusion Layer | +------------------------------------+ ^ | Elementary alerts +------------------------------------+ | Analysis Layer | +------------------------------------+ ^ | Raw data +------------------------------------+ | Collection Layer | +------------------------------------+ Figure 4: The data stream in Four-Layer Structure The details of each layer would be discussed in the following sections. 4.3 Collection Layer As illustrated in Figure 3, Collection Layer contains Log Collection Module, Datagram Collection Module and Other Information Collection Module. The data that leaf nodes collect are composed of the majority of raw data and the minority of alerts from other Components, while nodes in upper-levels mainly collect alerts from child nodes and some raw data sent by child nodes. Accordingly, the collection of leaf nodes emphasizes particularly on raw data. And the collection of other nodes differs from that of leaf nodes. 4.3.1 The Importance of Data Collection Data collection is the first job that IDS must complete. If the collection were delayed, the detection would lose its effect. In the worst cases, if the collection has some mistakes, probably juggled by the intruder, IDS would fail to detect some intrusions. Yixian Yang, et al. Expires December, 2003 [Page 13] INTERNET-DRAFT framework for LDIDS June, 2003 The number of the collection spots in large-scale distributed intrusion detection systems is large. So the design of collection spots should be simple. If mobile agent technique were introduced, the flexibility of mobile agents would strengthen the data collection ability. And simple designed collection spots could be set up quickly. 4.3.2 Data Collection Mechanism There are different collection mechanism classifications for IDS. As follow: o Centralized Collection and Distributed Collection The data collection spots in Centralized Collection is fixed. And the number of collection spots is fixed, too. While, the collection spots number of later case would zoom with the scale of the networks. In large-scale distributed systems, the later is preferred. o Direct Monitor and Indirect Monitor The former one obtains data from the monitored objects directly. The later differs from the former, which gains data from special processes or assemblies. Direct Monitor excels Indirect Monitor in speed, whereas its realization is much more complicated. o Host-based Collection and Network-based Collection A host-based IDS does not monitor the network traffic, contrarily it monitors what is happening on the actual target machines. It does this by monitoring security event logs or checking for changes to the system, for example, changes to critical system files or to the systems registry. A network-based IDS monitors packets on the network and attempts to discover an intruder. A typical example is to check a large number of TCP connection requests (SYN) to many different ports on a target machine, thus to discover if someone is attempting a TCP port scan. A network intrusion detection system sniffs network traffic by watching all network traffic. In the hierarchy, host-based IDSs collect log, network-based IDSs collect packets on networks, and the alerts come from Mutual Interface. The two major sources of data audited in the surveyed systems are network data (typically the data that is read directly from a multicast network such as Ethernet) and host-based security logs. The host-based logs may include operating system kernel logs, application program logs, network equipment (such as routers and firewalls) logs, etc. Yixian Yang, et al. Expires December, 2003 [Page 14] INTERNET-DRAFT framework for LDIDS June, 2003 4.3.3 Log Logs are important records of system operation and they can reflect the system operation status. However, logs increase so quickly that administrators are at a loss when facing thousands of logs and the logs would become garbage and occupy the disks. Thus, IDSs have the special modules that deal with logs. For intruders, the first thing they want to do is to delete the intrusion traces. If an intruder obtains the root privilege and modifies the system logs, IDS would be incapable of finding out the intrusion. So IDSs must contain the log files detection. Logs are the significant data sources. There are three types of logs, namely system logs, secure software logs and applications logs. 4.3.4 Network Datagram Intrusions based on multi hosts and network-based intrusions are the most popular attack modes. As a result, the majority of data analyzed by IDSs are datagrams. 4.3.5 Other Information IDSs are not the only security components running in the networks. Actually, there are other IDSs, such as firewalls, Honeypots and other security components. IDSs must work together with these components to make wiser decisions. 4.4 Analysis Layer The layer is composed of different analysis modules. These modules make further classification for the data, create events and analyze the events to issue alerts. 4.4.1 Analysis Techniques The task of IDS is to find out the trace of intruder from enormous raw data. Detection technology is the core of IDS, and the detection ability lies on analysis methods. According to the analysis method employed, IDS could be divided into Anomaly Detection and Misuse Detection. o Anomaly Detection: Anomaly intrusion refers to the abnormal activity or use related to system resources. Anomaly detection is the detection of illegal activities, which is accomplished by comparing the data collected with the stored data. This method of detection is still in the initial stages of its evolvement, but it has received much attention as a compliment to misuse detection. Yixian Yang, et al. Expires December, 2003 [Page 15] INTERNET-DRAFT framework for LDIDS June, 2003 o Misuse Detection: Misuse intrusion refers to well defined technical attacks on known system vulnerabilities. Misuse detection compares such activities with known attack patterns to determine if an intrusion is indeed occurring. This method of intrusion detection is not only simple to implement but accurate, and is a widely used method of intrusion detection. 4.4.2 Analysis Process In IDSs, data can be divided into events and alerts. Events are the data that have not be analyzed yet, and alerts are the data having been analyzed, which could imply the potential intrusions. What the analysis layer does is to abstract significative events from raw data and analyze the events to make alerts of potential intrusions. An analysis method fits some kind of data. However, any analysis method would make further classification for the data, create events and analyze the events to issue alerts. Analysis module is made up of Classification Submodule, Event Creating Submodule and Alert Making Submodule. They respectively behave as follows: o Classification The first job is filtration, which would order the data in time and discard the repeated data. And then the data would be classified according to the rules of IDS and sent to different event creating submodules. o Event Creating The data that would be analyzed are called events. Event could be either packets in networks or logs of hosts. An integrate event should have attributes such as event type, event source and event ID. o Alert Making All of the events would be sent to analyzer for analysis. When analyzer finds out intrusions, it would make alerts to tell the cases. 4.5 Fusion Layer In large-scale networks, the detection for distributed intrusions should adopt the real-time fusion mechanism to issue the alerts of large-scale environments. At the same time, fusion could improve the detection ability of single IDS, such as reducing the False Positive to send more precise alarms to the administrator. Figure 5 illustrates a simple version of the real-time fusion mechanism in a large-scale distributed IDS. Yixian Yang, et al. Expires December, 2003 [Page 16] INTERNET-DRAFT framework for LDIDS June, 2003 ^ | High-level Alerts +----------------------------+-------------------------------+ | | Fusion Component | | +------+----------+ | | | Corelation | | | +-----------------+ | | ^ ^ | | Intermediate Alerts | | | | | | | | +--------+--+ ++----------+ | | | Merging | ... | Merging | | | +-----------+ +-----------+ | | ^ ^ ^ | | Alert Clusters| | | | | | | | | | +--------+---+ ++-----------+ ++-----------+ | | | Clustering | | Clustering | ... | Clustering | | | +------------+ +------------+ +------------+ | | ^ ^ ^ | | | | | | +-----------+----------------+----------------------+--------+ Elemental Alerts| | | +------+--+---------+---+----------+--------+---+---+ | | | | | | +---+ +---+ +---+ +---+ +----+ +------+ |IDS| ... |IDS| |IDS| ... |IDS| | FW | | Else | +---+ +---+ +---+ +---+ +----+ +------+ Figure 5: The Real-time Fusion Mechanism in Large-scale Distribute IDS 4.5.1 Clustering Clustering puts some alerts that have some similarities together to constitute an aggregation. The aggregation is called an alert cluster, which would engender an intermediate alert. Clustering modules would deal with the alerts that come from different IDS and other security components. A clustering module detects and congregates the alerts of the same intrusion from everywhere. In the hierarchy, the alerts sent to the clustering module are all from different analysis modules, which guarantees that the alerts arriving are all in standard formats that could be recognized by the clustering module. When a new alert arrives, it would be stored in the buffered database. Then, the Clustering would traverse the buffered database and look up the alerts that have some similarities with this new one. The core problem is how to define the similarity relation, which could be solved well according to the probability arithmetic advanced by Valdes and Skinner. Yixian Yang, et al. Expires December, 2003 [Page 17] INTERNET-DRAFT framework for LDIDS June, 2003 4.5.2 Merging An alert cluster would be incorporated as an intermediate alert. The reason for merging is to create new alerts, which contain typical information of alert clusters. In case of a new element called Alerti inserted, there are two possibilities: 1) If there is no alert cluster similar with Alerti, an intermediate alert would be created. The intermediate one newly created only contains Alerti. 2) If Alterti could be inserted into an existed alert cluster, the alert cluster would be modified and the corresponding intermediate alert would be modified accordingly. When a new elementary alert added, the number of the modified alert clusters would be one or more. For example, the new alert should belong to two known alert clusters. When the new one having been inserted, the two alert clusters should be incorporated respectively, and two new intermediate alerts should be created. These two intermediate alerts may have some similarities, so that it is feasible that the intermediate alerts belong to the same alert clusters. In this status, it is possible that several alert clusters would be incorporated into one cluster. 4.5.3 Corelation There are two corelation methods: o Dominant Corelation: When the security administrator could distinguish the relation between alert events, IDS would adopt visualized corelation. The relation could be the logistic links of the knowledge based on different alerts, or be the route according as the topology of the information systems. o Recessive Corelation: When data analysis could find out the mapping relationship, recessive Corelation is used. The method is based on the teams of alerts observed and the recessive relations between these teams. 4.6 Harmonization and Management Layer Manager is an important role in an IDS, which runs above the layers of detection and treats with all alerts. An excellent manager could not only present all the instances in the networks, but respond to the alerts, filter noise, distinguish mistake alerts and obtain the information that surpasses the detection layers. In this layer, there are Decision-Making Module, Harmonization Module, Response Module, Administrator Console and Mutual Interface. These modules cooperate with each other, constitute the managers of most IDSs. Yixian Yang, et al. Expires December, 2003 [Page 18] INTERNET-DRAFT framework for LDIDS June, 2003 4.6.1 Decision-Making Decision-making is in charge of classifying the local alerts and sending different kinds of alerts to relevant components. The alerts that could be resolved locally would be sent to Response Module. Contrarily, the events and the alerts that must be analyzed by the parent node should be sent to Mutual Interface. 4.6.2 Harmonization There are two tasks that must be accomplished by Harmonization Module. o Dealing with the security problems of a local IDS. Different IDSs determining the security problems concerning with the IDSs differ from each other. The security mechanism determines the tasks that Harmonization Module must fulfill. o Being responsible for the processes between parent nodes and child nodes in the hierarchy. These mainly include registration, invalidation, control message transmission and so on. 4.6.3 Response There are two response techniques, passive versus active. Passive systems respond by notifying the proper authority or the administrator or some other security Components, and they do not try to mitigate the damage done, or actively seek to harm or hamper the attacker. i.e. they would make an alarm to the firewall nearby to cut off a connection that is controlled by the intruder. Active systems may be further divided into two classes: o Those that exercise control over the compromised system, i.e. they modify the state of the compromised system to thwart or mitigate the effects of the attack. Such controls could take the forms of terminating network connections, increasing the security logging, killing errant processes, etc. o Those that exercise control over the attacking system, i.e. an attempt to remove the attacker's platform of operation. Since this is difficult to defend in court, it is not of much interest in this approach outside military or law enforcement circles. 4.6.4 Administrator Console This is the only platform for the communication between the administrator and IDSs. There are two main jobs that Administrator Console must perform. Yixian Yang, et al. Expires December, 2003 [Page 19] INTERNET-DRAFT framework for LDIDS June, 2003 o Administrator could manage IDSs and configure IDSs. In detail, administrator could examine the status of IDS, access the online helps and create reports. Furthermore, administrator could define the rules of detection, create new agents and pause some functions. o When some alerts could not be responded by the responser, IDSs would make alerts to notify the administrator. The administrator would take some actions to solve these problems. 4.6.5 Mutual Interface This is the exclusive interface of IDSs. All of the communications with outside would get across this interface. The data across this interface are registration information, alerts, raw data that child nodes send to their parents and response messages that parents nodes sent to their child nodes. 5. Acknowledgement The authors wish to thank Huiqin Lv, Yafei Yang, Zhansong Wei, Jie Zhang, and Ming Tao, for their detailed inputs. 6. Informative References [1] Ed Gerck, Ph.D., ''Overview of Certification Systems:X.509, PKIX, CA, PGP & SKIP'', E. Gerck 1997-2000 and THE BELL, 2000. [2] Stefan Axelsson, ''Intrusion Detection Systems:A Survey and Taxonomy'', Department of Computer Engineering Chalmers University of Technology Gíºoteborg, Sweden. [3] Kumar Das, ''Protocol Anomaly Detection for Network-based Intrusion Detection'', GSEC Practical Assignment Version 1.2f, August,13, 2001. [4] B. Feinstein, ''The Intrusion Detection Exchange Protocol (IDXP)'', draft-ietf-idwg-beep-idxp-05, June 17, 2002. 7. Authors' Addresses Yixian Yang Information Security Center, Beijing University of posts and telecom.(BUPT), Beijing, China,100876 Phone:8610-62283366 Email:yxyang@bupt.edu.cn Yixian Yang, et al. Expires December, 2003 [Page 20]