INTERNET-DRAFT                                   Yixian Yang
Expires: December 2003                           Ning An
                                                 Yonggang Chu
                        	                 Beijing University of
                                                 Posts and Telecom.
                                                 June 2003


                  A Framework for Large-scale Distributed
                      Intrusion Detection System(LDIDS) 
                    draft-yang-ldids-framework-00.txt


Status of This Memo

   This document is an Internet Draft and is in full conformance with
   all provisions of Section 10 of RFC 2026.

   This document is an Internet-Draft.  Internet-Drafts are working
   documents of the Internet Engineering Task Force (IETF), its
   areas, and its working groups.  Note that other groups may also
   distribute working documents as Internet-Drafts.

   Internet-Drafts are draft documents valid for a maximum of six
   months and may be updated, replaced, or obsoleted by other
   documents at any time.  It is inappropriate to use Internet-
   Drafts as reference material or to cite them other than as
   "work in progress."   

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/1id-abstracts.html.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   Distribution of this memo is unlimited.

Abstract

   Intrusion Detection Systems (IDSs) are designed to detect 
   intrusions and protect the relative network or hosts. Now the 
   network scale is becoming larger and larger, Large-scale 
   Distributed Intrusion Detection Systems, which are IDSs that work
   in such environments, are the trends of IDSs evolution.  

   This document describes a hierarchy framework for Large-scale 
   Distributed Intrusion Detection Systems, with which a Large-scale
   Distributed IDS can be flexibly deployed.  Each node in this 
   framework can be seen as a simple IDS.  This document gives a 
   four-layer structure for the simple IDS. This four-layer structure
   can also be the structure of an independent IDS. 


Yixian Yang, et al.         Expires December, 2003              [Page 1]

INTERNET-DRAFT               framework for LDIDS              June, 2003


Table of Contents  

  Status of This Memo .........................................   1

  Abstract ....................................................   1

  1.Introduction...............................................   3

  2.Glossary ..................................................   4
 
  3.Conceptual Model ..........................................   5
    3.1 Overview...............................................   5
    3.2 Feature ...............................................   6
    3.3 Organization ..........................................   7
     3.3.1 Registration .......................................   7
     3.3.2 Security Conversation ..............................   8
     3.3.3 Conversation Control ...............................   8
     3.3.4 Invalidation .......................................   9

  4. Four-Layered Structure ...................................   9
    4.1 Basic Function Modules ................................   9
    4.2 Layered Structure .....................................  11
    4.3 Collection Layer ......................................  13
     4.3.1 The Importance of Data Collection ..................  13
     4.3.2 Data Collection Mechanism...........................  14
     4.3.3 Log ................................................  15
     4.3.4 Network Datagram ...................................  15
     4.3.5 Other Information...................................  15
    4.4 Analysis Layer ........................................  15
     4.4.1 Analysis Technique .................................  15
     4.4.2 Analysis Process ...................................  16
    4.5 Fusion Layer ..........................................  16
     4.5.1 Congregation .......................................  17
     4.5.2 Coalition ..........................................  18
     4.5.3 Association ........................................  18
    4.6 Harmonization and Management Layer ....................  18
     4.6.1 Decision-Making ....................................  19
     4.6.2 Harmonization ......................................  19
     4.6.3 Response ...........................................  19
     4.6.4 Administrator Console ..............................  19
     4.6.5 Mutual Interface ...................................  20

  5. Acknowledgements..........................................  20

  6. Informative References....................................  20

  7. Authors' Addresses........................................  20


Yixian Yang, et al.         Expires December, 2003              [Page 2]

INTERNET-DRAFT               framework for LDIDS              June, 2003


1. Introduction
  
   This document addresses the framework of a hierarchy for Large-scale
   Distributed Intrusion Detection Systems.  Large-scale distributed 
   intrusions have some unique features, such as large area, high speed
   and huge data stream and so on.  Thus, the IDS should have some 
   countermeasures for distributed intrusions.  It is necessary to
   bring forward a framework to facilitate large-scale distributed 
   intrusion detections.  This framework provides a mechanism, through
   which several distributed IDSs can cooperate harmoniously in the
   detection.

   In this document, a new four-layer structure of IDS is presented.  
   The IDSs adapted to this structure could be large-scale distributed 
   intrusion detection systems as well as an independent IDS.  There
   are different functional modules in IDSs, and layered structure
   shows how the functional modules cooperate in harmony to detect
   intrusions.  Despite that the forms of IDSs are not always uniform,
   the operation mechanisms would accord with the four-layer Structure.
   According to the structure, the functional modules would properly
   harmonize the tasks of distributed intrusion detection.


Yixian Yang, et al.         Expires December, 2003              [Page 3]

INTERNET-DRAFT               framework for LDIDS              June, 2003
 

2. Glossary

   This document uses terminology that is defined in [DSARCH].  There
   is also current work-in-progress on this terminology in the IETF 
   and some of the definitions provided here are taken from that work.
   Some of the terms from these other references are defined again 
   here in order to provide additional detail, along with some new 
   terms specific to this document.
 
   IDS          Intrusion Detection System, which is a security system
                that monitors computer systems and network traffic and 
                analyzed that traffic for possible hostile attacks 
                originating from outside the organization and also for 
                system misuse or attacks originating from inside the 
                organization.

   Distributed  The system that operates in distributed manners, such  
   System       as the system that adopts distributed analysis 
                methods.
 
   Distributed  The intrusions that take several steps and involve a 
   Intrusion    great deal of hosts. 

   Functional   A basic building block of the conceptual IDSs.  
   Module       Typical elements are Collection, Analysis,    
                Congregation, Coalition, Association, Decision-Making,  
                Harmonization,Response, Administrator Console and
                Mutual Interface.
 
   Layer        A function combination that is composed of one or 
                more functional modules.  The layers are Collection, 
                Analysis, Fusion and Harmonization and Management.
                
                
Yixian Yang, et al.         Expires December, 2003              [Page 4]

INTERNET-DRAFT               framework for LDIDS              June, 2003               


3. Conceptual Module
  
   A hierarchy is somehow the best way to explore the issues involved
   in large-scale environments.  It features a hierarchical 
   decomposition of the protected organization and its networks.

3.1 Entire Design

   The framework is based on hierarchy, which is set up according to
   the network topologies.  The hierarchy consists of leaf nodes, 
   branch nodes and root nodes.  Leaf nodes monitor local network
   activities, branch nodes monitor each child node network and root
   node monitors activities of the whole network. 

   In a simple case, there are only three levels in hierarchy.  
   That is, it exists only one level of branch nodes.


                             +----------+
                             |   Root   |
                             +----------+
                             |          |
                      +--------+       +--------+
                      | Middle |       | Middle |
                      +--------+       +--------+
                       |      |             |
               +--------+   +--------+    +--------+
               |  Leaf  |   |  Leaf  |    |  Leaf  |
               +--------+   +--------+    +--------+

              Figure 1:  An Sketch Map of the Hierarchy

   Networks can be decomposed into several departments.  Each 
   department indicates a security organization and its network.
   In each department, there is an IDS that is on duty of the 
   secure issues of the local networks.  The combination of a 
   department and the local IDS is defined as a leaf node of the
   hierarchy.

   Leaf nodes collect the data in local department, including logs, 
   network datagram and alerts sent by other secure components.  All
   the data would be analyzed using proper analysis methods.  After 
   that, the local IDS would produce alerts for intrusion activities,
   and locally respond to some of these alerts.  At the same time,
   local IDS would send the alerts and data, which cannot be analyzed
   locally, to its parent node.  These alerts and data sent to parent
   node might have some correlation with those of other departments.
   
   
Yixian Yang, et al.         Expires December, 2003              [Page 5]

INTERNET-DRAFT               framework for LDIDS              June, 2003   


   A branch node may have one or more child nodes, which can be branch
   nodes or leaf nodes. Actually, a branch department is an aggregation
   of all the child departments.  The IDS receives alerts and datagram
   from the IDSs of child nodes and determinates whether there are 
   intrusions in the child departments.

   The root node is not always a root node and sometimes it would act
   as a branch node.  When the scale of the network becomes larger, the
   depth of the hierarchy would also increase.  In this case, the root
   node would become a branch node in the updated hierarchy.

   Each node in the hierarchy is a unit of given networks and
   corresponding IDS.  The IDS of each node has its own independence.
   Therefore, any part of the hierarchy is also a complete distributed
   IDS.


3.2 Feature
     
   Firstly, the framework could be zoomed corresponding to the scale of
   networks.  There are four types of potential changes as follows:

   o   The number of the hosts in some department increases.  These
       newly added hosts must register themselves.  If the local IDS
       accepts them, they would join in this department.  However, when
       the local IDS could not accept their registrations, a new IDS is
       required to create a new department.  The new department needs
       to be registered, too.

   o   After a new department is created, it must register itself.  
       Upon registration having completed, this new department and the 
       IDS would become a new node in the hierarchy.

   o   If some node finds that one of its child nodes is inactive, this
       node would invalidate the inactive child node.

   o   When a new firewall or another secure component is to join, they
       have to register themselves to IDS of the same level.  This type
       of change would not alter the number of the nodes in the 
       hierarchy, but would change the data source of the related IDS.

   While the various changes above exist, there is no doubt that this
   framework can zoom freely.  So it is a framework that can adapt well
   to large-scale networks.

   Registration and invalidation are two mechanisms of the framework's
   security.  These mechanisms can distinguish legal nodes from illegal
   nodes.  Because registration would prevent intruder from damaging
   the hierarchy, some intruders who use illegal identity to cheat 
   parent node will not get his own way.
   
   
Yixian Yang, et al.         Expires December, 2003              [Page 6]

INTERNET-DRAFT               framework for LDIDS              June, 2003   


   Secondly, the use of hierarchy in network topology facilitates
   large-scale distributed intrusion detection.  The reason is that the
   nodes in upper level would deal with less data.  By this way, the 
   nodes that conduct larger networks would dispose less data.  So it
   is a reasonable way to detect distributed intrusions in large-scale
   networks.

   Finally, hierarchy reflects the actual network topology.  In most
   cases, it would not have more than five levels.  For this reason, 
   the response speed of the framework would not be affected.  So the
   response of root node can be transferred to the final destination
   timely.  

3.3  Organization 

3.3.1 Registration 
         
   Registration is a good technique for the new legal node that is to
   join the framework.  Registration guarantees that the hierarchy
   would keep in a relatively secure state.  It can not only prevent
   imitative node from entering into this hierarchy, but avoid any
   illegal communications.

   As the efficiency and urgency requested by registration are not
   rigid, CA (Certification Authority) is a better way that can be used
   to accomplish registration.  CA issues certificates to subscribers
   (CA clients) in order that such certificates can be verified by
   users. Thus, there are three main entities which can be outwardly
   recognized in certification procedures:

   o   CA: a general designation for any entity that controls the
       authentication services and the management of certificates,
       is the cornerstone of a trust community.  It issues and manages
       certificates of end-users, service providers, applications and
       appliances. 

   o   Subscriber: an entity that supplies to the CA the information
       that is to be included in the entityí¯s own certificate, signed
       by the CA.

   o   User: any entity which relies upon a certificate issued by a CA
       in order to obtain information on the subscriber.

   In case of a leaf node that would make itself certificated, the leaf
   node acts as the user and its parent node acts as the subscriber.  
   CA simplex distinguish protocol is used.
   

Yixian Yang, et al.         Expires December, 2003              [Page 7]

INTERNET-DRAFT               framework for LDIDS              June, 2003   
   

   In case of a branch node certificated, the branch node acts as the
   user and its child nodes act as the subscriber.  CA duplex
   distinguish protocol is used, for the parent node would not trust
   its child nodes whereas the child node always trusts the parent
   node.  Of course, the branch node must adopt simplex distinguish
   protocol to get its parent node's credit, the same with the case
   of the leaf node.
  
3.3.2 Security Conversation

   When registration completed, the node would become a part of the
   hierarchy.  The node should establish a secure and high-speed
   conversation with its parent nodes and child nodes.  With this
   conversation, any node in the hierarchy could transmit data, alerts
   and responses.

   IDXP is a choice of the communication mechanism.  In the
   conversations, two conjunct nodes act as peers.  One of the nodes is
   the other's parent, called parent IDXP peer.  The other node is
   called child IDXP peer.  If a pair of IDXP peers attempts to start
   up a communication, they must at first initiate a BEEP conversation.
   In general, it is recommended that the one initiating the
   conversation is the child peer (client).

   Once BEEP conversation established, all exchanges would occur in the
   context of a channel in BEEP. When an IDXP channel is created between
   a pair of IDXP peers, the two peers can deliver data on the channel
   as respective role (server or client).

3.3.3 Conversation Control

   In IDMEF, there are Heartbeat messages that are used to indicate
   analyzers' current status to managers.  Analogously in the hierarchy,
   any node should send Heartbeat messages to its parent node (except
   the root node).  

   Heartbeat messages are intent to be sent in a regular period, say
   every ten minutes or every hour.  The receipt of a Heartbeat message
   from a child node indicates to the parent node that the child is up
   and running.  Lack of a Heartbeat message (or more likely, lack of
   some number of consecutive Heartbeat messages) indicates that the
   child node or its network connection has failed.

   Heartbeat messages keep an appropriative IDXP channel between the 
   child peer and the parent peer.  As a result, the BEEP conversation
   would be maintained so far as the child node is in gear, which
   assures the continuity of the conversation.
   
   
Yixian Yang, et al.         Expires December, 2003              [Page 8]

INTERNET-DRAFT               framework for LDIDS              June, 2003   
   
   
3.3.4 Invalidation

   There is no special invalidation message in the hierarchy.  If a
   parent node cannot receive the regular Heartbeats from a child node,
   it would automatically cut off the BEEP conversation to invalidate
   the child node.  The period of Heartbeats could be designated.  
   The parent node would notify the administrator or produce security
   log when invalidation occurs.

   Automatic invalidation adds securities to the hierarchy.  The node
   being invalidated would register again at the time when it would like
   to communicate with its parent node, and set up a new BEEP
   conversation.
   
4. Four-Layer Structure

   Different network topologies and various data sources would require
   different configuration of IDSs, even different type of IDSs.  
   However, not any different IDSs could communicate with each other, 
   and not all networks and its IDS could construct a node that can join
   in the hierarchy.  Consequently, it is necessary to advance a
   structure to standardize the IDS that could be a member of the 
   hierarchy. With provision for the flexibility, it must be guaranteed 
   that any IDS designed according to the structure could be a part of 
   the hierarchy, regardless of the type of the IDSs. Besides, it is 
   remarkable that any IDS according with this criterion could 
   independently detect the distributed intrusion in the monitored 
   networks.
   
4.1 Basic Functional Module

   An IDS should have following functional modules: Sensor, Analyzer, 
   Manager, Database and Responser.  Figure 2 illustrates the major 
   functional modules of an IDS.

                                +-------------+
                       +--------|   Manager   |--------+
                       |        +-------------+        |
                  +---------+         |           +----------+
                  | Database|         |           | Responser|
                  +---------+   +-------------+   +----------+
                                |   Analyzer  |
                                +-------------+
                                      |
                                      |
                                +-------------+
                                |  Collector  |
                                +-------------+
                                
                  Figure 2:  main function modules of an IDS


Yixian Yang, et al.         Expires December, 2003              [Page 9]

INTERNET-DRAFT               framework for LDIDS              June, 2003   


   However, a large-scale distributed IDS should comprise the following
   functional modules:

   o   Collection Module: A module that collects the data that would be
       analyzed by IDS.  For each type of data, there is a corresponding
       type of collection module.  These modules are Log Collection 
       Module, Datagram Collection Module and Other Information 
       Collection Module.  Each of the collection modules transmits the 
       data to homologous analysis modules.

   o   Analysis Module: A module that analyzes the data from collection 
       modules.  There are three types of analysis modules, which are 
       log analysis modules, datagram analysis modules and other 
       information analysis modules.

       Different data should be analyzed differently.  Preceding two 
       kinds of analysis modules mainly filtrate data, sort data and 
       make elementary alerts.  The last one deals with the alerts from 
       other IDSs and secure components.  Accordingly, analysis modules 
       of the kind primarily transfer all the alerts to a uniform 
       format.

       In most IDSs, a sensor always consists of collection modules and 
       analysis modules.  In view of functionality, the assignments 
       accomplished by data collection and by analysis are far from one 
       another.

   o   Clustering Module: A module that makes alert clusters according 
       to elementary alerts.  The module congregates the alerts having 
       some comparability and creates alert clusters.  Furthermore, the 
       module takes responsibility for sending alerts and data to 
       Decision-Making Module, which cannot be disposed locally.

   o   Merging Module: A module that makes intermediate alerts according
       to the alert clusters.  And some of the intermediate alerts would
       be sent to Decision-Making Module, the others would be the 
       materials of Corelation Module.

   o   Corelation Module: A module that has the ability to figure out 
       the local distributed intrusion.  It fuses all of the local 
       intermediate alerts and issues high-level alerts of local 
       distributed intrusion.  All the alerts created by the corelation 
       module would be sent to Decision-Making Module.

       Clustering Module, Merging Module and Corelation Module are the 
       modules that directly accounts for large-scale distributed 
       intrusion detection.  With these modules, the system would be 
       an IDS that is able to detect distributed intrusion.
       
       
Yixian Yang, et al.         Expires December, 2003             [Page 10]

INTERNET-DRAFT               framework for LDIDS              June, 2003   
       

   o   Decision-Making Module: A module that makes decision of whether 
       alerts, commands and data should be transmitted to Response 
       Module or to Mutual Interface.  Alerts come from the three 
       modules above, data is from Clustering Module and commands come
       from Mutual Interface.

   o   Harmonization Module: In large-scale distributed intrusion 
       detection system, harmonization module mainly performs two tasks.
       One is to take charge of security of local IDS.  If a local IDS 
       adopts mobile agent mechanism, the module would assign the agents
       and control the agent platforms.  The other is to deal with the 
       issues with regard to the hierarchy, including registration, 
       invalidation and the transmission of the control messages.

   o   Response Module: A module that automatically responds to the 
       alerts transmitted by Decision-Making Module and gives an alarm 
       to Administrator Console when cannot respond by itself.

   o   Administrator Console: A platform for the administrator to 
       communicate with the IDS.  With this platform, the administrator
       can manually configure the local IDS, upgrade character library, 
       respond some alerts, make final decision, define a new type of 
       agent and so on.

   o   Mutual Interface: The interface that the local IDS communicates 
       with other IDSs, firewalls and other secure Components, mainly 
       for data and control information.  In the hierarchy, the 
       information through the interface is the data and alerts between
       parent nodes and child nodes.

   o   Data Storage Module: The intrusion character, intrusion event and
       other data (such as the logs of IDS) would be stored in this 
       module.  The data stored in this module would be of importance 
       for analysis in times to come. 

   An IDS consisted of all the modules above, with the networks it 
   monitors, could be a node of the hierarchy. In the actual IDS, an 
   entity module may have more than one of the functional modules.  
   And a functional module may have several kinds of realization in IDS.
   
4.2 Layered Structure

   Layered Structure shows how the functional modules cooperate in 
   harmony to detect intrusions.  Despite that the forms of different 
   IDSs are not always uniform, the operation mechanisms would accord 
   with the Four-Layer Structure.  According to the structure, the 
   functional modules would properly harmonize the tasks of detection.
   
   
Yixian Yang, et al.         Expires December, 2003             [Page 11]

INTERNET-DRAFT               framework for LDIDS              June, 2003   


                            +-----------+---------+------------+
                            |Admin      |Harmoni- |            |
          Harmonization and |Console    |zation   |Mutual      |
          Management Layer  +-----------+---------+            | 
                            |Response   |Decision-|Interface   |
                            |           |Making   |            |
                            +-----------+---------+------------+
                                            ^
                                            |
                                  +---------+------------+
                                  |         |            |
                            +-----------+---------+------------+
             Fusion Layer   |Corelation | Merging | Clustering |
                            +-----------+---------+------------+
                                                        ^ 
                                                        |
                                  +---------+-----------+
                                  |         |           |
                            +-----------+---------+------------+
            Analysis Layer  |Datagram   |Log      |Else        |
                            |Analysis   |Analysis |Analysis    |
                            +-----------+---------+------------+
                                 ^           ^          ^
                                 |           |          | 
                            +-----------+---------+------------+
             Fusion Layer   |Datagram   |Log      |Else data   |
                            +-----------+---------+------------+
                            
                            Figure 3:  Four-Layer Structure
                            
   A layer in this structure is defined as a function combination that 
   is composed of one or more functional modules.  The modules in a 
   layer associate with each other and accomplish a specific task. 

   Collection Layer collects the data for analysis and receives the 
   alerts for fusion, which consists of different kinds of collection 
   modules.

   Analysis Layer filters the raw data, classifies the data and produces
   elementary alerts.  Corresponding to different kinds of collection 
   modules, there are different analysis modules in the layer.

   Fusion Layer, which is a pure distributed analysis layer, makes some 
   further analysis upon elementary alerts and issues high-level alerts.
   Clustering Module, Merging Module and Corelation Module all belong to
   the layer.

   Obviously, Harmonization and Management Layer is in responsibility 
   for the management of IDS, whose main function is to accept or send 
   out control and coordination messages.  The layer is composed of 
   Decision-Making Module, Harmonization Module, Response Module, 
   Administrator Console and Mutual Interface.


Yixian Yang, et al.         Expires December, 2003             [Page 12]

INTERNET-DRAFT               framework for LDIDS              June, 2003   
          

   In most IDSs, the sensor is composed of Collection Layer and Analysis
   Layer, and Harmonization and Management Layer constitutes the 
   Manager.  Fusion Layer is a layer that aims at large-scale 
   distributed intrusion detection. Consequently, a common IDS that only
   consists of Collection Layer, Analysis Layer and Harmonization and 
   Management layer could also run properly.

   The data transmitted between the layers are respectively raw data, 
   elementary alerts and high-level alerts. The data stream is 
   illustrated as follows.
                      
                     +------------------------------------+
                     | Harmonization and Management Layer |
                     +------------------------------------+
                                      ^
                                      | High-level alerts
                     +------------------------------------+
                     |            Fusion Layer            |
                     +------------------------------------+
                                       ^                  
                                       | Elementary alerts
                     +------------------------------------+
                     |           Analysis Layer           |
                     +------------------------------------+
                                       ^
                                       | Raw data
                     +------------------------------------+
                     |           Collection Layer         |
                     +------------------------------------+

               Figure 4:  The data stream in Four-Layer Structure
               
   The details of each layer would be discussed in the following sections. 
                 
4.3 Collection Layer

   As illustrated in Figure 3, Collection Layer contains Log Collection 
   Module, Datagram Collection Module and Other Information Collection 
   Module.  The data that leaf nodes collect are composed of the 
   majority of raw data and the minority of alerts from other 
   Components, while nodes in upper-levels mainly collect alerts from 
   child nodes and some raw data sent by child nodes.  Accordingly, the 
   collection of leaf nodes emphasizes particularly on raw data.  And 
   the collection of other nodes differs from that of leaf nodes.

4.3.1 The Importance of Data Collection

   Data collection is the first job that IDS must complete.  If the 
   collection were delayed, the detection would lose its effect.  
   In the worst cases, if the collection has some mistakes, probably 
   juggled by the intruder, IDS would fail to detect some intrusions.


Yixian Yang, et al.         Expires December, 2003             [Page 13]

INTERNET-DRAFT               framework for LDIDS              June, 2003   
         

   The number of the collection spots in large-scale distributed 
   intrusion detection systems is large.  So the design of collection 
   spots should be simple.  If mobile agent technique were introduced, 
   the flexibility of mobile agents would strengthen the data collection
   ability.  And simple designed collection spots could be set up 
   quickly.
   
4.3.2 Data Collection Mechanism

   There are different collection mechanism classifications for IDS.  
   As follow: 

   o   Centralized Collection and Distributed Collection   The data 
       collection spots in Centralized Collection is fixed.  And the 
       number of collection spots is fixed, too.  While, the collection 
       spots number of later case would zoom with the scale of the 
       networks.  In large-scale distributed systems, the later is 
       preferred.

   o   Direct Monitor and Indirect Monitor   The former one obtains 
       data from the monitored objects directly. The later differs 
       from the former, which gains data from special processes or 
       assemblies. Direct Monitor excels Indirect Monitor in speed, 
       whereas its realization is much more complicated.

   o   Host-based Collection and Network-based Collection   
       A host-based IDS does not monitor the network traffic, contrarily
       it monitors what is happening on the actual target machines.  
       It does this by monitoring security event logs or checking for 
       changes to the system, for example, changes to critical system 
       files or to the systems registry.  A network-based IDS monitors 
       packets on the network and attempts to discover an intruder.  
       A typical example is to check a large number of TCP connection 
       requests (SYN) to many different ports on a target machine, 
       thus to discover if someone is attempting a TCP port scan.  
       A network intrusion detection system sniffs network traffic by 
       watching all network traffic.

   In the hierarchy, host-based IDSs collect log, network-based IDSs 
   collect packets on networks, and the alerts come from Mutual 
   Interface.  The two major sources of data audited in the surveyed 
   systems are network data (typically the data that is read directly 
   from a multicast network such as Ethernet) and host-based security 
   logs. The host-based logs may include operating system kernel logs, 
   application program logs, network equipment (such as routers and 
   firewalls) logs, etc.   


Yixian Yang, et al.         Expires December, 2003             [Page 14]

INTERNET-DRAFT               framework for LDIDS              June, 2003   
          

4.3.3 Log

   Logs are important records of system operation and they can reflect 
   the system operation status.  However, logs increase so quickly that 
   administrators are at a loss when facing thousands of logs and the 
   logs would become garbage and occupy the disks.  Thus, IDSs have the 
   special modules that deal with logs.

   For intruders, the first thing they want to do is to delete the 
   intrusion traces. If an intruder obtains the root privilege and 
   modifies the system logs, IDS would be incapable of finding out the 
   intrusion.  So IDSs must contain the log files detection. 

   Logs are the significant data sources.  There are three types of 
   logs, namely system logs, secure software logs and applications 
   logs.  

4.3.4 Network Datagram 

   Intrusions based on multi hosts and network-based intrusions are the 
   most popular attack modes.  As a result, the majority of data 
   analyzed by IDSs are datagrams.  

4.3.5 Other Information

   IDSs are not the only security components running in the networks.  
   Actually, there are other IDSs, such as firewalls, Honeypots and 
   other security components.  IDSs must work together with these 
   components to make wiser decisions.

4.4 Analysis Layer

   The layer is composed of different analysis modules. These modules 
   make further classification for the data, create events and analyze 
   the events to issue alerts.

4.4.1 Analysis Techniques

   The task of IDS is to find out the trace of intruder from enormous 
   raw data. Detection technology is the core of IDS, and the detection 
   ability lies on analysis methods.  According to the analysis method 
   employed, IDS could be divided into Anomaly Detection and Misuse 
   Detection.

   o   Anomaly Detection:  Anomaly intrusion refers to the abnormal 
       activity or use related to system resources.  Anomaly detection 
       is the detection of illegal activities, which is accomplished by 
       comparing the data collected with the stored data.  This method 
       of detection is still in the initial stages of its evolvement, 
       but it has received much attention as a compliment to misuse 
       detection.


Yixian Yang, et al.         Expires December, 2003             [Page 15]

INTERNET-DRAFT               framework for LDIDS              June, 2003   
               

   o   Misuse Detection:  Misuse intrusion refers to well defined 
       technical attacks on known system vulnerabilities.  Misuse 
       detection compares such activities with known attack patterns 
       to determine if an intrusion is indeed occurring.  This method 
       of intrusion detection is not only simple to implement but 
       accurate, and is a widely used method of intrusion detection. 
       
4.4.2 Analysis Process

   In IDSs, data can be divided into events and alerts.  Events are the 
   data that have not be analyzed yet, and alerts are the data having 
   been analyzed, which could imply the potential intrusions.  What the 
   analysis layer does is to abstract significative events from raw data
   and analyze the events to make alerts of potential intrusions.

   An analysis method fits some kind of data.  However, any analysis 
   method would make further classification for the data, create events 
   and analyze the events to issue alerts.  Analysis module is made up 
   of Classification Submodule, Event Creating Submodule and Alert 
   Making Submodule.  They respectively behave as follows:

   o   Classification   The first job is filtration, which would order 
       the data in time and discard the repeated data.  And then the 
       data would be classified according to the rules of IDS and sent 
       to different event creating submodules.

   o   Event Creating   The data that would be analyzed are called 
       events.  Event could be either packets in networks or logs of 
       hosts.  An integrate event should have attributes such as event 
       type, event source and event ID.

   o   Alert Making   All of the events would be sent to analyzer for 
       analysis.  When analyzer finds out intrusions, it would make 
       alerts to tell the cases.

4.5 Fusion Layer 

   In large-scale networks, the detection for distributed intrusions 
   should adopt the real-time fusion mechanism to issue the alerts of 
   large-scale environments.  At the same time, fusion could improve 
   the detection ability of single IDS, such as reducing the False 
   Positive to send more precise alarms to the administrator.  
   Figure 5 illustrates a simple version of the real-time fusion 
   mechanism in a large-scale distributed IDS.


Yixian Yang, et al.         Expires December, 2003             [Page 16]

INTERNET-DRAFT               framework for LDIDS              June, 2003   
          

                                    ^
                                    | High-level Alerts
       +----------------------------+-------------------------------+
       |                            |              Fusion Component |
       |                     +------+----------+                    |
       |                     |    Corelation   |                    |
       |                     +-----------------+                    |
       |                      ^               ^                     |
       | Intermediate Alerts  |               |                     |
       |                      |               |                     |
       |             +--------+--+           ++----------+          |
       |             |  Merging  |    ...    |  Merging  |          |
       |             +-----------+           +-----------+          |
       |               ^       ^                     ^              |
       | Alert Clusters|       |                     |              |
       |               |       |                     |              |
       |      +--------+---+  ++-----------+        ++-----------+  |
       |      | Clustering |  | Clustering |  ...   | Clustering |  |
       |      +------------+  +------------+        +------------+  |
       |           ^                ^                      ^        |
       |           |                |                      |        |
       +-----------+----------------+----------------------+--------+
   Elemental Alerts|                |                       |
            +------+--+---------+---+----------+--------+---+---+
            |         |         |              |        |       | 
          +---+     +---+     +---+          +---+   +----+  +------+
          |IDS| ... |IDS|     |IDS|  ...     |IDS|   | FW |  | Else |
          +---+     +---+     +---+          +---+   +----+  +------+

               Figure 5: The Real-time Fusion Mechanism
                         in Large-scale Distribute IDS

4.5.1 Clustering

   Clustering puts some alerts that have some similarities together to
   constitute an aggregation.  The aggregation is called an alert 
   cluster, which would engender an intermediate alert.

   Clustering modules would deal with the alerts that come from 
   different IDS and other security components. A clustering module 
   detects and congregates the alerts of the same intrusion from 
   everywhere.  In the hierarchy, the alerts sent to the clustering 
   module are all from different analysis modules, which guarantees 
   that the alerts arriving are all in standard formats that could be 
   recognized by the clustering module.  

   When a new alert arrives, it would be stored in the buffered 
   database. Then, the Clustering would traverse the buffered database 
   and look up the alerts that have some similarities with this new one.
   The core problem is how to define the similarity relation, which 
   could be solved well according to the probability arithmetic advanced
   by Valdes and Skinner.


Yixian Yang, et al.         Expires December, 2003             [Page 17]

INTERNET-DRAFT               framework for LDIDS              June, 2003   
          

4.5.2 Merging

   An alert cluster would be incorporated as an intermediate alert. The 
   reason for merging is to create new alerts, which contain typical 
   information of alert clusters.

   In case of a new element called Alerti inserted, there are two 
   possibilities: 1) If there is no alert cluster similar with Alerti, 
   an intermediate alert would be created.  The intermediate one newly 
   created only contains Alerti.  2) If Alterti could be inserted into 
   an existed alert cluster, the alert cluster would be modified and the
   corresponding intermediate alert would be modified accordingly.  

   When a new elementary alert added, the number of the modified alert 
   clusters would be one or more. For example, the new alert should 
   belong to two known alert clusters.  When the new one having been 
   inserted, the two alert clusters should be incorporated respectively,
   and two new intermediate alerts should be created.  These two 
   intermediate alerts may have some similarities, so that it is 
   feasible that the intermediate alerts belong to the same alert 
   clusters.  In this status, it is possible that several alert clusters
   would be incorporated into one cluster.

4.5.3 Corelation

   There are two corelation methods:

   o   Dominant Corelation: When the security administrator could 
       distinguish the relation between alert events, IDS would adopt 
       visualized corelation.  The relation could be the logistic links 
       of the knowledge based on different alerts, or be the route 
       according as the topology of the information systems.

   o   Recessive Corelation: When data analysis could find out the 
       mapping relationship, recessive Corelation is used.  The method 
       is based on the teams of alerts observed and the recessive 
       relations between these teams.

4.6 Harmonization and Management Layer

   Manager is an important role in an IDS, which runs above the layers 
   of detection and treats with all alerts.  An excellent manager could 
   not only present all the instances in the networks, but respond to 
   the alerts, filter noise, distinguish mistake alerts and obtain the 
   information that surpasses the detection layers.

   In this layer, there are Decision-Making Module, Harmonization 
   Module, Response Module, Administrator Console and Mutual Interface.
   These modules cooperate with each other, constitute the managers of 
   most IDSs.


Yixian Yang, et al.         Expires December, 2003             [Page 18]

INTERNET-DRAFT               framework for LDIDS              June, 2003   
          

4.6.1 Decision-Making 

   Decision-making is in charge of classifying the local alerts and 
   sending different kinds of alerts to relevant components.  The 
   alerts that could be resolved locally would be sent to Response 
   Module.  Contrarily, the events and the alerts that must be analyzed 
   by the parent node should be sent to Mutual Interface.

4.6.2 Harmonization 

   There are two tasks that must be accomplished by Harmonization 
   Module.

   o   Dealing with the security problems of a local IDS.  Different 
       IDSs determining the security problems concerning with the IDSs 
       differ from each other.  The security mechanism determines the 
       tasks that Harmonization Module must fulfill.  

   o   Being responsible for the processes between parent nodes and 
       child nodes in the hierarchy.  These mainly include registration,
       invalidation, control message transmission and so on.

4.6.3 Response

   There are two response techniques, passive versus active. Passive 
   systems respond by notifying the proper authority or the 
   administrator or some other security Components, and they do not 
   try to mitigate the damage done, or actively seek to harm or hamper 
   the attacker.  i.e. they would make an alarm to the firewall nearby 
   to cut off a connection that is controlled by the intruder.

   Active systems may be further divided into two classes:
   o   Those that exercise control over the compromised system, i.e. 
       they modify the state of the compromised system to thwart or 
       mitigate the effects of the attack.  Such controls could take 
       the forms of terminating network connections, increasing the 
       security logging, killing errant processes, etc.
   o   Those that exercise control over the attacking system, i.e. 
       an attempt to remove the attacker's platform of operation.  
       Since this is difficult to defend in court, it is not of much 
       interest in this approach outside military or law enforcement 
       circles.

4.6.4 Administrator Console 

   This is the only platform for the communication between the 
   administrator and IDSs.  There are two main jobs that Administrator 
   Console must perform.


Yixian Yang, et al.         Expires December, 2003             [Page 19]

INTERNET-DRAFT               framework for LDIDS              June, 2003   
          

   o   Administrator could manage IDSs and configure IDSs. In detail, 
       administrator could examine the status of IDS, access the online 
       helps and create reports.  Furthermore, administrator could 
       define the rules of detection, create new agents and pause some 
       functions.

   o   When some alerts could not be responded by the responser, IDSs 
       would make alerts to notify the administrator.  The administrator
       would take some actions to solve these problems.

4.6.5 Mutual Interface

   This is the exclusive interface of IDSs.  All of the communications
   with outside would get across this interface.  The data across this 
   interface are registration information, alerts, raw data that child 
   nodes send to their parents and response messages that parents nodes 
   sent to their child nodes.

5. Acknowledgement

   The authors wish to thank Huiqin Lv, Yafei Yang, Zhansong Wei, Jie 
   Zhang, and Ming Tao, for their detailed inputs. 

6. Informative References

   [1]  Ed Gerck, Ph.D., ''Overview of Certification Systems:X.509, 
        PKIX, CA, PGP & SKIP'', E. Gerck 1997-2000 and THE BELL, 2000.

   [2]  Stefan Axelsson, ''Intrusion Detection Systems:A Survey and 
        Taxonomy'', Department of Computer Engineering Chalmers 
        University of Technology Gíºoteborg, Sweden.

   [3]  Kumar Das, ''Protocol Anomaly Detection for Network-based 
        Intrusion Detection'', GSEC Practical Assignment Version 1.2f,
        August,13, 2001.
   [4]  B. Feinstein, ''The Intrusion Detection Exchange Protocol
        (IDXP)'', draft-ietf-idwg-beep-idxp-05, June 17, 2002.

7. Authors' Addresses

   Yixian Yang
   Information Security Center,
   Beijing University of posts and telecom.(BUPT),
   Beijing, China,100876
   Phone:8610-62283366
   Email:yxyang@bupt.edu.cn
   
   
Yixian Yang, et al.         Expires December, 2003             [Page 20]