Internet DRAFT - draft-yang-ips

draft-yang-ips






INTERNET-DRAFT                                   Yixian Yang
Expires: April 2006                              Jian Li
                                                 Xiuling Zhu
                        	                 Beijing University of
                                                 Posts and Telecom.
                                                 October 2005


            Key Technologies of Intrusion Prevention Systems(IPSs) 
                      draft-yang-ips-00.txt



Intellectual Property Rights (IPR) statement:
By submitting this Internet-Draft, each author represents that any 
applicable patent or other IPR claims of which he or she is aware 
have been or will be disclosed, and any of which he or she becomes 
aware will be disclosed, in accordance with Section 6 of BCP 79.

Status of this Memo
   By submitting this Internet-Draft, I certify that any applicable
   patent or other IPR claims of which I am aware have been disclosed,
   and any of which I become aware will be disclosed, in accordance with
   RFC 3668.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as
   Internet-Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.
   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

Copyright Notice

   Copyright (C) The Internet Society (2005). All Rights Reserved.
    
Abstract

   Intrusion prevention systems(IPSs) are a kind of systems that can 
   keep away and interdict intrusions actively.  This document puts 
   forward some relevant solutions to the existing problems in IPSs at 
   present.  Zero-copy technology is adopted to reduce the system 
   expenses, in order to optimize communication performance; Load 
   balancing technology guarantees the flow distributed at each IPSs
   engine even; Protocol analysis technology improves detection 
   efficiency to a certain degree.  The structure of detection engine 
Yixian Yang, et al.       Expires March, 2006                   [Page 1]
INTERNET-DRAFT           key technology of IPSs          September, 2005

   designed can be either a node of Large-scale Distributed Intrusion 
   Prevention Systems or an independent IPS.   
   


Table of Contents

  Status of This Memo .........................................   1

  Abstract ....................................................   1

  1.Introduction...............................................   3

  2.Conventions Used in This Document .........................   3

  3.Key Technology of IPS......................................   4
    3.1 Technology of Reducing System Expenses.................   4
     3.1.1 Grouping Expenses...................................¡¡ 4
     3.1.2 Byte Expenses.......................................   4
    3.2 Load Balancing Technology..............................   6
     3.2.1 Load Balancing Algorithm............................   6
     3.2.2 Load Balancing Strategy.............................   7
    3.3 Detection Technologies ................................  13
     3.3.1 Simple Potocol Analysis.............................  13
     3.3.2 State Protocol Anylysis Based on Markov Chain ......  15
    3.4 Design of Detection Engine.............................  16

  4. Acknowledgements..........................................  19

  5. Informative References....................................  19

  6. Authors' Addresses........................................  19























Yixian Yang, et al.       Expires March, 2006                   [Page 2]
INTERNET-DRAFT           key technology of IPSs          September, 2005


1. Introduction
  
   Intrusion prevention systems(IPSs), which evolve from intrusion 
   detection systems(IDSs), are placed in series in network,which 
   may easily make them become the bottleneck of network.  Low 
   detection efficiency, high false positive and false negative rate  
   limit the development of this new kind of security production.  How   
   to improve the performance of IPSs has become a problem that needs  
   to be sovled urgently.
    
   In this document, sevral kinds of technologies are presented.  They   
   can solve the bottleneck problems of IPSs to a certain extent.   
   Zero-copy technology that can reduce the system expenses is adopted 
   to optimize communication performance; Load balancing technology 
   guarantees the load distributed at each IPS engine even, so that all 
   engines can fully take effect; By replacing traditional pattern 
   matching with protocol analysis technology,detection efficiency is 
   improved to a certain extent.  Based on these technology, a structure  
   of detection engine is designed that can be either a node of 
   Large-scale Distributed Intrusion Prevention Systems or an 
   independent IPS.

2. Conventions Used in This Document

   IPS   The abbreviation of the term "Intrusion Prevention System" .
   
   EE    The abbreviation of "Enlisting Engine" used in load balancing 
         technology which means the detection engine that enlists for
         datagram from other engines.
   
   BE    The abbreviation of "Bidding Engine" used in load balancing
         technology which means the detection engine that bids to the 
         load table server.
       
   SPA   The abbreviation of "Simple Protocol Analysis", a detection 
         method used in this document.
       
   SPABM The abbreviation of"State Protocol Analysis Based on Markov 
         Chain", which is another method adopted in this document.
         
         
         
         
         
         
         
         
         
         
         
         
         
         
         
Yixian Yang, et al.       Expires March, 2006                   [Page 3]
INTERNET-DRAFT           key technology of IPSs          September, 2005
 

   
3. Key Technology of IPS   
   
   IPS is composed of detection engines and management platform.  It  
   works in series in the network.  The detection engine is disposed  
   behind the firewall and detects the attacks through analyzing the  
   datagram of network.  Once it discovers any intrusion, it will notice 
   the management platform.  The management platform filters the attack  
   flow with the alert-fusion technology etc.  and convey the final  
   result to the administrators.  The administrators can revise the  
   disposition of the management platform dynamically.
   
3.1 Technology of Reducing System Expenses

   With the realization of Gigabit Ethernet, the speed of network can
   reach giga or even higher, which greatly reduces the delay in network 
   transmission. Compared with the time spent in the sending terminal and 
   receiving terminal, the time when the data are transmitted in the 
   network can even be ignored.  So, in communication system based on  
   Gigabit Ethernet, the system expenses in the sending terminal and  
   the receiving terminal becomes a factor that influences  
   communication¡¡performance a lot.
   
3.1.1 Grouping Expenses
   
   Grouping expenses takes place during the assignment, release of the 
   system buffer and the implementation of TCP/IP protocol, the disposal  
   of equipment interrupts also causes grouping expenses.  For example, 
   at the sending terminal, when application programs send data to the    
   receiving terminal, OS distributes system buffer at first, and  
   preserves the data transmission.Then TCP/IP protocol is carried out  
   and the data are divided into TCP/IP groups.  After finishing    
   sending one group to the network interface, the network driver will 
   send equipment interrupt to notice OS. Then OS deals with the   
   interrupt and releases the system buffer.
   
   Interrupt coalescence: Generally, NIC(Network Internetface Card) will
   send one interrupt to OS whenever it finishes sending or receiving
   a group, which consumes system resources a lot.  In this document,
   some of these interrupts are delayed, and the NIC send one interrupt
   to interrupt controller after several groups' arrival.  Thus the 
   total number of interrupts becomes smaller.  As a result, grouping 
   expenses is reduced. 
   
3.1.2 Byte Expenses

   Byte expenses mainly comes from two aspects: transmitting and copying 
   data in the system and calculating the checksum.
   
   (1) In communication systems, data transmission usually adopts 
       DMA (Direct Memory Access ).  A typical data route is as follows:
              
       
                     
Yixian Yang, et al.       Expires March, 2006                   [Page 4]
INTERNET-DRAFT           key technology of IPSs          September, 2005
       
       
       Sending: First, application program calls sending operation, OS 
       copies the data from the buffer area of user's procedure to   
       system buffer in kernel space, then to the DMA buffer.  After 
       that, DMA is initialized and starts to work and the data are 
       transmitted into the network through net driver.
       
       Receiving: When the data arrive, buffer area is assigned for them 
       and DMA starts to transmit the data to DMA buffer.  Then the 
       application program calls receiving operation.  After that CPU 
       reads the data from the DMA buffer and writes them into system  
       buffer, and then they are copied into the user buffer.    
        
       In this course, data are copied for four times in the memory, and
       operation between the kernel and user space is implemented for 
       times, which takes up about fifty percent of whole CPU time.  It   
       is an important factor that causes the speed bottleneck.
   
       Zero-copy technology: Users and system kernel share their buffer 
       area.  The sharing mechanism merges the user buffer and system 
       buffer into a unified one.  Data are transmitted between the 
       sharing buffer and the network interface by DMA, and application
       programs can access the sharing buffer directly.  By this way, 
       data copy between users and system kernel is avoided.  Zero-copy 
       technology reduces the expenses caused by data copy,  so as to 
       increase the throughput of IPS.      
   
   
   (2) Because of various kinds of interfere, mistakes will occur when 
       data are transmitted in the network. In order to check these 
       mistakes, TCP/IP protocol adopts Internet checksum algorithm.  
       Total bytes to be transmitted are summed up and then the checksum  
       is sent to the receiving terminal.  At the receiving terminal,  
       the same numeration is carried out to the data received.  Then  
       the result will be compared with the check sum.  If any data  
       including the check sum itself are transmitted, mistakes will 
       be made, the result will mismatch.  Then the receiving terminal 
       knows that there must be mistakes during the transmission.  
       Usually, software is used to calculate the check sum.  At the 
       sending terminal and receiving terminal the check sum is  
       calculated by CPU.  That is to say, CPU download all the data  
       from memory, and then carry on a series of addition,  which   
       consequentially takes up a lot of system resources and time.
   
       Checksum by hardware: The check sum algorithm can be carried out  
       by hardware.  When the data are transmitted between memory and  
       NIC, the check sum can be calculated by the hardware of the DMA 
       interface. In this way, CPU resources is saved and time spending    
       in this process is also cut down.Consequently, byte expenses is   
       reduced.
   
   
   

              
Yixian Yang, et al.       Expires March, 2006                   [Page 5]
INTERNET-DRAFT           key technology of IPSs          September, 2005

   

3.2 Load Balancing Technology   

   With the appearance of Gigabit Ethernet, the number of data 
   transmitted in the network is becoming larger and larger.  It is 
   quite possible that IPS becomes the speed bottleneck for it is 
   placed in line in the network.  So load balancing technology is  
   adopted in LDIPS(Large-scale Distributed Intrusion Prevention 
   System).  In this way, the network traffic can be distributed 
   averagely on each IPS engine to make sure that every engine can  
   work fully. 
   
3.2.1 Load Balancing Algorithm

   The key of load balancing technology is dispatch algorithm,
   which is divided into two kinds as below:
   
   Static algorithm: In this algorithm, the concrete state of the host 
   computer is not considered.  The load balancing strategy is made 
   according to the information of task in existence.  Its shortcoming
   lies in that it can not reflect the network topology configuration 
   and also can not meet the change of the network structure.
   
   Dynamic algorithm: In this algorithm, the dispatch strategy is 
   made by exchanging the state information of all the hosts.  It is 
   more flexible.  But system expenses would be increased to a certain 
   extent, although this kind of expenses can usually be offsetted.
   
   Dynamic algorithm is adopted and process dispatch algorithm is 
   introduced in this document.  The advantages of both bidding  
   algorithm and drafting algorithm are combined together to carry out  
   load balancing for IPS.
   
   Bidding algorithm: It is a typical algorithm for traditional process
   migration.  Its basic idea is that whenever the processor produces a 
   new process, its load changes. Then it will broadcast the bid to all 
   the other processors in the distributed system.  The other processors 
   soon return their bid marks.  After all the bid marks are received,  
   it will choose the best one and transfer the process to the processor 
   that has won the bid. 
   
   
 
 
 
 
 
 
 
 
 
 
   
   
Yixian Yang, et al.       Expires March, 2006                   [Page 6]
INTERNET-DRAFT           key technology of IPSs          September, 2005



   There are certain defects in the bidding algorithm. Process may be 
   transferred from one heavy load host to another heavy load
   host, or from one light load host to another light load host.  Also, 
   it is possible that several hosts of heavy load transfer processes
   to one host of light load at the same time, which makes the host 
   overweighted at once.  For bidding algorithm, the processor which 
   produces the new process is positive, and the one to which the new
   process is transferred is passive.  So, the process received by the 
   latter one may not be transferred from the host whose load needs to
   be lightened  with the most urgency, which is unfair to other  
   processors of heavy load in the system.
   
   Drafting algorithm: In this algorithm, there are three possible grades 
   for each processor 'load: heavy, light or nomal.  The processor will 
   notify its adjoint ones When the load changes.  For a processor,
   whenever its load is "light", it will send out enlisting requests 
   for the purpose of trying to receive some processes.  Then according 
   to a certain strategy, it will choose a processor among the ones 
   whose load is "heavy".  In case that its load is still "light", it  
   will get other process from the processor choosed and carry it out. 

   Though drafting algorithm solves the problem of fairness, the 
   processor of "light" load can only enlist from its adjoint processors.
   So it has difficulty in making the best load distribution and is hard  
   to be expanded to the system of hundreds and thousands of crunodes 
   
   The concentration of bidding algorithm and the distribution of 
   drafting algorithm are combined together in IPS designed in this   
   document.  A host can be placed in the network as a server to 
   maintain the load table. It needn't to be connected into the network,
   so it won't influence the network traffic.  According to the bidding 
   algorithm, All IPS engines tender for the server by declaring their 
   own state(heavy, light, normal).  Meanwhile, each IPS engine has 
   process of enlisting.  When its load is "light", it will check the   
   load table in the server to find the engines of heavy load and enlist  
   a certain amount of connection. Thus enlistment can be realized among 
   the whole network instead of the confined area near some engine.
   
3.2.2 Load Balancing Strategy

   The realization of load balancing relies on such strategy:
   
   o  Location of the Load Balancing in the Network
   
      A server of load table is placed in parallel in the Intranet,  
      whose task is to keep the information of the IPS engines of heavy 
      load.  In the load table, the records are arranged from large to  
      small according to the load value.  Because of the mode that the   
      server is placed in the network, even if the server broke down,  
      the worst result is the disappearance of the function of load  
      balancing.  It would not influence the exchange of data in the
      network.
      
Yixian Yang, et al.       Expires March, 2006                   [Page 7]
INTERNET-DRAFT           key technology of IPSs          September, 2005

  
  
     
   o  Design of the Load Table 
   
      The load table is used for keeping the information of all the IPS
      engines of heavy load, including their ID, IP address, load value  
      and the network section they are in.
      
   o  Definition of Load Value
   
      The most important factor that can weigh the load of an IPS engine
      is the surplus capacity, because it determines the additional 
      connection amount that IPS engine can deal with. At the same time, 
      the IPS engine would carry on much analysis work.  The operation 
      ability of the engine should be considered as well.  So the 
      utilization ratio of CPU is another important parameter too.  The 
      load value is fixed by amalgamating these parameters according
      to different weight.
      
      Load value is calculated by each IPS engine itself and sent to  
      the server.
      
   o  Definition of the Load State
    
      There are three possible grades for the load of every engine:
      heavy, light and normal.   Only when its load is "heavy", the  
      engine bids to the server.  Only when it is "light", the 
      engine enlists to the server.
      
   o  Clock Signal
   
      The whole course of load balancing is composed of bid and 
      enlistment.  If each engine bids and enlists optionally, they may 
      fight for the critical resource, which will lead to larger system
      expenses and add to the complexity of the strategy.  So clock  
      signal is introduced to control the bid and enlistment.
      
      During the rising edge of the clock, enlistment is allowed while
      bid is forbidden.  During the falling edge, bid is allowed while 
      enlistment is forbidden.
      
   o  Bidding Strategy
   
   
      When the falling edge of the clock comes, each engine calculates  
      the average value of its load during the past several clock cycles   
      in order to avoid system shake caused by great traffic  
      accidentally.
    
    
    
    
    
    
Yixian Yang, et al.       Expires March, 2006                   [Page 8]
INTERNET-DRAFT           key technology of IPSs          September, 2005
    
      
      
      
      In bidding cycle, all the engines calculate their load value to 
      confirm the load state.  If the state is "heavy" in this cycle 
      while it is not "heavy" during the previous cycles, the engine  
      bids to the server.  If the state is "heavy" both in this cycle  
      and the previous ones, but the load value has changed, the engine  
      bids to the server renewedly.  If the state is not "heavy" in  
      this cycle while it is "heavy" in the previous cycles, the engine  
      asks the server to delete the record for it in the load table.
      

      Pseudocode below is the description of bidding strategy.
      
        if(clk ==down) // The falling edge of the clock  
        {
          engine _ load =check(engine); //Calculate the latest load  
                                        //value of the engine.
          engine _ state =evaluate(engin _ load); 
                                        //Get the newest load state 
                                        //according to the load value.
          if(engine _ exstate ==heavy &&engine _ state ==heavy&& 
                                       engine _ exload! =engine _ load) 
                   //The state is "heavy" both in this cycle and the 
                   //previous ones,but the load value has changed.
             bid(engine)                //Bid renewedly.
          elseif(engine_exstate==heavy && engine_state!=heav)
                   //The state is not "heavy" in this cycle but
                   //"heavy" during the previous cycles.
             delete(engine_bid)         //Delete the record for it 
                                        //in the load table.
          elseif(engine_exstate!=heavy && engine_state==heavy)                              
                   //The state is "heavy" in this cycle and not
                   //"heavy" during the previous cycles.
             bid(engine)                //Bid to the server.
        }
                                      
   o  Drafting  Stategy   
   
      When the rising edge of the clock comes, each engine calculates 
      its load value(averaged with the previous cycles).
       
      If the state is "light", the engine will enlist in the load table.
       
      The amount to enlist: An amount that can make the engine's state 
      close to "normal".  Certain resource is reserved to deal with the 
      possible impact.
       
      The being enlisted object: The engine whose load needs to be 
      lightened the most urgently.
      
            
      
      
Yixian Yang, et al.       Expires March, 2006                   [Page 9]
INTERNET-DRAFT           key technology of IPSs          September, 2005
       
     
      
      
      The abandoning mechanism: When the packet arrives the engine 
      which enlists, it may not be accepted for some reasons(Too much 
      packets arrive at a time suddenly and the buffer can not afford,
      or other reasons). Then the EE sends signal for failing to receive   
      to the BE and drops the packet. If the packet is received rightly,
      signal for success will be sent to the BE.  The BE drops the   
      relevant packet only when it receives this signal.  Thus,  
      overfreight for the EE and losing packet for the BE won't happen.
       
      Management for the competition between the EEs:
      (1) For the records in the load table are arranged from large   
          to small according to the load value of the BEs, the  
          EE arrived first enlists the BE whose load value is the  
          largest and locks the relevant record as critical resource. 
          Only after the whole course of enlisting is carried out, the  
          record can be released.  By this way, the phenomenon that a BE  
          is enlisted by different EEs is avoided. 
          
      (2) Sometimes, some of the EEs have no packet to enlist, then they 
          have to wait in queue.  When the first EE finishes enlisting,
          the first engine in the queue checks the load value of the 
          first record in the load table and starts its enlisting.
          This course will be implemented for times until the end of the 
          enlisting cycle.
      
          
      Pseudocode below is the description of enlisting strategy.
       
      Initializtion of enlistment:
        if(clk==up)                     // The rising edge of the clock.
	      {
		      engine_load=check(engine);    // Calculate the latest load  
                                        //value of the engine.
		      engine_state=evaluate(engine_load);  
		                                    //Get the newest load state 
                                        //according to the load value. 
		      engine_space=normal-engine_state-n; 
		                                    //Calculate  the resource left.
		                                    //n is the resource reserved. 
		      if(engine_state==light)       
			        {
				         draft(engin);          // Enlist in the load table.
			        }
		    }








Yixian Yang, et al.       Expires March, 2006                  [Page 10]
INTERNET-DRAFT           key technology of IPSs          September, 2005
     
     
     
     
     
      EE:
        for(i=1;i<=loadtab_tail;i++)    //Search for the BE.
        {
	        if(loadtab[i]_state==unlock)
		          {
			           darft(engine,loadtab[i]); 
                                // Enlist for load[i] which is selected.  
			           lock(loadtab[i]);      // Lock the relevant record.
			           engine_receive=start;  // Begin to receive.
			           while(engine_receive!=end)      
			                                  // Waiting for packets.
			              {
				              start(engine_wait)// Start timing.
				                if(loadtab[i]_load>engine_space)
				                  {  
				  	                sendfail(loadtab[i]);        
                                // No space for more packets.
                                // Send signal for failing to receive.
                    	      unlock(loadtab[i]);       
                    	                 // Release the locked record.
				                    sendtowait(engine);      
				                    engine_receive=end;
				                               // Finish receiving.    
				                  }
				                elseif         //Enough space to receive
				                  {
					                  receive(loadtab[i]); 
					                             // Receive the packets.
					                  senddone(loadtab[i]);    
					                             // Send signal for success. 
					                  engine_receive=end;    
					                             // Finish receiving,
				                  }		
				                elseif(engine_wait==overtime)) 
				                               // Overtime for waiting. 
				                  {
					                  sendfail(loadtab[i]);     
					                             // Send signal for failure.
				                    unlock(loadtab[i]);      
				                               // Release the locked record.
				                    engine_receive=end;       
				                               // Finish receiving. 
				                  }
			              }
		          }
        }
        
     
     
     
               
Yixian Yang, et al.       Expires March, 2006                  [Page 11]
INTERNET-DRAFT           key technology of IPSs          September, 2005
    
    
     
      BE:
                                       // EE is Received.
	      {
	        sendengine(load_copy);       // Send copy of the pocket.
	        while(engine_signal!=true)   // Waiting for the success signal.
	          { 
	  	        start(engine_wait)£»     // Start timing.
	  	        if(engin_send==done)     // Success signal is received.
	  		        {
	  			        delete(load);        // Native packets are dropped.
	  		          engine_signal=true£» // Finish.         
	  		        }
	  	        elseif(engin_send==fail)           
	  		        {
	  			        engine_signal=true£» // Fail to send packets.
	  		        }
	  	        elseif(engine_wait==overtime)           
	  	                                 // Overtime for waiting. 
	  		        {
	  			        engine_signal=true£» // Finish.
	  		        }
	           }    
	      }
	    
	 o  Choice of The Cycle  
	      
      The best cycle for bid and enlist can be found by training. 
      Enlisting cycle should be a bit longer than bidding cycle.
      
   o  Management for the Abnormity 
   
      (1) During the process of enlisting, clock may stop. 
          In this case, no matter which step of the process is going on,
          the packets must be dropped and EE should send failure signal 
          to BE.
      (2) During the enlisting process, the BE does not response.        
          For some reason, the BE may not respond for the 
          EE's request. Then the EE will enter the queue after trying 
          for times.
      (3) During the process of enlisting, no signal from the EE
          can be received by the BE.
          For some reason, after receiving the packets, the EE can not 
          send either success signal or failure signal.  The BE keeps
          waiting until overtime.  In this case, the native packets
          wouldn't be dropped, which may increase the total load of 
          the system. 
          
          The load balancing strategy designed in this document not 
          only meet the LDIPS, but also solve the bottle-neck 
          problem to a certain extent.
         
                            
          
Yixian Yang, et al.       Expires March, 2006                  [Page 12]
INTERNET-DRAFT           key technology of IPSs          September, 2005            
          
3.3 Detection Technologies  

   Pattern matching technology is usually used to detect intrusion in 
   traditional IDS.  Packets received are matched with the rules one by 
   one to check whether there exists any intrusion. This technology is  
   not flexible enough because it neglects the special structure of the 
   packets.
   
   Protocol analysis technology detects intrusion according to the 
   communication protocol.  Protocol analysis based on Markov chain 
   presented in this document integrates the ability of Markov chain
   in dealing with random process and the advantage of protocol analysis 
   technology, which can optimize the performance of IPS.
   
   
3.3.1 Simple Protocol Analysis
       
   In network, data transmission is based on one or more protocols. 
   Thus, only those characters relevant with the implemented protocol 
   will be matched, which can greatly reduce the amount of calculation
   and improve the efficiency.                 
                                   
   Taking Ethernet for example, below is the course of SPA.
      
   MAC layer: In this layer, packets are decoded in order to get the 
   source address and destination address.  At the same time, the 13th
   byte of each packet is checked. If 0800 was found, the packet is an IP 
   packet.  Also, 0806 for ARP packet, 0835 for RARP packet and so on. 
   The engine can directly jump to detect the right position for  
   corresponding protocol. 
   
   IP layer: For an IP packet, the 24th byte is the mark of the 4th 
   layer protocol.  06 is TCP protocol, 01 is ICMP protocol, and 02 is  
   IGMP protocol.  So the engine can ignore more than 10 bytes from the   
   13th byte to the 24th, which improves the efficiency a lot.
   
   Application layer: In this layer, port number is used to identify 
   application programs. For example, the 35th byte of a TCP packet is
   the port number(80 for http protocol, 23 for telnet protocol, 21 for
   FTP protocol and so on).  If it is a TCP packet, the 55th byte is the 
   beginning of the URL.  So the engine will read the 55th byte directly.  
   
   When data are transmitted in network, the max length of a packet or 
   the MTU permitted is not accordant for different protocol.  In 
   Ethernet, any IP packet that is longer than 1500 Bytes must be 
   fragmented.  The IP layer at the receiving host must accumulate these 
   fragments to completely reconstitute the original datagram. For this 
   reason, when the datagram is fragmented, certain rules must be 
   followed.  These rules have made the transmission normative, but also
   created opportunity for hackers.
   
        
   
   
   
Yixian Yang, et al.       Expires March, 2006                  [Page 13]
INTERNET-DRAFT           key technology of IPSs          September, 2005
  
   Some intrusions cause memory errors or even make the protocol stack 
   breakdown by sending lots of conjugated IP fragments whose offset are
   overlapped in a very short time.  These intrusions can not be 
   detected before the fragments are accumulated.  So when SPA is 
   implemented, the engine must accumulate the fragments first.
   
   The flow of SPA is described in Figure 1. 
      
                                    |         
                                    V network packet   
                         N    +------------+     
                     +--------| IP packet? |
                     |        +------------+ 
                     V              |
                +--------+          V
                |  drop  |    +------------+    N
                +--------+    | fragment?  |---------+
                              +------------+         |
                                    |                |
                                Y   V                |
                              +------------+         |
                              | coalition  |         | 
                              +------------+         |
                                    |                |
                                    V                |
                         Y    +------------+         |
                     +--------| intrusion? |         |
                     |        +------------+         |
                     V              |                |
                +--------+          V                |
                | alert  |    +------------+         |
                +--------+    |upper-level |  <------+
                              |  protocol  |
                              +------------+
                                    |
        +-------------------------------------------------------+
        |                   |                     |             |
    +-------+           +-------+             +-------+     +-------+
    | ICMP  |           |  TCP  |             |  UDP  |     |others |
    +-------+           +-------+             +-------+     +-------+
        |                   |                     |             |
        V        +-----------------------+        V             V
 +----------+    |           |            |    +---------+   +------+
 |ICMP agent|    V           V            V    |UDP agent|   | drop |
 +----------+ +------+    +------+   +------+  +---------+   +------+
              | http |    | ftp  |   |telnet| 
              +------+    +------+   +------+
                 |            |          |
                 V            V          V
          +----------+   +---------+  +------------+  
          |http agent|   |ftp agent|  |telnet agent|
          +----------+   +---------+  +------------+
          
            Figure1: flow of SPA

Yixian Yang, et al.       Expires March, 2006                  [Page 14]
INTERNET-DRAFT           key technology of IPSs          September, 2005

   
      
3.3.2 State Protocol Analysis Based on Markov Chain   

   The transmission of data in network is based on various protocols.
   At the different moment of the course that a protocol is implemented,
   the network is in different state.  This can be described with 
   state-switching map. Take the TCP connection for example, Figure 2
   illustrates the process.
   
     +----+  SYN   +----+ SYN,ack +----+  ack    +----+
     | q0 |------->| q1 |-------->| q2 |-------->| q3 |
     +----+        +----+         +----+         +----+
      
       Figure 2:the state-switching map of TCP connection

   Only the sequence {q0,q1,q2,q3} is a normal connection.  If other 
   sequences are detected, there probably is some intrusion.
   
   In fact, within period of time, there is certain reliance relation 
   among the states, namely the next state of the network relies on its
   previous one or even several states.  It means that high-order random 
   process can better describe the course that a protocol is 
   implemented.  But high-order random process module is too complicated 
   and the cost for calculation is too high, especially when it is used 
   to deal with large amount of data.  In order to simplify the problem,
   Markov chain module is adopted to describe the course.
   
   It is assumed that the probability of the state at the moment t+1 
   relies on the state at the moment only and this probability has 
   nothing to do with time. 
   
   During the course that a protocol is carried out, if a state sequence
   {X1,X2,...Xt} is observed, the probability of this sequence can be 
   calculated with the one-step transition probability using Formula (1)
   below.
            
              P(X1,X2,...Xt)=P(X1)*P(X1X2)*P(X2X3)...P(Xt-1Xt)    (1)
              
   If P(X1,X2,...Xt)>=C, the sequence is considered as an normal one.
   If P(X1,X2,...Xt)<C, the sequence is considered as an abnormal one.
   C above is a threshold value, which can be the smallest probability
   for the sequence in history or be adjusted by the administrator 
   according to experience.  
   
   For better description of the difference between normal sequence and
   abnormal sequence, parameter ¦Ë is introduced. It is defined as 
   follow.
         
               ¦Ë=P(X1,X2,...Xm)/P0(X10,X20,...Xn0)               (2)
               
   P0(X10,X20,...Xn0) is the probability of the normal sequence.If 
   ¦Ë=1, (X1,X2,...Xm) is a completely normal sequence.
   
   
Yixian Yang, et al.       Expires March, 2006                  [Page 15]
INTERNET-DRAFT           key technology of IPSs          September, 2005
   
 
   For those complex protocols, there are many states.  As the protocol 
   being implemented, P(X1,X2,...Xm) and P0(X10,X20,...Xn0) will both 
   become too small. In this case, the whole course can be separated into 
   several segments.  Then each segment can be detected using Formula(2).
   
   Markov chain module can not only detect intrusions already happened,
   but also can forecast intrusions that have not finished with n-step 
   transition probability. ¡¡
   
3.4 Design of Detection Engine  

   Based on the technologies described, a detection engine in Figure 3 
   is designed.   
    
          ¡¡¡¡¡¡¡¡¡¡¡¡ +------------------+
           ¡¡¡¡¡¡¡¡¡¡¡¡£ü¡¡control module |
          ¡¡¡¡¡¡¡¡¡¡¡¡ +------------------+
               ¡¡¡¡¡¡¡¡¡¡¡¡     ^
             ¡¡¡¡¡¡¡¡¡¡¡¡       |
                  ¡¡¡¡¡¡¡¡¡¡¡¡  V  
  ¡¡¡¡¡¡+--------------------------------------------------+                                                 
 ¡¡¡¡¡¡ |     +------------------+                         |     
 ¡¡¡¡¡¡ |     |responding module |                         |
¡¡¡¡¡¡  |     +------------------+                         |
  ¡¡¡¡¡¡|              ^                                   |
 ¡¡¡¡¡¡ |              |                                   |
 ¡¡¡¡¡¡ |     +------------------+                         |
 ¡¡¡¡¡¡ |     |      SPABM       |-----------+             |
 ¡¡¡¡¡¡ |     |      module      |           |             |
 ¡¡¡¡¡¡ |     +------------------+           V             |
¡¡¡¡¡¡  |              ^                +---------+        |
  ¡¡¡¡¡¡|              |                |   DB    |        |    
        |              |                +---------+        |                     
        |     +------------------+           ^             |
        |     |       SPA        |           |             |
        |     |      module      |-----------+             |
        |     +------------------+                         |     
        |              ^                                   |
        |              |                                   | 
        |     +------------------+                         | 
        |     |datagram capturing|                         |
        |     |     module       |                         |
        |     +------------------+                         | 
        +--------------------------------------------------+ 
                       ^
                       |          
              +-----------------+
              |network interface|
              +-----------------+     
              
              Figure 3: composing of the detection engine   
              
              
              
Yixian Yang, et al.       Expires March, 2006                  [Page 16]
INTERNET-DRAFT           key technology of IPSs          September, 2005                 
   
   Datagram capturing module: This module takes charge of capturing 
   network packets. Interrupt coalescence and Zero-copy technology are 
   adopted.  Also checksum is realized by hardware. 
   
   SPA module: Rock-bottom analysis to the datagram captured is carried 
   out by this module.
   
   SPABM module: This module mainly detects the intrusions based on the 
   course that protocols are implemented.
   
   Responding module: Positive measures are taken as the response to the 
   alerts made by the two analysis modules. 
   
   DB:The two analysis modules get detection rules here.
   
   Control module: This module harmonizes the work of other modules in
   order to complete intrusion detection and active prevention.
   
   Intrusion detection is completed by SPA and SPABM.  Figure 3 
   illustrates the flow of whole detection.
    
                               |         
                               V network packet 
                          +---------+ 
                          |parse the|
                          | packet  |
                          +---------+
                               |
                               V 
                         +-----------+
                         |    SPA    |
                         +-----------+
                               |
                               V 
                         +------------+      Y
                         | intrusion? |---------------+
                         +------------+               |
                               |                      |
                               V                      |
                         +------------+               |
                         |   SPABM    |               |
                         +------------+               |
                               |                      |
                               V                      |
                        +------------+      Y         |
                        | intrusion? |----------------|
                        +------------+                |
                               |                      |
                            N  V                      V
                           +--------+            +--------+
                           |  pass  |            | alert  |
                           +--------+            +--------+
                           
                    Figure 4: flow chart of intrusion detection
                                        
Yixian Yang, et al.       Expires March, 2006                  [Page 17]
INTERNET-DRAFT           key technology of IPSs          September, 2005                     
                       
   In order to improve detection efficiency, a port scanning module
   can be used.  It begins to work along with the system and scans all
   the hosts in the network regularly.  According to the results of 
   scanning,  detection rules can be updated to keep up with the changes
   of the system.  For example, if it has discovered that the ftp port
   of some host was closed, then all the ftp connection to this host will 
   be regarded as invalid requests.  At the same time, even if some 
   intrusion characters are detected in these data packets, intrusion 
   will not take place.  In this way, efficiency is improved and false 
   positive is reduced.
   
   Figure 4 illustrates the flow of Detection engine with port scanning 
   module.                        
                              
                        +--------------------+    
                        |   initialization    |
                        +--------------------+    
                                  |
                                  V 
                        +--------------------+     
         +------------->| intrusion detection|     
         |              +--------------------+ 
         |                        |
         |                        V 
         |                 +-------------+¡¡¡¡¡¡Y
         |                 |   finish?   |-------------+
         |                 +-------------+             |   
         |                        |                    |
         |                        V                    |  
         |     N         +------------------+          |
         +---------------|time for scanning?|          |
         |               +------------------+          |
         |                        |                    |
         |                        V                    |
         |               +------------------+          |
         |               |  scan the ports  |          |
         |               +------------------+          |      
         |                        |                    |
         |                        V                    | 
         |      N        +------------------+          |
         +---------------|  state changed?  |          |
         |               +------------------+          |
         |                        |                    |
         |                        V                    V
         |               +------------------+      +--------+    
         |               | modify the rules |      | finish |   
         |               +------------------+      +--------+       
         |                        |
         +------------------------+                                                            
       
         Figure 5: the flow of the whole detection engine  




Yixian Yang, et al.       Expires March, 2006                  [Page 18]
INTERNET-DRAFT           key technology of IPSs          September, 2005      




4. Informative References

   [1]  M.R Eskicioglu Design issues of process migration facilities in
        distributed system.IEEE New sletter TC on OS and Application
        Environments,Vol 4,No.2,Summer 1990
        
   [2]  Jerry Chu.Zero-copy TCP in Solaris[C].In:Proc of the USENIX 1996
        Annual Technical Conference,1996
      
   [3]  Lane T.,Brodley C.E..Temporal sequence learning and data 
        reduction for anomaly detection.In:Proceedings of the 5th ACM
        Conference on Computer and Communications Security,San 
        Francisco,California,1998,150~158
   
   [4]  Protocol Analysis and Command Parsing vs.Pattern Maching in
        Intrusion Detection System.http://www.networkice.com/products/
        documentation.html,2000 






5. Acknowledgement

   The authors wish to thank Ming Cao, Xu Zhu, Xinliu Wang, Liang Ren,  
   and Huayi Rao, Shuai Zeng for their detailed inputs. 






6. Authors' Addresses

   Yixian Yang
   Information Security Center,
   Beijing University of posts and telecom.(BUPT),
   Beijing, China,100876
   Phone:8610-62283366
   Email:yxyang@bupt.edu.cn











Yixian Yang, et al.       Expires March, 2006                  [Page 19]
INTERNET-DRAFT           key technology of IPSs          September, 2005
   

Full Copyright Statement

   Copyright (C) The Internet Society (2005).

   This document is subject to the rights, licenses and restrictions
   contained in BCP 78, and except as set forth therein, the authors
   retain all their rights.

   This document and the information contained herein are provided on an
   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
   ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
   INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
   INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

Intellectual Property

   The IETF takes no position regarding the validity or scope of any
   Intellectual Property Rights or other rights that might be claimed to
   pertain to the implementation or use of the technology described in
   this document or the extent to which any license under such rights
   might or might not be available; nor does it represent that it has
   made any independent effort to identify any such rights.  Information
   on the procedures with respect to rights in RFC documents can be
   found in BCP 78 and BCP 79.

   Copies of IPR disclosures made to the IETF Secretariat and any
   assurances of licenses to be made available, or the result of an
   attempt made to obtain a general license or permission for the use of
   such proprietary rights by implementers or users of this
   specification can be obtained from the IETF on-line IPR repository at
   http://www.ietf.org/ipr.

   The IETF invites any interested party to bring to its attention any
   copyrights, patents or patent applications, or other proprietary
   rights that may cover technology that may be required to implement
   this standard.  Please address the information to the IETF at ietf-
   ipr@ietf.org.
   
   
    
    
    
        








            
Yixian Yang, et al.       Expires March, 2006                  [Page 20]