Network Working Group                                   Keith Gutfreund
Internet Draft                              AltaVista Internet Software
                                                           May 10, 1999
                                              Expires November 10, 1999


                   Internet Content Filtering Protocol
                 draft-gutfreund-content-filtering-protocol-00.txt


Status of This Memo

This document is an Internet-Draft and is in full conformance with all 
provisions of Section 10 of FRC2026.

Internet-Drafts are working documents of the Internet Engineering Task 
Force (IETF), its areas, and its working groups.  Note that other 
groups may also distribute working documents as Internet-Drafts.

Internet-Drafts are draft documents valid for a maximum of six months 
and may be updated, replaced, or obsoleted by other documents at any 
time.  It is inappropriate to use Internet-Drafts as reference material
or to cite them other than as "work in progress."

The list of current Internet-Drafts can be accessed at 
http://www.ietf.org/ietf/1id-abstracts.txt

The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.

Internet Draft     draft-content-filtering-protocol-00.txt     May 1999

                   Internet Content Filtering Protocol


Abstract

      The Content Filtering Protocol (CFP) has been developed to 
      facilitate the connection of content filtering databases to 
      Internet firewall systems. CFP compliance allows content filters
      to be located "behind the firewall," where they are safe from 
      outside hostile attack.

      The CFP is a binary protocol used by firewall systems to 
      communicate over a private TCP/IP connection to the content 
      filtering database server.


K. Gutfreund                                                   [Page 2]

Internet Draft     draft-content-filtering-protocol-00.txt     May 1999

Table of Contents

1. Introduction                                 4
1.1 Firewalls                                   4
1.2 Content Filtering Databases                 4
1.3 Content Filtering Protocol                  4
2. The Content Filtering Protocol               5
2.1 Overview                                    5
2.2 Binary Protocol                             5
2.3 Response Request Architecture               5
2.4 Message Header Block                        6
2.5 Messages                                    6
2.5.1 SERVER_STATUS_REQUEST - Command ID: 1     6
2.5.2 VERSION_REQUEST - Command ID: 2           7
2.5.3 FEATURES_REQUEST - Command ID: 3          7
2.5.4 URL_LOOKUP_REQUEST - Command ID: 4        7
2.5.5 SERVER_STATUS_RESPONSE - Command ID: 101  7
2.5.6 VERSION_RESPONSE - Command ID: 102        8
2.5.7 FEATURES_RESPONSE - Command ID: 103       8
2.5.8 URL_LOOKUP_RESPONSE - Command ID: 104     8
3. Character Encoding                           9
4. Security Consideration                       9
5. Acknowledgements                             10
6. Author's Address                             11


K. Gutfreund                                                   [Page 3]

Internet Draft     draft-content-filtering-protocol-00.txt     May 1999

1. Introduction

The Content Filtering Protocol (CFP) is a protocol used to communicate 
between firewall systems and a content filtering database server. A 
typical configuration is shown below:

< Internet > ||  +==========+    |  < Firewall protected resources >
             ||  | Firewall |    |         .
             ||  | System   |    |         .
             ||  +==========+ < === > +===========+
             ||       .        (CFP)  | Content   |
             ||       .          .    | Filtering |
             ||       .          .    | Database  |
             ||  +==========+  (CFP)  | Server    |
             ||  | Firewall | < === > +===========+
             ||  | System   |    |         .
             ||  +==========+    |         .
             ||                  |         .

Figure 1, Typical firewall and content filtering database configuration


1.1 Firewalls

Firewall systems are computer systems used to authorize and secure 
network traffic between inside (secure) resources and outside 
(insecure) resources.  When used with a content filtering database, the
firewall can restrict access to undesirable content outside of the 
firewall.

Firewall systems typically play a dual role: 
a) Unauthorized users outside the firewall are prevented from accessing
resources behind the firewall.
b) Users behind the firewall are kept from accessing unauthorized 
resources outside the firewall.

Firewall systems are considered to be the dividing line between the 
internal (secure) resources and the external (insecure) world.

1.2 Content Filtering Databases

Content filtering databases are databases of network resource 
addresses.  In the context of firewalls, content filtering databases 
are used to identify requests originating behind the firewall for 
undesirable content outside of the firewall.  Once a request is 
identified as undesirable, the content filtering database notifies the 
firewall system that the request should be denied.  The firewall system
then takes the necessary steps to deny the request.  These steps may 
include, for example, logging and documenting the request or 
redirecting the request to a more desirable location.

1.3 Content Filtering Protocol

The Content Filtering Protocol (CFP) is a network protocol that 

K. Gutfreund                                                   [Page 4]

Internet Draft     draft-content-filtering-protocol-00.txt     May 1999

describes how a CFP compliant firewall communicates with a CFP 
compliant content filtering database. 

2. The Content Filtering Protocol

2.1 Overview

The protocol described herein is used to describe the communications 
between a firewall system and a content filtering database.  Current 
implementations are based on communications over a TCP/IP network 
stack; this is, however, not a strict requirement for the protocol.

2.2 Binary Protocol

The Content Filtering Protocol (CFP) is a binary protocol with variable
length communication instances or messages.  All numeric values are 
transmitted in network byte order.  Character strings are terminated by
a null character and are additionally accompanied by their location 
within a message and their length in bytes.

Each message is composed of a fixed length header field (message header
block) followed by a variable length body (message body block).  The 
message body block may have zero(0) length.  Figure 2 shows the format 
of a message.

           +======================+
           | Message Header Block |
           |     ( 16 bytes)      |
           +======================+
           | Message Body Block   |
           |     ( variable )     |
           +======================+

Figure 2, Message format

2.3 Response Request Architecture

Communications between the firewall and the database follow a typical 
client-server, transactional architecture.  The firewall (the client) 
initiates all requests to the database (the server) and the 
architecture defines a precise response from the database back to the 
firewall for every request. 

Typically, a client will issue some "request" for information (say a 
URL lookup request) and the server will return a "response."  This 
request-response pair is called a "transaction".  For example, a client
will transmit a SERVER_STATUS_REQUEST and the server will respond with 
a SERVER_STATUS_RESPONSE.  The command field (described below) in the 
Message Header Block semantically identifies each request or response 
in a transaction.

Communications are straightforward.  The firewall clients and server 
communicate over a standard TCP/IP connection, on port 18311.  There 
may be more than one firewall client connected to the database server. 

K. Gutfreund                                                   [Page 5]

Internet Draft     draft-content-filtering-protocol-00.txt     May 1999


2.4 Message Header Block

The Message Header Block is used for both request and response 
messages.  Depending upon the request or response, it may be followed 
by a Message Body Block.

Field       Size      Field 
Name        (bytes)   Description
-----       -------   -----------------
Length      short(2)  Number of bytes in the entire message, 
                      including the Message Header Block.
-----       -------   -----------------
Version     byte(1)   The major version number of this protocol.
(Major)
-----       -------   -----------------
Version     byte(1)   The minor number of this protocol.
(Minor)
-----       -------   -----------------
Command     short(2)  The command identifier.
-----       -------   -----------------
Reserved    short(2)  Reserved, must be 0.
-----       -------   -----------------
Reserved    long(4)   Reserved, must be 0.
-----       -------   -----------------
Transaction long(4)   A unique identifier generated by the requester
ID                    and returned in the corresponding response
                      message.
-----       -------   -----------------

2.5 Messages

Messages are composed of a fixed length message header block and a 
variable length message body block.  The message body block may have 
zero length.  Both requests from the firewall and responses from the 
database use this same message format.

There are two types of messages, requests and responses.  Requests are 
sent from the client to the server; responses are sent from the server 
to the client.  There is a one-to-one correspondence between requests 
and responses.

The command field in the message header block identifies the message.  
For example, the SERVER_STATUS_REQUEST message has a message header 
block containing the value of 1 in the Command ID field.  The 
SERVER_STATUS_RESPONSE message has a message header block containing 
the value of 101 in the Command ID field.  For ease of use, all 
response messages have Command ID values 100 greater than their 
corresponding request message ID values.

2.5.1 SERVER_STATUS_REQUEST - Command ID: 1

The purpose of this message is for the firewall to request status from 
the content filtering database. The request is composed of a Message 

K. Gutfreund                                                   [Page 6]

Internet Draft     draft-content-filtering-protocol-00.txt     May 1999

Header Block with a command identifier of SERVER_STATUS_REQUEST. This 
message has no message body block.  The database returns status 
information in a SERVER_STATUS_RESPONSE message.

2.5.2 VERSION_REQUEST - Command ID: 2

The purpose of this message is for the firewall to request version 
information from the content filtering database. The request is 
composed of a Message Header Block with a command identifier of 
VERSION_REQUEST. This message has no message body block. The database 
returns version information in a VERSION_RESPONSE message.

2.5.3 FEATURES_REQUEST - Command ID: 3

The purpose of this message is for the firewall to request information 
from the content filtering server about supported protocols, features, 
filtering categories, etc.  For example, the firewall could determine 
whether or not the server supports filtering for the FTP protocol.  
[This message is reserved for future use and is not currently 
implemented.]

2.5.4 URL_LOOKUP_REQUEST - Command ID: 4

The purpose of this message is for the firewall to determine if an URL 
should be filtered.  The URL_LOOKUP_REQUEST is composed of a Message 
Header Block with a command identifier of URL_LOOKUP_REQUEST, followed 
by a Message Body Block as shown below. The database responds with a 
URL_LOOKUP_RESPONSE message. 

URL_LOOKUP_REQUEST Message Body Block

Field       Size      Field 
Name        (bytes)   Description
-----       -------   -----------------
Protocol    short(2)  A value indicating the protocol type: HTTP,
                      FTP, NNTP, etc.  The value used is the port
                      number for the protocol from RFC 1700.
-----       -------   -----------------
URL string  short(2)  The length of the following URL string in
length                bytes.
-----       -------   -----------------
Source      long(4)   The IP address of the original client host
address               (not the Firewall) requesting the URL.                     
-----       -------   -----------------
URL         char(var) The requested URL.  This string is null
string                terminated.
-----       -------   -----------------

2.5.5 SERVER_STATUS_RESPONSE - Command ID: 101

The server status response is composed of the standard message header 
and the following message body:


K. Gutfreund                                                   [Page 7]

Internet Draft     draft-content-filtering-protocol-00.txt     May 1999


Field       Size      Field 
Name        (bytes)   Description
-----       -------   -----------------
Status code short(2)  The server status code.
-----       -------   -----------------
Status msg  short(2)  The length of the server status message,
length                or 0 if no message.
-----       -------   -----------------
License     long(4)   The number of workstations licensed to use
count                 the database.
-----       -------   -----------------
Reserved    long(4)   Reserved, must be 0.
-----       -------   -----------------
Reserved    long(4)   Reserved, must be 0.
-----       -------   -----------------
Message     char(var) A status message string corresponding to the
String                status code.  The string is null terminated.
-----       -------   -----------------

2.5.6 VERSION_RESPONSE - Command ID: 102

This message has no message body block. The request is simply the 
Message Header Block with a command identifier of VERSION_REQUEST.  The
server responds with a VERSION_RESPONSE message.

2.5.7 FEATURES_RESPONSE - Command ID: 103

This message has no message body block. The request is simply the 
Message Header Block with a command identifier of VERSION_REQUEST.  The
server responds with a VERSION_RESPONSE message.  [This message is 
reserved for future use and is not currently implemented.]

2.5.8 URL_LOOKUP_RESPONSE - Command ID: 104

The URL lookup response is composed of a Message Header Block and the 
message body below. When a URL is to be blocked, the database server 
must return a lookup code greater than 0 and zero or more of the 
following items:
a. An optional text message describing why the URL was blocked, 
suitable for display on the client browser.
b. An optional "redirected" (replacement) URL.
c. An optional rating label or category

Field       Size      Field 
Name        (bytes)   Description
-----       -------   -----------------
lookup code short(2)  Zero if the request should not be blocked.
                      Non-zero lookup codes indicate that a 
                      request should be blocked, the license has
                      been exceeded, or an error occurred.
-----       -------   -----------------


K. Gutfreund                                                   [Page 8]

Internet Draft     draft-content-filtering-protocol-00.txt     May 1999

Rating      short(2)  Zero if no rating label or category found.
label                 Otherwise, length of the rating label and/
length                or category string in bytes.
-----       -------   -----------------
Rating      short(2)  Offset from the beginning of this structure
Label                 to the rating label and/or category string.
offset
-----       -------   -----------------
Message     short(2)  Zero if no rating message.  Otherwise,
label                 length of an HTML formatted message for
length                display on the client's browser.  This
                      string is null terminated.
-----       -------   -----------------
Message     short(2)  Offset from the beginning of this structure
label                 to the rating message string.
offset
-----       -------   -----------------
Redirect    short(2)  Zero if no message.  Otherwise,
URL length            length of a redirected URL string for
                      display on the client's browser.  This
                      string is null terminated.
-----       -------   -----------------
Redirect    short(2)  Offset from the beginning of this structure
URL offset            to the redirected URL string.
-----       -------   -----------------
Reserved    short(2)  Reserved, must be 0.
-----       -------   -----------------
Rating      char(var) The rating label and/or category for the
label                 requested URL, if available.  This string
string                is null-terminated.  This string is only
                      present if the rating label string length
                      is > 0.
-----       -------   -----------------
Message     char(var) For blocked URLs, HTML formatted text for
string                display on the client's browser.  This
                      string is null-terminated.  This string is
                      only present if the message string length
                      is > 0.
-----       -------   -----------------
Redirected  char(var) For blocked URLs, a redirected URL for the
URL string            client's browser.  This string does not
                      need to be null-terminated.  This string
                      is only present if the redirected URL
                      string length is > 0.
-----       -------   -----------------

3. Character Encoding

All character strings in CFP are UTF8 encoded, null terminated and are 
accompanied by the length (in bytes). 

4. Security Considerations

Using the content filtering protocol allows the content filtering 

K. Gutfreund                                                   [Page 9]

Internet Draft     draft-content-filtering-protocol-00.txt     May 1999

database server to be safely located behind the firewall.  
Alternatively, the content filtering database server can also be 
located on the Internet side of the firewall, as shown below:

             < Internet >
                              ^^
~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ || ~ ~ ~ ~ ~ ~ ~ ~
                              vv
                         +===========+
+==========+             | Content   |
| Firewall |  < ===== >  | Filtering |
| System   |             | Database  |
+==========+             | Server    |
    ^^                   +===========+
~ ~ || ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~
    vv
       < Firewall protected resources >

Figure 3, Content filtering database located outside the firewall


In this configuration, the content filtering database works as a proxy 
server for all content requests sent to and retrieved from the 
Internet.  The principal advantage here is that many firewalls and 
content servers can communicate with each other without any special 
protocol.  The firewall treats the content server as a super-firewall 
(operating as a firewall within a firewall mode) and the content server
accepts or rejects content as per its configuration.  This 
configuration is very easy to implement.

The principal disadvantage to the above configuration is that the 
content filtering database server is not protected by the firewall.  
One other disadvantage is that the firewall management system is not 
able to communicate with the database server, other than to send and 
retrieve content requests.

5. Acknowledgements

The author would like to thank the following people for their support 
and feedback on this specification:

The AltaVista Security Team, Compaq Computer Corporation
Steve Shannon, The Content Advisor
Chao Yu, Log-On Data Corporation
Myrna, Olivia, Sander and Maxine Gutfreund


K. Gutfreund                                                  [Page 10]

Internet Draft     draft-content-filtering-protocol-00.txt     May 1999

6. Author's Address

Please address all comments to:

Keith Gutfreund
AltaVista Internet Software
Compaq Computer Corporation
550 King Street
Littleton, MA 01460
Email: keith.gutfreund@compaq.com
Phone: (978) 506-2147


Document Expiration Date: November 10, 1999


K. Gutfreund                                                  [Page 11]