HTTP/1.1 200 OK Date: Tue, 09 Apr 2002 00:13:13 GMT Server: Apache/1.3.20 (Unix) Last-Modified: Mon, 10 May 1999 18:59:16 GMT ETag: "2e6b78-4ff4-37372c84" Accept-Ranges: bytes Content-Length: 20468 Connection: close Content-Type: text/plain Network Working Group Keith Gutfreund Internet Draft AltaVista Internet Software May 10, 1999 Expires November 10, 1999 Internet Content Filtering Protocol draft-gutfreund-content-filtering-protocol-00.txt Status of This Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of FRC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Internet Draft draft-content-filtering-protocol-00.txt May 1999 Internet Content Filtering Protocol Abstract The Content Filtering Protocol (CFP) has been developed to facilitate the connection of content filtering databases to Internet firewall systems. CFP compliance allows content filters to be located "behind the firewall," where they are safe from outside hostile attack. The CFP is a binary protocol used by firewall systems to communicate over a private TCP/IP connection to the content filtering database server. K. Gutfreund [Page 2] Internet Draft draft-content-filtering-protocol-00.txt May 1999 Table of Contents 1. Introduction 4 1.1 Firewalls 4 1.2 Content Filtering Databases 4 1.3 Content Filtering Protocol 4 2. The Content Filtering Protocol 5 2.1 Overview 5 2.2 Binary Protocol 5 2.3 Response Request Architecture 5 2.4 Message Header Block 6 2.5 Messages 6 2.5.1 SERVER_STATUS_REQUEST - Command ID: 1 6 2.5.2 VERSION_REQUEST - Command ID: 2 7 2.5.3 FEATURES_REQUEST - Command ID: 3 7 2.5.4 URL_LOOKUP_REQUEST - Command ID: 4 7 2.5.5 SERVER_STATUS_RESPONSE - Command ID: 101 7 2.5.6 VERSION_RESPONSE - Command ID: 102 8 2.5.7 FEATURES_RESPONSE - Command ID: 103 8 2.5.8 URL_LOOKUP_RESPONSE - Command ID: 104 8 3. Character Encoding 9 4. Security Consideration 9 5. Acknowledgements 10 6. Author's Address 11 K. Gutfreund [Page 3] Internet Draft draft-content-filtering-protocol-00.txt May 1999 1. Introduction The Content Filtering Protocol (CFP) is a protocol used to communicate between firewall systems and a content filtering database server. A typical configuration is shown below: < Internet > || +==========+ | < Firewall protected resources > || | Firewall | | . || | System | | . || +==========+ < === > +===========+ || . (CFP) | Content | || . . | Filtering | || . . | Database | || +==========+ (CFP) | Server | || | Firewall | < === > +===========+ || | System | | . || +==========+ | . || | . Figure 1, Typical firewall and content filtering database configuration 1.1 Firewalls Firewall systems are computer systems used to authorize and secure network traffic between inside (secure) resources and outside (insecure) resources. When used with a content filtering database, the firewall can restrict access to undesirable content outside of the firewall. Firewall systems typically play a dual role: a) Unauthorized users outside the firewall are prevented from accessing resources behind the firewall. b) Users behind the firewall are kept from accessing unauthorized resources outside the firewall. Firewall systems are considered to be the dividing line between the internal (secure) resources and the external (insecure) world. 1.2 Content Filtering Databases Content filtering databases are databases of network resource addresses. In the context of firewalls, content filtering databases are used to identify requests originating behind the firewall for undesirable content outside of the firewall. Once a request is identified as undesirable, the content filtering database notifies the firewall system that the request should be denied. The firewall system then takes the necessary steps to deny the request. These steps may include, for example, logging and documenting the request or redirecting the request to a more desirable location. 1.3 Content Filtering Protocol The Content Filtering Protocol (CFP) is a network protocol that K. Gutfreund [Page 4] Internet Draft draft-content-filtering-protocol-00.txt May 1999 describes how a CFP compliant firewall communicates with a CFP compliant content filtering database. 2. The Content Filtering Protocol 2.1 Overview The protocol described herein is used to describe the communications between a firewall system and a content filtering database. Current implementations are based on communications over a TCP/IP network stack; this is, however, not a strict requirement for the protocol. 2.2 Binary Protocol The Content Filtering Protocol (CFP) is a binary protocol with variable length communication instances or messages. All numeric values are transmitted in network byte order. Character strings are terminated by a null character and are additionally accompanied by their location within a message and their length in bytes. Each message is composed of a fixed length header field (message header block) followed by a variable length body (message body block). The message body block may have zero(0) length. Figure 2 shows the format of a message. +======================+ | Message Header Block | | ( 16 bytes) | +======================+ | Message Body Block | | ( variable ) | +======================+ Figure 2, Message format 2.3 Response Request Architecture Communications between the firewall and the database follow a typical client-server, transactional architecture. The firewall (the client) initiates all requests to the database (the server) and the architecture defines a precise response from the database back to the firewall for every request. Typically, a client will issue some "request" for information (say a URL lookup request) and the server will return a "response." This request-response pair is called a "transaction". For example, a client will transmit a SERVER_STATUS_REQUEST and the server will respond with a SERVER_STATUS_RESPONSE. The command field (described below) in the Message Header Block semantically identifies each request or response in a transaction. Communications are straightforward. The firewall clients and server communicate over a standard TCP/IP connection, on port 18311. There may be more than one firewall client connected to the database server. K. Gutfreund [Page 5] Internet Draft draft-content-filtering-protocol-00.txt May 1999 2.4 Message Header Block The Message Header Block is used for both request and response messages. Depending upon the request or response, it may be followed by a Message Body Block. Field Size Field Name (bytes) Description ----- ------- ----------------- Length short(2) Number of bytes in the entire message, including the Message Header Block. ----- ------- ----------------- Version byte(1) The major version number of this protocol. (Major) ----- ------- ----------------- Version byte(1) The minor number of this protocol. (Minor) ----- ------- ----------------- Command short(2) The command identifier. ----- ------- ----------------- Reserved short(2) Reserved, must be 0. ----- ------- ----------------- Reserved long(4) Reserved, must be 0. ----- ------- ----------------- Transaction long(4) A unique identifier generated by the requester ID and returned in the corresponding response message. ----- ------- ----------------- 2.5 Messages Messages are composed of a fixed length message header block and a variable length message body block. The message body block may have zero length. Both requests from the firewall and responses from the database use this same message format. There are two types of messages, requests and responses. Requests are sent from the client to the server; responses are sent from the server to the client. There is a one-to-one correspondence between requests and responses. The command field in the message header block identifies the message. For example, the SERVER_STATUS_REQUEST message has a message header block containing the value of 1 in the Command ID field. The SERVER_STATUS_RESPONSE message has a message header block containing the value of 101 in the Command ID field. For ease of use, all response messages have Command ID values 100 greater than their corresponding request message ID values. 2.5.1 SERVER_STATUS_REQUEST - Command ID: 1 The purpose of this message is for the firewall to request status from the content filtering database. The request is composed of a Message K. Gutfreund [Page 6] Internet Draft draft-content-filtering-protocol-00.txt May 1999 Header Block with a command identifier of SERVER_STATUS_REQUEST. This message has no message body block. The database returns status information in a SERVER_STATUS_RESPONSE message. 2.5.2 VERSION_REQUEST - Command ID: 2 The purpose of this message is for the firewall to request version information from the content filtering database. The request is composed of a Message Header Block with a command identifier of VERSION_REQUEST. This message has no message body block. The database returns version information in a VERSION_RESPONSE message. 2.5.3 FEATURES_REQUEST - Command ID: 3 The purpose of this message is for the firewall to request information from the content filtering server about supported protocols, features, filtering categories, etc. For example, the firewall could determine whether or not the server supports filtering for the FTP protocol. [This message is reserved for future use and is not currently implemented.] 2.5.4 URL_LOOKUP_REQUEST - Command ID: 4 The purpose of this message is for the firewall to determine if an URL should be filtered. The URL_LOOKUP_REQUEST is composed of a Message Header Block with a command identifier of URL_LOOKUP_REQUEST, followed by a Message Body Block as shown below. The database responds with a URL_LOOKUP_RESPONSE message. URL_LOOKUP_REQUEST Message Body Block Field Size Field Name (bytes) Description ----- ------- ----------------- Protocol short(2) A value indicating the protocol type: HTTP, FTP, NNTP, etc. The value used is the port number for the protocol from RFC 1700. ----- ------- ----------------- URL string short(2) The length of the following URL string in length bytes. ----- ------- ----------------- Source long(4) The IP address of the original client host address (not the Firewall) requesting the URL. ----- ------- ----------------- URL char(var) The requested URL. This string is null string terminated. ----- ------- ----------------- 2.5.5 SERVER_STATUS_RESPONSE - Command ID: 101 The server status response is composed of the standard message header and the following message body: K. Gutfreund [Page 7] Internet Draft draft-content-filtering-protocol-00.txt May 1999 Field Size Field Name (bytes) Description ----- ------- ----------------- Status code short(2) The server status code. ----- ------- ----------------- Status msg short(2) The length of the server status message, length or 0 if no message. ----- ------- ----------------- License long(4) The number of workstations licensed to use count the database. ----- ------- ----------------- Reserved long(4) Reserved, must be 0. ----- ------- ----------------- Reserved long(4) Reserved, must be 0. ----- ------- ----------------- Message char(var) A status message string corresponding to the String status code. The string is null terminated. ----- ------- ----------------- 2.5.6 VERSION_RESPONSE - Command ID: 102 This message has no message body block. The request is simply the Message Header Block with a command identifier of VERSION_REQUEST. The server responds with a VERSION_RESPONSE message. 2.5.7 FEATURES_RESPONSE - Command ID: 103 This message has no message body block. The request is simply the Message Header Block with a command identifier of VERSION_REQUEST. The server responds with a VERSION_RESPONSE message. [This message is reserved for future use and is not currently implemented.] 2.5.8 URL_LOOKUP_RESPONSE - Command ID: 104 The URL lookup response is composed of a Message Header Block and the message body below. When a URL is to be blocked, the database server must return a lookup code greater than 0 and zero or more of the following items: a. An optional text message describing why the URL was blocked, suitable for display on the client browser. b. An optional "redirected" (replacement) URL. c. An optional rating label or category Field Size Field Name (bytes) Description ----- ------- ----------------- lookup code short(2) Zero if the request should not be blocked. Non-zero lookup codes indicate that a request should be blocked, the license has been exceeded, or an error occurred. ----- ------- ----------------- K. Gutfreund [Page 8] Internet Draft draft-content-filtering-protocol-00.txt May 1999 Rating short(2) Zero if no rating label or category found. label Otherwise, length of the rating label and/ length or category string in bytes. ----- ------- ----------------- Rating short(2) Offset from the beginning of this structure Label to the rating label and/or category string. offset ----- ------- ----------------- Message short(2) Zero if no rating message. Otherwise, label length of an HTML formatted message for length display on the client's browser. This string is null terminated. ----- ------- ----------------- Message short(2) Offset from the beginning of this structure label to the rating message string. offset ----- ------- ----------------- Redirect short(2) Zero if no message. Otherwise, URL length length of a redirected URL string for display on the client's browser. This string is null terminated. ----- ------- ----------------- Redirect short(2) Offset from the beginning of this structure URL offset to the redirected URL string. ----- ------- ----------------- Reserved short(2) Reserved, must be 0. ----- ------- ----------------- Rating char(var) The rating label and/or category for the label requested URL, if available. This string string is null-terminated. This string is only present if the rating label string length is > 0. ----- ------- ----------------- Message char(var) For blocked URLs, HTML formatted text for string display on the client's browser. This string is null-terminated. This string is only present if the message string length is > 0. ----- ------- ----------------- Redirected char(var) For blocked URLs, a redirected URL for the URL string client's browser. This string does not need to be null-terminated. This string is only present if the redirected URL string length is > 0. ----- ------- ----------------- 3. Character Encoding All character strings in CFP are UTF8 encoded, null terminated and are accompanied by the length (in bytes). 4. Security Considerations Using the content filtering protocol allows the content filtering K. Gutfreund [Page 9] Internet Draft draft-content-filtering-protocol-00.txt May 1999 database server to be safely located behind the firewall. Alternatively, the content filtering database server can also be located on the Internet side of the firewall, as shown below: < Internet > ^^ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ || ~ ~ ~ ~ ~ ~ ~ ~ vv +===========+ +==========+ | Content | | Firewall | < ===== > | Filtering | | System | | Database | +==========+ | Server | ^^ +===========+ ~ ~ || ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ vv < Firewall protected resources > Figure 3, Content filtering database located outside the firewall In this configuration, the content filtering database works as a proxy server for all content requests sent to and retrieved from the Internet. The principal advantage here is that many firewalls and content servers can communicate with each other without any special protocol. The firewall treats the content server as a super-firewall (operating as a firewall within a firewall mode) and the content server accepts or rejects content as per its configuration. This configuration is very easy to implement. The principal disadvantage to the above configuration is that the content filtering database server is not protected by the firewall. One other disadvantage is that the firewall management system is not able to communicate with the database server, other than to send and retrieve content requests. 5. Acknowledgements The author would like to thank the following people for their support and feedback on this specification: The AltaVista Security Team, Compaq Computer Corporation Steve Shannon, The Content Advisor Chao Yu, Log-On Data Corporation Myrna, Olivia, Sander and Maxine Gutfreund K. Gutfreund [Page 10] Internet Draft draft-content-filtering-protocol-00.txt May 1999 6. Author's Address Please address all comments to: Keith Gutfreund AltaVista Internet Software Compaq Computer Corporation 550 King Street Littleton, MA 01460 Email: keith.gutfreund@compaq.com Phone: (978) 506-2147 Document Expiration Date: November 10, 1999 K. Gutfreund [Page 11]