INTERNET DRAFT                                       Srinivas Mantripragada
Category: Informational                                        NetContinuum
Title: draft-srinivas-wat-00.txt                            Prasad Vellanki
Date: Decemeber 1, 2003                                        NetContinuum
Expires: June 1, 2004                                         Sridhar Raman
                                                               NetContinuum
                                                            Venkata Nambula
                                                               NetContinuum


                         Web Address Translation (WAT)


Status of this Memo

   This document is an Internet-Draft and is in full conformance with
   all provisions of Section 10 of RFC2026.

   Internet-Drafts are working documents of the Internet Engineering Task
   Force (IETF), its areas, and its working groups.  Note that other groups
   may also distribute working documents as Internet-Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsolete by other documents at
   any time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at:

      http://www.ietf.org/ietf/1id-abstracts.txt

   The list of Internet-Draft Shadow Directories can be accessed at:

      http://www.ietf.org/shadow.html.

   Copyright Notice

     Copyright (C) The Internet Society (2003).  All Rights Reserved.

Abstract

   This draft specifies Web Address Translation (WAT) mechanism. The
   scheme allows user to hide or rewrite the backend (and internal)
   domain addresses. The scheme is based on a suite of URL translation
   schemes without requiring any of the backend application servers to be 
   reconfigured This allows to host multiple applications on different 
   virtual servers via a single domain. The analogy is similar to NAT 
   except that the proposed implementation scheme operates at the web 
   application layer and brings the value of web address translation 
   mechanism up into the network than doing the functions only in the 
   web servers at the end point.


Srinivas			Expires May 24, 2004		[Page 1]

Internet-Draft                          WAT                     Nov 2003


Table of Contents

   1.0  Introduction
   2.0  Terminology
   3.0  WAT Implementation
        3.1  Website Cloaking
             3.1.1 Status
               3.1.2 Suppress Return Code
               3.1.3 Filter Response Header
               3.1.4 Headers to Filter
        3.2  URL Translations
               3.2.1 Status
               3.2.2 External URL
               3.2.3 External Domain
               3.2.4 Internal URL
               3.2.5 Internal Domain
        3.3  URL Rewrites
               3.3.1 Status
               3.3.2 Matching Rule
               3.3.3 Sequence Number
               3.3.4 Action
                     3.3.4.1 Insert Header
                     3.3.4.2 Remove Header
                     3.3.4.3 Replace Header
                     3.3.4.4 Rewrite URL
                     3.3.4.5 Redirect URL
               3.3.5 Header
               3.3.6 Continue Processing other Rewrites
               3.3.7 New Value
   4.0 Authors
   5.0 Full Copyright statement

1.0  Introduction

   Enterprises are actively migrating business applications to web
   technologies to improve access and control costs. At the same
   time, the threat of attack is growing exponentially, with the
   majority of attacks now exploiting application-layer vulnerabilities.
   While traditional network firewalls address network access control,
   blocking unauthorized network-level requests, application firewalls
   address the application layer by enforcing security policies within
   application sessions. An application firewall specifically protects the
   Web application communication stream and all associated application 
   resources from attacks that happen via the Web protocol.

   The logical place to add this protection is at the corporate edge where
   the traditional firewall currently sits. A major portion of web attacks
   is through tampering with the HTTP protocol compliant URLs and header
   fields. One of the pre-requisites of a true web application firewall is
   URL and header protection. In this draft, we propose Web Address


Srinivas			Expires May 24, 2004		[Page 2]

Internet-Draft                          WAT                     Nov 2003


   Translation (WAT) scheme that can be effectively implemented as a 
   standard mechanism to provide URL and header protection at the edge. 
   The key highlights of the WAT implementation scheme include:

   (1) Ability to hide an internal structure of a company's web site.
   (2) Create a homogeneous and consistent URL layout over all WWW servers
         within an Intranet Web cluster.
   (3) Give the WWW namespace a consistent server-independent layout.
   (4) Provide a consistent URL translation mechanism by which:
       (4.1) Exported URLs do not need to bind to any physically correct
               target server.
         (4.2) Applications do not need to be altered to work outside the
             firewall.

   The authors feel that WAT is a natural extension to NAT implementation
   (RFC 1631) with a different goal in mind. The NAT implementation
   presents a technique to connect end IP addresses in a public (or 
   external) network to communicate with the end IP addresses in a private 
   (or internal) network and vice-versa. NAT works by using the several 
   million private addresses that have been put aside by the Internet 
   Engineering Task Force, turning a public IP address such as 
   192.156.136.22 into a private address, such as 10.0.0.4, for delivery 
   to a user's PC. Private IP addresses cannot be "seen" by the Internet, 
   and therefore may be reused by various enterprise networks. The WAT 
   implementation adopts similar philosophy and proposes a series of 
   techniques to modify and translate URLs and headers that are globally 
   visible in the WWW namespace to a private URL namespace that is not 
   visible to the external world. The WAT implementation specifics are 
   described in the subsequent sections.


2.0  Terminology

   The following terms are used in the rest of the document.

   2.1 Network Address Translation (NAT)

   The term NAT in this document refers to translation of a private/internal
   IP address to a public/external IP address and vice-versa.

   2.2 Uniform Resource Identifier (URI)

   The W3C's codification of the name and address syntax of present and
   future objects on the Internet. In its most basic form, a URI consists 
   of a scheme name (such as file, http, ftp, news, mailto, gopher) 
   followed by a colon, followed by a path whose nature is determined by 
   the scheme that precedes it (see RFC 1630). URI is the umbrella term 
   for URNs, URLs, and all other Uniform Resource Identifiers.

   2.3 World Wide Web (WWW)

   The World Wide Web is a collection of information servers linked

Srinivas			Expires May 24, 2004		[Page 3]

Internet-Draft                          WAT                     Nov 2003


   together through a language called hypertext. This allows you to select
   a hypertext link on one page which may take you to a different server 
   halfway around the world.

   2.4 Uniform Resource Locator (URL)

   The World Wide Web address of a site on the Internet.

   2.5 Hypertext Reference (HREF)

   This is an attribute used to set the URL of an object that is being
   referenced. This attribute is used in many tags, but mostly the <A> tag.

3.0  WAT Implementation

   The proposed WAT implementation is split into 3 main techniques.
   (1) Website Cloaking
   (2) URL Translations
   (3) URL Rewrite (Request and Response)

   3.1 Website Cloaking

       The Website Cloaking is described as a method to conceal enterprise 
       web resources from hackers and worms scanning for vulnerabilities. 
       Almost every successful attack is preceded by probing websites for 
       weakness. Readily available tools on the Internet such as Whisker, 
       Nessus and Nikto make it easy for potential intruders to scan any 
       website, determining exactly how applications were built, what kind 
       of servers they are running on, and which URLs contain 
       vulnerabilities. Worms such as Code Red auto scan the internet for 
       specific server types with known vulnerabilities in order to launch 
       an attack.

       In the proposed implementation, website cloaking effectively hides 
       URL return codes, HTTP headers and backend IP addresses. As a 
       result, there is zero visibility into which web servers, 
       application servers, operating systems, directory structure and 
       patches are running on the protected web sites.

       The implementation details follows:

       3.1.1 Status

       This parameter is used to enable or disable this policy.

       3.1.2 Suppress Return Code

       When enabled, this parameter blocks the return of an HTTP status 
       code in a response header. These codes are returned from a server 
       if there is a problem with the browser or the Web server, itself. 
       The two types of response error codes that are suppressed include:


Srinivas			Expires May 24, 2004		[Page 4]

Internet-Draft                          WAT                     Nov 2003


       . 4xx (client): These are "400-series" error codes.
       These codes are intended for instances where a client seems to have
       erred when attempting to access a Web page. For example, "404: Page 
       not found."

       . 5xx (server): These are "500-series" error codes.
       These codes are intended to indicate that a Web server is aware 
       that it has a problem or that it is incapable of performing a 
       request. For example, "500: Internal Error".

       With these codes "suppressed", weakness in any infrastructure will 
       also be suppressed, since the hacker will not know whether there is 
       a problem with the client or the Web server.

       3.1.3 Filter Response Header

       When enabled, this parameter filters a specific HTTP header in a
       response. The actual HTTP header response can be defined by using 
       the "Headers" option (defined below).

       3.1.4 Headers

       This parameter is used to define the banner header in a response 
       that needs to be filtered. The input format is specified in string  
       format.


   3.2 URL Translations

   When a Web site sends a page to a user, it typically includes a variety
   of embedded references to other objects on the site. If the reference 
   are relative, meaning that they don't include the name of the server 
   within them (/content.html) rather than absolute
   (http://www.example.come/content.html), there is no problem.

   However, most Web sites do embed absolute links. Two problems arise. 
   The First frequently occurs in situations where a proxy is performing 
   SSL acceleration. When links embedded in the document are prefixed with 
   "http" instead of "https", "users" click are made to the unencrypted 
   pages where URLs are sometimes delivered without question or just don't 
   work. The URL translation mechanism should allow parsing the response 
   and rewriting the "http" to "https".

   The second problem occurs when a proxy's domain name is different from 
   the server's name - for example, a server named server.example.com and 
   a proxy called www.example.com. Applications that look to the host name 
   might end up embedding links such as 
   http://server.example.com/content.html when they should say
   http://www.example.com/content.html.


Srinivas			Expires May 24, 2004		[Page 5]

Internet-Draft                          WAT                     Nov 2003


   JavaScript and HTTP cookies increase the problem. JavaScript-driven 
   pages often dynamically assemble URLs on the client side, and the 
   HTTP cookies are sent from the server such that the client will only 
   send them back when communication with the server and not through a 
   proxy.

   In most cases, site administrators lack the resources to make the 
   changes to applications to fix a problem. Instead, what is needed 
   is a rewriting/mapping of incorrect URLs to the correct form. The 
   rewriting/mapping has to happen for links being sent from the server 
   to the client and for HTTP requests from the client to the server.

   The URL translation is able to rewrite URLs embedded within HTML,
   DHTML, XHTML, Cascading Style Sheets, JavaScript, HTTP cookies and 
   Flash.

   A link that once appeared as

   http://intranet.company.com/content.html

   will now appear as
   https://proxy.company.com/prx/000/http/intranet.company.com/content.html

   The URL translation occurs such that everything is syntactically and
   semantically correct.

   Step1: User request https://www.example.com

   Step2: Server responds with content which includes the link"

   http://server.example.com/images/logi.jpg

   Step3: URL translation rewrites the outgoing response and sends it to  
          the User.

   Step4: User requests
   https://example.com/prx/00/http/server.example.com/image/logo.jpg

   Step5: URL translation rewrites the incoming request and the server
          recieves:
   http://server.example.images/logo.jpg

   By performing the above operations, the server doesn't realize that the
   content was modified in any way. This helps provide application
   security.

   The implementation details follows:

     3.2.1 Status

     Expects a Boolean input [Yes/No]. The parameter is used to enable or
     disable this feature.

Srinivas			Expires May 24, 2004		[Page 6]

Internet-Draft                          WAT                     Nov 2003


     3.2.2 External URL

     Expects a string input. External URL should be publicly exported URL
     in the WWW namespace and has to be unique. An empty value means that
     no translations need to be performed on this external URL. Requests 
     coming from the client with the matching input string are mapped to a 
     unique URL translation rule. The domain part of the outgoing 
     requests is rewritten back with the input value. The string "*" means 
     rewrite all absolute URLs on the response data. Domain can be a suffix 
     pattern or a simple string.

     3.2.3 External Domain

     Expects a string input. External domain should be the publicly
     exported Domain in the WWW namespace and has to be unique. For
     example www.mysite.com, www.mydomain.com etc.

     3.2.4 Internal URL

     Expects a string input. Internal URL should always start with a '/'
     character and should be locally visible.

     3.2.5 Internal Domain

     Expects a string input. Internal Domain represents the local namespace
     server or IP address that is not visible (or exported) to the external
     user.

     3.2.5 Example
     The following example configuration can be used to translate a URL
     internally mounted as /bugzilla to an externally visible URL,
     http://www.mydomain.com/bugs. As a result the internally mounted URL is
     now invisible to the external user.

     http://www.mydomain.com/bugs => /bugzilla

     Name:              bugs
     Status:            On/Off
     External URL:      /bugs
     External Domain:   www.mydomain.com
     Internal URL:      /
     Internal Domain:   bugzilla


   3.3 URL Rewrite

   The WAT implementation proposes URL rewrite for both incoming requests
   and outgoing responses. The specific implementation details follows:

     3.3.1 Status
     This parameter is used to enable or disable this feature.


Srinivas			Expires May 24, 2004		[Page 7]

Internet-Draft                          WAT                     Nov 2003


     3.3.2 Matching Rule

     Expects a string input. Can be in a regular expression or a prefix-
     suffix pattern. Can specify multiple rules. The pattern will be used 
     to match the URL or the Header as specified in the Action field below.

     3.3.3 Sequence Number

     Expects a non-negative value. The number specifies the order in which
     the matching rules as specified in 3.3.2 need to be processed.

     3.3.4 Action

     The Action field specifies the operation that needs to be followed once
     the rule is matched. The action attributes apply to only Header and URL
     field items and are listed below:

           3.3.4.1 Insert Header
                   The matching rule specified in 3.3.2 applies to Header 
                   field. This applies to both incoming request and outgoing 
	           response. If the rule matches, insert a header field, the 
                   value is specified in "New Value" field, as specified in 
                   3.3.7.

           3.3.4.2 Remove Header
                   The matching rule specified in 3.3.2 applies to Header 
                   field. This applies to both incoming request and outgoing 
                   response. If the rule matches, remove the header field.

           3.3.4.3 Replace Header
                   The matching rule specified in 3.3.2 applies to Header 
                   field. This applies to both incoming request and outgoing 
	           response. If the rule matches, replace the old header 
	           value with the new value, as specified in 3.3.7.

           3.3.4.4 Rewrite URL
                   The matching rule specified in 3.3.2 applies to URL 
                   field. This applies to incoming request only. If the rule 
		   matches, rewrite the URL with the new URL as specified in 
		   3.3.7.

           3.3.4.5 Redirect URL
                   The matching rule specified in 3.3.2 applies to URL 
                   field. This applies to incoming requests only. If the 
		   rule matches, redirect the URL to a new location. The 
	  	   new URL value is specified in 3.3.7.

     3.3.5 Header

     Expects a string input. Specifies one of the many header fields that 
     need to be matched and the corresponding action as specified in 3.3.4 
     that needs to be taken.

Srinivas			Expires May 24, 2004		[Page 8]

Internet-Draft                          WAT                     Nov 2003


     3.3.6 Continue Processing other Rewrites

     Expects a boolean input [Yes/No]. Provides an option for the rewrite
     engine to stop after the first match or continue processing all the 
     rules specified.

     3.3.7 New Value

     Expects a string input. This specifies the new value that the action as
     specified in 3.3.4 needs to operate upon.

4.0  References

[NAT]      Egevang, K. and P. Francis, "The IP Network Address
           Translator (NAT)", RFC 1631, May 1994.

[NAT-TERM] Srisuresh, P. and M. Holdrege, "IP Network Address
           Translator (NAT) Terminology and Considerations", RFC
           2663, August 1999.


5.0  Authors

Srinivas Mantripragada
1705 Wyatt Drive
Santa Clara, CA 95054 USA
Phone: 408-961-5600
Fax: 408-986-8997
Email: srinivas@netcontinuum.com


6.0  Full Copyright Statement

     Copyright (C) The Internet Society (2003). All Rights Reserved.

     This document and translations of it may be copied and furnished to
     others, and derivative works that comment on or otherwise explain it
     or assist in its implementation may be prepared, copied, published
     and distributed, in whole or in part, without restriction of any
     kind, provided that the above copyright notice and this paragraph are
     included on all such copies and derivative works. However, this
     document itself may not be modified in any way, such as by removing
     the copyright notice or references to the Internet Society or other
     Internet organizations, except as needed for the purpose of
     developing Internet standards in which case the procedures for
     copyrights defined in the Internet Standards process must be
     followed, or as required to translate it into languages other than
     English.

     The limited permissions granted above are perpetual and will not be
     revoked by the Internet Society or its successors or assignees.


Srinivas			Expires May 24, 2004		[Page 9]

Internet-Draft                          WAT                     Nov 2003


     This document and the information contained herein is provided on an
     "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
     TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
     BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
     HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
     MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.


Srinivas			Expires May 24, 2004			[Page 10]