SIP                                                          C. Jennings
Internet-Draft                                             Cisco Systems
Expires: January 16, 2006                                  July 15, 2005


            Computational Puzzles for SPAM Reduction in SIP
                     draft-jennings-sip-hashcash-02

Status of this Memo

   By submitting this Internet-Draft, each author represents that any
   applicable patent or other IPR claims of which he or she is aware
   have been or will be disclosed, and any of which he or she becomes
   aware will be disclosed, in accordance with Section 6 of BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   This Internet-Draft will expire on January 16, 2006.

Copyright Notice

   Copyright (C) The Internet Society (2005).

Abstract

   One of the techniques used in SPAM prevention and various solutions
   for denial of service attacks is to force the SIP client requesting a
   service to perform a calculation that limits the rate and increases
   the cost of the request.  This draft defines a way to allow a UAS to
   ask the UAC to compute a computationally expensive hash based
   function and present the result to the UAS.  Although the computation
   is expensive for the UAC to compute, it is cheap for the UAS to
   verify.  The solution also allows for proxies to compute and check
   the puzzle on behalf of the UAC or UAS.


Jennings                Expires January 16, 2006                [Page 1]

Internet-Draft          SIP Puzzles Against SPAM               July 2005


1.  Introduction

   The SPAM prevention problem is complex and will require many
   techniques working in combination to balance reducing SPAM to
   acceptable levels while still fostering efficient communication.  The
   overall problem and various approaches are in [7].  Clearly white
   lists are a critical part of dealing with SPAM.  Any system would
   first check whether an incoming request for communications was from
   someone on the white list.  The Identity [6] mechanisms are critical
   for understanding who the caller is and to check whether the caller
   is on the white list.  As well, there still needs to be a way for
   callers not on the white list to communicate with the user.  It is
   here that this specification becomes relevant.

   The problem is how to permit contacts from people with no previous
   relationship to us without receiving undesirable contacts.  This
   draft uses the idea that it may be possible to make undesirable
   contacts more expensive than desirable ones.

   Different undesirables are willing to spend different amounts of time
   and money on contacting their markets.  Founders of acquired startups
   are often contacted by random financial companies offering to help
   manage the new riches.  These companies will send people from New
   York to San Jose and spend hours talking to this very narrow target
   market.  Clothing retailers will mail glossy catalogues worth $1
   apiece to houses within the right demographic zip codes.  Emails
   advertising Viagra are sent to random email addresses.  As the costs
   go down, the volume of unsolicited contact goes up.

   Often people whose contact is desirable are willing to spend much
   less than some of the undesirables.  The student in Fiji who wants to
   ask about this draft will send an email but probably will not fly
   here to talk to me.  I would like to receive that email.

   Increasing the cost of contact will reduce both desirable and
   undesirable contact.  My assumption is that the cost should be set
   very low, so that even a person with a pathetic CPU could still make
   contact in, say, 10 seconds.  Key to this draft is that the receiver
   can set this cost.  This low cost will not stop the financial
   advisers or the telemarketers, but it might stop the Viagra ads.  It
   would also probably stop a single user from ringing every phone of
   some residential service provider in a five-second window, before any
   operator or system can react.  Deciding what cost to set constitutes
   a classic type I/type II error problem, and the receiver gets to
   choose how to balance these two errors.

   As is clearly stated in [7], whitelists are the best thing.  After
   that, this is one of the multiple other options that need


Jennings                Expires January 16, 2006                [Page 2]

Internet-Draft          SIP Puzzles Against SPAM               July 2005


   consideration.

   In general there are two arguments about why the computation puzzles
   in this specification will not work.  The first is that the bad guys
   have the most powerful CPUs.  This issues was addressed above.  The
   other argument is that bad guys have infinite CPU time through using
   armies of zombie PCs.  The problem with this argument is that the
   goal is not to block particular bad guys but to reduce the overall
   number of undesirable messages.  This second argument is, however,
   more worrisome than the first.

   Assume that some percentage of the world's machines each year get
   owned and used as zombies.  Let's say that a given machine has 1% of
   having this happen to it in a year, that it sends zombie traffic for
   24 hours before getting shut down, and that the mechanism described
   here limits it to ten messages per second: each machine on the
   internet would receive an average of about one undesirable message
   per hour.  If you assume there are more users than machines, this
   looks appealing.  If message sending technology detects users that
   are sending lots of messages and shuts them down in less than 24
   hours, it gets better.  It gets better still if you hope for
   improvements in operating systems or for users to choose them more
   carefully.  The next assumption is hard to model statistically but it
   is true: the people with the best financial incentives to send
   undesirable messages do not want to be subject to the legal and
   reputation problems of using zombies to get their message across.

   The zombie problem basically comes down to this.  If there are a
   small percentage of machines in the world that are zombies, they do
   not render this computation puzzle approach useless.  If 10% of the
   machines in the world are zombies, this approach will be useless.
   This specification does not attempt to deal with how to make the
   world such that a small percentage of computers are zombies - the is
   the problem for other work and that work needs to happen for SPAM to
   be reduced to reasonable level.  This specification does assume that
   the zombie problem is solved to the level where a small percentage of
   the worlds computers are zombies.

   So in summary, white listing is the first and best defense.  But for
   dealing with messages from people with whom we have not previous
   direct or indirect relationship, another approach is necessary.
   Hashcash cannot stop all bad messages - that is not the goal - but it
   can raise the cost of messages and thus decrease the number of times
   it makes economic sense to send undesirable ones.  This approach does
   assume that bad guys will have more CPU power than good guys and that
   zombies will still send lots of messages.  This approach will simply
   reduce the number of undesirable messages by some amount that cannot
   be measured.


Jennings                Expires January 16, 2006                [Page 3]

Internet-Draft          SIP Puzzles Against SPAM               July 2005


   No one knows if this approach would reduce SPAM noticeably.  Right
   now the only thing that limits the rate at which I can call every SIP
   phone in the world is proxies getting overloaded.  And of course,
   most SIP phones are not connected to the public internet.  Of course,
   the SPAM problem is one reason why many SIP phones are not connected
   to the public internet.  There are some other approaches outlined in
   [7].  They have different pros and cons, and it is probably necessary
   to use most of them to ensure SPAM stays at an acceptable level.

2.  Overview

   This specification extends RFC 3261 [3] and defines a mechanism for a
   proxy or UAS to request that a UAC compute the solution to a puzzle.
   The puzzle is based on finding a value called the pre-image that,
   when hashed with SHA1 [4], results in a specific value referred to as
   the image.  The goal is for the UAC to find a pre-image that will
   SHA1 hash to the correct image.  The UAS provides a partial pre-image
   with some of the low order bits set to zero, together with the number
   of bits in the pre-image that have been set to zero.

   The UAS provides the puzzle information using a 419 response, and the
   UAC resubmits the request along with the solution to the puzzle.  The
   high level flow of information is shown below.


     UAC                        UAS
      |  Request                 |
      |------------------------->|
      |                          |
      |          419 with Puzzle |
      |<-------------------------|
      |                          |
      |  Request with Solution   |
      |------------------------->|
      |                          |

   This specification defines the 419 response code along with a new
   header, called Puzzle, to carry the puzzle and solution.

3.  Definitions

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC 2119 [2].

4.  Puzzles

   The normative definition of a puzzle is as follows.  A puzzle is four


Jennings                Expires January 16, 2006                [Page 4]

Internet-Draft          SIP Puzzles Against SPAM               July 2005


   values: an integer number referred to as work, a pre-image string, an
   image string, and a integer number referred to as value.  There MUST
   exist a value X such that all but the "work" number of low order bits
   of X match the pre-image string, and the SHA1 hash of the string
   formed by the concatenation of "z9hG4bK" and X results in a value Y,
   where the "value" number of low order bits of Y are the same as those
   bits in the image string.  The SHA1 hash is computed as described in
   RFC 3174 [4].  The value X is the solution to the puzzle.  The 'work'
   number of low order bits of the pre-image MUST be zero.

   This can all be described more mathematically.  The notation low(v,x)
   returns the low v low order bits of x, and zero(v,x) returns x with
   the low v bits set to zero.  The | operator signifies string
   concatenation.  The solution to the puzzle can be considered finding
   an X such that both the following are true:


    low( value, image ) = low( value, sha1( "z9hG4bK" | X ) )
        zero( work, X ) = zero( work, pre-image )

   The pre-image forms a constraint on X. The value of X is the same
   pre-image, other than the low 'work' bits that are set to zero in the
   pre-image.  The 'value' is the number of bits that match in the
   solution and is typically set to 160, which is the full size of the
   SHA1 hash result.

   The following is a non-normative way for a UAS or proxy to construct
   a puzzle.  The following strings are concatenated:

   1.  a secret that only this device knows.  This would typically be a
       crypto random string of bits;
   2.  the current time, rounded to the nearest minute;
   3.  the URI of the request, the Call-ID, the From tags, and the
       branch tag for a proxy or the To tag for a UAS.

   The string is hashed with SHA1 to form the pre-image.  The pre-image
   is appended to the string "z9hG4bK", and the SHA1 hash of this is
   computed to get the value of the image.  A value 'work' indicates how
   many bits of the pre-image are to be removed.  The value 'work' could
   be a configurable parameter, or it could be dynamically discovered by
   the software based on how long a hash should take and the speed of
   the computer it was running on.  In the latter case, the resulting
   software would automatically choose larger values of 'work' as
   computers get faster.  The low order 'work' bits of the pre-image are
   set to zero.  The puzzle consists of the chosen value of 'work', the
   pre-image (with the low order bits set to zero), the image, and the
   'value'.  The 'value' would typically be set to 160 as this is the
   size of the SHA1 hash.


Jennings                Expires January 16, 2006                [Page 5]

Internet-Draft          SIP Puzzles Against SPAM               July 2005


5.  Semantics

5.1  UAS Creating Puzzle

   When a UAS wishes to challenge a request, it MAY create a puzzle,
   encode this puzzle in a Puzzle header field value, and return the
   puzzle in a 419 response.

5.2  UAC Receiving Puzzle

   When a UAC receives a 419 response, it needs to look at the 'work'
   and 'value' requested and decide whether or not to try to solve this
   puzzle.  This decision can be made based on the programmed policy and
   possibly human input.  The UAC should not tackle a puzzle that will
   take longer than the age of the universe to solve.  If the UAC
   chooses to try to solve the puzzle then it proceeds along the
   following steps:

   1.  Check that the 'work' bottom bits of the pre-image are all zero.
       If they are not, this is an invalid puzzle and the 419 response
       MUST be considered an error response.
   2.  Set Y to low( value, image ).
   3.  Create a loop where X ranges from the value of the pre-image to
       the value of the pre-image plus 2 raised to power of the 'work'.
   4.  For each interaction through the loop, check if low( value, sha1(
       "z9hG4bK" | X )) equals Y. If it does, a solution X has been
       found and the loop can terminate.

   If the loop terminates without a solution being found, the puzzle was
   bad and the 419 response MUST be considered as an error response.

   Once the solution to the puzzle, X, is found, a new request is formed
   by copying the old request and adding an additional puzzle header
   field value.  The new puzzle header field value MUST have the 'work'
   set to 0, the pre-image set to the value X, the image set to the
   value of the image in the original puzzle, and the value parameter
   set to the same as the value parameter in the original puzzle.  Note
   that if a request was challenged by one proxy and a new request was
   generated with a solution, and then this request was challenged by a
   second proxy, a third request would be generated that had two Puzzle
   header field values.  If a UAC, through some out of band mechanism,
   knows that it will be challenged and what the puzzle will be, it MAY
   include the appropriate puzzle header field value in the initial
   request.

5.3  Proxy Behavior

   SIP allows proxies to act as UASs when generating 4xx responses.


Jennings                Expires January 16, 2006                [Page 6]

Internet-Draft          SIP Puzzles Against SPAM               July 2005


   This same mechanism can be used to allow a proxy to generate the
   challenge on behalf of a UAS in its domain.

   Proxies may also act on behalf of the UAC and compute the solution to
   a puzzle on behalf of the UAC in either a request or a response that
   passes through the proxy.  Typically a proxy would only do this for a
   UAC that had authenticated to the proxy and for which the proxy had a
   service relationship.

6.  Example

   TBD

7.  Syntax

   The Puzzle header field carries the puzzle and solution information.
   It has a parameter called 'work' that has the number of bits of the
   pre-image that have been set to zero for this puzzle.  It has a
   parameter called 'pre' that carries the pre-image string base64
   encoded, and a parameter called 'image' that carries the image string
   base64 encoded.  In addition there is a parameter called 'value' that
   indicates how many bits of the resulting hash will match the 'image'
   string.  The base64 encoding is done as described in RFC 3548 [1].

   When the header field value is carrying a solution to a puzzle, the
   work parameter will be set to zero.

   Example:

       Puzzle: work=10; pre="XPokF1n0+NG6iwRcYzeXuETrtDo=";
               image="XPokF1n0+NG6iwRcYzeXuETrtDo="; value=160

   The ABNF for the header is:

    Puzzle       = "Puzzle" HCOLON puzzle-parm *(COMMA puzzle-param)

    puzzle-param =  puzzle-bits SEMI puzzle-pre SEMI puzzle-image
                    SEMI puzzle-value *( SEMI generic-param )

    puzzle-work  = "work=" 1*DIGIT
    puzzle-value = "value=" 1*DIGIT
    puzzle-pre   = "pre=" quoted-string
    puzzle-image = "image=" quoted-string

   This document updates the dreaded Table 2 of RFC 3261 to be:


Jennings                Expires January 16, 2006                [Page 7]

Internet-Draft          SIP Puzzles Against SPAM               July 2005


    Header field         where   proxy   ACK  BYE  CAN  INV  OPT  REG
    ------------         -----   -----   ---  ---  ---  ---  ---  ---
    Puzzle                        amr     o    o    -    o    o    o

                                         SUB  NOT  REF  INF  UPD  PRA
                                         ---  ---  ---  ---  ---  ---
                                          o    o    o    o    o    o


8.  Open Issues and To Do Items

   The current mechanism has poor interaction with the HERFP forking
   problem.  If several endpoints sent a 419, the proxy would need to
   aggregate the results and add something like the realm to the
   challenges to keep them sorted out.  Need to add this in next
   revision.  In many cases the solution would work out better if the
   proxy that was doing the forking applied the policy and did the 419
   before forking.  This approach has the usual HERFP problem that if
   some UAs do a 419, and some UAs don't, the request will only reach
   the UAs that don't do the 419.

9.  Security Considerations

   Still TBD.

   The concatenation with "z9hG4bK" is done so that this mechanism
   cannot be used as a distributed computation to reverse arbitrary hash
   values, as that would present a security risk for other hash based
   security schemes.

   TODO - Advice on selecting the size of 'work'.

10.  IANA

   This specification registers a new header and a new response code.
   IANA is requested to make the following updates in the registry at:
   http:///www.iana.org/assignments/sip-parameters

10.1  Puzzle Header

   Add the following entry to the header sub-registry.

     Header Name        compact    Reference
     -----------------  -------    ---------
     Puzzle                        [RFCXXXX]


Jennings                Expires January 16, 2006                [Page 8]

Internet-Draft          SIP Puzzles Against SPAM               July 2005


10.2  419 Response

   Add the following entry to the response code sub-registry under the
   "Request Failure 4xx" heading.

       419  Puzzle Required                      [RFCXXXX]


11.  Acknowledgments

   This approach was motivated by [5].

12.  References

12.1  Normative References

   [1]  Josefsson, S., "The Base16, Base32, and Base64 Data Encodings",
        RFC 3548, July 2003.

   [2]  Bradner, S., "Key words for use in RFCs to Indicate Requirement
        Levels", BCP 14, RFC 2119, March 1997.

   [3]  Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, A.,
        Peterson, J., Sparks, R., Handley, M., and E. Schooler, "SIP:
        Session Initiation Protocol", RFC 3261, June 2002.

   [4]  Eastlake, D. and P. Jones, "US Secure Hash Algorithm 1 (SHA1)",
        RFC 3174, September 2001.

12.2  Informational References

   [5]  Black, A., "http://www.hashcash.org/", February 2005.

   [6]  Peterson, J. and C. Jennings, "Enhancements for Authenticated
        Identity Management in the Session Initiation  Protocol (SIP)",
        draft-ietf-sip-identity-05 (work in progress), May 2005.

   [7]  Rosenberg, J., "The Session Initiation Protocol (SIP) and Spam",
        draft-ietf-sipping-spam-00 (work in progress), February 2005.


Jennings                Expires January 16, 2006                [Page 9]

Internet-Draft          SIP Puzzles Against SPAM               July 2005


Author's Address

   Cullen Jennings
   Cisco Systems
   170 West Tasman Drive
   MS: SJC-21/2
   San Jose, CA  95134
   USA

   Phone: +1 408 421 9990
   Email: fluffy@cisco.com


Jennings                Expires January 16, 2006               [Page 10]

Internet-Draft          SIP Puzzles Against SPAM               July 2005


Intellectual Property Statement

   The IETF takes no position regarding the validity or scope of any
   Intellectual Property Rights or other rights that might be claimed to
   pertain to the implementation or use of the technology described in
   this document or the extent to which any license under such rights
   might or might not be available; nor does it represent that it has
   made any independent effort to identify any such rights.  Information
   on the procedures with respect to rights in RFC documents can be
   found in BCP 78 and BCP 79.

   Copies of IPR disclosures made to the IETF Secretariat and any
   assurances of licenses to be made available, or the result of an
   attempt made to obtain a general license or permission for the use of
   such proprietary rights by implementers or users of this
   specification can be obtained from the IETF on-line IPR repository at
   http://www.ietf.org/ipr.

   The IETF invites any interested party to bring to its attention any
   copyrights, patents or patent applications, or other proprietary
   rights that may cover technology that may be required to implement
   this standard.  Please address the information to the IETF at
   ietf-ipr@ietf.org.


Disclaimer of Validity

   This document and the information contained herein are provided on an
   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
   ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
   INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
   INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.


Copyright Statement

   Copyright (C) The Internet Society (2005).  This document is subject
   to the rights, licenses and restrictions contained in BCP 78, and
   except as set forth therein, the authors retain all their rights.


Acknowledgment

   Funding for the RFC Editor function is currently provided by the
   Internet Society.


Jennings                Expires January 16, 2006               [Page 11]