SIP Working Group                                                V. Hilt
Internet-Draft                                                I. Widjaja
Expires: January 7, 2008                        Bell Labs/Alcatel-Lucent
                                                            July 6, 2007


   Essential Correction to the Session Initiation Protocol (SIP) 503
                     (Service Unavailable) Response
                    draft-hilt-sip-correction-503-01

Status of this Memo

   By submitting this Internet-Draft, each author represents that any
   applicable patent or other IPR claims of which he or she is aware
   have been or will be disclosed, and any of which he or she becomes
   aware will be disclosed, in accordance with Section 6 of BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   This Internet-Draft will expire on January 7, 2008.

Copyright Notice

   Copyright (C) The IETF Trust (2007).

Abstract

   Overload occurs in the Session Initiation Protocol (SIP) when SIP
   servers have insufficient resources to process all SIP messages they
   receive.  The SIP protocol specified in RFC 3261 provides the 503
   (Service Unavailable) response code as a remedy for servers under
   overload.  However, the current definition of 503 (Service
   Unavailable) has problems and can in fact amplify an overload
   condition.  This document proposes an essential correction to RFC


Hilt & Widjaja           Expires January 7, 2008                [Page 1]

Internet-Draft              Overload Control                   July 2007


   3261 that avoids these problems and helps SIP servers to better cope
   with overload.


Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
   2.  Terminology  . . . . . . . . . . . . . . . . . . . . . . . . .  3
   3.  Reason for Change  . . . . . . . . . . . . . . . . . . . . . .  4
   4.  Summary of Change  . . . . . . . . . . . . . . . . . . . . . .  5
   5.  Consequences if not approved . . . . . . . . . . . . . . . . .  7
   6.  The Change (Alternative 1) . . . . . . . . . . . . . . . . . .  7
     6.1.  503 Service Unavailable  . . . . . . . . . . . . . . . . .  7
     6.2.  507 Server Overload  . . . . . . . . . . . . . . . . . . .  7
   7.  The Change (Alternative 2) . . . . . . . . . . . . . . . . . .  8
     7.1.  503 Service Unavailable  . . . . . . . . . . . . . . . . .  8
   8.  Security Considerations  . . . . . . . . . . . . . . . . . . .  9
   9.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . .  9
   Appendix A.  Acknowledgements  . . . . . . . . . . . . . . . . . .  9
   10. References . . . . . . . . . . . . . . . . . . . . . . . . . .  9
     10.1. Normative References . . . . . . . . . . . . . . . . . . .  9
     10.2. Informative References . . . . . . . . . . . . . . . . . .  9
   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 10
   Intellectual Property and Copyright Statements . . . . . . . . . . 11


Hilt & Widjaja           Expires January 7, 2008                [Page 2]

Internet-Draft              Overload Control                   July 2007


1.  Introduction

   As with any network element, a Session Initiation Protocol (SIP) [2]
   server can suffer from overload when the number of SIP messages it
   receives exceeds the number of SIP messages it can process.
   Generally, a SIP server is overloaded when it does not have
   sufficient resources to process all incoming SIP messages.

   RFC3261 [2] defines the 503 (Service Unavailable) response code to
   enable servers to handle temporary overload as follows:

      The server is temporarily unable to process the request due to a
      temporary overloading or maintenance of the server.  The server
      MAY indicate when the client should retry the request in a Retry-
      After header field.  If no Retry-After is given, the client MUST
      act as if it had received a 500 (Server Internal Error) response.

      A client (proxy or UAC) receiving a 503 (Service Unavailable)
      SHOULD attempt to forward the request to an alternate server.  It
      SHOULD NOT forward any other requests to that server for the
      duration specified in the Retry-After header field, if present.

      Servers MAY refuse the connection or drop the request instead of
      responding with 503 (Service Unavailable).

   Unfortunately, this mechanism has proven to be problematic in actual
   deployments.  Problems observed include load amplification, server
   underutilization, off/on semantics and ambiguous usages [5], which
   can eventually lead to a congestion collapse; a condition in which
   the throughput of a server drops to a small fraction of its capacity.

   This specification proposes an essential correction to RFC3261 for
   the 503 (Service Unavailable) response following the process defined
   in [4].  The specification does not attempt to provide a complete
   solution for SIP overload control.  Such a solution is left for
   further study.


2.  Terminology

   In this document, the key words "MUST", "MUST NOT", "REQUIRED",
   "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT
   RECOMMENDED", "MAY", and "OPTIONAL" are to be interpreted as
   described in BCP 14, RFC 2119 [1] and indicate requirement levels for
   compliant implementations.


Hilt & Widjaja           Expires January 7, 2008                [Page 3]

Internet-Draft              Overload Control                   July 2007


3.  Reason for Change

   The current specification of 503 (Service Unavailable) responses has
   proven to be problematic [5].  In many cases, the use of 503 (Service
   Unavailable) responses does not enable a server to successfully cope
   with overload and it may indeed worsen an overload condition.

   One of the problems of the 503 (Service Unavailable) response is that
   it covers temporary server unavailability due to overload as well as
   server maintenance.  However, both cases are different in nature.
   For maintenance, a server needs to stop all incoming traffic for a
   certain period of time.  Since it is likely that not all servers of a
   domain will be taken down at the same time, it is useful for the
   receiver of a 503 response to re-send the request to an alternate
   server.

   In an overload condition, a course grained on/off semantic for
   controlling load is problematic [5].  It is beneficial if a server
   can control incoming load on a more fine grained basis.  This enables
   the server to smoothly steer load towards the desired rate and to
   reduce load even before overload occurs.  Re-sending requests to
   alternate servers during overload is also problematic and may amplify
   the load on servers [5].  In particular if multiple servers are load
   balanced, re-sending a request that was rejected by one server due to
   overload does not help to increase the chances of delivery since the
   other servers are likely to be highly loaded as well.  Instead, it
   increases the number of requests these servers have to handle.

   The following specific mechanisms defined for 503 (Service
   Unavailable) responses contribute to overall the problem:

   Server maintenance vs. overload control:  The 503 (Service
      Unavailable) response is used for overload control as well as
      server maintenance.  Since client behavior may be different in
      these two cases, it is useful to clearly differentiate between
      them.
   Retry-After:  The Retry-After header in a 503 (Service Unavailable)
      response tells a client to stop sending traffic for the given
      period of time.  After this time is over, the client may start
      sending again.  This mechanism causes performance problems when
      used for overload control by a server that receives requests from
      a small number of clients (e.g. a SIP proxy that receives requests
      from a few other SIP proxies) [5].  However, it works well for
      server maintenance.


Hilt & Widjaja           Expires January 7, 2008                [Page 4]

Internet-Draft              Overload Control                   July 2007


   Alternate server forwarding:  After receiving a 503 (Service
      Unavailable) response, a client can send the request to an
      alternate server.  This mechanism can amplify load if used for
      overload control [5] but it is beneficial for server maintenance.
   Dropping requests:  A server is allowed to drop requests or refuse
      connections instead of sending a 503 (Service Unavailable)
      response.  Requests that do not receive a response will eventually
      be retransmitted by the client, which again amplifies load during
      periods of overload.
   Blocking hostnames:  A client that has received a 503 (Service
      Unavailable) response with a Retry-After header may decide to stop
      forwarding traffic to this server based on the servers hostname.
      However, if this hostname represents a cluster of servers (e.g.
      via a DNS mapping), the client would block traffic to all servers.
      The other servers may then be underutilized [5].


4.  Summary of Change

   In the following, two alternative sets of changes are proposed.

   Alternative 1:

   1.  Introduce a new response code, 507 (Server Overload), for servers
       temporarily unavailable due to overload.  This response is
       similar to a 500 (Server Internal Error) response.  Its Retry-
       After header has the same semantics (i.e., it only affects the
       current request) and it is forwarded all the way to the UAC.
   2.  A difference between a 500 (Server Internal Error) and a 507
       (Server Overload) response is that a 507 (Server Overload)
       response should not be re-tried at an alternate server.  Instead,
       it should be returned to the UAC.  This way, excess requests are
       quickly cleared from a network of SIP servers.  A new header,
       "Allow-Retry", may be used to explicitly allow proxies to re-try
       the request at an alternate server.
   3.  Deprecate the use of 503 (Service Unavailable) responses for
       temporary unavailability due to overload.
   4.  Change dropping requests or refusing the connection as a
       replacement for sending a 503 (Service Unavailable) response from
       MAY to SHOULD NOT.
   5.  Recommend the use of IP addresses for blocking traffic after
       receiving a 503 (Service Unavailable) with Retry-After and not
       the hostname.

   Alternative 2:


Hilt & Widjaja           Expires January 7, 2008                [Page 5]

Internet-Draft              Overload Control                   July 2007


   1.  Deprecate the use of Retry-After headers in 503 (Service
       Unavailable) responses for overload control by servers with a
       small client population (< 20 clients).  The use of Retry-After
       remains unchanged for servers with a large number of clients such
       as edge proxies (> 20 clients) and server maintenance.  Proxies
       that create a 500 (Server Internal Error) response after
       receiving a 503 (Service Unavailable) may include a Retry-After
       header in the 500 (Server Internal Error) response to prevent the
       UAC from instantly retrying the request.
   2.  Introduce a new header, "Allow-Retry", for 503 (Service
       Unavailable) responses.  This header controls whether a client
       receiving a 503 (Service Unavailable) response should or should
       not forward the request to an alternate server.  The default
       value for this header is true.  A somewhat simplistic alternative
       to the introduction of a new header is to deprecate forwarding
       requests to alternate servers if the 503 (Service Unavailable)
       response does not contain a Retry-After header.  In this case, it
       can be assumed that it was created because of overload and not
       server maintenance.
   3.  Change dropping requests or refusing the connection as a
       replacement for sending a 503 (Service Unavailable) response from
       MAY to SHOULD NOT.
   4.  Recommend the use of IP addresses for blocking traffic after
       receiving a 503 (Service Unavailable) with Retry-After and not
       the hostname.

   RFC 3261 [2] and RFC 3263 [3] define that transport failures
   (generally, due to fatal ICMP errors in UDP or connection failures in
   TCP) should be treated as a 503 (Service Unavailable) response.
   These cases should be treated as 503 (Service Unavailable) response
   that would be created due to server maintenance.

   OPEN ISSUE 1: is alternative 1. or 2. preferable?  Alternative 1.
   seems cleaner and the introduction of a specific response code for
   overload useful.  It also has the advantage that it is independent of
   the number of clients a server has.  Alternative 1. should be
   backwards compatible and work even if proxies and/or clients do not
   support it.  However, the introduction of a new response code is a
   more significant change than alternative 2.

   OPEN ISSUE 2: is 20 a good delineation for the use of 503 (Service
   Unavailable) in alternative 2.?

   OPEN ISSUE 3: Is a new response code in alternative 1. a good way to
   differentiate between overload and server maintenance.  An
   alternative would be to use a Warning header code or a new header
   field in 503 (Service Unavailable) responses.


Hilt & Widjaja           Expires January 7, 2008                [Page 6]

Internet-Draft              Overload Control                   July 2007


5.  Consequences if not approved

   Without these changes, networks of SIP servers are vulnerable to
   overload.  The performance of a network of SIP servers can be
   significantly impacted by overload due to the problems described
   above.

   While the proposed changes do not provide a full solution for
   overload control and cannot always prevent a congestion collapse,
   they avoid the problems described above and improve SIP server
   performance under overload.


6.  The Change (Alternative 1)

   The following two sections replace Section 21.5.4 in RFC3261 [2].

6.1.  503 Service Unavailable

   The server is temporarily unable to process the request due to a
   temporary maintenance of the server.  The server MAY indicate when
   the client should retry the request in a Retry-After header field.
   If no Retry-After is given, the client MUST act as if it had received
   a 500 (Server Internal Error) response.

   A client (proxy or UAC) receiving a 503 (Service Unavailable) SHOULD
   attempt to forward the request to an alternate server.  It SHOULD NOT
   forward any other requests to that server for the duration specified
   in the Retry-After header field.  The client SHOULD block traffic to
   a server based on the servers IP address and not the hostname since
   hostnames can represent multiple servers.

   Servers SHOULD NOT refuse the connection or drop the request as a
   replacement for responding with 503 (Service Unavailable).

6.2.  507 Server Overload

   The server is temporarily unable to process the request due to a
   temporary overloading of the server.  The server SHOULD reject
   requests that exceed its capacity with a 507 (Server Overload)
   response.  It MAY indicate when the client should retry the current
   request in a Retry-After header field.  The Retry-After header has
   the same semantics as in a 500 (Server Internal Error) response.

   A client (proxy or UAC) receiving a 507 (Server Overload) response
   SHOULD NOT attempt to forward the request to an alternate server.
   Forwarding the request to alternate servers under overload would
   increase the load on all servers and thereby amplify the overload


Hilt & Widjaja           Expires January 7, 2008                [Page 7]

Internet-Draft              Overload Control                   July 2007


   condition.


7.  The Change (Alternative 2)

   The following sentence is added to the end of the paragraph starting
   with "A proxy which receives..." on top of page 110 (Section 16.7
   step 6.) in RFC3261 [2]:

      It MAY indicate when the client should retry the request in a
      Retry-After header field added to the 500 (Server Internal Error)
      response.

   The following section replaces Section 21.5.4 in RFC3261 [2].

7.1.  503 Service Unavailable

   The server is temporarily unable to process the request due to a
   temporary overloading or maintenance of the server.

   If a server is temporarily unavailable due to maintenance, it MAY
   indicate when the client should retry the request in a Retry-After
   header field.  It MAY include a "Allow-Retry" header with the value
   "true" in this response to indicate that the client SHOULD re-try the
   request at an alternate server.

   A server that is temporarily unavailable due to overload SHOULD
   reject the requests that exceed its capacity with a 503 (Service
   Unavailable) response.  Servers with a large population of clients
   (proxies or UACs) MAY indicate when the client should retry the
   request in a Retry-After header field.  Servers that fall into this
   category typically receive traffic from 20 or more (often much more)
   clients.  An example for such a server is an edge proxy.  All other
   servers SHOULD NOT include a Retry-After header in a 503 (Service
   Unavailable) response.  If no Retry-After is given, a client MUST act
   as if it had received a 500 (Server Internal Error) response.  The
   server SHOULD include a "Allow-Retry" header with the value "false"
   in this response.  Re-trying the request at alternate servers under
   overload may increase the load on all servers and thereby amplify the
   overload condition.

   A client (proxy or UAC) receiving a 503 (Service Unavailable)
   response that contains an "Allow-Retry" header SHOULD or SHOULD NOT
   attempt to forward the request to alternate servers, depending on the
   value of this header field.

   If the Retry-After header field is present in a 503 (Service
   Unavailable) response, the client SHOULD NOT forward any other


Hilt & Widjaja           Expires January 7, 2008                [Page 8]

Internet-Draft              Overload Control                   July 2007


   requests to that server for the duration specified in the Retry-After
   header field.  The client SHOULD block traffic to a server based on
   the servers IP address and not the hostname since hostnames can
   represent multiple servers.

   Servers SHOULD NOT refuse the connection or drop the request as a
   replacement for responding with 503 (Service Unavailable).


8.  Security Considerations

   The procedures introduced in this document have no security
   implications beyond what is already specified in RFC3261 [2].


9.  IANA Considerations

   None.


Appendix A.  Acknowledgements

   Many thanks to Jonathan Rosenberg and Keith Drage for their
   suggestions.  Many thanks also to Eric Noel, Carolyn Johnson, Ping
   Wu, Tadeusz Drwiega and the overload control design team for the
   simulation results.


10.  References

10.1.  Normative References

   [1]  Bradner, S., "Key words for use in RFCs to Indicate Requirement
        Levels", BCP 14, RFC 2119, March 1997.

   [2]  Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, A.,
        Peterson, J., Sparks, R., Handley, M., and E. Schooler, "SIP:
        Session Initiation Protocol", RFC 3261, June 2002.

   [3]  Rosenberg, J. and H. Schulzrinne, "Session Initiation Protocol
        (SIP): Locating SIP Servers", RFC 3263, June 2002.

10.2.  Informative References

   [4]  Drage, K., "A Process for Handling Essential Corrections to the
        Session Initiation  Protocol (SIP)",
        draft-drage-sip-essential-correction-01 (work in progress),
        March 2007.


Hilt & Widjaja           Expires January 7, 2008                [Page 9]

Internet-Draft              Overload Control                   July 2007


   [5]  Rosenberg, J., "Requirements for Management of Overload in the
        Session Initiation Protocol",
        draft-rosenberg-sipping-overload-reqs-02 (work in progress),
        October 2006.


Authors' Addresses

   Volker Hilt
   Bell Labs/Alcatel-Lucent
   791 Holmdel-Keyport Rd
   Holmdel, NJ  07733
   USA

   Email: volkerh@bell-labs.com


   Indra Widjaja
   Bell Labs/Alcatel-Lucent
   600-700 Mountain Avenue
   Murray Hill, NJ  07974
   USA

   Email: iwidjaja@alcatel-lucent.com


Hilt & Widjaja           Expires January 7, 2008               [Page 10]

Internet-Draft              Overload Control                   July 2007


Full Copyright Statement

   Copyright (C) The IETF Trust (2007).

   This document is subject to the rights, licenses and restrictions
   contained in BCP 78, and except as set forth therein, the authors
   retain all their rights.

   This document and the information contained herein are provided on an
   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
   THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
   OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
   THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.


Intellectual Property

   The IETF takes no position regarding the validity or scope of any
   Intellectual Property Rights or other rights that might be claimed to
   pertain to the implementation or use of the technology described in
   this document or the extent to which any license under such rights
   might or might not be available; nor does it represent that it has
   made any independent effort to identify any such rights.  Information
   on the procedures with respect to rights in RFC documents can be
   found in BCP 78 and BCP 79.

   Copies of IPR disclosures made to the IETF Secretariat and any
   assurances of licenses to be made available, or the result of an
   attempt made to obtain a general license or permission for the use of
   such proprietary rights by implementers or users of this
   specification can be obtained from the IETF on-line IPR repository at
   http://www.ietf.org/ipr.

   The IETF invites any interested party to bring to its attention any
   copyrights, patents or patent applications, or other proprietary
   rights that may cover technology that may be required to implement
   this standard.  Please address the information to the IETF at
   ietf-ipr@ietf.org.


Acknowledgment

   Funding for the RFC Editor function is provided by the IETF
   Administrative Support Activity (IASA).


Hilt & Widjaja           Expires January 7, 2008               [Page 11]