SIP Working Group V. Hilt Internet-Draft I. Widjaja Expires: January 7, 2008 Bell Labs/Alcatel-Lucent July 6, 2007 Essential Correction to the Session Initiation Protocol (SIP) 503 (Service Unavailable) Response draft-hilt-sip-correction-503-01 Status of this Memo By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on January 7, 2008. Copyright Notice Copyright (C) The IETF Trust (2007). Abstract Overload occurs in the Session Initiation Protocol (SIP) when SIP servers have insufficient resources to process all SIP messages they receive. The SIP protocol specified in RFC 3261 provides the 503 (Service Unavailable) response code as a remedy for servers under overload. However, the current definition of 503 (Service Unavailable) has problems and can in fact amplify an overload condition. This document proposes an essential correction to RFC Hilt & Widjaja Expires January 7, 2008 [Page 1] Internet-Draft Overload Control July 2007 3261 that avoids these problems and helps SIP servers to better cope with overload. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 3. Reason for Change . . . . . . . . . . . . . . . . . . . . . . 4 4. Summary of Change . . . . . . . . . . . . . . . . . . . . . . 5 5. Consequences if not approved . . . . . . . . . . . . . . . . . 7 6. The Change (Alternative 1) . . . . . . . . . . . . . . . . . . 7 6.1. 503 Service Unavailable . . . . . . . . . . . . . . . . . 7 6.2. 507 Server Overload . . . . . . . . . . . . . . . . . . . 7 7. The Change (Alternative 2) . . . . . . . . . . . . . . . . . . 8 7.1. 503 Service Unavailable . . . . . . . . . . . . . . . . . 8 8. Security Considerations . . . . . . . . . . . . . . . . . . . 9 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9 Appendix A. Acknowledgements . . . . . . . . . . . . . . . . . . 9 10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 9 10.1. Normative References . . . . . . . . . . . . . . . . . . . 9 10.2. Informative References . . . . . . . . . . . . . . . . . . 9 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 10 Intellectual Property and Copyright Statements . . . . . . . . . . 11 Hilt & Widjaja Expires January 7, 2008 [Page 2] Internet-Draft Overload Control July 2007 1. Introduction As with any network element, a Session Initiation Protocol (SIP) [2] server can suffer from overload when the number of SIP messages it receives exceeds the number of SIP messages it can process. Generally, a SIP server is overloaded when it does not have sufficient resources to process all incoming SIP messages. RFC3261 [2] defines the 503 (Service Unavailable) response code to enable servers to handle temporary overload as follows: The server is temporarily unable to process the request due to a temporary overloading or maintenance of the server. The server MAY indicate when the client should retry the request in a Retry- After header field. If no Retry-After is given, the client MUST act as if it had received a 500 (Server Internal Error) response. A client (proxy or UAC) receiving a 503 (Service Unavailable) SHOULD attempt to forward the request to an alternate server. It SHOULD NOT forward any other requests to that server for the duration specified in the Retry-After header field, if present. Servers MAY refuse the connection or drop the request instead of responding with 503 (Service Unavailable). Unfortunately, this mechanism has proven to be problematic in actual deployments. Problems observed include load amplification, server underutilization, off/on semantics and ambiguous usages [5], which can eventually lead to a congestion collapse; a condition in which the throughput of a server drops to a small fraction of its capacity. This specification proposes an essential correction to RFC3261 for the 503 (Service Unavailable) response following the process defined in [4]. The specification does not attempt to provide a complete solution for SIP overload control. Such a solution is left for further study. 2. Terminology In this document, the key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" are to be interpreted as described in BCP 14, RFC 2119 [1] and indicate requirement levels for compliant implementations. Hilt & Widjaja Expires January 7, 2008 [Page 3] Internet-Draft Overload Control July 2007 3. Reason for Change The current specification of 503 (Service Unavailable) responses has proven to be problematic [5]. In many cases, the use of 503 (Service Unavailable) responses does not enable a server to successfully cope with overload and it may indeed worsen an overload condition. One of the problems of the 503 (Service Unavailable) response is that it covers temporary server unavailability due to overload as well as server maintenance. However, both cases are different in nature. For maintenance, a server needs to stop all incoming traffic for a certain period of time. Since it is likely that not all servers of a domain will be taken down at the same time, it is useful for the receiver of a 503 response to re-send the request to an alternate server. In an overload condition, a course grained on/off semantic for controlling load is problematic [5]. It is beneficial if a server can control incoming load on a more fine grained basis. This enables the server to smoothly steer load towards the desired rate and to reduce load even before overload occurs. Re-sending requests to alternate servers during overload is also problematic and may amplify the load on servers [5]. In particular if multiple servers are load balanced, re-sending a request that was rejected by one server due to overload does not help to increase the chances of delivery since the other servers are likely to be highly loaded as well. Instead, it increases the number of requests these servers have to handle. The following specific mechanisms defined for 503 (Service Unavailable) responses contribute to overall the problem: Server maintenance vs. overload control: The 503 (Service Unavailable) response is used for overload control as well as server maintenance. Since client behavior may be different in these two cases, it is useful to clearly differentiate between them. Retry-After: The Retry-After header in a 503 (Service Unavailable) response tells a client to stop sending traffic for the given period of time. After this time is over, the client may start sending again. This mechanism causes performance problems when used for overload control by a server that receives requests from a small number of clients (e.g. a SIP proxy that receives requests from a few other SIP proxies) [5]. However, it works well for server maintenance. Hilt & Widjaja Expires January 7, 2008 [Page 4] Internet-Draft Overload Control July 2007 Alternate server forwarding: After receiving a 503 (Service Unavailable) response, a client can send the request to an alternate server. This mechanism can amplify load if used for overload control [5] but it is beneficial for server maintenance. Dropping requests: A server is allowed to drop requests or refuse connections instead of sending a 503 (Service Unavailable) response. Requests that do not receive a response will eventually be retransmitted by the client, which again amplifies load during periods of overload. Blocking hostnames: A client that has received a 503 (Service Unavailable) response with a Retry-After header may decide to stop forwarding traffic to this server based on the servers hostname. However, if this hostname represents a cluster of servers (e.g. via a DNS mapping), the client would block traffic to all servers. The other servers may then be underutilized [5]. 4. Summary of Change In the following, two alternative sets of changes are proposed. Alternative 1: 1. Introduce a new response code, 507 (Server Overload), for servers temporarily unavailable due to overload. This response is similar to a 500 (Server Internal Error) response. Its Retry- After header has the same semantics (i.e., it only affects the current request) and it is forwarded all the way to the UAC. 2. A difference between a 500 (Server Internal Error) and a 507 (Server Overload) response is that a 507 (Server Overload) response should not be re-tried at an alternate server. Instead, it should be returned to the UAC. This way, excess requests are quickly cleared from a network of SIP servers. A new header, "Allow-Retry", may be used to explicitly allow proxies to re-try the request at an alternate server. 3. Deprecate the use of 503 (Service Unavailable) responses for temporary unavailability due to overload. 4. Change dropping requests or refusing the connection as a replacement for sending a 503 (Service Unavailable) response from MAY to SHOULD NOT. 5. Recommend the use of IP addresses for blocking traffic after receiving a 503 (Service Unavailable) with Retry-After and not the hostname. Alternative 2: Hilt & Widjaja Expires January 7, 2008 [Page 5] Internet-Draft Overload Control July 2007 1. Deprecate the use of Retry-After headers in 503 (Service Unavailable) responses for overload control by servers with a small client population (< 20 clients). The use of Retry-After remains unchanged for servers with a large number of clients such as edge proxies (> 20 clients) and server maintenance. Proxies that create a 500 (Server Internal Error) response after receiving a 503 (Service Unavailable) may include a Retry-After header in the 500 (Server Internal Error) response to prevent the UAC from instantly retrying the request. 2. Introduce a new header, "Allow-Retry", for 503 (Service Unavailable) responses. This header controls whether a client receiving a 503 (Service Unavailable) response should or should not forward the request to an alternate server. The default value for this header is true. A somewhat simplistic alternative to the introduction of a new header is to deprecate forwarding requests to alternate servers if the 503 (Service Unavailable) response does not contain a Retry-After header. In this case, it can be assumed that it was created because of overload and not server maintenance. 3. Change dropping requests or refusing the connection as a replacement for sending a 503 (Service Unavailable) response from MAY to SHOULD NOT. 4. Recommend the use of IP addresses for blocking traffic after receiving a 503 (Service Unavailable) with Retry-After and not the hostname. RFC 3261 [2] and RFC 3263 [3] define that transport failures (generally, due to fatal ICMP errors in UDP or connection failures in TCP) should be treated as a 503 (Service Unavailable) response. These cases should be treated as 503 (Service Unavailable) response that would be created due to server maintenance. OPEN ISSUE 1: is alternative 1. or 2. preferable? Alternative 1. seems cleaner and the introduction of a specific response code for overload useful. It also has the advantage that it is independent of the number of clients a server has. Alternative 1. should be backwards compatible and work even if proxies and/or clients do not support it. However, the introduction of a new response code is a more significant change than alternative 2. OPEN ISSUE 2: is 20 a good delineation for the use of 503 (Service Unavailable) in alternative 2.? OPEN ISSUE 3: Is a new response code in alternative 1. a good way to differentiate between overload and server maintenance. An alternative would be to use a Warning header code or a new header field in 503 (Service Unavailable) responses. Hilt & Widjaja Expires January 7, 2008 [Page 6] Internet-Draft Overload Control July 2007 5. Consequences if not approved Without these changes, networks of SIP servers are vulnerable to overload. The performance of a network of SIP servers can be significantly impacted by overload due to the problems described above. While the proposed changes do not provide a full solution for overload control and cannot always prevent a congestion collapse, they avoid the problems described above and improve SIP server performance under overload. 6. The Change (Alternative 1) The following two sections replace Section 21.5.4 in RFC3261 [2]. 6.1. 503 Service Unavailable The server is temporarily unable to process the request due to a temporary maintenance of the server. The server MAY indicate when the client should retry the request in a Retry-After header field. If no Retry-After is given, the client MUST act as if it had received a 500 (Server Internal Error) response. A client (proxy or UAC) receiving a 503 (Service Unavailable) SHOULD attempt to forward the request to an alternate server. It SHOULD NOT forward any other requests to that server for the duration specified in the Retry-After header field. The client SHOULD block traffic to a server based on the servers IP address and not the hostname since hostnames can represent multiple servers. Servers SHOULD NOT refuse the connection or drop the request as a replacement for responding with 503 (Service Unavailable). 6.2. 507 Server Overload The server is temporarily unable to process the request due to a temporary overloading of the server. The server SHOULD reject requests that exceed its capacity with a 507 (Server Overload) response. It MAY indicate when the client should retry the current request in a Retry-After header field. The Retry-After header has the same semantics as in a 500 (Server Internal Error) response. A client (proxy or UAC) receiving a 507 (Server Overload) response SHOULD NOT attempt to forward the request to an alternate server. Forwarding the request to alternate servers under overload would increase the load on all servers and thereby amplify the overload Hilt & Widjaja Expires January 7, 2008 [Page 7] Internet-Draft Overload Control July 2007 condition. 7. The Change (Alternative 2) The following sentence is added to the end of the paragraph starting with "A proxy which receives..." on top of page 110 (Section 16.7 step 6.) in RFC3261 [2]: It MAY indicate when the client should retry the request in a Retry-After header field added to the 500 (Server Internal Error) response. The following section replaces Section 21.5.4 in RFC3261 [2]. 7.1. 503 Service Unavailable The server is temporarily unable to process the request due to a temporary overloading or maintenance of the server. If a server is temporarily unavailable due to maintenance, it MAY indicate when the client should retry the request in a Retry-After header field. It MAY include a "Allow-Retry" header with the value "true" in this response to indicate that the client SHOULD re-try the request at an alternate server. A server that is temporarily unavailable due to overload SHOULD reject the requests that exceed its capacity with a 503 (Service Unavailable) response. Servers with a large population of clients (proxies or UACs) MAY indicate when the client should retry the request in a Retry-After header field. Servers that fall into this category typically receive traffic from 20 or more (often much more) clients. An example for such a server is an edge proxy. All other servers SHOULD NOT include a Retry-After header in a 503 (Service Unavailable) response. If no Retry-After is given, a client MUST act as if it had received a 500 (Server Internal Error) response. The server SHOULD include a "Allow-Retry" header with the value "false" in this response. Re-trying the request at alternate servers under overload may increase the load on all servers and thereby amplify the overload condition. A client (proxy or UAC) receiving a 503 (Service Unavailable) response that contains an "Allow-Retry" header SHOULD or SHOULD NOT attempt to forward the request to alternate servers, depending on the value of this header field. If the Retry-After header field is present in a 503 (Service Unavailable) response, the client SHOULD NOT forward any other Hilt & Widjaja Expires January 7, 2008 [Page 8] Internet-Draft Overload Control July 2007 requests to that server for the duration specified in the Retry-After header field. The client SHOULD block traffic to a server based on the servers IP address and not the hostname since hostnames can represent multiple servers. Servers SHOULD NOT refuse the connection or drop the request as a replacement for responding with 503 (Service Unavailable). 8. Security Considerations The procedures introduced in this document have no security implications beyond what is already specified in RFC3261 [2]. 9. IANA Considerations None. Appendix A. Acknowledgements Many thanks to Jonathan Rosenberg and Keith Drage for their suggestions. Many thanks also to Eric Noel, Carolyn Johnson, Ping Wu, Tadeusz Drwiega and the overload control design team for the simulation results. 10. References 10.1. Normative References [1] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [2] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, A., Peterson, J., Sparks, R., Handley, M., and E. Schooler, "SIP: Session Initiation Protocol", RFC 3261, June 2002. [3] Rosenberg, J. and H. Schulzrinne, "Session Initiation Protocol (SIP): Locating SIP Servers", RFC 3263, June 2002. 10.2. Informative References [4] Drage, K., "A Process for Handling Essential Corrections to the Session Initiation Protocol (SIP)", draft-drage-sip-essential-correction-01 (work in progress), March 2007. Hilt & Widjaja Expires January 7, 2008 [Page 9] Internet-Draft Overload Control July 2007 [5] Rosenberg, J., "Requirements for Management of Overload in the Session Initiation Protocol", draft-rosenberg-sipping-overload-reqs-02 (work in progress), October 2006. Authors' Addresses Volker Hilt Bell Labs/Alcatel-Lucent 791 Holmdel-Keyport Rd Holmdel, NJ 07733 USA Email: volkerh@bell-labs.com Indra Widjaja Bell Labs/Alcatel-Lucent 600-700 Mountain Avenue Murray Hill, NJ 07974 USA Email: iwidjaja@alcatel-lucent.com Hilt & Widjaja Expires January 7, 2008 [Page 10] Internet-Draft Overload Control July 2007 Full Copyright Statement Copyright (C) The IETF Trust (2007). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Intellectual Property The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org. Acknowledgment Funding for the RFC Editor function is provided by the IETF Administrative Support Activity (IASA). Hilt & Widjaja Expires January 7, 2008 [Page 11]