Network Working Group Internet Draft Vivek Bansal R Ezhirpavai Document: draft-vibansal-sctpsocket- Hughes Software congestion-00.txt Systems Expires: June 2005 December 2004 SCTP Sockets API Extension for Congestion Handling Status of this Memo This document is an Internet-Draft and is subject to all provisions of section 3 of RFC 3667. By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she become aware will be disclosed, in accordance with RFC 3668. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on June 08, 2005. Abstract This draft defines extension to the SCTP Socket API draft[1] for Congestion Handling. SCTP Protocol has been widely accepted by the VoIP forum. SIGTRAN protocols (M3UA/SUA/M2UA/M2PA etc) provide procedures for handling congestion notification from the SCTP Stack layer. This draft provides mechanism for SCTP applications to enable/ disable notification of congestion indications over SCTP Socket Interface. This draft suggests notification to application at onset and abatement of congestion. Bansal Expires - June 2005 [Page 1] SCTP Socket API Extension December 2004 for Congestion Handling Conventions used in this document The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC-2119 [1]. Table of Contents 1. Introduction...................................................2 2. Conventions....................................................2 2.1 Data Types.................................................3 3. SCTP Congestion Event and Notification.........................3 3.1 Changes in SCTP Notification Structure.....................3 3.2 SCTP_CONGESTION_EVENT......................................4 4. SCTP Socket Options for Congestion.............................5 4.1 Receive Congestion Parameters (SCTP_RECV_CONGINFO).........6 4.2 Send Congestion Parameters (SCTP_SEND_CONGINFO)............7 4.3 Error Codes................................................8 5. Subscribing Congestion Notification............................8 6. Implementation Details.........................................9 7. Security Considerations.......................................10 8. References....................................................10 9. Author's Addresses............................................10 10. Intellectual Property Statement..............................12 1. Introduction SCTP protocol is the transport layer protocol for the SIGTRAN and other VoIP protocols. Many of these protocols provide protocol procedures on receiving congestion notification from SCTP layer. This draft therefore proposes an interface for SCTP layer to notify its application for buffer congestion. This draft provides an interface for the application to enable congestion notification in the SCTP Stack. The draft also provides an interface for SCTP Stack layer to notify its application for send or receive buffer congestion. The interface provided for the application is in accordance with the IETF draft Sockets API Extensions for Stream Control Transmission Protocol[1]. 2. Conventions This section lists the data types for the congestion extension to the socket draft. Bansal Expires - June 2005 [Page 2] SCTP Socket API Extension December 2004 for Congestion Handling 2.1 Data Types Data types are in accordance with the IETF draft for Sockets API Extensions for Stream Control Transmission Protocol[1]. Data types for congestion notifications are mentioned in later sections of this draft. 3. SCTP Congestion Event and Notification SCTP applications need to understand and process congestion event on notification from SCTP Stack. The detection of this event is on per association basis. If the application wants to receive the congestion notification, it sets appropriate socket option for SCTP congestion. When a notification arrives, recvmsg() returns the notification in the application-supplied data buffer via msg_iov, and sets MSG_NOTIFICATION in msg_flags. The delivery of notification with the data is as per IETF draft for Sockets API Extensions for Stream Control Transmission Protocol[1]. 3.1 Changes in SCTP Notification Structure The notification structure is as per the IETF draft for Sockets API Extensions for Stream Control Transmission Protocol[1]. The element ôsn_cong_eventö as shown below would be added to the union sctp_notification. union sctp_notification { struct { uint16_t sn_type; /* Notification type. */ uint16_t sn_flags; uint32_t sn_length; } sn_header; àà àà struct sctp_congestion_event sn_cong_event; }; sn_type field can additionally have value as SCTP_CONGESTION_EVENT: This notification indicates that the SCTP Stack layer has encountered or has abated the send/receive buffer congestion. Bansal Expires - June 2005 [Page 3] SCTP Socket API Extension December 2004 for Congestion Handling If the congestion is because of send buffers, the application in this case can slower the rate at which it is transmitting data or increase the number of transmit buffers. If the congestion is because of the receive buffers, this indicates that the rate of receiving messages from the peer is more than the rate at which application is reading data from the SCTP Stack. 3.2 SCTP_CONGESTION_EVENT When the SCTP Stack encounters buffer congestion for an association, this notification is given to the user. struct sctp_congestion_event { uint16_t sce_type; uint16_t sce_flags; uint32_t sce_length; uint16_t sce_buffer_type; uint16_t sce_cong_levl; uint16_t sce_buffer_levl; sctp_assoc_t sce_assoc_id; }; sce_type: 16 bits (unsigned integer) It should be SCTP_CONGESTIION_EVENT. sce_flags: 16 bits (unsigned integer) Currently unused. sce_length: 32 bits (unsigned integer) This field is the total length of the notification data, including the notification header. sce_buffer_type: 16 bits (unsigned integer) This field indicates the type of buffer that is experiencing a change in congestion level. The type of buffer includes: Buffer Type Description ---------------- ----------- SCTP_RECV_BUFFERS Indicates Receive Buffers have encountered or abated congestion. SCTP_SEND_BUFFERS Indicates Send Buffers have been encountered or abated congestion. Bansal Expires - June 2005 [Page 4] SCTP Socket API Extension December 2004 for Congestion Handling sce_cong_levl: 16 bits (unsigned integer) This field indicates the congestion level encountered by the SCTP Stack. This parameter can have following values Congestion Level Description ---------------- ----------- SCTP_CONGESTION_LEVL_0 Indicates no congestion. SCTP_CONGESTION_LEVL_1 Indicates congestion level 1 has reached. SCTP_CONGESTION_LEVL_2 Indicates congestion level 2 has reached. SCTP_CONGESTION_LEVL_3 Indicates congestion level 3 has reached. sce_buffer_levl: 16 bits (unsigned integer) This field indicates the buffer occupancy of the association corresponding to the congestion indication. sce_assoc_id : sizeof (sctp_assoc_t) The association id field, holds the identifier for the association. All notifications for a given association have the same association identifier. For one-to-one style socket, this field is ignored. 4. SCTP Socket Options for Congestion This section describes the socket options that the application will have to invoke to enable or disable congestion notification to the application. For the one-to-many style sockets, an sctp_assoc_t structure (association ID) is used to identify the association instance that the operation affects. So it must be set when using this style. The socket options are SCTP_RECV_CONGINFO SCTP_SEND_CONGINFO Both the options are Read and Write. Bansal Expires - June 2005 [Page 5] SCTP Socket API Extension December 2004 for Congestion Handling 4.1 Receive Congestion Parameters (SCTP_RECV_CONGINFO) This protocol parameter is used to configure the congestion level for the receive buffers. The following structure is used to access or modify the parameters struct sctp_conginfo { sctp_assoc_t sci_assoc_id; sctp_cong_level_info sci_level_1; sctp_cong_level_info sci_level_2; sctp_cong_level_info sci_level_3; }; sci_level_1 ¡ percentage of bytes utilized of the receive buffer for onset and abatement of congestion level 1. sci_level_2 ¡ percentage of bytes utilized of the receive buffer for onset and abatement of congestion level 2. sci_level_3 ¡ percentage of bytes utilized of the receive buffer for onset and abatement of congestion level 3. sci_assoc_id - (one-to-many style socket) This is filled in the application, and identifies the association for this query. If this parameter is '0' (on a one-to-many style socket), then the change effects the entire endpoint. struct sctp_cong_level_info { uint8_t scil_onset; uint8_t scil_abate; }; scil_onset ¡ Indicates onset threshold level of the specified congestion region. This indicates the percentage of bytes of the receive buffer used in order for the congestion level to be raised from a lower level to a higher level. scil_abate ¡ Indicates abate threshold level of the specified congestion region. This indicates the percentage of bytes used of the receive buffer in order for the congestion level to be reduced from the current level to the lower level. This needs to be less than or equal to the percentage specified in scil_onset of the same level. In order to avoid hysteresis of congestion level change, it is advisable to set the scil_onset higher than the scil_abate of the same level. Bansal Expires - June 2005 [Page 6] SCTP Socket API Extension December 2004 for Congestion Handling All parameters are specified in terms of percentage of bytes used for an association. A value of 0, when modifying the parameters indicates that the current value should not be changed. For the level specified, the SCTP stack will give an indication to the application when the indicated percentage of bytes is occupied. To access or modify these parameters, the application should call getsockopt() or setsockopt() respectively with the option name SCTP_RECV_CONGINFO. 4.2 Send Congestion Parameters (SCTP_SEND_CONGINFO) This protocol parameter is used to configure the congestion level for the receive buffers. The following structure is used to access or modify the parameters struct sctp_conginfo { sctp_assoc_t sci_assoc_id; sctp_cong_level_info sci_level_1; sctp_cong_level_info sci_level_2; sctp_cong_level_info sci_level_3; }; sci_level_1 ¡ percentage of bytes utilized of the send buffer for onset and abatement of congestion level 1. sci_level_2 ¡ percentage of bytes utilized of the send buffer for onset and abatement of congestion level 2. sci_level_3 ¡ percentage of bytes utilized of the send buffer for onset and abatement of congestion level 3. sci_assoc_id - (one-to-many style socket) This is filled in the application, and identifies the association for this query. If this parameter is '0' (on a one-to-many style socket), then the change effects the entire endpoint. struct sctp_cong_level_info { uint8_t scil_onset; uint8_t scil_abate; }; scil_onset ¡ Indicates onset threshold level of the specified congestion region. This indicates the percentage of bytes used in order for the congestion level to be raised from a lower level to a higher level. Bansal Expires - June 2005 [Page 7] SCTP Socket API Extension December 2004 for Congestion Handling scil_abate ¡ Indicates abate threshold level of the specified congestion region. This indicates the percentage of bytes used in order for the congestion level to be reduced from the current level to the lower level. This needs to be less than or equal to the percentage specified in scil_onset of the same level. In order to avoid hysteresis of congestion level change, it is advisable to set the scil_onset higher than the scil_abate of the same level. All parameters are specified in terms of percentage of bytes used for an association. A value of 0, when modifying the parameters indicates that the current value should not be changed. For the level specified, the SCTP stack will give an indication to the application when the indicated percentage of bytes is occupied. To access or modify these parameters, the application should call getsockopt() or setsockopt() respectively with the option name SCTP_SEND_CONGINFO. 4.3 Error Codes Following error codes would be returned if invalid values of the congestion level are specified. 1.EINVLVL1PARAMS ¡ If the onset and abate values of the congestion level 1 are invalid with respect to the other congestion levels. 2.EINVLVL2PARAMS ¡ If the onset and abate values of the congestion level 2 are invalid with respect to the other congestion levels. 3.EINVLVL3PARAMS ¡ If the onset and abate values of the congestion level 3 are invalid with respect to the other congestion levels. 5. Subscribing Congestion Notification Application can subscribe to the SCTP Congestion Notification by invoking the following option 1. SCTP_RECV_CONGESTION_NOTIFICATION (sctp_recv_congestion_event) 2. SCTP_SEND_CONGESTION_NOTIFICATION (sctp_send_congestion_event) To enable these notifications, application will have to register for these events using setsockopt() function. Following change will be required in the sctp_event_subscribe structure to subscribe to these notifications. struct sctp_event_subscribe{ Bansal Expires - June 2005 [Page 8] SCTP Socket API Extension December 2004 for Congestion Handling àà àà uint8_t sctp_recv_congestion_event; uint8_t sctp_send_congestion_event; }; sctp_recv_congestion_event - Setting this flag to 1 will enable the reception of receive buffers congestion event notifications. Setting the flag to 0 will disable receive buffers congestion event notifications. sctp_send_congestion_event - Setting this flag to 1 will enable the reception of transmit buffers congestion event notifications. Setting the flag to 0 will disable transmit buffers congestion event notifications. When the application subscribes to the recv/send congestion event, the congestion levels are initialised with system dependent default values. Implementation can sent to the default value of the congestion level 1 to be 40%, level 2 to be 60% and level 3 to be 80% of the Send/Receive buffer occupancy. These values can be modified using socket options as explained in section 4. 6. Implementation Details The buffer utilisation for an association is divided into multiple congestion regions as shown in the figure below. -^-----------------------------100% ^ | | | | Congestion Level 3 | | | | Onset Threshold Level|3 | -V-<~~~~~~~-------------------| | ^ <~~~~~Abate Threshold Level|3 | Congestion Level 2 | | | | | | V | Onset Threshold Level|2 | ---<~~~~~~~-------------------+ | ^ <~~~~~Abate Threshold Level|2 | | | | Total Memory Congestion Level 1 | Region | | | | V | Onset Threshold Level|1 | ---<~~~~~~~-------------------| | ^ <~~~~~Abate Threshold Level|1 | Bansal Expires - June 2005 [Page 9] SCTP Socket API Extension December 2004 for Congestion Handling | | | | | | | | No Congestion | | Congestion Level 0 | | | | | | V | | | ------------------------------+0% V The level 0 is a no-congestion region. Congestion indication with level 0 indicates system is not congested. Corresponding to every congestion region, there is a Onset threshold value and Abate threshold value. When the stack memory usage increases from the low congestion region to the higher congestion region, the SCTP layer will give the congestion notification for the Onset threshold value reached for the higher congestion region. When the stack memory usage decreases from the higher congestion region to the lower congestion region, then SCTP layer will give the congestion notification for the Abate threshold value reached for the lower congestion region. 7. Security Considerations Not applicable. 8. References 1 Stewart, R., Xie, Q., Yarroll, L., Wood, J., Poon, K., Tuexen, M., ôSockets API Extensions for Stream Control Transmission Protocol" - draft-ietf-tsvwg-sctpsocket-09.txt. 2 Stewart, R., Xie, Q., Morneault, K., Sharp, C., Schwarzbauer, H., Taylor, T., Rytina, I., Kalla, M., Zhang, L. and V. Paxson, "Stream Control Transmission Protocol", RFC 2960, October 2000. 9. Author's Addresses Vivek Bansal Hughes Software Systems, Ltd. Gurgaon, Haryana, India. 122015. Phone: (91)- 124-2346666.Ex-3080 Bansal Expires - June 2005 [Page 10] SCTP Socket API Extension December 2004 for Congestion Handling Email: vibansal@hssworld.com R Ezhirpavai Hughes Software Systems, Ltd. Gurgaon, Haryana, India. 122015. Phone: (91)- 124-2346666.Ex-3279 Email: rezhirpavai@hssworld.com Bansal Expires - June 2005 [Page 11] SCTP Socket API Extension December 2004 for Congestion Handling 10. Intellectual Property Statement The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf - i pr@ietf.or g. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org. Disclaimer of Validity This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Copyright Statement Copyright (C) The Internet Society (2004). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. Acknowledgment Bansal Expires - June 2005 [Page 12] SCTP Socket API Extension December 2004 for Congestion Handling Funding for the RFC Editor function is currently provided by the Internet Society. Bansal Expires - June 2005 [Page 13]