Network Working Group Tissa Senevirathne(Force10) Internet Draft Neena Premmaraju (Force10) Document: draft-tsenevir-ppp-flow-00.txt Category: Informational April 2001 Generalized PPP Flow Control Mechanism for Packet Over SONET (POS) Links Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026 [1]. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet- Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. For potential updates to the above required-text see: http://www.ietf.org/ietf/1id-guidelines.txt Abstract This document presents Flow Control methods that may be used in Packet Over SONET (POS) links. The proposed methods in this document may be especially useful for data communication equipment that provide high speed interfaces. Senevirathne Informational û September 2001 1 draft-tsenevir-ppp-flow-00.txt April 2001 1. Conventions used in this document The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC-2119 [2]. 2. Introduction High Speed SONET links are gaining popularity as a low cost, reliable protocol for Wide Area Network (WAN) and Metropolitan Area Networks (MAN). Packet Over SONET [3] is the most popular method used to adapt SONET circuits to transport packets. PPP is the transport level protocol used in Packet Over SONET (POS) links. PPP [4] protocol depends on the lower layer to provide required flow control. SONET has excellent error detection and recovery methods. However, SONET does not provide mechanism to indicate congestion at the higher layers of the network stack or any resource depletion, such as receive buffer overflow, at the end systems. This lack of flow control is creating serious data losses in the packet service networks over SONET. If proper flow control methods are not in place, such losses may be more evident at higher speeds such as OC192 and above. IEEE802.3x [5] has well defined set of methods that is used to provide flow control mechanism. The IEEE802.3x presents means of avoiding packet losses due to resource depletion at the receive side, more specifically at the physical level (PHY). In general, PHY level contain significantly less buffering space than higher layers. Depletion of resources at PHY level requires discarding incoming packets. However, flow control methods allows notifying the sending end of the link about the congestion at the receiver. Hence, Transmitting side could utilize the larger buffer space available to hold the packets for short time interval, or, until the congestion at the receiving end has cleared. Flow control methods presented in IEEE802.3x pause all flows and do not present methods to control flows based on some priority class such as Platinum vs. Gold etc. Most POS links are used as inter site links. Over such inter site links various classes of traffic may be transported. Any, flow control method that does not distinguish class of service may affect all the classes regardless of the priority of the service. It may be attractive to present an elaborate flow control mechanism that allows controlling the flow based on some local policy. In this paper we propose two flow control mechanisms. In the first method we simply encapsulate the IEEE802.3x Flow control frames in PPP payloads. Thus emulating Ethernet like flow control mechanism over POS links. In the second method we propose a new PPP packet type to present elaborate and extensive flow control method. 3.0 Placement of PPP Flow Control Senevirathne Informational û Septmebr 2001 2 draft-tsenevir-ppp-flow-00.txt April 2001 Switch Fabric Switch Fabric ^ ^ | | | | V V -------- ----- ----- ------- | | | | | | | | Classifier < - > | PHY | <= $ = $ = >|PHY | <-> | Classifier | | | | | | | | -------- ----- ----- ------- ^ ^ | PPP Connection (POS) | v V ---------- -------- | | | | | Buffer | | Buffer | | | | | | | | | ---------- -------- < --------------------------> < ----------------------- > Node A Node B Fig Interaction of two PPP Nodes, at the lower layer The above diagram depicts two PPP nodes that are connected via Packet Over SONET link. Lets assume the scenario that Node A is sending packets to Node B. The PHY at Node B is nearly full and unable to receive any more packets. The proposed flow control method allows the Node B to indicate Node A that congestion has occurred at Nod B. PHY at Node A notify the MAC/Classifier that a Flow control indication has received. MAC/Classifier at Node A stores the packets to Node B in the Buffer space, either until a specified time or receipt of ready to receive indication from Node B. When implementing service level flow control, PHY is required to have different threshold levels, at which flow control indications are generated for appropriate service class. As an example, assume there are three service classes, Gold, Silver and Bronze. Let's assume there are three receive thresholds available at the PHY. Let these thresholds are T1, T2 and T3, such that T1 < T2 < T3. User may configure a local policy such that flow control packets are generated for Service Class Bronze when receive PHY usage exceed threshold T1. User may also configure policies to specify the duration of flow pause for each service class. Definitions of such policies are beyond the scope of this document. 4.0 PPP Flow Control Capability Negotiation Some products may not implement the proposed flow control methods, or the device may be configured not to implement flow control. In either case it is important to negotiate the flow control Senevirathne Informational û Septmebr 2001 3 draft-tsenevir-ppp-flow-00.txt April 2001 capabilities during the Link Configuration phase. For the purpose of Flow control negotiation we define a new LCP configuration option for PPP configuration options [4]. The specified LCP Configuration Option type for Flow Control Capability requires IANA approval. Type Flow Control 9 NOTE: See [4] for details of the LCP configuration options. Summary of PPP LCP configuration Options encoding 0 1 2 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+- | Type | Length | Data . . . +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+- Type - Flow-Control 9 See [4] for details of other LCP configuration types Length - Length of this Option in octets, including the Type, Length and Data fields. Data - Contain the Flow Control options (see below for details). 4.1 Flow Control Options The flow control options in the PPP Flow control capability negotiation allows the peers to negotiate between two different flow control methods and appropriate parameters for the selected flow control method. The flow control options are encoded in Type/Length/Value format. Within each of the flow control option several sub-TLV are defined to negotiate specific parameters for that flow control method. One may negotiate flow control method with default parameters by not including any sub-TLV. The end node that is acknowledging the request must send flow control option back to the requester in the LCP-Acknowledge. Absence of the Flow control option in Config-Req is considered as peer is unable or not willing to participate in flow control. Nodes that are not willing to participate in Flow Control MUST generate config-Nak. Also, implementations that do not support Flow Control may generate Config-NaK in response to a Config-Req with Flow Control options. Type Reserved 0 Senevirathne Informational û Septmebr 2001 4 draft-tsenevir-ppp-flow-00.txt April 2001 Simple Flow Control 1 Service Flow Control 2 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Length | Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Sub-TLVs... | ~ ~ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Type Type of the Flow Control Option Length Length of this TLV in octets, including Type, Length and Sub-TLV fields. Sub-TLV Variable length options, based on the type of the flow control Reserved Set to zero on transmition and ignored at reception. 4.2 Simple Flow Control Simple Flow control indicates that the node is capable of providing 802.3x [5] type flow control on the PPP link. There is a single sub- TLV defined to specify the transmit recommence time (pause time out). The transmit recommence time (pause time out) indicates time to resume transmition, if the subsequent flow enable packet was lost in the transmition. Type 0 Reserved 1 Pause Time out 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Length | Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |Rsvd | Pause Time out | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Senevirathne Informational û Septmebr 2001 5 draft-tsenevir-ppp-flow-00.txt April 2001 Type Type of the parameter - which is Pause Time out. Length Length of this TLV in octets, including Type, Length and Pause Time Out. Pause Time Out Number of byte Times. The Byte time is defined as 8/Link Speed in bits per sec. This allows a normalized metric for the pause time out. Reserved/Rsvd Set to zero on transmition and ignored at reception. 4.3 Service Flow Control Service Flow control LCP option allows the peer to negotiate different service class ID, the priority associated with the Service ID and the Pause time out for each of the service ID. In practice each of the service class maps to a threshold limit in the input FIFO of the receiving side. In light with the practical implementation issues, receiving side may not be able support large range of thresholds. Hence we propose to support 16 service classes. Implementations that support Service Flow control MUST have capability to support 16 threshold levels. Thresholds in the receiving FIFO are numbered from 0 to 15. Zero (0) indicates the lowest limit of the FIFO and 15 indicates the highest limit of the FIFO. Priority field indicates the service class ID to threshold mapping. More than one service class MAY be mapped to a given threshold. In another words, more than one service class may contain the same priority. Type 0 Reserved 1 Service Class 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Length | Reserved |Priorit| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | S ID | Pause Time out | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Type Type of the parameter - Service Class Senevirathne Informational û Septmebr 2001 6 draft-tsenevir-ppp-flow-00.txt April 2001 Length Length of this TLV in octets, including Type, Length and Pause Time Out. Priority Priority of this service class. S ID Service ID of this service class. Pause Time Out Number of byte Times for Pause Time out for this service class. The Byte time is defined as 8/Link Speed in bits per sec. This allows a normalized metric for the pause time out. Reserved Set to zero on transmition and ignored at reception. 5.0 Encoding of Flow Control Packets In Simple Flow Control, the same payload as IEEE802.3x [5] PAUSE frame is carried. In Service Flow Control, service IDs of service classes that required to be paused are carried in a single Flow Control Packet. In order to keep these two implementations separate, we propose to define two new PPP protocol types. Flow control operates at the Link Level. As specified in [4], all protocols operating at the Link Level requires to have protocol id in the range c*** to f***. Hence the new flow control protocol values need to be in this range. Appropriate values for Flow Control Protocol ID is sought from IANA. In this document we assume value 0xc4c1 and 0xc4c2 are assigned for Simple Flow Control and service flow control respectively. ------------------------------------------------- |Protocol| Payload |optional | |Type | |padding | ------------------------------------------------- Fig Packet format in PPP [4] 5.1 Simple Flow Control Payload The payload of the Flow Control packet is of the same format as IEEE802.3x [5] MAC Control Frame. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 Senevirathne Informational û Septmebr 2001 7 draft-tsenevir-ppp-flow-00.txt April 2001 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | MacControl | PauseTime | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | ~ 40 byte padding ~ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 2 byte padding | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ MacControl Mac Control Opcode - 0x0001 for PAUSE frame, all other values are ignored. Pause Time Time to Pause traffic. The time is specfied in 512 bit times, as per IEEE 802.3x specification. Padding 44 bytes of padding to be compliant with IEEE 802.3x [5] specification. This padding also allows easy conversion of incoming PPP frames to corresponding Ethernet format. It is also envisioned simple flow control mode may be used by implementation that has flow control logic implemented according to the IEEE802.3x [5]. For such implementations, ability to receive flow control frames in the same format is IEEE802.3x [5] is important for proper operation. 5.2 Service Flow Control payload Service Flow Control payload facilitates selective control of different flows based on the service class. It is anticipated that ability to perform selective flow control over WAN links has merit. The service Flow control packet is encoded in the TLV format. A single packet may contain one or more service classes that require flow control. Type RESERVED 0 Service PAUSE 1 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Length | Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | S ID | Pause Time for Service Class m | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | S ID and Service Time for Service Class n . . . | ~ (variable) ~ Senevirathne Informational û Septmebr 2001 8 draft-tsenevir-ppp-flow-00.txt April 2001 | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Type Type of this service flow control. Length Length of this Service Class Flow control TLV. This include, Type, length, reserved and Pause Time fields. S ID Service ID of the Service Class Pause Time Pause Time out for this service class in byte times. Value zero (0) indicates a ready to transmit. Reserved Set to zero on transmission and ignored in reception. NOTE: A given Service Flow control may contain one or more service classes. Some of these service classes may have zero pause time out. 6.0 Flow Control Generation Let assume there are n thresholds such that t0 < t1 < ..< ti < ti+1 .. < tn. Let assume the corresponding Service classes are S0, s1..Si..Sn. Let assume the corresponding PAUSE times are P0,P1,..Pi..Pn. Let assume the input FIFO limit exceeded threshold ti. If Simple Flow Control is enabled, generate the Simple Flow Control packet if the Simple Threshold T < ti. Else continue If Service Flow Control is enable, generate Service Flow control for {S0,P0} .. {Si-1,Pi-1} service range. 7.0 Flow Control Processing Received Flow control packets are passed to the MAC and Classifier for Flow pausing. The action performed depend on the received packet is Simple Flow Control packet or Service flow control packet. If Flow control packet received is Simple Flow control and Simple Flow Control is enabled, PAUSE all egress flows. Senevirathne Informational û Septmebr 2001 9 draft-tsenevir-ppp-flow-00.txt April 2001 If the Flow Control packet received is Service Flow Control and Service Flow Control is enabled then If the Pause time is zero (0) clear the pause time and enable that flow. If the Pause time is non zero; update the local pause time counter and stop the flow for that service. 8.0 Flow Control Latency There is a finite time between the detection of a congestion and activation of flow control at the transmitting node. Peer A Peer B | | Congestion |<- | detected | ^ | | | | Flow Ctrl |<- | Flow Control | Transmitted| | Latency | | | | | | ->| Flow Control Received | | | | | ->| Flow Control Activated | | | | V ->|Complete Current Transmission | | 9.0 Issues There is a finite time between the reception of the flow control packet and activation of the flow control policy. Transmitter may continue to transmit during this slack time. This amount to a large quantity of data, especially in very high speed links. Thereby creating a possible ingress FIFO over flow situation at the receiving node. Hence, FLOW control thresholds should be carefully selected to avoid FIFO over flow situation and packet losses. In the service flow control, over subscription by a higher service class may affect the flow of well-behaved lower Service classes. 10.0 Security Considerations This document does not discuss the security implications of the new Flow control frames. It is thought that the possibility of any Denial of service of attack by generating bogus flow control frames are minimal in point to point links. This is especially true for optical base infrastructures such as SONET. Senevirathne Informational û Septmebr 2001 10 draft-tsenevir-ppp-flow-00.txt April 2001 11.0 References 1 Bradner, S., "The Internet Standards Process -- Revision 3", BCP 9, RFC 2026, October 1996. 2 Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997 3 Malis, A. and Simpson, W., "PPP over SONET/SDH", RFC 2615, June 1999. 4 Simpson, W., "The Point-to-Point Protocol (PPP)", RFC 1661, July 1994. 5 IEEE Std 802.3x, Institute of Electrical and Electronic Engineers, 1997. 12.0 Acknowledgments We would like to extend our appreciaption to Som Sikdar for inspiring us to write this document. 13.0 Author's Addresses Tissa Senevirathne Force10 Networks 1440 McCarthy Blvd. Milpitas, CA 95035-7438 Phone: 408-571-3500 Email: tissa@force10networks.com Neena Premmaraju Force10 Networks 1440 McCarthy Blvd. Milpitas, CA 95035-7438 Phone: 408-571-3500 Email: neena@force10networks.com Senevirathne Informational û Septmebr 2001 11 draft-tsenevir-ppp-flow-00.txt April 2001 Full Copyright Statement "Copyright (C) The Internet Society (date). All Rights Reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implmentation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into Senevirathne Informational û Septmebr 2001 12