Network Working Group Y. Zha Internet Draft Huawei Technologies Intended status: Informational Expires: April 2017 October 31, 2016 Deterministic Latency Network Framework draft-zha-dln-framework-00 Abstract More and more real time applications, such as VR gaming, high frequency trading, Internet of vehicle, require accurate latency guarantee or bound, during service provisioning. Providing End-to- End latency guarantee across Internet is increasingly attractive while challenging. The main problem with latency guarantee on Internet is that packet scheduling is still best-effort, and congestion that further introduces latency uncertainty cannot be avoided. This document presents a way of describing latency information that mostly rely on the congestion management such as queue scheduling, and a framework to use such information to guarantee End-to-End latency. Status of this Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html Zha Expires April 30 2017 [Page 1] Internet-Draft DLN Framework October 2016 This Internet-Draft will expire on April 30, 2017. Copyright Notice Copyright (c) 2014 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents 1. Introduction .................................................2 2. Conventions used in this document ............................3 3. End-to-End Latency Guarantee Problem .........................3 3.1. Current QoS framework ...................................4 3.2. Need of Modeling of Latency .............................5 3.3. Need of Measurement of Flow Latency .....................6 4. Latency Information Modeling and Interface Framework .........7 4.1. Latency Aware PHB .......................................7 4.2. Latency Aware Interface .................................9 5. Security Considerations ......................................9 6. IANA Considerations ..........................................9 7. Acknowledgments ..............................................9 8. References ...................................................9 8.1. Normative References ....................................9 8.2. Informative References .................................10 1. Introduction Latency sensitive applications, such as VR gaming, high frequency trading, Internet of vehicle, have become more and more popular. These applications often require a guaranteed or bounded End-to- End packet transmission latency. For example, the End-to-End latency bounds of and VR gaming are 1 millisecond and 5 millisecond corresponding. Hence, mechanism achieving End-to-End latency guarantee or bound for latency sensitive applications is highly desirable [I-D.liu-dln-use-cases]. Zha Expires April 30, 2017 [Page 2] Internet-Draft DLN Framework October 2016 The well-known End-to-End QoS (Quality of Service) model is DiffServ (Differentiated Service) [RFC2474] that defines PHB (Per Hop Behavior) to provide different, relative service levels (or priority) for different packets. However, one key factor for latency during packet transmission is congestion, and congestion management such as packet queue scheduling further introduces latency uncertainty. DiffServ which only cares about providing differentiated service based on packet priority is not sufficient for the application that requires guaranteed End-to-End latency. Therefore, mechanism is desirable that can model the packet queue scheduling latency, then give the latency guarantee or bound of certain packet flow in each network node. It is also useful to define interface to expose such latency information to the upper level of the network to facilitate End-to-End latency guaranteed or bounded service. 2. Conventions used in this document The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119]. In this document, these words will appear with that interpretation only when in ALL CAPS. Lower case uses of these words are not to be interpreted as carrying [RFC2119] significance. 3. End-to-End Latency Guarantee Problem End-to-End latency is one of the most important metrics of network service provisioning. Latency sensitive service has become the main driving force of deterministic low latency network, which requires End-to-End latency guarantee or bound. Unlike relative low latency transport, End-to-End guaranteed or bounded latency requires the knowledge of the worst latency performance across the network, and the capability of latency guarantee in each network node. Current packet network is mainly best-effort, so End-to-End latency guarantee or bound is still difficult to achieve. The key issue is that latency guarantee requires latency description, or the capability of the network device, in order to further provide latency guarantee for certain traffic flow. In this section, the challenge of End-to-End latency guarantee, as well as the need of latency description for the network device is presented. Zha Expires April 30, 2017 [Page 3] Internet-Draft DLN Framework October 2016 3.1. Current QoS framework End-to-End QoS provisioning is the typical approach today to provide better performance including latency, for the latency sensitive services. DiffServ [RFC2474] is a well-known QoS mechanism to provide differentiated service quality to different classes of traffic flow. DiffServ model is shown in Figure 1. +-------+ | |------------+ +-->| Meter | | | | |-+ | | +-------+ | | | V V +------------+ +--------+ +---------+ +-----------+ | | | | | Shaper/ | | | Packets =>| Classifier |=>| Marker |=>| Dropper |=>| Scheduler |=> | | | | | | | | +------------+ +--------+ +---------+ |-----------+ Figure 1 - DiffServ Model For classifiers, DiffServ uses code-points to describe the service level and drop level of packets. The service level and drop level are essentially relative performance description (or priority) for different classes of traffic flows. One key factor for latency is congestion, and congestion management such as packet queue scheduling further introduces latency uncertainty. A deterministic latency performance (e.g. guarantee or bound) description is still missing while necessary for latency sensitive traffic flows. For traffic conditioning policies, PHBs of DiffServ are applied, including BE (Best Effort), EF (Expedited Forwarding), CS (Class Selector) and AF (Assured Forwarding). These PHBs are designed for only a few flow aggregates, which imply that they cannot provide differentiated services for a potentially large amount of latency sensitive traffic flows with different latency requirements. For service provisioning (or packet scheduling) policies, they are decoupled from traffic conditioning policies and haven't received much attention. For example, in the case where an EF packet arrives at a network device that contains other EF packets in the queue, the latency of scheduling the EF packet is impacted by the size of the queue. Moreover, the order of different latency aware Zha Expires April 30, 2017 [Page 4] Internet-Draft DLN Framework October 2016 flows arriving the EF queue hasn't been considered. Hence the specific packet scheduling policy in specific network device is import to the latency performance of the packet. In summary, the current DiffServ model is not sufficient for the application that requires guaranteed End-to-End latency. The problems can be listed as below. a) Current QoS mechanism like DiffServ is not well defined to support deterministic latency performance. For example, relative service level or packet priority can not address the congestion factor for latency, as well as the congestion management such as packet scheduling that introduces latency uncertainty. Current PHBs can only support a few flow aggregates which are not sufficient for different latency requirements. b) There is no user-/device-specific latency performance specification, or no control plane mechanism to assign user- /device-specific latency requirement to the network devices along the path. As a result, network device has no idea how fast a packet must be forwarded, and cannot adopt a suitable mechanism (e.g. queue scheduling) to guarantee latency. c) There is no mechanism supporting the measurement of flow latency inside of a network device, especially given certain PHB type and code-points of the flow. Such measurement will make End-to-End latency more visible, and thus is crucial for End-to-End latency oriented OAM. d) Service provisioning (or packet scheduling) policies are not specified. Packet scheduling policy and queue status are also key factors of latency and its uncertainty. Therefore packet scheduling policy must be considered to provide deterministic latency service for time sensitive flows. In a nutshell, how to explain the QoS value or how to make sure the QoS value can be used to guarantee latency performance is not well defined yet. Some extension to the current QoS model (e.g. new PHB) could be useful to solve these problems. 3.2. Need of Modeling of Latency As mentioned in problem section, QoS value or packet priority cannot guarantee deterministic low latency. In another word, the same QoS value or priority doesn't guarantee same latency performance. In network device, various forwarding mechanisms and Zha Expires April 30, 2017 [Page 5] Internet-Draft DLN Framework October 2016 interfaces introduce different latency that may be linked to the same priority code. There is still lack of latency performance information that can be used to provide latency guarantee service. The principle of Diffserv is focus on providing, describing differentiated service of traffic flow but not deterministic latency. Instead, queuing and scheduling, as a main part of latency and latency uncertainty, is out of DiffServ's major concern. PHB provides standard way of modeling of device forwarding behavior as well as how to handle each traffic flow. However, PHB description does not include queuing or scheduling information. In reality, latency is dominated by congestion control scheme, which is mainly queuing and scheduling in the network device to take care of multiple traffic flows arriving at the same port simultaneously. Therefore, extension to the current QoS model (e.g. new PHB) is desirable as a standard way to describe the latency performance of network device. With such standard latency model, network device is enabled to better manage the forwarding mechanism (e.g. queue scheduling) for the packet, in order to guarantee End-to-End latency for the service. 3.3. Need of Measurement of Flow Latency Flow latency measurement is also very crucial to make sure the latency bound is not violated and useful for End-to-End latency aware OAM mechanism. There is a need to support the measurement of flow latency inside of a network device, especially given certain PHB type and code-points of the flow. Existing technologies such as OWAMP [RFC4656] and TWAMP [RFC5357] is focused on providing one way and two way IP performance metrics. Latency is one of metrics that can be used for End-to-End deterministic latency provisioning. Use OWAMP/TWAMP protocols or extension on that to support measurement of flow latency performance is feasible. Overlay based End-to-End latency measurement is another approach commonly adopted by service providers like CDN vendor. Such approach can be further enhanced by latency measurement inside network device, for better service provisioning, e.g. traffic steering and path selection. Zha Expires April 30, 2017 [Page 6] Internet-Draft DLN Framework October 2016 4. Latency Information Modeling and Interface Framework To provide better End-to-End latency guarantee, an extension of QoS framework is proposed to provide more accurate latency information of network device. Mechanisms for flow latency measurement and latency information exchange are also proposed. This work may introduce new interfaces or extension on existing interfaces. 4.1. Latency Aware PHB Latency aware PHB provides latency performance information on network device with more accurate description on latency bound of queuing and scheduling mechanisms. In addition to classical PHB which focus on classifier, marker and shaper/dropper, latency aware PHB includes queuing and scheduling as well. The latency factor is mainly introduced by queuing and scheduling algorithm that handles congestion of multiple traffic flows. Congestion latency can be up to hundreds of milliseconds regarding buffer size. Implementations of various queuing, scheduling algorithms, and QoS policies introduce latency uncertainty. Moreover, different links cause different latency, as 1GE and 10GE will certainly cause different latency. The queue scheduling information model is proposed to describe latency based service capability and capacity of network nodes. As the assigned deterministic latency bound must be guaranteed, the network needs to know what kinds of latency based services can be provided by a network node for a flow with specific traffic profile. For latency based service capability, by referring to DiffServ design, a differentiated service model called latency slicing model is defined as follows. A network node provides multiple classes of latency-bounded services, and if a flow is allocated to a service class, then all its shaped packets can be sent out from the network node with waiting time no more than the predefined latency class bound. Such a latency-bounded service class is called a latency slice. For example, a network node may have multiple latency slices with latency bounds of 50us, 100us, 250us, 1ms, etc. Notice that actual max flow latency may vary due to acceptance of new flows or finish of existing flows, but it will never exceed the corresponding latency slice bound. Zha Expires April 30, 2017 [Page 7] Internet-Draft DLN Framework October 2016 The maximum packet length cannot exceed a predefined value called MaxSDU, otherwise other flows latency requirement may be possibly violated. A latency aware flow with specific traffic profile can be accepted by a latency slice only if the latency of any existing latency aware flow does not violate the latency bound of its latency slice. For example, when the traffic profile is defined by the leaky token model (b, r), two parameters MaxBurst B and MaxRate R representing the maximum allowable burst size and average rate are introduced to formulate the latency slice service capacity. In this case, a new flow can be accepted as long as b<=B and r<=R. MaxBurst and MaxRate can be adjusted to accommodate more latency sensitive flows in a particular latency slice, and lead to service capacity reduction of other latency slice. By hiding queue scheduling implementation details, the latency slicing information model is shown in Figure 2. +--------------+--------------+ | Name | Elements | +--------------+--------------+ |NodeID | | +--------------+--------------+ |SliceID | | +--------------+--------------+ |LatencyBound | | |--------------+--------------+ |MaxSDU | | |--------------+--------------+ |MaxRate | | |--------------+--------------+ |MaxBurst | | |--------------+--------------+ Figure 2 - Latency Slice Information Model More detail of this latency Slice information model is under discussion and will be available in the next version. Zha Expires April 30, 2017 [Page 8] Internet-Draft DLN Framework October 2016 4.2. Latency Aware Interface The queuing and scheduling information is typically localized in the network device. In another word, the current hop has no knowledge of how the flow was scheduled on last hop. So it is hard to guarantee End-to-End latency if latency information only affects locally. Latency performance such as queuing and scheduling information in network device needs to be exposed for End-to-End latency guarantee service provisioning. Basically, more information needs to be exchanged than the current DiffServ code-points. Existing approach such as [RFC7297] defines CPP that contains information of connection attributes, latency, loss, and so on. With latency performance information defined on previous section, an interface or mechanism to exchange such information is proposed. A simple approach can be an interface from network device to controller, which collects all the latency performance information of each hop, and then make a decision how to serve the flow at each hop. In this case, the controller tells each hop how to serve the flow with queuing, scheduling information that can be understood by each hop. The detail of latency aware interface is under discussion and will be available in the next version. 5. Security Considerations TBD 6. IANA Considerations This document has no actions for IANA. 7. Acknowledgments This document has benefited from reviews, suggestions, comments and proposed text provided by the following members, listed in alphabetical order: Jinchun Xu and Hengjun Zhu. 8. References 8.1. Normative References [RFC2119] S. Bradner, "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. Zha Expires April 30, 2017 [Page 9] Internet-Draft DLN Framework October 2016 [RFC3393] C. Demichelis, "IP Packet Delay Variation Metric for IP Performance Metrics (IPPM) ", RFC 3393, November 2002. [RFC2474] K. Nichols, "Definition of the Differentiated Services Field (DS Field) in the IPv4 and IPv6 Headers", RFC 2474, December 1998. [RFC4656] S. Shalunov, "A One-way Active Measurement Protocol (OWAMP)", RFC 4656, September 2006. [RFC5357] K. Hedayat, "A Two-Way Active Measurement Protocol (TWAMP)", RFC 5357, October, 2008. [RFC7279] M. Boucadair, "IP Connectivity Provisioning Profile (CPP)", RFC 7297, July, 2014. 8.2. Informative References [I-D.finn-detnet-problem-statement] Finn, N. and P. Thubert, "Deterministic Networking Problem Statement", draft-ietf-detnet-problem-statement-01 (work in progress), September 2016. [I-D.finn-detnet-architecture] Finn, N., Thubert, P., and M. Teener, "Deterministic Networking Architecture", draft-ietf-detnet-architecture-00 (work in progress), September 2016. [I-D.liu-dln-use-cases] Liu, X., "Deterministic Latency Network Use Cases", draft-liu-dln- use-cases-00 (work in progress), October 2016. Authors' Addresses Yiyong Zha Huawei Technologies Email: zhayiyong@huawei.com Zha Expires April 30, 2017 [Page 10]