RMCAT WG I. Johansson Internet-Draft Z. Sarker Intended status: Informational Ericsson AB Expires: April 30, 2015 October 27, 2014 Self-Clocked Rate Adaptation for Multimedia draft-johansson-rmcat-scream-cc-03 Abstract This memo describes a rate adaptation framework for conversational video services. The solution conforms to the packet conservation principle and uses a hybrid loss and delay based congestion control algorithm. The framework is evaluated over both simulated bottleneck scenarios as well as in a LTE (Long Term Evolution) system simulator and is shown to achieve both low latency and high video throughput in these scenarios. Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on April 30, 2015. Copyright Notice Copyright (c) 2014 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of Johansson & Sarker Expires April 30, 2015 [Page 1] Internet-Draft SCReAM October 2014 the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1. Wireless (LTE) access properties . . . . . . . . . . . . 3 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 3. The adaptation framework . . . . . . . . . . . . . . . . . . 4 3.1. Congestion control . . . . . . . . . . . . . . . . . . . 7 3.2. Transmission scheduling . . . . . . . . . . . . . . . . . 8 3.3. Media rate control . . . . . . . . . . . . . . . . . . . 8 4. Detailed description . . . . . . . . . . . . . . . . . . . . 8 4.1. Network congestion control . . . . . . . . . . . . . . . 8 4.1.1. Congestion window update . . . . . . . . . . . . . . 9 4.1.1.1. Initial steps . . . . . . . . . . . . . . . . . . 9 4.1.1.2. Loss event is detected . . . . . . . . . . . . . 11 4.1.1.3. If in_exponential_start = true and no loss event detected . . . . . . . . . . . . . . . . . . . . 11 4.1.1.4. If in_exponential_start = false and no loss event detected . . . . . . . . . . . . . . . . . . . . 11 4.1.1.5. Fairness enforcement . . . . . . . . . . . . . . 11 4.1.1.6. Final CWND adjustment step . . . . . . . . . . . 12 4.1.1.7. Competing flows compensation, adjustment of owd_target . . . . . . . . . . . . . . . . . . . 12 4.1.2. Transmission scheduling . . . . . . . . . . . . . . . 13 4.1.2.1. Transmission decision . . . . . . . . . . . . . . 13 4.1.2.2. Next transmission attempt . . . . . . . . . . . . 14 4.2. Video rate control . . . . . . . . . . . . . . . . . . . 14 4.2.1. Frame skipping . . . . . . . . . . . . . . . . . . . 15 4.2.2. Rate change . . . . . . . . . . . . . . . . . . . . . 16 4.2.2.1. Reduce rate . . . . . . . . . . . . . . . . . . . 17 4.2.2.2. Increase rate . . . . . . . . . . . . . . . . . . 18 5. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . 19 6. Open issues . . . . . . . . . . . . . . . . . . . . . . . . . 20 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 20 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 20 9. Security Considerations . . . . . . . . . . . . . . . . . . . 20 10. Change history . . . . . . . . . . . . . . . . . . . . . . . 20 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 20 11.1. Normative References . . . . . . . . . . . . . . . . . . 21 11.2. Informative References . . . . . . . . . . . . . . . . . 21 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 22 Johansson & Sarker Expires April 30, 2015 [Page 2] Internet-Draft SCReAM October 2014 1. Introduction Rate adaptation is considered as an important part of a interactive realtime communication as the transmission channel bandwidth may vary over period of time. Wireless access such as LTE (Long Term Evolution), which is an integral part of the current Internet, increases the importance of rate adaptation as the channel bandwidth of a default LTE bearer [QoS-3GPP] can change considerably in a very short time frame. Thus a rate adaptation solution for interactive realtime media, such as WebRTC, in LTE system must be both quick and be able to operate over a large span in available channel bandwidth. This memo describes a solution that borrows the self-clocking principle of TCP and combines it with a new delay based rate adaptation algorithm, LEDBAT [RFC6817]. Because neither TCP nor LEDBAT was designed for interactive realtime media, a few extra features are needed to make the concept work well with in this context. This memo describes these extra features. 1.1. Wireless (LTE) access properties [I-D.draft-sarker-rmcat-cellular-eval-test-cases] introduces the complications that can be observed in wireless environments. Wireless access such as LTE can typically not guarantee a given bandwidth, this is true especially for default bearers. The network throughput may vary considerably for instance in cases where the wireless terminal is moving around. Unlike wireline bottlenecks with large statistical multiplexing it is not possible to try to maintain a given bitrate when congestion is detected with the hope that other flows will yield, this because there are generally few other flows competing for the same bottleneck. Each user gets its own variable throughput bottleneck, where the throughput depends on factors like channel quality, load and historical throughput. The bottom line is, if the throughput drops, the sender has no other option than to reduce the bitrate. In addition, the grace time, i.e. allowed reaction time from the time that the congestion is detected until a reaction in terms of a rate reduction is effected, is generally very short, in the order of one RTT (Round Trip Time). 2. Terminology The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC2119 [RFC2119] Johansson & Sarker Expires April 30, 2015 [Page 3] Internet-Draft SCReAM October 2014 3. The adaptation framework The adaptation framework has similarities to concepts like TFWC [TFWC]. One important property is self-clocking and compliance to the packet conservation principle. The packet conservation principle is described as an important key-factor behind the protection of networks from congestion [FACK]. The packet conservation principle is realized by including a vector of the sequence numbers of received packets in the feedback from the receiver back to the sender, the sender keeps a list of transmitted packets and their respective sizes. This information is then used to determine how many bytes can be transmitted. A congestion window puts an upper limit on how many bytes can be in flight, i.e. transmitted but not yet acknowledged. The congestion window is determined in a way similar to LEDBAT [RFC6817]. This ensures that the e2e latency is kept low. The basic functionality is quite simple, there are however a few steps to take to make the concept work with conversational media. These will be briefly described in sections Section 3.1 to Section 3.3. The feedback is over RTCP [RFC3550] and is based on [RFC4585]. It is implemented as a transport layer feedback message, see proposed example in Figure 1. The feedback control information part (FCI) consists of the following elements. o Timestamp: A timestamp value indicating when the last packet was received which makes it possible to compute the one way (extra) delay (OWD). o The ACK list (Highest received sequence number + ACK vector): Makes it possible to detect lost packets and determine the number of bytes in flight. o Source quench bit (Q): Makes it possible to request the sender to reduce its congestion window. This is useful if WebRTC media is received from many hosts and it becomes necessary to balance the bitrates between the streams. The exact behavior and use for the source quench bit is T.B.D. o ECE (Explicit Congestion Notification) echo: Makes it possible to indicate if packets are ECN-CE (ECN Congestion Experienced) marked. The use for the ECN echo bits is T.B.D. Johansson & Sarker Expires April 30, 2015 [Page 4] Internet-Draft SCReAM October 2014 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |V=2|P| FMT | PT | length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | SSRC of packet sender | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | SSRC of media source | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Timestamp (32bits) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Highest recv. seq. nr. (16b) |ECN echo |Q|R|R|R|R|R|R|R| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ACK vector (32b) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 1: Transport layer feedback message To make the feedback as frequent as possible, the feedback packets are transmitted as reduced size RTCP according to [RFC5506]. The timestamp clock time base is typically set to the same time base as the media source in question but as the protocol described here is not dependent on the media it can be set to a fixed value defined in this specification. The ACK vector is here a bit vector that indicates the reception of the last 1+32 = 33 RTP packets. Section 4 describes the main algorithm details and how the feedback is used. Johansson & Sarker Expires April 30, 2015 [Page 5] Internet-Draft SCReAM October 2014 ---------------------------- ----------------------------- | Video encoder | | Video encoder | ---------------------------- ----------------------------- ^ | ^ ^ | ^ (1)| (2)| (3)| (1)| (2)| (3)| | RTP | | RTP | | V | | V | | ------------- | | ------------- | ----------- | |-- ----------- | |-- | Rate | (4) | Queue | | Rate | (4) | Queue | | control |<----| | | control |<----| | | | |RTP packets| | | |RTP packets| ----------- | | ----------- | | ------------- ------------- | | --------------- -------------- (5)| |(5) RTP RTP | | v v -------------- ---------------- | Network | (8) | Transmission | | congestion |<-------->| scheduler | | control | | | -------------- ---------------- ^ | | (7) |(6) ---------RTCP---------- RTP | | | v ------------- | UDP | | socket | ------------- Figure 2: Rate adaptation framework Johansson & Sarker Expires April 30, 2015 [Page 6] Internet-Draft SCReAM October 2014 Figure 2 shows the functional overview of the adaptation framework. Each media type or source implements rate control and a queue, where RTP packets containing encoded media frames are temporarily stored for transmission, the figure shows the details for when two video sources are used. Video frames are encoded and forwarded to the queue (2). The media rate adaptation adapts to the age of the oldest RTP frame in the queue and controls the video bitrate (1). It is also possible to make the video encoder skip frames and thus temporarily reduce the frame rate if the queue age exceeds a given threshold (3). The RTP packets are picked from each queue based on some defined priority order or simply in a round robin fashion (5). A transmission scheduler takes care of the transmission of RTP packets, to be written to the UDP socket (6). In the general case all media must go through the packet scheduler and is allowed to be transmitted if the number of bytes in flight is less than the congestion window. However audio frames can be allowed to be transmitted as audio is typically low bitrate and thus contributes little to congestion, this is however something that is left as an implementation choice. RTCP packets are received (7) and the information about bytes in flight and congestion window is exchanged between the network congestion control and the transmission scheduler (8). The rate adaptation solution constitutes three parts; congestion control, transmission scheduling and media rate adaptation. 3.1. Congestion control The congestion control sets an upper limit on how much data can be in the network (bytes in flight); this limit is called CWND (congestion window) and is used in the transmission scheduling. A congestion control method, similar to LEDBAT [RFC6817], measures the OWD (one way delay). The congestion window is allowed to increase if the OWD is below a predefined target, otherwise the congestion window decreases. The delay target is typically set to 50-100ms. This ensures that the OWD is kept low on the average. The reaction to loss events is similar to that of loss based TCP, i.e. an instant reduction of CWND. LEDBAT is designed with file transfers as main use case which means that the algorithm must be modified somewhat to work with rate- limited sources such as video. The modifications are o Congestion window validation techniques. These are similar in action as the method described in [I-D.ietf-tcpm-newcwv]. Johansson & Sarker Expires April 30, 2015 [Page 7] Internet-Draft SCReAM October 2014 o Fast start for bitrate increase. It makes the video bitrate ramp- up within 5 to 10 seconds. The behavior is similar to TCP slowstart. The fast start is exited when congestion is detected. o Adaptive delay target. This helps the congestion control to compete with FTP traffic to some degree. 3.2. Transmission scheduling Transmission scheduling limits the output of data, given by the relation between the number of bytes in flight and the congestion window similar to TCP. Packet pacing is used to mitigate issues with coalescing that may cause increased jitter in the media traffic. 3.3. Media rate control The media rate control serves to adjust the media bitrate to ramp up quickly enough to get a fair share of the system resources when link throughput increases. The reaction to reduced throughput must be prompt in order to avoid getting too much data queued up in the sender frame queues. The queuing delay is determined and the media bitrate is decreased if it exceeds a threshold. In cases where the sender frame queues increase rapidly such as the case of a RAT (Radio Access Type) handover it may be necessary to implement additional actions, such as discarding of encoded video frames or frame skipping in order to ensure that the sender frame queues are drained quickly. Frame skipping means that the frame rate is temporarily reduced. Discarding of old video frames is a more efficient way to reduce media latency than frame skipping but it comes with a requirement to repair codec state, frame skipping is thus to prefer as a first remedy. 4. Detailed description This section describes the algorithm in more detail. It is split between the network congetsion control and the video rate adaptation. 4.1. Network congestion control This section explains the network congestion control, it contains two main functions o Computation of congestion window: Gives an upper limit to the number of bytes in flight i.e. how many bytes that have been transmitted but not yet acknowledged. Johansson & Sarker Expires April 30, 2015 [Page 8] Internet-Draft SCReAM October 2014 o Transmission scheduling: RTP packets are transmitted if allowed by the relation between the number of bytes in flight and the congestion window Unlike TCP, SCReAM is not a byte oriented protocol, rather it is an RTP packet oriented protocol. Thus it keeps a list of transmitted RTP packets and their respective sending times (wall-clock time). The feedback indicates the highest received RTP sequence number and a timestamp (wall-clock time) when it was received. In addition, an ACK list is included to make it possible to determine lost packets 4.1.1. Congestion window update Below is described the actions when an acknowledgement is received. 4.1.1.1. Initial steps Bytes in flight (bytes_in_flight) is computed as the sum of the sizes of the RTP packets ranging from the RTP packet most recently transmitted up to but not including the acknowledged packet with the highest sequence number. As an example: If RTP packet was sequence number SN with transmitted and the last ACK indicated SN-5 as the highest received sequence number then bytes in flight is computed as the sum of the size of RTP packets with sequence number SN-4, SN-3, SN-2, SN-1 and SN. The congestion window is computed from the one way (extra) delay estimates (OWD) that are obtained from the send and received timestamp of the RTP packets. LEDBAT [RFC6817] explains the details of the computation of the OWD. An OWD sample is obtained for each received acknowledgement. No smoothing of the OWD samples occur, however some smoothing occurs naturally as the computation of the CWND is in itself a low pass filter function. A variable bytes_newly_acked depicts the number bytes that was acknowledged with the last received acknowledgement. owd_mem is an EWMA (Exponential Weighted Moving Average) filtered OWD owd_mem = max(owd_mem*0.5 + owd*0.5, owd_mem*0.9 + owd*0.1) The OWD fraction is computed as owd_fraction = owd/owd_target where owd_target is the target (extra) delay, owd_target is typically set to owd_target_lo=0.1s but can in certain cases increase to owd_target_hi=0.4s. The OWD fraction is sampled every 50ms and the Johansson & Sarker Expires April 30, 2015 [Page 9] Internet-Draft SCReAM October 2014 last 20 samples are stored in a vector (owd_fraction_hist). This vector is used in the computation of an OWD trend that gives a value between 0.0 and 1.0 depending on how close to congestion it gets. The OWD trend is calculated as follows Let R(owd_fraction_hist,K) be the autocorrelation function of owd_fraction_hist at lag K. The 1st order prediction coefficient is formulated as a = R(owd_fraction_hist,1)/R(owd_fraction_hist,0) The prediction coefficient a has positive values if OWD shows an increasing trend, thus one get an indication of congestion before the OWD target is reached. The prediction coefficient is further multiplied with owd_fraction to reduce sensitivity to increasing OWD when OWD is small. The OWD trend is thus computed as owd_trend = max(0.0,min(1.0,a*owd_fraction)) The owd_trend is utilized in the media rate control and to determine when to exit slow start. An EWMA filtered version of owd_trend is computed owd_trend_ewma=max(owd_trend, owd_trend_ewma*(1.0-alpha)+ alpha* owd_trend) alpha = (t_now-t_cwnd_update_prev) / 5000.0 t_now is the current wall clock time. owd_fraction_avg is a lowpass filtered version of owd_fraction owd_fraction_avg = 0.9* owd_fraction_avg + 0.1* owd_fraction An off target value is computed as off_target = (owd_target - owd) / owd_target CWND is updated differently depending on whether the congestion control is in fast start or not and if a loss event is detected. A Boolean variable in_exponential_start (initialized to true) indicates if the congestion is in fast start. A loss event indicates one or more lost RTP packets within an RTT. This is detected by means of inspection for holes in the sequence number space in the acknowledgements with some margin for possible packet reordering in the network. Johansson & Sarker Expires April 30, 2015 [Page 10] Internet-Draft SCReAM October 2014 4.1.1.2. Loss event is detected If a loss event is detected then in_exponential_start is set to false and CWND is updated according to cwnd = max(min_cwnd,cwnd*0.8) where min_cwnd = 2*mss otherwise the CWND update continues 4.1.1.3. If in_exponential_start = true and no loss event detected in_exponential_start is set to false if owd_trend >= 0.2 and otherwise CWND is updated according to cwnd = cwnd + bytes_newly_acked 4.1.1.4. If in_exponential_start = false and no loss event detected Values of off_target > 0.0 indicates that the congestion window can be increased. This is done according to the equations below (mss is the maximum RTP packet size). gain = gain_up*(1.0 + max(0.0, 1.0 - owd_trend/ 0.2)) cwnd += gain * off_target * bytes_newly_acked * mss / cwnd Values of off_target <= 0.0 indicates congestion, CWND is then updated according to the equation cwnd += gain_down*off_target*bytes_newly_acked*mss/cwnd 4.1.1.5. Fairness enforcement Fairness enforcement is realized by reducing the congestion window by a fraction when a number of conditions are met. They are o owd_target < owd_target_lo*1.2 i.e no competing flows are compensated for o owd_trend > 0.1 i.e. congestion is detected o more than t_delta since the congestion window was reduced the last time t_delta is computed as t_delta = 0.1*min(200.0, max(20.0, 50.0e6/max_paced_bitrate) Johansson & Sarker Expires April 30, 2015 [Page 11] Internet-Draft SCReAM October 2014 The bitrate is taken into account in the sense that the lower the bitrate, the more sparse the reductions in congestion window get. If the above conditions are met then cwnd is adjusted according to cwnd *= 0.8 4.1.1.6. Final CWND adjustment step The congestion window is limited by the maximum number of bytes in flight over the last 1.0 seconds according to cwnd = min(cwnd, max_bytes_in_flight*max_bytes_in_flight_head_room) where max_bytes_in_flight_head_room = 1.1. This avoids possible over-estimation of the throughput after for example, idle periods- Finally cwnd is set to ensure that it is at least min_cwnd cwnd = max(cwnd, min_cwnd) 4.1.1.7. Competing flows compensation, adjustment of owd_target In certain cases it becomes necessary to increase owd_target, one such case is where SCReAM competes with TCP based file transfer over a tail drop bottleneck link and the TCP congestion avoidance is loss based (for example Cubic or NewReno). The technique is to inhibit video long enough to make bytes in flight reach zero (no remaining RTP packets in flight) and then resume video. For the unfortunate case that the last RTP packet was lost, it is necessary to force video to resume after 1.0s as bytes in flight will never reach zero in this case. This interruption is typically in the order of one RTT. Once video is resumed the average OWD (owd_avg_c_flow) is computed over the first 5 acknowledgements after video is resumed. If no competing flows exist then this average should be close to zero, otherwise owd_avg_c_flow has a value that corresponds roughly to the queuing delay caused by the competing flow. The owd_target is updated according to the value of owd_avg_c_flow. The method above is executed if more than a given time since the last time video was inhibited (e.g. 20 seconds) and any of the two conditions below are fulfilled o owd_mem > owd_target o owd_target > owd_target_lo Johansson & Sarker Expires April 30, 2015 [Page 12] Internet-Draft SCReAM October 2014 The first condition indicates that another competing flows is possibly driving higher queuing delays in the network. The second condition indicates that the OWD target is increased and it should be determined if this can be lowered. Once owd_avg_c_flow is computed the owd_target is adjusted. The adjustment action depends on the value of owd_avg_c_flow o If owd_avg_c_flow > owd_target_lo/2: Adjust the owd_target upwards according to owd_target = min(owd_target_hi, max(owd_target, owd_avg_c_flow *3.0)) o If owd_avg_c_flow <= owd_target_lo/2: Adjust the owd_target downwards according to owd_target = 0.5*owd_target+ 0.5*Math.max(owd_target_lo, owd_avg_c_flow). Furhermore owd_target is set to owd_target_lo if it is less than owd_target_lo*1.2. 4.1.2. Transmission scheduling An RTP packet transmission attempt is scheduled at intervals given by t_pace that depends on the estimated throughput, the RTT and the size of the last transmitted RTP packet. This provides with packet pacing which is in some cases necessary in order to break up coalescing tendencies which can otherwise cause unwanted extra jitter or packet loss. 4.1.2.1. Transmission decision The principle is to allow packet transmission of an RTP packet only if the number of bytes in flight is less than the congestion window. There are however two reasons why this strict rule will not work optimally o Bitrate variations. The video frame size is always varying to a larger or smaller extent, a strict rule as the one given above will have the effect that the video bitrate have difficulties to increase as the congestion window puts a too hard restriction on the video frame size variation, this further can lead to occasional queuing of RTP packets in the RTP packet queue that will prevent bitrate increase because of the increased queuing delay. o Reverse (feedback) path congestion. Especially in transport over buffer-bloated networks, the one way delay in the reverse direction may jump due to congestion. The effect of this is that the acknowledgements are delayed with the result that the self- Johansson & Sarker Expires April 30, 2015 [Page 13] Internet-Draft SCReAM October 2014 clocking is temporarily halted, even though the forward path is not congested. Transmission of an RTP packet of size rtp_size is thus allowed when any of the following conditions is met. o If owd > owd_target: Transmission is allowed if bytes_in_flight + rtp_size <= cwnd. This enforces a strict rule that helps to prevent further queue buildup. o If owd <= owd_target: A helper variable x_cwnd=1.0+bytes_in_flight_slack*max(0.0, min(1.0,1.0-owd_trend/0.5))/100.0 is computed. Transmission is allowed if bytes_in_flight+rtp_size <= max(cwnd*x_cwnd, cwnd+mss) . This gives a slack that reduces as congestion increases, bytes_in_flight_slack is a maximum allowed slack in percent. A large value such as 100% increases the robustness to bitrate variations in the source and congested feedback channel issues. The possible drawback is increased delay or packet loss when forward path congestion occur. Recommended values are 20 to 50%. 4.1.2.2. Next transmission attempt The interval until the next transmission attempt (t_pace) is set to 0.001s if no RTP packet was transmitted according to the decision in previous section. Otherwise it is calculated as max_paced_bitrate = max (50000, cwnd* 8 / s_rtt) t_pace = rtp_size * 8 / max_paced_bitrate 4.2. Video rate control The video rate control is based on the queuing delay in the RTP packet queue and loss events. The video rate control function is executed for each video frame. The actual video rate adjustment may however be less frequent. The main reason is that there is typically a lag between the bitrate request and the actual bitrate from the video coder and this lag can be as much as 1 second. This makes it less efficient to try to react to congestion with prompt rate adjustments. The solution is to complement the rate reduction with frame skipping in order to keep the RTP queuing delay limited. Johansson & Sarker Expires April 30, 2015 [Page 14] Internet-Draft SCReAM October 2014 The queuing delay is sampled every frame period and the last N_a samples are stored in a vector age_vec. An average queuing delay is computed as a weighted sum over the samples in age_vec. age_avg at the current time instant n is computed as age_avg(n) = SUM age_vec(n-k)*w(k) The sum is computed over k=[0..N_a-1] w(n) are weight factors arranged to give the most recent samples a higher weight. N_a i.e. the number of samples that avg_age is computed over, depends on how slow the video encoder is to respond to video rate change requests. With a slow video encoder N_a is suggested to be set to N_a = 1.0/frame_period where frame_peridod is the video frame interval, 1.0 corresponds roughly to the time constant in the video coder rate control loop (1.0s). If the video encoder is quicker to react to bitrate changes, N_a can be set to a lower value such as N_a = 5. avg_age is used for rate adjustment instead of the current value, the reason is to avoid bitrate reduction because of temporal delay spikes. Instead the video rate control is a combination of slower rate adjustments and adjustments of the temporal frame rate by means of raw frame skipping on a shorter time scale. This is an adaptation to SCReAM as it works best when it has data to send because of its self-clocking properties. The concept also avoids very large rate reduction due to isolated delay spikes. The change in age_avg is computed as age_d = (age_avg(n) - age_avg(n-1))/frame_period 4.2.1. Frame skipping Frame skipping is controlled by a flag frame_skip which, if set to 1, dictates that the video coder should skip the next video frame. The frame skipping intensity at the current time instant n is computed as o If age_d > 0 and age_avg > frame_period: The frame skip intensity is computed as Johansson & Sarker Expires April 30, 2015 [Page 15] Internet-Draft SCReAM October 2014 frame_skip_intensity = min(1.0, (age_vec(n)-frame_period)/(2* frame_period) o Otherwise frame skip intensity is set to zero Note that the frame skipping intensity is computed based on the current value of the queuing delay. Furthermore, frame skipping is enabled only if the average queue delay increases and is large enough. The skip_frame flag is set depending on three variables o frame_skip_intensity o since_last_frame_skip, i.e the number of consecutive frames without frame skipping o consecutive_frame_skips, i.e the number of consecutive frame skips The flag skip_frame is set to 1 if any of the conditions below is met. o age_vec(n) > 0.2 && consecutive_frame_skips < 5 o frame_skip_intensity < 0.5 && since_last_frame_skip >= 1.0/ frame_skip_intensity o frame_skip_intensity >= 0.5 && consecutive_frame_skips < (frame_skip_intensity -0.5)*10 The arrangement makes sure that no more than 4 frames are skipped in sequence, the rationale is to ensure that the input to the video encoder does not change to much, something that may give poor prediction gain. 4.2.2. Rate change A variable target_bitrate is adjusted depending on the congestion state. The target bitrate can vary between a minimum value (target_bitrate_min) and a maximum value (target_bitrate_max). First of all the target_bitrate is updated if a new loss event was indicated and the rate change procedure is exited. target_bitrate = max(0.9* target_bitrate, target_bitrate_min) If no loss event was indicated then the rate change procedure continues. Based on age_avg(n) and the time span since the last rate Johansson & Sarker Expires April 30, 2015 [Page 16] Internet-Draft SCReAM October 2014 reduction. A rate reduction condition is determined. This is evaluated differently depending on whether an ideal video coder is simulated for algorithm evaluation purposes or if the algorithm is executed in a real implement with a video coder that lags behind in the rate adjustment. o Ideal mode: reduce_rate = age_avg(n) > frame_period/2 && t_now- t_last_rate_change >= rate_change_interval && t_now- t_last_rate_reduction > 0.5 o Non-ideal mode: reduce_rate = age_avg(n) > frame_period*2 && t_now-t_last_rate_change >= rate_change_interval && t_now- t_last_rate_reduction > video_coder_time_constant rate_change_interval is set to 0.1s, video_coder_time_constant is set to a value that approximates the lag in the video coder rate change. 4.2.2.1. Reduce rate If reduce_rate evaluates to true then the bitrate is reduced. First an inflection point is determined for later rate increase target_bitrate_i = target_bitrate * 0.95 In addition, a restore point is determined for the case that false congestion was detected, for instance as an effect of congestion in the feedback path. target_bitrate_restore_point = target_bitrate A few varibles are updated for future use t_last_rate_change = t_now max_owd_fraction = max(max_owd_fraction, owd_fraction_avg) A rate reduction factor is determined alpha = min(0.5, max(0.0, 0.9*age_d)) The target bitrate and t_last_rate_reduction are updated if alpha > 0.0 according to target_bitrate = max(target_bitrate_min, target_bitrate*(1.0-alpha)) Johansson & Sarker Expires April 30, 2015 [Page 17] Internet-Draft SCReAM October 2014 4.2.2.2. Increase rate A rate increase is allowed if two conditions are met o t_now-t_last_rate_change >= rate_change_interval o age_avg(n) <= frame_period/2 First the target bitrate is restored if false congestion was detected. This restoration is allowed if it it is more that 2.0s since the last loss event and target_bitrate_restore_point > 0.0. Further, if an additional condition do_restore = max_owd_fraction < 0.4 && owd_trend_ewma < 0.2 evaluates to true then the target bitrate is restored as target_bitrate = max(target_bitrate, target_bitrate_restore_point) Regardless of whether do_restore evaluates to true or false target_bitrate_restore_point is set to -1.0 and max_owd_fraction = 0.0 The target bitrate is increased, the increase rate depends on if the algorithm is in slow start or not, indicated by the variable in_exponential_start. 4.2.2.2.1. If in_exponential_start = true The bitrate incremented is computed as increment = target_bitrate_max*rate_change_interval*ramp_up_time_fast* (1.0-min(1.0, owd_trend/0.1)) target_bitrate = min(target_bitrate_max, target_bitrate+increment)) The target bitrate is allowed to reach the the highest bitrate within ramp_up_time_fast seconds if no congestion is detected. A recommended value for ramp_up_time_fast is 10.0s. 4.2.2.2.2. If in_exponential_start = false The maximum allowed increment of the target bitrate is computed increment_max = target_bitrate*0.2 A variable gain factor is computed in a number of steps, first the gain factor is reduced if the target bitrate is close to the Johansson & Sarker Expires April 30, 2015 [Page 18] Internet-Draft SCReAM October 2014 inflection point i.e. the target bitrate when congestion was last detected. gain = max(0.2,min(1.0, abs((target_bitrate - target_bitrate_i)/ target_bitrate_i)*4.0)) Furthermore the gain is reduced if near (or past) congestion is detected gain *= min(1.0, max(0.0,(1.0-owd_trend_ewma))) The gain is increased if competing (potentially aggressive) flows are detected, this is indicated by that owd_target/owd_target_lo > 1.0 gain *= owd_target/owd_target_lo A ramp-up speed is computed that is adjusted depending on the estimated congestion level ramp_up_time = ramp_up_time_fast+(ramp_up_time_slow- ramp_up_time_fast)* max(0.0,Math.min(1.0, owd_trend_ewma /0.2)) A recommended value for ramp_up_time_slow is 20.0s. The increment is computed and the target_bitrate is updated increment = min(target_bitrate_max*gain*rate_change_interval /(ramp_up_t), increment_max) target_bitrate = min(target_bitrate_max, target_bitrate +increment) 5. Conclusion This memo describes a congestion control framework for RMCAT that it is particularly good at handling the quickly changing condition in wireless network such as LTE. The solution conforms to the packet conservation principle and leverages on novel congestion control algorithms and recent TCP research, together with media bitrate determined by sender queuing delay and given delay thresholds. The solution has shown potential to meet the goals of high link utilization and prompt reaction to congestion. The solution is realized with a new RFC4585 transport layer feedback message. Johansson & Sarker Expires April 30, 2015 [Page 19] Internet-Draft SCReAM October 2014 6. Open issues A list of open issues. o Describe use of Q bit o Describe how clock drift compensation is done o RTCP AVPF mode. Determine if AVPF early or immediate mode is to prefer o Determine format and use of ECN echo field 7. Acknowledgements We would like to thank the following persons for their comments, questions and support during the work that led to this memo: Markus Andersson, Bo Burman, Tomas Frankkila, Laurits Hamm, Hans Hannu, Nikolas Hermanns, Stefan Haekansson, Erlendur Karlsson, Mats Nordberg, Jonathan Samuelsson, Rickard Sjoeberg, Magnus Westerlund. 8. IANA Considerations A new RFC4585 transport layer feedback message needs to be standardized. 9. Security Considerations The feedback can be vulnerable to attacks similar to those that can affect TCP. It is therefore recommended that the RTCP feedback is at least integrity protected. 10. Change history A list of changes: o -02 to -03 : Added algorithm description with equations, removed pseudo code and simulation results o -01 to -02 : Updated GCC simulation results o -00 to -01 : Fixed a few bugs in example code 11. References Johansson & Sarker Expires April 30, 2015 [Page 20] Internet-Draft SCReAM October 2014 11.1. Normative References [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. Jacobson, "RTP: A Transport Protocol for Real-Time Applications", STD 64, RFC 3550, July 2003. [RFC4585] Ott, J., Wenger, S., Sato, N., Burmeister, C., and J. Rey, "Extended RTP Profile for Real-time Transport Control Protocol (RTCP)-Based Feedback (RTP/AVPF)", RFC 4585, July 2006. [RFC5506] Johansson, I. and M. Westerlund, "Support for Reduced-Size Real-Time Transport Control Protocol (RTCP): Opportunities and Consequences", RFC 5506, April 2009. [RFC6817] Shalunov, S., Hazel, G., Iyengar, J., and M. Kuehlewind, "Low Extra Delay Background Transport (LEDBAT)", RFC 6817, December 2012. 11.2. Informative References [FACK] "Forward Acknowledgement: Refining TCP Congestion Control", 2006. [I-D.alvestrand-rmcat-congestion] Holmer, S., Cicco, L., Mascolo, S., and H. Alvestrand, "A Google Congestion Control Algorithm for Real-Time Communication", draft-alvestrand-rmcat-congestion-02 (work in progress), February 2014. [I-D.draft-sarker-rmcat-cellular-eval-test-cases] Sarker, Z., "Evaluation Test Cases for Interactive Real- Time Media over Cellular Networks", . [I-D.ietf-tcpm-newcwv] Fairhurst, G., Sathiaseelan, A., and R. Secchi, "Updating TCP to support Rate-Limited Traffic", draft-ietf-tcpm- newcwv-07 (work in progress), September 2014. [QoS-3GPP] TS 23.203, 3GPP., "Policy and charging control architecture", June 2011, . Johansson & Sarker Expires April 30, 2015 [Page 21] Internet-Draft SCReAM October 2014 [TFWC] University College London, "Fairer TCP-Friendly Congestion Control Protocol for Multimedia Streaming", December 2007, . Authors' Addresses Ingemar Johansson Ericsson AB Laboratoriegraend 11 Luleae 977 53 Sweden Phone: +46 730783289 Email: ingemar.s.johansson@ericsson.com Zaheduzzaman Sarker Ericsson AB Laboratoriegraend 11 Luleae 977 53 Sweden Phone: +46 761153743 Email: zaheduzzaman.sarker@ericsson.com Johansson & Sarker Expires April 30, 2015 [Page 22]