Internet DRAFT - draft-garcia-simulcast-and-layered-video-webrtc

draft-garcia-simulcast-and-layered-video-webrtc







RTCWEB Working Group                                           G. Garcia
Internet-Draft                                                    TokBox
Intended status: Informational                           August 05, 2013
Expires: February 06, 2014


          Simulcast and layered video coding support in WebRTC
           draft-garcia-simulcast-and-layered-video-webrtc-00

Abstract

   This document describes the use cases and requirements for simulcast
   and layered video coding support in WebRTC.  These techniques
   simplify the implementation of video stream adaptation to different
   participants in centralized conferencing solutions.  This document
   also includes a proposal to expose these capabilities in the existing
   PeerConnection API by defining new media constraint properties.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at http://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on February 06, 2014.

Copyright Notice

   Copyright (c) 2013 IETF Trust and the persons identified as the
   document authors.  All rights reserved.












Garcia                  Expires February 06, 2014               [Page 1]

Internet-Draft     Simulcast and layered video coding        August 2013


   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
     1.1.  Browser support status  . . . . . . . . . . . . . . . . .   3
     1.2.  Requirements Language . . . . . . . . . . . . . . . . . .   3
   2.  Use-cases . . . . . . . . . . . . . . . . . . . . . . . . . .   3
     2.1.  Adaptation to devices with different capabilities . . . .   3
     2.2.  Adaptation to participants with different network
           conditions  . . . . . . . . . . . . . . . . . . . . . . .   3
     2.3.  Recording . . . . . . . . . . . . . . . . . . . . . . . .   4
     2.4.  Increasing video quality for active speaker . . . . . . .   4
   3.  Requirements  . . . . . . . . . . . . . . . . . . . . . . . .   4
   4.  Proposed API  . . . . . . . . . . . . . . . . . . . . . . . .   5
     4.1.  Simulcasted streams . . . . . . . . . . . . . . . . . . .   5
     4.2.  Layered video coding  . . . . . . . . . . . . . . . . . .   5
     4.3.  Example . . . . . . . . . . . . . . . . . . . . . . . . .   6
   5.  Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .   7
   6.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .   7
   7.  Security Considerations . . . . . . . . . . . . . . . . . . .   7
   8.  References  . . . . . . . . . . . . . . . . . . . . . . . . .   7
     8.1.  Normative References  . . . . . . . . . . . . . . . . . .   7
     8.2.  Informative References  . . . . . . . . . . . . . . . . .   7
   Author's Address  . . . . . . . . . . . . . . . . . . . . . . . .   8

1.  Introduction

   Video conferencing using a central server is one of the typical use
   cases for real-time communication capabilities in browsers
   [I-D.ietf-rtcweb-use-cases-and-requirements].

   Most of today's multiparty videoconference solutions make use of
   centralized servers to reduce the bandwidth and CPU consumption in
   the endpoints.  Those servers receive streams from each participant
   and send the streams to rest of the participants, which usually have
   heterogeneous capabilities (screen size, CPU, bandwidth, etc.).  One
   of the biggest issues is how to perform the adaption to different
   participants' constraints with the minimum possible impact on video
   quality and server performance.



Garcia                  Expires February 06, 2014               [Page 2]

Internet-Draft     Simulcast and layered video coding        August 2013


   There are two approaches to adapt the streams to different
   destinations: one is transcoding (sometimes including mixing), and
   the other is switching between multiple streams or sub-streams
   received from the originator.  The first solution is computationally
   expensive and can degrade video quality.  The second solution makes a
   suboptimal use of network resources by sending redundant information,
   and in addition it is codec-specific.

   The requirements and proposed API in this document are based on
   existing JSEP API version and VP8 capabilities.  These are the
   technologies available in existing WebRTC browsers, but this proposal
   could be extended to other codecs or mapped to other APIs.

1.1.  Browser support status

   It is possible to use simulcast with existing WebRTC implementations.
   However, this requires the use of different PeerConnection objects,
   and all streams will have the same resolution.

   Multi-layer encoding is implemented and working in existing WebRTC
   browsers, and it has been tested in prototypes, but currently there
   is no way for developers to enable it.  In VP8 there is support for
   temporal scalability, while VP9 will include more advanced control
   and support for both temporal and spatial scalability.

1.2.  Requirements Language

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC 2119 [RFC2119].

2.  Use-cases

   The use cases envisioned for these new WebRTC capabilities are
   focused on centralized conferencing solutions.

2.1.  Adaptation to devices with different capabilities

   Some endpoints connected to a centralized conferencing server have
   small screens and do not need to receive high-resolution video, or
   the CPU power and battery consumption make it impossible to receive
   and decode high-resolution video in real-time.

   In this situation, it is desirable to send lower-resolution video to
   those endpoints.

2.2.  Adaptation to participants with different network conditions




Garcia                  Expires February 06, 2014               [Page 3]

Internet-Draft     Simulcast and layered video coding        August 2013


   Some endpoints connected to a centralized conferencing server do not
   have enough available bandwidth to receive high-quality video, while
   other endpoints have enough available bandwidth.

   In this situation is desirable to send lower-bitrate video to those
   endpoints.

2.3.  Recording

   A conferencing server implements recording and wants to record video
   in the highest quality possible, while forwarding it in lower quality
   to endpoints.

2.4.  Increasing video quality for active speaker

   A videoconference application shows the video of the active speaker
   in a larger size than videos of the other participants.

   It is desirable to increase the resolution and quality of that
   highlighted video stream, to maintain the perceived video quality.
   One possible implementation to increase the quality is to have a
   paused high-quality stream that resumes when voice activity is
   detected.

3.  Requirements

   This section contains the requirements for the API exposed in the
   browser, derived from the use-cases in Section 2.

   Requirements on how and when to enable scalable video coding:

   o  REQ-1.  It must be possible to enable and configure the scalable
      video coding before initiating a peer connection.

   o  REQ-2.  It must be possible to enable and configure the scalable
      video coding before answering a peer connection.

   o  REQ-3.  It must be possible to enable/disable and re-configure the
      scalable video coding to update a peer connection.

   Requirements on the parameters that needs to be configurable:

   o  REQ-5.  It must be possible to configure the number of simulcasted
      streams.

   o  REQ-6.  It must be possible to configure the minimum and maximum
      bitrate of each simulcasted stream.




Garcia                  Expires February 06, 2014               [Page 4]

Internet-Draft     Simulcast and layered video coding        August 2013


   o  REQ-7.  It must be possible to configure the resolution of each
      simulcasted stream.

   o  REQ-8.  It must be possible to configure the number of temporal
      layers (1 to 4).  This should be the only mandatory parameter when
      enabling temporal scalability.

   o  REQ-9.  It must be possible to configure the bitrate, frame rate
      decimation factor and membership of frames to layers for each
      temporal layer of the VP8 stream.

   Requirements regarding RTP usage:

   o  REQ-10.  Congestion control must be supported for all the
      simulcasted streams between the configured boundaries (min/max
      bitrate).

   o  REQ-11.  Transmission of simulcasted streams must be signaled and
      negotiated in the SDP and transmitted in RTP sessions, making use
      of existing standard attributes
      [I-D.westerlund-avtcore-multistream-and-simulcast].

   o  REQ-12.  Any endpoint should be prepared to receive VP8 multi-
      layered encoded video not requiring out of band negotiation in
      SDP.

   Non functional requirements:

   o  REQ-13.  The exposed API must be extensible to new codecs or new
      codec parameters.

4.  Proposed API

   The existing solution in the WebRTC API to modify settings of a
   PeerConnection is to use media constraints.  This section defines
   some new media constrains to enable and configure the usage of
   simulcasted and layered video streams.

4.1.  Simulcasted streams

   Simulcast capabilities are codec-agnostic and do not require new
   media constraints.  Existing media constrains for resolution, frame
   rate and bitrate can be reused, but the API needs to support
   receiving a list of them instead of just one.

4.2.  Layered video coding





Garcia                  Expires February 06, 2014               [Page 5]

Internet-Draft     Simulcast and layered video coding        August 2013


   Multi-layer capabilities are codec-dependent.  For VP8, these are the
   configuration parameters exposed in the codec, and that needs to be
   translated to media constraints (the descriptions are taken from VP8
   source code):

   o  tsNumberLayers: This value specifies the number of coding layers
      to be used.

   o  tsTargetBitrate: These values specify the target coding bitrate
      for each coding layer.

   o  tsRateDecimator: These values specify the frame rate decimation
      factors to apply to each layer.

   o  tsPeriodicity: This value specifies the length of the sequence
      that defines the membership of frames to layers.  For example, if
      tsPeriodicity=8 then frames are assigned to coding layers with a
      repeated sequence of length 8.

   o  tsLayerId: This array defines the membership of frames to coding
      layers.  For a 2-layer encoding that assigns even numbered frames
      to one layer (0) and odd numbered frames to a second layer (1)
      with tsPeriodicity=8, then tsLayerId = (0,1,0,1,0,1,0,1).

4.3.  Example

   Example of media constraints to request two simulcasted streams, the
   first one with four temporal layers and default bitrate and the
   second one with a single layer and fixed bitrate.

      {
        video: [{
            width: 640,
            height: 480,
            codecs: {
              vp8: { tsNumberLayers: 4 }
            }
          },
          {
            width: 320,
            height: 240,
            bitrate: { min: 100000, max: 100000 }
          }]
        }
      }






Garcia                  Expires February 06, 2014               [Page 6]

Internet-Draft     Simulcast and layered video coding        August 2013


5.  Acknowledgements

6.  IANA Considerations

   This memo includes no request to IANA.

7.  Security Considerations

   No security implications foreseen.

8.  References

8.1.  Normative References

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119, March 1997.

8.2.  Informative References

   [I-D.ietf-rtcweb-use-cases-and-requirements]
              Holmberg, C., Hakansson, S., and G. Eriksson, "Web Real-
              Time Communication Use-cases and Requirements", draft-
              ietf-rtcweb-use-cases-and-requirements-11 (work in
              progress), June 2013.

   [I-D.narten-iana-considerations-rfc2434bis]
              Narten, T. and H. Alvestrand, "Guidelines for Writing an
              IANA Considerations Section in RFCs", draft-narten-iana-
              considerations-rfc2434bis-09 (work in progress), March
              2008.

   [I-D.westerlund-avtcore-multistream-and-simulcast]
              Westerlund, M. and B. Burman, "RTP Multiple Stream
              Sessions and Simulcast", draft-westerlund-avtcore-
              multistream-and-simulcast-00 (work in progress), July
              2011.

   [RFC2629]  Rose, M., "Writing I-Ds and RFCs using XML", RFC 2629,
              June 1999.

   [VP8]      The WebM Project, "VP8 source code", 2013,
              <http://www.webmproject.org/docs/vp8-sdk/>.









Garcia                  Expires February 06, 2014               [Page 7]

Internet-Draft     Simulcast and layered video coding        August 2013


Author's Address

   Gustavo Garcia
   TokBox
   115 Stillman Street
   San Francisco, CA
   US

   Email: gustavo@tokbox.com










































Garcia                  Expires February 06, 2014               [Page 8]