MMUSIC Working Group                                         T. Schierl
Internet Draft
Document: draft-schierl-mmusic-layered-codec-00
Expires: December 2006
                                                              June 2006





                    Signaling media decoding dependency
                    in Session Description Protocol (SDP)

Status of this Memo

   By submitting this Internet-Draft, each author represents that any
   applicable patent or other IPR claims of which he or she is aware
   have been or will be disclosed, and any of which he or she becomes
   aware will be disclosed, in accordance with Section 6 of BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   This Internet-Draft will expire on December 18, 2006.

Copyright Notice

   Copyright (C) The Internet Society (2006).


Abstract

This memo defines semantics that allows for signaling decoding
dependency of different media descriptions with the same media type in
the Session Description Protocol (SDP).  This is e.g. required if media
data as result of a layered media coding process is separated and
carried in different transport streams.

INTERNET-DRAFT    draft-schierl-mmusic-layered-codec-00       June 2006

Therefore a new grouping type "DDP" - decoding dependency is defined to
be used with the Grouping of Media Lines in the Session Description
Protocol (RFC 3388); further an attribute is specified describing the
relationship of media streams of a "DDP" group.  Additionally attributes
for description of the media properties are defined.

















































Schierl                     Standards Track                   [page 2]

INTERNET-DRAFT    draft-schierl-mmusic-layered-codec-00       June 2006

Table of Content

   1.   Introduction.................................................4
   2.   Terminology..................................................4
   3.   Motivation for media dependency signaling....................4
   4.   Generic signaling in SDP for media dependency................5
   4.1.   Design Principles..........................................6
   4.2.   Definitions................................................6
   4.3.   Semantics..................................................7
   4.3.1.  SDP grouping semantics for decoding dependency............7
   4.3.2.  Attribute for dependency signaling per media-stream.......7
   4.3.3.  Attributes for media/operation point description..........8
   5.   Usage of new semantics in SDP................................9
   5.1.1.  General...................................................9
   5.1.2.  Usage with the SDP Offer/Answer Model.....................9
   5.1.3.  Network elements not supporting dependency signaling......9
   5.2.   Examples...................................................9
   6.   Security Considerations.....................................11
   7.   IANA Consideration..........................................11
   8.   Acknowledgements............................................11
   9.   References..................................................11
   9.1.   Normative References......................................11
   9.2.   Informative References....................................12
   10.  Author's Addresses..........................................12
   11.  Intellectual Property Statement.............................12
   12.  Disclaimer of Validity......................................12
   13.  Copyright Statement.........................................13
   14.  RFC Editor Considerations...................................13
   15.  Open Issues.................................................13
   16.  Changes Log.................................................13
























Schierl                     Standards Track                   [page 3]

INTERNET-DRAFT    draft-schierl-mmusic-layered-codec-00       June 2006


1. Introduction

   An SDP session description may contain various media descriptions
   each identifying one media stream.  A media description is identified
   by one "m=" line.  If more than one "m=" line exist indicating the
   same media type, a receiver or network element possibly cannot
   identify an existing relationship between those "m=" lines.  This is
   certainly the case if the receiver or network element is not aware of
   the media specific information, which may be carried within in the
   "fmtp:" attribute.  Relationships like dependencies of media streams
   may exist for different reasons, as for transporting bitstream
   partitions of a hierarchical media coding process (also known as
   layered media coding process) or of a multi description coding (MDC)
   in different transport streams.  SDP does not allow for signaling
   such relations.


2. Terminology

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in BCP 14, RFC 2119
   [RFC2119].


3. Motivation for media dependency signaling

   The reasons for having dependency of media descriptions with the same
   media type may be various.  But the basic idea for all cases is the
   separation of partitions of a media bitstream for issues like
   increasing efficiency in transport or allowing scalability in network
   elements.

   Two types of dependency are explained in the following in a more
   detailed way:

   o Layered/Hierarchical decoding dependency:

   One or more partition(s) of a layered media bitstream, also known as
   media layers, may be transported in different network streams.  Such
   a scheme is e.g. used for layered multicast transmission, where the
   receiver can select a certain combination of those media streams for
   receiving a certain level of quality or bit-rate.  [SDPnew] allows
   only for signaling a range of transport addresses or ports for a
   certain media description, but does not take into account that
   different combination of a layered media bitstream result in
   different operation points (represented by a layer or a combination
   of layers) of the media bitstream.  These operation points may

Schierl                     Standards Track                   [page 4]

INTERNET-DRAFT    draft-schierl-mmusic-layered-codec-00       June 2006

   require different media codec specific signaling in the "fmtp:"
   attribute.  Further the operation points in different transport
   streams may belong to different payload types.
   This may be in particular the case if using the Scalable Video Coding
   (SVC) Extensions of H.264/MPEG-4 AVC (payload format [SVCpayld]).
   The base layer of a media-stream of this layered media coding
   standard is plain H.264/MPEG-4 AVC for compatibility reasons with old
   receivers, thus the base layer is transported using the native
   payload format of H.264 [RFC3984].  But enhancement layers of such a
   media-stream may be transported using the SVC payload format
   [SVCpayld].

   At this point, SVC is used as an example for a layered media coding
   standard in general.  Audio coding standards can be as well of
   layered nature.  Since layered media coding standards in general may
   save cost in infrastructure for content generation and delivery as
   well as in transmission bandwidth, it is foreseeable that also future
   coding standards will follow the layered design.


   o Equivalent decoding dependency:

   Dependency of media streams do not necessarily have to be of a
   hierarchical nature as it is the case for layered media.  A
   relationship between media streams may be also of equal relevance.
   Maybe all partitions of the media bitstream are required, or at
   minimum one partition is required for successfully decoding, i.e. a
   valid media bit-stream can be (re-)constructed at the receiver for
   decoding.

   An example for equal importance of partitions of a media bitstream in
   different transport streams, but with the requirement of having all
   partitions available for successfully decoding: A video coding
   standard that allows for differentiated, high quality coding of the
   color components.  In such a case it may be helpful to transport the
   color components in different transport streams.

   An example for equal importance of partitions of a media bitstream in
   different transport streams, but with the requirement of having
   minimally only one partition available for successfully decoding: A
   multi description coding (MDC) process.  In such a process equal
   partitions of a bitstream are generated, which can be decoded
   independently, i.e. each partition is a valid media bit-stream.  By
   each additionally received partition of the MDC stream the quality of
   the media may be enhanced.


4. Generic signaling in SDP for media dependency


Schierl                     Standards Track                   [page 5]

INTERNET-DRAFT    draft-schierl-mmusic-layered-codec-00       June 2006

4.1. Design Principles

   For the separated transport of a media bitstream in different
   transport streams, the media description of SDP is assumed as the
   only multiplexing point for the transport protocol, i.e. dependency
   signaling is only feasible between media descriptions described with
   a "m="-line and with an assigned media identification attribute
   ("mid") defined in RFC3388.


4.2. Definitions

   Media stream:
   As used in [SDPnew].

   Media bitstream:
   A valid, decodable stream of binary data produced by a media encoder.

   Decoding dependency:
   Partitions of a media bitstream may be separated for transportation
   or scalability issues and must be re-interleaved at the receiver.
   The result of the re-interleaving process is a valid media bitstream.

   Equivalent decoding dependency:
   A media stream may be separated into partitions for transportation
   issues, where all partitions of the bitstream are required for
   reconstructing a valid media bitstream.

   Hierarchical/layered coding dependency:
   Partitions of a layered media bitstream can be removed or added for
   scaling quality of the media and bit-rate of the stream.  These
   partitions of the bitstream have a hierarchical dependency.  A
   partition may depend on one or more partition(s).  The dependencies
   between the layered bitstream partitions create a directed graph.

   Operation point:
   A subset of a layered media bitstream, including all partitions
   required for reconstructing a valid media bitstream.  This subset of
   the media represents a certain level or point of quality.  Layers of
   the media bitstream not required for decoding the operation point
   does not belong to it.

   Multi description coding (MDC) dependency:
   A bitstream as result of a multi description coding process can be
   separated into sub bitstreams, where each of the sub bitstreams can
   be decoded independently, i.e. each sub bitstream represents a valid
   media bitstream.  A combination of one or more of these sub
   bitstreams may result in higher quality than decoding a smaller
   number of sub-bitstreams.

Schierl                     Standards Track                   [page 6]

INTERNET-DRAFT    draft-schierl-mmusic-layered-codec-00       June 2006


   Fine Granularity Scalability (FGS):
   This capability of a media allows for truncation of media frames or
   packets by cutting bytes-wise from the end of a media frame or packet
   for bit-rate and quality reduction/adaptation.  An example for a
   definition of this feature used with video is given in [SVCpayld],
   but this feature is supported by media coding standards for audio as
   well.


4.3. Semantics

4.3.1.    SDP grouping semantics for decoding dependency

   This specification defines the new grouping semantics
   Decoding Dependency "DDP":

   All media streams of a DDP group have the same type of coding
   dependency (as signaled by attribute defined in 4.3.2) and belong to
   one media, i.e. one, more or all media streams of a DDP group may be
   required for reconstructing a valid media bitstream.  This group type
   informs a receiver or a middle box about the requirement for treating
   the streams of the group in a similar or the same way.  For detailed
   knowledge about how to treat the streams, the new media level
   attribute "depend", defined in 4.3.2, SHALL be used.


4.3.2.    Attribute for dependency signaling per media-stream

   This specification defines a new media-level value attribute,
   "depend".  Therefore the formatting in SDP is described by the
   following BNF [RFC2234]. The "identification-tag" is defined in
   [RFC3388]:

          depend-attribute     = "a=depend:" dependency-type-tag
                                  *(space identification-tag)
          dependency-type-tag  = dependency
          dependency           = "lay" / "eql" / "mdc"


   The "depend"-attribute describes the decoding dependency.  Different
   types of dependency are defined within this document.  The "depend"-
   attribute may be followed by a sequence of identification-tag(s) for
   expressing the directly related media streams.  The following types
   of dependency are defined:

   o lay:  Layered decoding dependency - signals that the described
   media stream is a partition of a layered media bitstream and MUST
   have the streams identified by the following identification-tag(s)

Schierl                     Standards Track                   [page 7]

INTERNET-DRAFT    draft-schierl-mmusic-layered-codec-00       June 2006

   available for re-interleaving or re-constructing the valid media
   bitstream.  The identification-tag(s) MUST be present for this type
   of dependency.  Further the described media stream represents one
   operation point of the layered media bitstream, i.e. all other media
   streams belonging to the same dependency group, but not identified by
   a identification-tag MAY be left out for scalability or transport
   issues for the operation point given by this media description.

   o eql:  Equal decoding dependency - signals that the described media
   stream is a partition of a media bitstream and has the same
   importance for decoding as the remainder media streams in the group,
   i.e. all media streams of the group MUST be available for re-
   interleaving or re-constructing a valid media bitstream.  This type
   of dependency does not require the signaling of the depended media
   streams.

   o mdc:  Multi descriptive decoding dependency - signals that the
   described media stream is a partition of a multi description coding
   (MDC) media bitstream, i.e. at minimum one stream of the group MUST
   be received for allowing decoding the media.  Receiving more than one
   stream of the group may enhance the decodable quality of the media
   bitstream.  This type of dependency does not require the signaling of
   the depended media streams.


4.3.3.    Attributes for media/operation point description

   Currently two media level attributes are defined for description of
   the media.  These attributes define the property of the media itself
   or if separately transported the property of the re-constructed
   operation point of the media description (after combining the
   depending media streams to a valid media bit-stream):

   o "a=resolution:<width> <height>"
   This media level attribute gives the maximum presentation size of the
   signaled video in terms of pixels.  If the color space components are
   of different resolution, the resolution of the luminance component is
   indicated.  <width> gives the maximum horizontal size of the video
   and <height> gives the maximum vertical size of the video in terms of
   pixels.

   o "a=fgscapability"
   If present, this media level attribute indicates the so-called 'Fine
   Granularity Scalability (FGS)' capability.  This attribute gives the
   capability of truncating network packets for bit-rate and quality
   reduction of a media stream.  The minimal achievable network packet
   size SHALL be derived from the transport parameters.



Schierl                     Standards Track                   [page 8]

INTERNET-DRAFT    draft-schierl-mmusic-layered-codec-00       June 2006

5. Usage of new semantics in SDP

5.1.1.    General

   Sender and receiver using the feature of separating a media stream
   for transport SHALL support the signaling defined in this
   specification.  Using the information about the decoding dependency
   may give a network element more options in treating the media streams
   of a session.  Therefore the network element does not need to know
   details about the media (e.g. about the media format description),
   but SHALL use the information defined in this specification for
   treating the media streams.


5.1.2.    Usage with the SDP Offer/Answer Model

   If an Answerer does not understand the decoding dependency signaling,
   it may detect the 'base' media only for a layered media session or
   may detect only one media-stream of MDC media session.  Thus for both
   described cases, an Answerer may not understand the full media
   description, but may be able to request a valid sub-set of the
   offered media.  For the Equal decoding dependency case, an Answerer
   may not correctly understand the session description.

   If an Offerer is not able to interpret the decoding dependency
   signaling, the Offerer SHALL NOT offer the feature of separating a
   media into different transport sessions.


5.1.3.    Network elements not supporting dependency signaling

   Network elements that do not understand the new grouping type, but
   understand grouping in general, MAY detect a general requirement of
   treating the media streams of the group in a certain way.  Network
   elements that do not understand the decoding dependency signaling MAY
   treat all media streams of a session in the same way or MAY use their
   knowledge about the media format description for treatment of media
   streams, if such knowledge does exist.  Receivers that do not
   understand the signaling defined in this specification may detect a
   subset of the separated media only, thus the receiver may not
   understand the full media description, but may be able to understand
   and/or request a subset of the media.


5.2. Examples

   a.)  Example for signaling transport of operation points of a layered
        video bitstream in different transport streams:


Schierl                     Standards Track                   [page 9]

INTERNET-DRAFT    draft-schierl-mmusic-layered-codec-00       June 2006

          v=0
          o=svcsrv 289083124 289083124 IN IP4 host.example.com
          s=LAYERED VIDEO SIGNALING Seminar
          t=0 0

          c=IN IP4 224.2.17.12/127
          a=group:DDP 1 2 3 4

          m=video 40000 RTP/AVP 94
          b=AS:96
          a=framerate:15
          a=resolution:176 144
          a=rtpmap:94 h264/90000
          a=mid:1
          a=depend:lay

          m=video 40002 RTP/AVP 95
          b=AS:64
          a=framerate:15
          a=resolution:320 240
          a=rtpmap:95 svc1/90000
          a=mid:2
          a=depend:lay 1

          m=video 40004 RTP/AVP 96
          b=AS:128
          a=framerate:30
          a=resolution:320 240
          a=fgscapability
          a=rtpmap:96 svc1/90000
          a=mid:3
          a=depend:lay 1 2

          m=video 40004 RTP/AVP 100
          c=IN IP4 224.2.17.13/127
          b=AS:256
          a=framerate:60
          a=resolution:640 480
          a=rtpmap:100 svc1/90000
          a=mid:4
          a=depend:lay 1 2 3


   b.)  Example for signaling transport of streams of a multi
        description (MDC) video bitstream in different transport
        streams. Examples for signaling Equal decoding dependency
        ("eql") is very similar and is left out for that reason:

          v=0

Schierl                     Standards Track                   [page 10]

INTERNET-DRAFT    draft-schierl-mmusic-layered-codec-00       June 2006

          o=mdcsrv 289083124 289083124 IN IP4 host.example.com
          s=MULTI DESCRIPTION VIDEO SIGNALING Seminar
          t=0 0

          c=IN IP4 224.2.17.12/127
          a=group:DDP 1 2 3
          m=video 40000 RTP/AVP 94
          a=mid:1
          a=depend:mdc

          m=video 40002 RTP/AVP 95
          a=mid:2
          a=depend:mdc

          m=video 40004 RTP/AVP 96
          c=IN IP4 224.2.17.13/127
          a=mid:3
          a=depend:mdc


6. Security Considerations


7. IANA Consideration


8. Acknowledgements

   Funding for the RFC Editor function is currently provided by the
   Internet Society.


9. References

9.1. Normative References

[SDPnew]     M. Handley, V. Jacobson, and C. Perkins, "SDP: Session
             Description Protocol", IETF work in progress, January
             2006.
[RFC3388]    G. Camarillo, J. Holler, and H. Schulzrinne, "Grouping of
             Media Lines in the Session Description Protocol (SDP)",
             RFC 3388, December 2002.
[RFC2119]    Bradner, S., "Key words for use in RFCs to Indicate
             Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC2234]    Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax
             Specifications: ABNF", RFC 2234, November 1997




Schierl                     Standards Track                   [page 11]

INTERNET-DRAFT    draft-schierl-mmusic-layered-codec-00       June 2006

9.2. Informative References

[SVCpayld]   Wenger,S., Wang, Y.-K., Schierl, T., "RTP Payload Format
             for SVC Video", "draft-wenger-avt-rtp-svc-02.txt",
             June 2006
[RFC3984]    Wenger, S., Hannuksela, M., Stockhammer, T.,
             Westerlund, M., Singer, D., "RTP Payload Format for H.264
             Video", RFC 3984, February 2005


10.  Author's Addresses

   Thomas Schierl                       Phone: +49-30-31002-227
   Fraunhofer HHI                       Email: schierl@hhi.fhg.de
   Einsteinufer 37
   D-10587 Berlin
   Germany


11.  Intellectual Property Statement

   The IETF takes no position regarding the validity or scope of any
   Intellectual Property Rights or other rights that might be claimed to
   pertain to the implementation or use of the technology described in
   this document or the extent to which any license under such rights
   might or might not be available; nor does it represent that it has
   made any independent effort to identify any such rights.  Information
   on the procedures with respect to rights in RFC documents can be
   found in BCP 78 and BCP 79.

   Copies of IPR disclosures made to the IETF Secretariat and any
   assurances of licenses to be made available, or the result of an
   attempt made to obtain a general license or permission for the use of
   such proprietary rights by implementers or users of this
   specification can be obtained from the IETF on-line IPR repository at
   http://www.ietf.org/ipr.

   The IETF invites any interested party to bring to its attention any
   copyrights, patents or patent applications, or other proprietary
   rights that may cover technology that may be required to implement
   this standard.  Please address the information to the IETF at
   ietf-ipr@ietf.org.


12.  Disclaimer of Validity

   This document and the information contained herein are provided on an
   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET

Schierl                     Standards Track                   [page 12]

INTERNET-DRAFT    draft-schierl-mmusic-layered-codec-00       June 2006

   ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
   INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
   INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.


13.  Copyright Statement

   Copyright (C) The Internet Society (2006).  This document is subject
   to the rights, licenses and restrictions contained in BCP 78, and
   except as set forth therein, the authors retain all their rights.

14.  RFC Editor Considerations

   none

15.  Open Issues

- This draft is written with the assumption, that the media description
("m"-line) is the one and only multiplexing point.  If payload type
multiplexing (as used in draft-ietf-avt-rtp-retransmission-12) should be
used with the signaling defined in this draft, a complete different
approach may be used not based on the grouping semantics.
- Missing reference to layered audio codec.
- More detailed description of MDC.  Missing examples for MDC, since no
standard is available?  MDC audio codecs?
- FGS capability signaling may be extended.
- Description of media operation points may be extended.
- Missing Example for "Equal decoding dependency".


16.  Changes Log





















Schierl                     Standards Track                   [page 13]