HTTP/1.1 200 OK Date: Tue, 09 Apr 2002 00:11:59 GMT Server: Apache/1.3.20 (Unix) Last-Modified: Fri, 02 Jul 1999 17:23:00 GMT ETag: "2e9a7d-b460-377cf574" Accept-Ranges: bytes Content-Length: 46176 Connection: close Content-Type: text/plain Internet Engineering Task Force Audio Visual Transport WG INTERNET_DRAFT C.Guillemot, P.Christ, S.Wesner, A. Klemets draft-guillemot-genrtp-01.txt INRIA / Univ. Stuttgart - RUS / Microsoft June, 25 1999 Expires: December, 24 1999 RTP Payload Format for MPEG-4 with Scaleable & Flexible Error Resiliency STATUS OF THIS MEMO This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet- Drafts as refer- ence material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Abstract This document describes a payload format, which can be used for the transport of both MPEG-4 Elementary Streams (ES) as well as MPEG-4 Sync Layer packet streams, in RTP [1] packets. The payload format allows for protection against loss in a generic way, through frag- mentation, grouping and extension data mechanisms, which can dynami- cally adapt to network conditions. These mechanisms can operate both on full and partial MPEG-4 Access Units, such as Sync Layer packets, or typed "segments", These mechanisms can cover a broad range of protection schemes and avoid extra connection management complexity - e.g. for separate FEC channels - in MPEG-4 applications with a high number of streams. C.Guillemot, P.Christ, S.Wesner, A. Klemets [Page 1] INTERNET-DRAFT draft-guillemot-genrtp-01.txt June 25, 1999 Table of Contents 1 Introduction..............................................3 2 MPEG-4 overview...........................................4 3 Design Considerations.....................................5 4 Payload Format specification..............................7 4.1 RTP Header Usage..........................................7 4.2 Payload Header............................................8 5 Examples of payload headers..............................10 5.1 The payload contains Extension data followed by one object containing AU data.......................................10 5.2 The payload contains Extension data followed by 2 AU"s...11 6 Usage of Extension data field for redundant data.........12 7 Extension data field for FEC data........................12 7.1 Extension data field for Parity Codes....................12 7.2 Extension data field for RS-based Unequal Error Protection of typed segments........................................15 8 Multiplexing.............................................16 9 Security Considerations..................................16 10 Authors Addresses........................................17 11 References...............................................18 List of Figures Figure 1 Architecture..........................................6 Figure 2 Example of ESI........................................6 Figure 3 Sample RTP payload, using the payload format..........8 Figure 4 Portrait of the unified approach for transport of ES and SL packetized streams................................10 Figure 5 Sample RTP payload for SL-PDU transport..............10 Figure 6 RTP Payload Example 1................................11 Figure 7 RTP Payload Example 2................................11 Figure 8 FEC Header for Parity Codes..........................13 Figure 9 Simplified FEC Header for Parity Codes (with default masks)...............................................14 Figure 10 FEC Header for Reed-Solomon Codes....................15 Figure 11 Example of Interleaving (for P=7)....................15 Figure 12 Example of data organization for RS-based UEP........16 C.Guillemot, P.Christ, S.Wesner, A. Klemets [Page 2] INTERNET-DRAFT draft-guillemot-genrtp-01.txt June 25, 1999 1 Introduction This document is motivated by the large variety of MPEG-4 compressed streams, and by the large variety of error control mechanisms that can be applied to them. In addition to having a unique payload for- mat for both MPEG-4 Elementary Streams (ES) and Synchronization Layer packet streams (SL-PDU Streams), another motivation is flexi- bility in associating error control mechanisms with the compressed media streams. The error control mechanisms can be dynamically adapted to network characteristics and to different types of stream segments. They can evolve without having to define a new payload format. This design of this payload format has been inspired by previous proposals for generic payload formats, [2-3]. Additionally, it at- tempts to federate different error control approaches under a single protocol support mechanism. The rationale for this payload format consists in: - Generality - a unified approach for both MPEG-4 ES and MPEG-4 sync layer packet streams - with simple fragmentation and grouping mechanisms. - Protection against packet loss with a generic protocol support. The mechanism could also be used for adding protections against binary errors, in the case of IP over wireless. If used, for protection against packet loss, this in-band mechanism avoids extra connection management complexity possibly brought by separate FEC channels. Indeed, in MPEG-4 applications, the num- ber of streams can potentially be high. - Flexible support of a range of error control mechanisms, from no protection to FEC and redundant data, which could be adapted and applied to typed segments and to network characteristics. Typed segments are partial Access Units (AUs) or segments being - in terms of the encoding syntax - syntactical and semanti- cally meaningful parts of an AU - cf. [4], 7.2.3, "Such partial AUs may have significance for improved error resilience"). Ac- cess Units are the smallest entities in the bitstream that can be attributed individual timestamps. Redundant data, as in the sense of [5], or of [6-8] (e.g. under the form of repeated pic- ture headers, or of the HEC field of the MPEG-4 video syntax [9]) can be supported by a single mechanism. - A common solution for "live" and "VOD" (or "pre-recorded") con- tent. The list of all the protection schemes supported will be announced via an out-of-band signaling at the beginning of the session, using for example SDP [10]. The protection scheme used at a specific in- C.Guillemot, P.Christ, S.Wesner, A. Klemets [Page 3] INTERNET-DRAFT draft-guillemot-genrtp-01.txt June 25, 1999 stant during the session will be signaled via the extension type (XT) field in the payload header. 2 MPEG-4 overview An MPEG-4 scene is composed of media objects. The MPEG-4 dynamic- scene description framework, which defines the spatio-temporal rela- tion of the media objects as well as their contents, is inspired by VRML. The compressed binary representation of the scene description is called BIFS (Binary Format for Scenes), [4]. The compressed scene description is conveyed through one or more Elementary Streams (ES). A compression layer produces the compressed representations of the audio-visual objects that will be inserted into the scene. These compressed representations are organized into Elementary Streams (ES). Elementary Stream Descriptors provide information relative to the stream, such as the compression scheme used. Elementary stream data is partitioned into Access Units. The delineation of an Access Unit is completely determined by the entity - the compression layer - that generates the elementary stream. An Access Unit is the smallest data entity to which timing information can be attributed. Two Access Units shall never refer to the same point in time. Natural and animated synthetic objects may refer to an Object De- scriptor (OD), which points to one or more Elementary Streams that carry the coded representation of the object or its animation data. An OD serves as a grouping of one or more Elementary Stream Descrip- tors that refer to a single media object. The OD also defines the hierarchical relations and properties of the Elementary Streams De- scriptors. A complete set of ODs can be seen as an MPEG-4 resource or session description. The Object Descriptors are conveyed through one or more Elementary Streams. By conveying the session (or resource) de- scription as well as the scene description through their own Elemen- tary Streams, it becomes possible to change portions of scenes and/or properties of media streams separately and dynamically at well-known instants of time. The MPEG-4 Systems specification [4] also defines a packetization of ES data into access units or parts thereof. The packets are called SL packets, or SL-PDUs. The resulting sequence of SL packets is called the SL-Packetized Stream (SPS). Access Units are the only semantic entities at this layer and their content is opaque. Pack- etization information has to be exchanged between the entity that generates an elementary stream and the sync layer. This relation is best described by a conceptual interface between both layers, termed the Elementary Stream Interface (ESI). A SL packet (SL-PDU) consists of a SL packet header and a SL packet payload. The SL packet header provides means for continuity check- ing in case of data loss and carries the coded representation of the time stamps and associated information. This syntax is configurable C.Guillemot, P.Christ, S.Wesner, A. Klemets [Page 4] INTERNET-DRAFT draft-guillemot-genrtp-01.txt June 25, 1999 to adapt to the needs of different types of elementary streams and is defined in the SLConfigDescriptor (as defined in [4]) A SL-PDU does not contain an indication of its length. Therefore, SL packets must be framed by a suitable lower layer protocol. Conse- quently, a SL-PDU stream is not a self-contained data stream that can be stored or decoded without such framing. 3 Design Considerations The design goals of this RTP payload format are to provide the fol- lowing: - a unified solution, with error protection easily adaptable to varying network conditions, for both "live" and "pre-recorded" contents. - a unified solution for the transport of SL packet streams - with a possible 1-to-N mapping - and for the transport of robust ES data. Figure 1, on the following page, shows the adopted model. It relies on an optional network adaptation layer, which supports protection mechanisms. Ideally, this network adaptation layer is be both media and network aware. The compression layer organizes the ESs in Access Units (AU). The AUs are the smallest entities that can be attributed individual timestamps. The timestamps may be obtained directly, through the ESI, with syntax as specified by the SLConfigDescriptor. If the SLConfigDescriptor indicates that timestamps are absent, the time- stamps may be obtained indirectly, for example, by using the frame rate. The compression layer passes full or partial Access Units (i.e. typed "segments"), together with indications of AU boundaries, ran- dom access points, desired timing information as described by the SLConfigDescriptor, directly to the network adaptation layer or in- directly via the sync layer. It is however preferable, for imple- mentation efficiency, to pass the ES data directly to the network adaptation layer, i.e. to avoid producing the full SL packets. Par- tial AUs or typed segments are - in terms of the encoding syntax - syntactical and semantically meaningful parts of an AU - cf. [4], 7.2.3, "Such partial AUs may have significance for improved error resilience".) C.Guillemot, P.Christ, S.Wesner, A. Klemets [Page 5] INTERNET-DRAFT draft-guillemot-genrtp-01.txt June 25, 1999 --- ---------------------------------- |S| | Compression Layer | Media aware |L| ----------------------------------- | | | |C| ES Descriptor | | |o| |----------|---------| | |n| ES Type RAP Flag QoS | |f| | | | | |.| -------------V----------V---------V-----|---- ESI |D| | |e| ------------------------------- | |s| | | | |c| | Network Adaptation Layer |<-O Network aware |r| | ->Redundancy, FEC | | | |.| | | | | | -----------|-+- - - - - - - - -| - - -| | | --|-----------------|------|--- | | | | | -------------|-- -------------V------V-----V------ | QoS | | RTP | | Ext. | |"SL" | Media | | monitoring | | Hdr.| | Data= | | | | ---------------- | | | e.g. | | | | | | | FEC | | | | --------------------------------- Figure 1 Architecture Figure 2 lists parameters that should be passed along with the ES data. The SLConfigDescriptor indicates the presence or absence of each parameter. When any of these parameters are present, then the adaptation layer will directly produce the "stripped down" SL header to be inserted in the payload of the RTP packet. Note that, the normative behavior is assured by the SLConfigDescrip- tor, which is visible in the compression layer. DTS: Decoding Time Stamp CTS: Composition Time Stamp OCR: Object Clock Reference IdleFlag loop(randomAccess Flag AUStartFlag AUEndFlag Esdata dataLength degradationPriority segmentType ) Figure 2 Example of ESI. The payload format also specifies a mechanism for grouping an AU or a partial AU or an SL-PDU together with protection data (FEC, redun- C.Guillemot, P.Christ, S.Wesner, A. Klemets [Page 6] INTERNET-DRAFT draft-guillemot-genrtp-01.txt June 25, 1999 dant data). This mechanism makes it possible to adapt the protec- tion of the different typed segments, or SL-PDUs, to varying network conditions during the session, as well as to a degradation priority indicated by the SLConfigDescriptor. The grouping mechanism can be used for grouping SL-PDUs with different SL header parameters (CTS, DTS, etc.) This mechanism also allows several AUs to be grouped, with possibly non-monotonically increasing time stamps, in a single packet. The grouping mechanism can also be used for grouping low bit rate data streams with low delay requirements, such as facial animation parameters. The mechanism can also be used for interleav- ing data in order to increase the error resiliency. Consecutive segments (e.g. video packets [9]) of the same type will be packed consecutively in the same RTP payload without using the grouping mechanism. The grouping mechanism will be used to group partial AUs (or typed-segments) of different types only if UEP - Unequal Error Protection - is used (see section 7.3). The payload format also supports a fragmentation mechanism where the full AUs or the partial AUs passed by the compression layer are fragmented at arbitrary boundaries. This may result in fragments that are not independently decodable. This kind of fragmentation may be used in situations when the RTP packets are not allowed to exceed the path-MTU size. However, this media-unaware fragmentation is not recommended. It is preferable that the compression layer provides partial AUs, in the form of typed segments, of a size small enough so that the resulting RTP packet can fit the MTU size. Note that passing partial AUs of small size will also facilitate conges- tion and rate control based on the real output buffer management. RTP packets that transport fragments belonging to the same AU will have their RTP timestamp set to the same value. The protocol support for fragmentation and grouping is inspired from [2-3] with an attempt for simplification. 4 Payload Format specification The packet will consist of an RTP header followed by possibly multiple payloads. 4.1 RTP Header Usage Each RTP packet starts with a fixed RTP header. The following fields of the fixed RTP header are used: - Marker bit (M bit): The marker bit of the RTP header is set to 1 when the current packet carries the end of an access unit AU, or the last fragment of an AU. - Payload Type (PT): The payload type shall be set to a value as- signed to this format or a payload type in the dynamic range should be chosen. C.Guillemot, P.Christ, S.Wesner, A. Klemets [Page 7] INTERNET-DRAFT draft-guillemot-genrtp-01.txt June 25, 1999 - Timestamp: The RTP timestamp encodes the presentation time of the first AU contained in the packet. The RTP timestamp may be the same on successive packets if an AU occupies more than one packet. If the packet contains only "extension" data objects (see below), then the RTP timestamp is set at the value of the presentation time of the AU to which the first extension data object (e.g. FEC or redundant data) applies. The RTP timestamp is set to the composition timestamp (CTS), if its presence is indicated by the SLConfigDescriptor, and if its length is not more than 32 bits. Otherwise, the RTP timestsamp should be set to the sampling instant of the first AU contained in the packet. SSRC: A mapping between the ES identifiers and the SSRCs should be provided via out-of-band signaling (e.g. SDP). 4.2 Payload Header The payload header is always present, with a variable length, and is defined as follows: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |G|E| XT | LENGTH | TSOFFSET . +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ .TSOFFSET(cnt" | Extension Data . +-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+ . Extension Data (continued) |G|E|F| res | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | LENGTH | FOFFSET | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | . . Media Payload | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 3 Sample RTP payload, using the payload format. G (Group) (1 bit): If this field is 1, it indicates that the object associated to the current header is followed by another object. E (Extension) (1 bit): If its value is 1 then the next object contains Extension data. If its value is 0, then the next object contains AU data (full AU or partial AU - typed segment -). F (Fragmentation) (1 bit): This field is only present when the E- field is 0. If its value is 1, then the next object is a fragment of a typed segment. If this field is 0, then the next object is a com- plete typed segment or complete AU. res (Reserved) (5 bits): this field is only present if the E-field is 0, resulting in always 1 byte for {G,E=1,XT} or {G,E=0,F,res} C.Guillemot, P.Christ, S.Wesner, A. Klemets [Page 8] INTERNET-DRAFT draft-guillemot-genrtp-01.txt June 25, 1999 XT (Extension type) (6 bits): This field is only present if E is set to 1. It then specifies the type of extension data. Examples of types will be FEC data with the specification of the FEC coding scheme (parity codes, block codes such as Reed Solomon codes,...), redundant data with the specification of the redundant data encoding scheme, duplicated high priority - e.g. headers - data,...etc. LENGTH (16 bits): this field specifies the length in bytes of the next object. If the object is the last object of the payload (G=0) then this field is not present. FOFFSET (16 bits): This field is present only when the F field is present and F=1. It contains the byte offset of the first byte of the fragment of the typed segment from the beginning of the typed segment. This field should be indeed rarely present. TSOFFSET (Time Stamp OFFSET) (16 bits): The value of the field is an unsigned 16 bit integer. The default value is 0. If the E field is "1", then the next object carries extension data, and the TSOFFSET added to the value of the RTP timestamp yields the presentation time of the AU to which the extension data apply. The TSOFFSET is, in this case set to the difference between the media TS and the TS of the media to which the extension data apply. If the E field is "0", then the next object contains AU data. If this object is not the first object in the payload containing AU data, then the TSOFFSET added to the value of the RTP timestamp yields the presentation time of the following AU data. If this object is the first in the payload containing AU data,(even if it has been preceded by extension data) then this field is not present. Note that the TSOFFSET is also use- ful for grouping AUs with non-monotonically increasing Time Stamps, as well as for data interleaving. Media payload: If the presence of the DTS - Decoding Time Stamp - is indicated by the SLConfigDescriptor, then the DTS value is placed as the first data of the media payload, the length of the field being provided by the SLConfigDescriptor. If the presence of the OCR - Object Clock Reference - is indicated by the SLConfigDescriptor, then the OCR value is placed as the sec- ond field of the media payload, the length of the field being pro- vided by the SLConfigDescriptor. If the payload format is used to accommodate SL-packet streams, the SN number, if present, can be placed as the third field of the media payload. Corresponding length values are provided by the SLConfigDe- scriptor. C.Guillemot, P.Christ, S.Wesner, A. Klemets [Page 9] INTERNET-DRAFT draft-guillemot-genrtp-01.txt June 25, 1999 If the resulting optional parameters consume a non-integer number of bytes, zero padding bits must be inserted at the end of these pa- rameters to byte-align the rest of the payload. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |Payload Header | Optional Extension| Opt. parameters | Media | | | data | as indicated by |.........| | | | SLConfigDesc | payload | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 4 Portrait of the unified approach for transport of ES and SL packetized streams. In scenarios where the sync layer is used without a need for further protection, the payload will be as illustrated in Figure 5. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |G|E|F| res | optional SL header paramaters as indicated by . +-+-+-+-+-+-+-+-+ the SLConfigDescriptor . | . . Media payload | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 5 Sample RTP payload for SL-PDU transport. 5 Examples of payload headers 5.1 The payload contains Extension data followed by one object containing AU data First payload header: G=1, E=1, so F not present, FOFFSET not present; Second payload header: G=0, E=0, F=0, XT not present, res present, FOFFSET not present (F=0). last object (G=0) containing AU data in the payload, so the length field is not present. C.Guillemot, P.Christ, S.Wesner, A. Klemets [Page 10] INTERNET-DRAFT draft-guillemot-genrtp-01.txt June 25, 1999 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |G|E| X T | LENGTH | TSOFFSET . +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ . TSOFFSET(cnt")| Extension Data . +-+-+-+-+-+-+-+-+ . . . +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |G|E|F| res | . +-+-+-+-+-+-+-+-+ . . . . AU data . . . +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 6 RTP Payload Example 1. 5.2 The payload contains Extension data followed by 2 AU"s First payload header: G=1, E=1, so F field not present Second payload header: G=1, E=0, F=0, XT not present, res present, first object containing AU data in the payload, so TSOFFSET is not present. Third payload header: G=0, E=0, F=0, XT field not present, Last object in the payload, so LENGTH field not present 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |G|E| X T | LENGTH | TSOFFSET . +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ .TSOFFSET(cnt") | Extension Data . +-+-+-+-+-+-+-+-+ . . . +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |G|E|F| res | LENGTH | . +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ . . . . AU data . . . +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |G|E|F| res | . +-+-+-+-+-+-+-+-+ . . . . AU data . . . +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 7 RTP Payload Example 2 C.Guillemot, P.Christ, S.Wesner, A. Klemets [Page 11] INTERNET-DRAFT draft-guillemot-genrtp-01.txt June 25, 1999 6 Usage of Extension data field for redundant data All AU-level decoder configuration information can be considered as information of high priority, since, if lost, the whole AU is lost. In addition, it does not tolerate increased latency. The extension data field may hence contain duplicated data (e.g. du- plicated headers) in "n" consecutive packets. The parameter "n" may be chosen so that the probability that "n" consecutive packets are lost is below a given threshold. But these decision mechanisms are outside the scope of this document. As a special type of FEC, it has been proposed in [5] to use, lower rate, secondary encoding of the media data to be protected. The mechanism described above is directly useable for the transport of secondary compressed streams along with primary compressed data. Note that the secondary compressed stream can also be a lower layer (with a lower rate) of a scaleable compression scheme, such as specified in [9] and [11] for respectively video and audio. 7 Extension data field for FEC data 7.1 Extension data field for Parity Codes The Extension data field can be used for transporting FEC (parity codes) data in the spirit of [12]. The XT field is set at to the type associated to the FEC mechanism (parity codes) used. The XT field semantic, with all the FEC mechanisms supported, is announced via a non-RTP out of band signaling, such as SDP [10], with appropriate extensions. Then the FEC mechanisms can, during the ses- sion, and depending on the segment type, and on the network charac- teristics, be adapted with a simple in-band signaling. The FEC operation, as defined in [12], acts on a stream of media packets without extension data, and generates a stream of FEC pack- ets. The media payload of the above media packets is then encapsu- lated in the object containing the AU data. The FEC header and FEC data are encapsulated in the extension data field. The extension data length field is set to the length of the FEC header plus FEC payload. The FEC header in the case of parity codes is given in Figure 8. It is inspired from the header specified in [12], with the following modifications: 1)- the PT recovery field is not used, since the payload type of the packets transported in a given channel is supposed to be known, namely to be of the type corresponding to this proposed payload; 2)- a R bit has been added in order to protect the marker bit of the media packets; 3)- In order for the FEC header to be byte-aligned, it is also proposed to reduce the mask length by 2 bits (22 bits instead of 24). This should be acceptable, since 24 bits induces a very high delay. C.Guillemot, P.Christ, S.Wesner, A. Klemets [Page 12] INTERNET-DRAFT draft-guillemot-genrtp-01.txt June 25, 1999 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | SN Base | length recovery | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |E|R| Mask | . +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ . TS Recovery | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 8 FEC Header for Parity Codes On the receiver side, the FEC packets will be reconstructed as de- fined in [12], by copying the sequence number, SSRC, CC field, RTP version and extension bit from the RTP header of the packets re- ceived. The fields SN base, E, Mask, TS recovery of the FEC header are de- fined as in [12]. The bit R is the Marker recovery bit. The marker bit is computed from the RTP media packets marker bits M, to which is applied the protection operation. The Length Recovery field determines the length of the recovered packets and is here computed via the protection operation applied to the 16 bit natural binary representation of the lengths (in bytes) of the media payload, CSRC list, extension and padding of media packets associated with this FEC data, PLUS THE MARKER BIT. The length recovery field makes it possible to apply the procedure to media packets that are not of the same length. When the extension data carries this type of FEC, then the TSOFFSET of the extension data header is not used and should be set to zero. The protection also applies to sync layer parameters when present in the payload of the media packets. The advantage of the approach - with respect to having separate FEC packets - is a reduced overhead for sending the FEC data. It is also proposed to allocate 3 Extension Types to parity codes with 3 different default masks in order to reduce the overhead of the FEC header which would therefore become as in Figure 9 below: C.Guillemot, P.Christ, S.Wesner, A. Klemets [Page 13] INTERNET-DRAFT draft-guillemot-genrtp-01.txt June 25, 1999 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | SN Base | length recovery | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |E|R| res | TS Recovery . +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ . | +-+-+-+-+-+-+-+-+ Figure 9 Simplified FEC Header for Parity Codes (with default masks) The Extension data field can be used for transporting FEC (parity codes) data in the spirit of [13]. The XT field is set at to the type associated to the FEC mechanism (parity codes) used. The XT field semantic, with all the FEC mechanisms supported, is announced via a non-RTP out of band signaling, such as SDP [10], with appro- priate extensions. The FEC operation, as defined in [13], acts on a stream of media packets without extension data, generating a stream of FEC packets. The media payload of the above media packets is then encapsulated in the object containing the AU data. The FEC header and FEC data are encapsulated in the extension data field. The extension data length field is set to the length of the FEC header plus FEC payload. The FEC header for Reed-Solomon codes is provided in figure 10. It is inspired from the header specified in [13], with the following modifications: 1)- the PT recovery field is not used, since the pay- load type of the packets transported in a given channel is supposed to be known, namely to be of the type corresponding to this proposed payload; 2)- a R bit has been added in order to protect the marker bit of the media packets; 3)- In order for the FEC header to be byte-aligned, it is also proposed to reduce the length of the K field to 6 bits instead of 8 bits. Indeed, 8 bits would allow to process 256 media packets inducing a very high delay. The length of the N field is also reduced to 7 bits (corresponding to the maximum code rate of 1/2) instead of 8 bits, and accordingly reduce the length of the i field from 8 to 6 bits, since the i field indicates the position of the packet within the N-K FEC packets.4)- A P field has been added allowing for interleaving in order to create a FEC code capable of correcting longer bursts of packet losses. The P field defines the interleaving periodicity minus 1, as illustrated in figure 11 below for the special case of P=7. C.Guillemot, P.Christ, S.Wesner, A. Klemets [Page 14] INTERNET-DRAFT draft-guillemot-genrtp-01.txt June 25, 1999 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | SN Base | length recovery | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |E|R| N | k | i | P |TS Recovery . +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ . TS Recovery (cnt"d) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 10 FEC Header for Reed-Solomon Codes When the extension data carries this type of FEC, then the TSOFFSET of the extension data header is not used and should be set to zero. The advantage of the approach - with respect to having separate FEC packets - is a reduced overhead for sending the FEC data. +-----+-----+-----+-----+-------+-----+ | 1 | 8 | 15 | 22 | ... |mn-6 | +-----+-----+-----+-----+-------+-----+ | 2 | 9 | 16 | 23 | ... |mn-5 | +-----+-----+-----+-----+-------+-----+ . . +-----+-----+-----+-----+-------+-----+ | 7 | 14 | 21 | 28 | ... |mn | +-----+-----+-----+-----+-------+-----+ Figure 11 Example of Interleaving (for P=7) 7.2 Extension data field for RS-based Unequal Error Protection of typed segments. The separation of data with different priority levels into separate packets, in order to apply different levels of protection, is not always feasible. Indeed, in most situations, these data of different priority levels are not independently decodable. In this case, these typed segments corresponding to different degradation priorities should be grouped into one packet as shown in figure 11. The group- ing mechanism provided by the payload format allows to do so. The scheme assumes a fixed pattern in terms of number of objects carrying AU data in one packet, e.g. 3 in the example of figure 11 below. The protection operation, as described above for Reed-Solomon, ap- plies on objects in consecutive packets transporting partial AUs of same type (typed segments of same priority). C.Guillemot, P.Christ, S.Wesner, A. Klemets [Page 15] INTERNET-DRAFT draft-guillemot-genrtp-01.txt June 25, 1999 +-----------------+-----------------+-----------------+ |AU data object | AU data object | AU data object |RTP packet 1 +-----------------+-----------------+-----------------+------ |Typed segment 1 | Typed segment 2 | Typed segment 3 |RTP packet 2 +-----------------------------------------------------+------ . . . .RTP packet 3 +-----------------+-----------------+-----------------+------ . . . ... +-----------------+-----------------+-----------------+------ | R-S | | .RTP packet i ------------------+-----------------+-----------------+------ | | | | ------------------------------------------------------+------ | K1/N | R-S | | ------------------------------------+-----------------+------ | | K2/N | R-S | ------------------------------------------------------------- | | | K3/N |RTP packet n +-----------------+-----------------+-----------------+ Figure 12: Example of data organization for RS-based UEP. 8 Multiplexing MPEG-4 applications can involve a large number of ESs, and thus also a large number of RTP sessions. A multiplexing scheme allowing se- lective bundling of ES may therefore be necessary for some applica- tions. The multiplexing problem is outside the scope of this payload format and can be solved by using a generic solution as defined in [14]. 9 Security Considerations RTP packets transporting information with the proposed payload for- mat are subject to the security considerations discussed in the RTP specification [1]. This implies that confidentiality of the media streams is achieved by encryption. If the entire stream (extension data and AU data) is to be secured and all the participants are expected to have the keys to decode the entire stream, then the encryption is performed in the usual manner, and there is no conflict between the two operations (encapsulation and encryption). The need for a portion of stream (e.g. extension data) to be en- crypted with a different key, or not to be encrypted, would require application level signaling protocols to be aware of the usage of the XT field, and to exchange keys and negotiate their usage on the media and extension data separately. C.Guillemot, P.Christ, S.Wesner, A. Klemets [Page 16] INTERNET-DRAFT draft-guillemot-genrtp-01.txt June 25, 1999 10 Authors Addresses Christine Guillemot INRIA Campus Universitaire de Beaulieu 35042 RENNES Cedex, FRANCE email: Christine.Guillemot@irisa.fr Paul Christ Computer Center - RUS University of Stuttgart Allmandring 30 D70550 Stuttgart, Germany. email: Paul.Christ@rus.uni-stuttgart.de Stefan Wesner Computer Center - RUS University of Stuttgart Allmandring 30 D70550 Stuttgart, Germany. email: wesner@rus.uni-stuttgart.de Anders Klemets 1 Microsoft Way Redmond, WA 98052-6399 USA. E-mail: anderskl@microsoft.com C.Guillemot, P.Christ, S.Wesner, A. Klemets [Page 17] INTERNET-DRAFT draft-guillemot-genrtp-01.txt June 25, 1999 11 References [1] H. Schulzrinne, S. Casner, R. Frederick, V. Jacobson "RTP: A Transport Protocol for Real Time Applications", RFC 1889, Internet Engineering Task Force, January 1996. [2] A. Klemets, "Common Generic RTP Payload Format", draft-klemets generic-rtp-00, March 13, 1998. [3] A. Periyannan, D. Singer, M. Speer, "Delivering Media Generi- cally over RTP", draft-periyannan-generic-rtp-00, March 13, 1998 [4] ISO/IEC 14496-1 FDIS MPEG-4 Systems November 1998 [5] C. Perkins, I. Kouvelas, O. Hodson, V. Hardman, M. Handley, J. Bolot, A. Vega-Garcia, S. Fosse-Parisis, "RTP Payload for Re- dundant Audio Data", draft-ietf-avt-redundancy-revised-00.txt, 10-Aug-98 [6] C. Zhu, "RTP payload format for H.263 Video Streams", RFC 2190. [7] C. Borman, L. Cline, G. Deisher, T. Gardos, C. Maciocco, D. Newell, J. Ott, S. Wenger, C. Zhu, "RTP payload format for the 1998 version of ITU-T Rec. H.263 video (H.263+)", draft-ietf- avt-rtp-h263-video-02.txt, 7-May-98. [8] D. Hoffman, G. Fernando, V. Goyal, M. Civanlar, "RTP Payload format for MPEG1/MPEG2 video", RFC 2250, January 1998. [9] ISO/IEC 14496-2 FDIS MPEG-4 Visual November 1998 [10] Mark Handley, Van Jacobson, "SDP:Session Description Protocol", draft-ietf-mmusic-sdp-07.txt, 2nd Apr 1998. [11] ISO/IEC 14496-3 FDIS MPEG-4 Audio November 1998. [12] J. Rosenberg, H. Schulzrinne, "An RTP Payload format for Generic Forward Error Correction", draft-ietf-avt-fec-05.txt, 26 Feb. 1999. [13] J. Rosenberg, H. Schulzrinne, "An RTP Payload format for Reed Solomon Codes", draft-ietf-avt-reedsolomon-00.txt, 3 November 1998. [14] M. Handley, "GeRM: Generic RTP Multiplexing," work in progress, draft-ietf-avt-germ-00.txt, November 1998. [15] S. Bradner, Key words for use in RFCs to Indicate Requirement Levels, RFC 2119, March 1997. C.Guillemot, P.Christ, S.Wesner, A. Klemets [Page 18]