HTTP/1.1 200 OK Date: Tue, 09 Apr 2002 01:02:18 GMT Server: Apache/1.3.20 (Unix) Last-Modified: Tue, 07 Apr 1998 05:43:37 GMT ETag: "2e7a1b-7cdc-3529bd09" Accept-Ranges: bytes Content-Length: 31964 Connection: close Content-Type: text/plain Internet Engineering Task Force Audio Video Transport WG Internet Draft J.Rosenberg,H.Schulzrinne draft-ietf-avt-fec-02.txt Bell Laboratories,Columbia U. March 9, 1998 Expires: September 9, 1998 An RTP Payload Format for Generic Forward Error Correction STATUS OF THIS MEMO This document is an Internet-Draft. Internet-Drafts are working docu- ments of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute work- ing documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference mate- rial or to cite them other than as ``work in progress''. To learn the current status of any Internet-Draft, please check the ``1id-abstracts.txt'' listing contained in the Internet-Drafts Shadow Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe), munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or ftp.isi.edu (US West Coast). Distribution of this document is unlimited. 1 Abstract This document specifies a payload format for generic forward error correction of media encapsulated in RTP. It is engineered for FEC algorithms based on the exclusive or (parity) operation, although it can be used with other techniques. The payload format allows end sys- tems to transmit using arbitrary block lengths and parity schemes. It also allows for the recovery of both the payload and critical RTP header fields. Since FEC is sent as a separate stream, it is back- wards compatible with non-FEC capable hosts, so that receivers which do not wish to implement FEC can just ignore the extensions. 2 Background The quality of packet voice on the Internet has been mediocre due, in part, to high packet loss rates. This is especially true on wide-area connections. Unfortunately, the strict delay requirements of real- J.Rosenberg,H.Schulzrinne [Page 1] Internet Draft Generic FEC March 9, 1998 time multimedia usually eliminate the possibility of retransmissions. It is for this reason that forward error correction (FEC) has been proposed to compensate for packet loss in the Internet [1] [2]. In particular, the use of traditional error correcting codes, such as parity, Reed-Solomon, and Hamming codes, has attracted attention. To support these mechanisms, protocol support is required. Budge, McKenzie, Mills, Diss, and Long have proposed a payload format for RTP which allows for the encapsulation of FEC-protected media on top of RTP [3]. We briefly summarize their proposal, and urge the reader to consult their draft for more details. They define a new RTP payload type which identifies the packet contents as FEC-protected media. The RTP payload format in their proposal consists of two ele- ments, the media-correction header and the payload. The media- correction header is 24 bits, and consists of three fields. The first is called the scheme, the second the mode, and the third, the length. The scheme identifies the particular error correction scheme in use. In particular, it defines the set of data packets over which the FEC is applied, and the order in which the packets (data and FEC) are sent. The mode identifies which packet in a group of data and FEC packets (typically called a block) this particular one corresponds to. For packets that contain just data (and not FEC), the length field contains the length of the payload. For packets which contain FEC, the length field contains the xor of the length fields of the packets which are covered by the FEC. Since packets must be padded out with zeroes (to be equal lengths) in order to perform the xor operation, the length field allows recovery of the actual length of the pre-padded packets. 3 Motivation The payload format proposed in [3] works quite well, but has a number of drawbacks: oIt does not indicate the media type of the actual data being pro- tected. This is because the RTP PT field always indicates that the payload format is "FEC-protected media". Since many applications will need to change media payload types mid-stream (for example, sending DTMF tones in-band), the presence of this field is impor- tant. oThe RTP timestamp field and marker bit are not covered by FEC. When a packet is lost and then reconstructed, the timestamp and marker bits are copied from another packet. Correct recovery of these fields is important. oIt defines four very specific schemes (one of which is no error correction), and assigns a value for the scheme field in the header to each. New schemes must be registered with IANA, the details written up, and receivers and senders alike must be upgraded to recognize and support them. This makes backwards J.Rosenberg,H.Schulzrinne [Page 2] Internet Draft Generic FEC March 9, 1998 compatibility difficult, requiring capabilities negotiation. It also means that transmitters are restricted to using the schemes defined thus far. The three non-null schemes defined in [3] use heavy forward error correction. These schemes are not appropriate for all loss conditions. It is our aim to generalize the payload format for forward error protec- tion. To do this, the details of the scheme are transmitted inside the data packets with minimal overhead. This allows sender-based adaptation of the FEC schemes. This adaptation can be static or dynamic, and based on any information available at the sender. Changing schemes mid-stream is then trivially supported, whereas special protocol support is required in [3]. Capability exchanges are avoided, simplifying the pro- tocol and eliminating compatibility problems. 4 Protocol Overview Before discussing the protocol, we define a few terms for clarity. A media payload is a piece of raw, un-protected user data. A media header is the RTP header for the media payload. The combination of a media payload and media header is called a media packet. The forward error correction algorithms at the transmitter take the media packets as an input. They output both the media packets that they are passed, and new packets called FEC packets. The FEC packet contains an FEC header and FEC payload. Each FEC packet is said to be associated with one or more media packets when those media packets are used to gener- ate the FEC packet (by use of the exclusive or operation, for exam- ple). The FEC packets are sent as a separate stream from the media packets. This implies that the FEC packets have their own sequence number and timestamp space. The media packet stream is essentially unaffected by the use of FEC. This allows the two to be sent on a separate multi- cast group, so that non-FEC receivers can ignore the FEC and just receive the original media. The separation also allows for coherent values for the sequence numbers and timestamps. This document does not prescibe the definition of separate streams, but leaves this to applications and higher level protocols to define. For multicast, the separate stream may be implemented by separate multicast groups, different ports in the same group, or by a differ- ent SSRC within the same group/port. For unicast, different ports or different SSRC may be used. Each of these approaches has drawbacks and benefits which depend on the application. At the receiver, arriving FEC and media packets are used to generate a stream of media packets for direct use by the application. This results in a clean separation of error protection from the J.Rosenberg,H.Schulzrinne [Page 3] Internet Draft Generic FEC March 9, 1998 application. The protocol operates by assuming that the error correction algorithm works by applying some function f to one or more media payloads, which are specified as the arguments to f. The result of this func- tion is an FEC payload. When the function is applied to just a single media payload, the result is that media payload (f(a) = a). When the function is applied to multiple media payloads, the result is some combination of those payloads (the exclusive or would be defined as: f(a,b,c) = a xor b xor c). We assume f can combine any number of pay- loads, each with arbitrary lengths. Recovery is possible if a suffi- cient number of FEC and media packets are received. Sufficiency depends on the reception of N packets (media or FEC) which contain linearly independent combinations of at most N media packets. For example, consider the case where f is the exclusive or. Media packets w,x,y, and z, with media payloads a,b,c and d are to be transmitted. Pairs of media payloads will be xor'ed together to gen- erate the FEC payloads. We would denote the resulting network packet stream as: a, b, f(a,b), c, d, f(c,d) In this example, the error correction scheme introduces a 50% over- head. But if b is lost, a and f(a,b) can be used to recover b. The way in which the various schemes differ is in the set of media payloads over which the exclusive-or (or more generally, f(.)) is applied, and the order in which the resulting packets are sent. For example, Budge et. al. describe four schemes, 0, 1, 2, and 3 which take media payloads a,b,c,d, etc., and generate FEC payloads as fol- lows: Scheme 0 -------- This scheme is null, and has no error correction. The scheme is formally defined as: a,b,c,d, ... -> a, b, c, d, .... Scheme 1 -------- J.Rosenberg,H.Schulzrinne [Page 4] Internet Draft Generic FEC March 9, 1998 This scheme is the similar to the one in the example above. The switching of the positions of f(b) and f(a,b) allow some bursts of two consecutive packet losses to be recovered. It is defined as: a,b,c,d,e,f -> a, f(a,b), b, f(b,c), c, f(c,d), d, ... Scheme 2 -------- This scheme allows for recovery of all single packet losses and some consecutive packet losses, but with less overhead than scheme 1: a,b,c,d,e,f,g -> f(a,b),f(a,c),f(a,b,c),f(c,d),f(c,e),f(c,d,e)... Scheme 3 -------- This scheme requires 4 packet delays to recover the original media payloads, but it can recover from 1,2, or 3 consecutive packet losses: a,b,c,d,e,f -> f(a),f(b),f(a,b,c),f(c),f(a,c,d),f(a,b,d),f(d), ... In order to decode the FEC payloads to media payloads, all that is necessary is for the receiver to know the function being applied, and the set of media payloads in each FEC payload to which it is applied. This is exactly the information provided by the payload format. To determine the function f() being used, the payload type for the FEC packets only is set to indicate this function. This payload type may either be static or dynamic, just like any other payload type. To determine which packets are associated with the FEC packet, a field is present in the FEC header, called the offset mask. Assume this mask is M bits. If bit k in the mask is set to 1, then the media packet with sequence number N + k is associated with this FEC packet, where N is the sequence number base in the FEC packet header. This effectively allows an FEC packet to be associated with any of the M packets before and after it. The offset mask and payload type are sufficient to signal arbitary forward error correction schemes with little overhead. 5 Protocol Specifics The following section fills in the details based on the general dis- cussion above. J.Rosenberg,H.Schulzrinne [Page 5] Internet Draft Generic FEC March 9, 1998 5.1 RTP Media Packet Structure The media packets and FEC packets are sent as separate streams. The media packets are unaffected by FEC, and are sent in the same fashion they would be sent if there were no FEC. This lends to a very efficient encoding. When little (or no) FEC is used, there are mostly media packets being sent. This means that the overhead (present in FEC packets only) tracks the amount of FEC in use. 5.2 RTP FEC Packet Structure When a packet is to be transmitted which contains FEC data (i.e., its payload is derived from one or more media payloads), the RTP header is followed by an FEC header. We first discuss the semantics of the RTP header fields. The version field is set to 2. The padding bit is computed via the protection operation, defined below. The extension bit is also com- puted via the protection operation. The SSRC value should generally be the same as the SSRC value of the media stream it protects. It may be different if the FEC stream is being demultiplexed via the SSRC value. The CC value is computed via the protection operation. The CSRC list is never present, independent of the value of the CC field. The extension is never present, independent of the value of the X bit. The marker bit is computed via the protection operation. The sequence number has the standard definition: it is one higher than the sequence number in the previously transmitted FEC packet. The timestamp is set to be the minimum of all the timestamps of the media packets associated with the FEC packet. This allows the times- tamps of sequential FEC packets to usually increment, although its behavior for different FEC schemes will vary. The payload type for the FEC packet is set as described above. End systems which cannot recognize a payload type must discard it anyway. This provides backwards compatibility. The FEC mechanisms can then be used in a multicast group with mixed FEC-capable and FEC-uncapable receivers. Following the RTP header is the FEC header. This header is nominally 64 bits, and may be optionally extended to 96 bits. The format of the header is as follows: +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | length recovery |E| PT recovery | TS recovery | J.Rosenberg,H.Schulzrinne [Page 6] Internet Draft Generic FEC March 9, 1998 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | SN Base | mask | +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | Additional Offset Mask | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ The length recovery field is used to determine the length of any recovered packets. It is computed via the protection operation applied to the 16 bit natural binary representation of the lengths (in bytes) of the media payload, CSRC list, extension and padding of media packets associated with this FEC packet (in other words, the CSRC list, extension, and padding, if present, are counted as part of the payload). This allows for the FEC procedure to be applied even when the lengths of the media packets are not identical. For example, assume an FEC packet is being generated by xor'ing two media packets together. The length of the two media packets are 3 (0b011) and 5 (0b101) bytes, respectively. The length recovery field is then encoded as 0b011 xor 0b101 = 0b110. The E bit indicates a header extension. When set to 1, it indicates that an additional 32 bits of header follow. The PT recovery field is obtained via the protection operation applied to the payload types of the media packets associated with the FEC packet. The mask field is either 16 bits or 48 bits, depending on the value of E. If bit k in the mask is set to 1, then the media packet with sequence number N + k is associated with this FEC packet, where N is the SN Base field in the FEC packet header. The SN base field is always set to the minimum sequence number of those media packets pro- tected by FEC. This allows for the FEC operation to extend over any string of at most 16 packets, whose range in sequence numbers is no more than 16. The TS recovery field is computed via the protection operation applied to the least significant 8 bits of the timestamps of the media packets associated with this FEC packet. This allows the high frequency component of the timestamp to be recovered via FEC, while the low frequency component can be determined via interpolation. The payload of the FEC packet is the protection operation applied to the CSRC List plus payloads of the media packets associated with the FEC packet. 5.3 Protection Operation J.Rosenberg,H.Schulzrinne [Page 7] Internet Draft Generic FEC March 9, 1998 The protection operation involves taking a variety of fields from the various headers, adding the payloads, appending the whole thing together, padding zeroes, and then computing the f() operator across the resulting binary array. The result is then placed into the FEC packet. For each media packet to be protected, a binary array is generated by appending the following fields together: oPadding Bit (1 bit) oExtension Bit (1 bit) oCC bits (3 bits) oMarker bit (1 bit) oPayload Type (7 bits) oLeast Significant 8 bits of Timestamp (8 bits) oNatural binary representation of the length of the CSRC List plus padding plus payload plus extension of the media packet (16 bits) oCSRC List (if CC is 1), else null (variable) oHeader Extension (if X is 1), else null (variable) othe payload (variable) oPadding, if present (variable) If the lengths of the binary arrays are not equal, they are padded with zeroes to be the length of the longest binary array. The f() operator is then applied across the binary arrays. The result is the binary array of the FEC packet. The first bit in the FEC packet binary array is written into the Padding Bit of the FEC packet. The second bit in the FEC packet binary array is written into the Extension bit of the FEC packet. The next three bits of the FEC packet binary array are written into the CC field of the FEC packet. The next bit of the FEC packet binary array is written into the marker bit of the FEC packet. The next 7 bits of the FEC packet binary array are written into the PT recovery field in the FEC packet header. The next 8 bits of the FEC packet binary array are written into the TS recovery field in the packet header. The next 16 bits are written into the Length Recovery field in the FEC packet header. The remaining bits J.Rosenberg,H.Schulzrinne [Page 8] Internet Draft Generic FEC March 9, 1998 are set to be the payload of the FEC packet. 5.4 Recovery Procedures The FEC packets allow end systems to recover from the loss of media packets. All of the header fields of the missing packets, including CSRC lists, extensions, padding bits, marker and payload type, are recoverable. This section describes the procedure for performing this recovery. Recovery requires two distinct operations. The first determines which packets (media and FEC) must be combined in order to recover a miss- ing packet. Once this is done, the second step is to actually recon- struct the data. The second step must be performed as described below. The first step can be based on any algorithm chosen by the implementor. Different algorithms result in a tradeoff between com- plexity and the ability to recover missing packets if at all possi- ble. 5.4.1 Reconstruction Let T be the list of packets (FEC and media) which can be combined to recover some media packet xi. xi is recovered by applying the inverse of f to the other received packets in T. The procedure is as follows: 1. For the media packets in T, compute the binary array as described in the previous section. 2. For the FEC packets in T, compute the binary array in the same fashion, except always set the CSRC list, extension, and pad- ding to null. 3. If the resulting binary arrays are not of equal length, pad them with zeroes to be the length of the longest binary array. 4. Apply the f() inverse operator to the binary arrays, resulting in a recovery array. 5. Create a new packet with the standard 12 byte header and no payload. 6. Set the version of the new packet to 2. 7. Set the Padding bit in the new packet to the first bit in the recovery array. 8. Set the Extension bit in the new packet to the second bit in the recovery array. J.Rosenberg,H.Schulzrinne [Page 9] Internet Draft Generic FEC March 9, 1998 9. Set the CC field to the next three bits in the recovery array. 10. Set the marker bit in the new packet to the next bit in the recovery array. 11. Set the payload type in the new packet to the next 7 bits in the recovery array. 12. Set the SN field in the new packet to xi. 13. Set the most significant 24 bits of the timestamp in the new packet to the most significant 24 bits of a nearby packet (preferably the previous). Set the least significant 8 bits of the timestamp to the next 8 bits in the array. If the differ- ence in timestamps between the new packet and the previous (or several packets ago) is negative, there was a rollover, so increment the most significant 24 bits in the timestamp of the new packet. 14. Take the next 16 bits of the recovery array. Whatever the nat- ural binary number this corresponds to, take that many bytes from the recovery array and append them to the new packet. This represents the CSRC list, extension, payload, and pad- ding. 15. Set the SSRC of the new packet to the SSRC of the media stream its protecting This procedure will completely recover both the header and payload of an RTP packet with high probability (the TS value may be incorrect when large timestamp increments are present). 5.4.2 Determination of when to recover The previous section discussed how to recover a media packet with sequence number xi when all of the packets needed to recover it were available. The decision about whether to attempt recovery of some media packet, and how to determine if sufficient data is available to recover it, is left to the implementor. However, this section pro- vides a simple algorithm which may be used for this purpose. The algorithm is based on maintaining an array A of Z bits, and stor- ing up to Z packets. This array is used as a ring buffer to hold past packets. The receiver must also maintain a few state variables. SNmin represents the sequence number of the oldest media packet in the array, and SNmax represents the largest. Pmin is an index into the array which indicates the position of the packet with sequence number SNmin. As media packets arrive, they are checked off in the array. J.Rosenberg,H.Schulzrinne [Page 10] Internet Draft Generic FEC March 9, 1998 When an FEC packet arrives, its mask is checked against the array. This allows the receiver to make the determination about whether a packet can be recovered with the FEC. Initially, all Z bits are set to zero. Assume 1 media packet has been received, so that SNmin is set to its sequence number, as is SNmax. Pmin is set to zero, and A[Pmin] is set to 1. In the discussion that follows, we assume that the ring buffer is actually an infinite array. A real implementation will need to worry about wrap-arounds, but we will ignore this for clarity. When a media packet arrives, its sequence number B is examined. If it is less than SNmin, no operation is taken, as the packet falls below the window. If it is more than SNMin + Z - 1, delta (a local vari- able) is set to B - SNMin - Z + 1. SNmax and SNMin are incremented by delta. All of the bits in the array between the new and old SNMax are set to zero. If B is between SNMin and SNmax, position B is set to 1, indicating that this packet is present. The packet is also stored at the receiver. When an FEC packet arrives, the SN base field is retrieved. This is used as an index into the array. From that point in the array, a bit by bit comparison is made with the mask. If each 1 in the mask aligns with a 1 in the array, no operation is taken. If each 1 in the mask aligns with a 1 in the array, except for a single 1 in the mask which aligns with a zero in the array (assume this is in the array at posi- tion k), recovery of the packet with sequence number k can take place as described above. Otherwise, no action is taken. This simple recovery algorithm works efficiently for schemes which send a single FEC packet, and it will operate at lower effectiveness for schemes which send more. Clearly, storage of previously received FEC packets, and reexamination of them when later recoveries are made, can vastly improve performance. Such procedures are left to implementors. 5.5 Example Consider 2 media packets to be sent, x and y. We wish to protect them by sending one FEC packet which is derived from x and y. The f opera- tor is implemented using xor. The three packets are: Media Packet x -------------- Version: 2 (10) Padding: 0 (0) J.Rosenberg,H.Schulzrinne [Page 11] Internet Draft Generic FEC March 9, 1998 Extension: 0 (0) Marker: 0 (0) PTI: 11 (01011) SN: 8 (1000) TS: 3 (011) SSRC: 2 (10) The payload length is 10 bytes. Media Packet y -------------- Version: 2 (10) Padding: 0 (0) Extension: 0 (0) Marker: 1 (1) PTI: 18 (10010) SN: 9 (1001) TS: 5 (101) SSRC: 2 (10) The payload length is 11 bytes. The FEC packet is then: FEC Packet (contains a xor b) ----------------------------- Version: 2 (10) Padding: 0 (0) Extension: 0 (0) Marker: 1 (1) (NOTE: 0 xor 1 = 1) PTI: 191 (NOTE: Assume PTI 191 implies XOR FEC) SN: 1 TS: 3 SSRC: 2 (10) len. rec.: 1 (1) (NOTE: 10 xor 11 = 1010 xor 1011 = 0001) PTI rec.: 24 (11001) SN base: 8 TS rec.: 6 (110) (3 xor 5 = 6) E: 0 (0) mask: 3 (0000000000000011) J.Rosenberg,H.Schulzrinne [Page 12] Internet Draft Generic FEC March 9, 1998 The payload length is 11 bytes. 6 Use with Redundant Encodings One can consider an FEC packet as a redundant coding of the media. Because of this, the payload format for encoding of redundant audio data [4] can be used to carry the FEC data along with the media. The procedure for this is simple. In some media packet, the payload type is set to the value for redundant encodings. The secondary coder is then set to be the FEC data. This means that the PTI of the secondary coder is set to the PTI value which indicates FEC. The block length of the secondary coder is set to the length of the FEC header and payload. The timestamp offset is set to the difference between the media timestamp and the timestamp from the FEC packet. The secondary coder payload includes the FEC header and FEC payload. This procedure only works if an FEC packet is sent after at least one of the media packets it is associated with has been sent. Otherwise, the timestamp offset would be negative, which is not allowed. Using the redundant encodings payload format also implies that the marker bit cannot be recovered. An advantage of this approach is a reduction in the overhead for sending FEC packets. 7 Open Issues There are a number of open issues to be resolved. The change in defi- nition of the RTP header fields will affect many of the parameters sent in RTCP packets. For example, jitter computations may have to exclude FEC packets. Octet counts and number of transmitted packets probably should include FEC, however. 8 Conclusion This draft has presented a new payload format which allows for for- ward error correction of audio visual media. It is generic, allowing any sender defined error correction schemes to be used which meets the required criteria (any xor based strategy meets the criteria). It is also backwards compatible with existing implementations. Receivers which cannot understand FEC can discard the FEC packets, and still receive the media packets. 9 Security Considerations J.Rosenberg,H.Schulzrinne [Page 13] Internet Draft Generic FEC March 9, 1998 There are no security considerations beyond those discussed in [5] and [6]. 10 Author's Addresses Jonathan Rosenberg Lucent Technologies, Bell Laboratories 101 Crawfords Corner Rd. Holmdel, NJ 07733 Rm. 4C-526 email: jdrosen@bell-labs.com Henning Schulzrinne Columbia University M/S 0401 1214 Amsterdam Ave. New York, NY 10027-7003 email: schulzrinne@cs.columbia.edu 11 Bibliography [1] J.-C. Bolot and A. Garcia, The case for fec-based error control for packet audio in the internet, Multimedia Systems , 1997. [2] C. Perkins, Options for repair of streaming media, Internet Draft, Internet Engineering Task Force, Aug. 1997. Work in progress. [3] D. Budge, R. McKenzie, W. Mills, and P. Long, Media-independent error correction using rtp, (internet draft), Internet Engineering Task Force, May 1996. Work in Progress. [4] C. Perkins, I. Kouvelas, V. Hardman, M. Handley, J.-C. Bolot, A. Vega-Garcia, and S. Fosse-Parisis, RTP payload for redundant audio data, Internet Draft, Internet Engineering Task Force, Mar. 1997. Work in progress. [5] H. Schulzrinne, RTP profile for audio and video conferences with minimal control, Request for Comments (Proposed Standard) 1890, Internet Engineering Task Force, Jan. 1996. [6] H. Schulzrinne, S. Casner, R. Frederick, and V. Jacobson, RTP: a transport protocol for real-time applications, Request for Comments (Proposed Standard) 1889, Internet Engineering Task Force, Jan. 1996. J.Rosenberg,H.Schulzrinne [Page 14] Internet Draft Generic FEC March 9, 1998 J.Rosenberg,H.Schulzrinne [Page 15]