Internet Engineering Task Force                 Audio Video Transport WG
Internet Draft                                 J.Rosenberg,H.Schulzrinne
draft-ietf-avt-reedsolomon-00.txt          Bell Laboratories,Columbia U.
November 3, 1998
Expires: May 2, 1999


                An RTP Payload Format for Reed Solomon Codes

STATUS OF THIS MEMO

   This document is an Internet-Draft. Internet-Drafts are working docu-
   ments of the Internet Engineering Task Force (IETF), its areas, and
   its working groups.  Note that other groups may also distribute work-
   ing documents as Internet-Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as ``work in progress''.

   To learn the current status of any Internet-Draft, please check the
   ``1id-abstracts.txt'' listing contained in the Internet-Drafts Shadow
   Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe),
   munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or
   ftp.isi.edu (US West Coast).

   Distribution of this document is unlimited.


                                 ABSTRACT


         This document specifies a payload format for forward
         error correction of media encapsulated in RTP using Reed
         Solomon codes. The payload format allows end systems to
         transmit using arbitrary block lengths and codes. It also
         allows for the recovery of both the payload and critical
         RTP header fields. Since FEC is sent as a separate
         stream, it is backwards compatible with non-FEC capable
         hosts, so that receivers which do not wish to implement
         FEC can just ignore the extensions.

1 Introduction


J.Rosenberg,H.Schulzrinne                                     [Page 1]

Internet Draft                 RS Coding                November 3, 1998


   The quality of packet voice on the Internet has been mediocre due, in
   part, to high packet loss rates. This is especially true on wide-area
   connections. Unfortunately, the strict delay requirements of real-
   time multimedia usually eliminate the possibility of retransmissions.
   It is for this reason that forward error correction (FEC) has been
   proposed to compensate for packet loss in the Internet [1] [2]. In
   particular, the use of traditional error correcting codes, such as
   parity, Reed-Solomon, and Hamming codes, has attracted attention. To
   support these mechanisms, protocol support is required.

   This document defines a payload format for RTP which allows for for-
   ward error correction of media based on Reed Solomon (RS) codes. It
   is similar in design to [3], which specifies forward error correction
   using exclusive or based parity codes. As with that format, the Reed
   Solomon format described here is generic. This means that the proto-
   col is (1) independent of the nature of the media being protected, be
   it audio, video, or otherwise, (2) flexible enough to support a wide
   variety of RS codes, (3) designed for adaptivity so that the FEC
   technique can be modified easily without out of band signaling, and
   (4) supportive of a number of different mechanisms for transporting
   the FEC packets.

2 Terminology

   The following terms are used throughout this document:

        1.   Media Payload: is a piece of raw, un-protected user data
             which is to be transmitted from the sender. The media pay-
             load would is placed inside of an RTP packet.

        2.   Media Header: is the RTP header for the packet containing
             the media payload.

        3.   Media Packet: The combination of a media payload and media
             header is called a media packet.

        4.   FEC Packet: The forward error correction algorithms at the
             transmitter take the media packets as an input. They output
             both the media packets that they are passed, and new pack-
             ets called FEC packets. The FEC packets are formatted
             according to the rules specified in this document.

        5.   FEC Header: The FEC header is the header information con-
             tained in an FEC packet.

        6.   FEC Payload: The FEC payload is the payload in an FEC
             packet.


J.Rosenberg,H.Schulzrinne                                     [Page 2]

Internet Draft                 RS Coding                November 3, 1998


        7.   Media Block: The media block is a set of K consecutive
             media packets which are used to generate the FEC packets.

        8.   Coding Block: The coding block is a set of N packets, con-
             sisting of the K packets from a single media block, plus
             N-K additional FEC packets generated from the media block.

        9.   K: K is a variable which represents the number of media
             packets in a media block.

        10.  N: N is a variable which represents the total number of
             packets in a coding block.

        11.  Reed Solomon Code: A Reed Solomon code is uniquely deter-
             mined by the values of N and K, as defined above, and the
             symbol length.

3 Basic Operation

   The media packets are broken into blocks of K. K can vary from block
   to block. The Reed Solomon coding is used to obtain N-K FEC packets
   which protect the K media packets.

   The FEC packets are sent as a separate stream from the media packets.
   This implies that the FEC packets have their own sequence number
   space. Although the timestamps for the FEC packets are derived from
   the media packets, they increment monotonically. Combined together,
   FEC packet streams work well with RTP header compression. The media
   packet stream is unaffected by the use of FEC. This allows the two to
   be sent on a separate multicast group, so that non-FEC receivers can
   ignore the FEC and just receive the original media. The separation
   also allows for coherent values for the sequence numbers and times-
   tamps.

   This document does not prescibe the definition of "separate streams",
   but leaves this to applications and higher level protocols to define.
   For multicast, the separate stream may be implemented by separate
   multicast groups, different ports in the same group, or by a dif-
   ferent SSRC within the same group/port. For unicast, different ports
   or different SSRC may be used. Each of these approaches has drawbacks
   and benefits which depend on the application.

   At the receiver, arriving FEC and media packets are used to generate
   a stream of media packets for direct use by the application. This
   results in a clean separation of error protection from the applica-
   tion.

   FEC packets encoded according to this document are indicated through


J.Rosenberg,H.Schulzrinne                                     [Page 3]

Internet Draft                 RS Coding                November 3, 1998


   the payload type in the packet header. These payload types are sig-
   naled dynamically.

4 Reed Solomon Codes

   The detailed operation and theory behind Reed Solomon codes is not
   important here. A Reed Solomon code takes a group of K data blocks
   and generates N - K FEC blocks. A receiver needs to receive any K of
   the N data or FEC blocks in order to recover the K data blocks.
   Besides K and N, the only parameters of the algorithm are the symbol
   length, which is the number of bits per symbol used in the algorithm.
   If the symbol length is l, the size of the data block must be a mul-
   tiple of l. For more information of Reed Solomon codes, the reader is
   referred to [4].

   Reed Solomon codes are systematic. This means that the K data blocks
   are not modified as a result of the FEC operation.

5 RTP Media Packet Structure

   The media packets and FEC packets are sent as separate streams. The
   media packets are unaffected by FEC, and are sent in the same fashion
   they would be sent if there were no FEC.

   This lends to a very efficient encoding. When little (or no) FEC is
   used, there are mostly media packets being sent. This means that the
   overhead (present in FEC packets only) tracks the amount of FEC in
   use.

6 FEC Packet Structure

   When a packet is to be transmitted which contains FEC data, the RTP
   header is followed by an FEC header. We first discuss the semantics
   of the RTP header fields.

   The version field is set to 2. The padding bit is computed via the
   protection operation, defined below. The extension bit is also com-
   puted via the protection operation. The SSRC value should generally
   be the same as the SSRC value of the media stream it protects. It MAY
   be different if the FEC stream is being demultiplexed via the SSRC
   value. The CC value is computed via the protection operation. The
   CSRC list is never present, independent of the value of the CC field.
   The extension is never present, independent of the value of the X
   bit. The marker bit is computed via the protection operation.

   The sequence number has the standard definition: it is one higher
   than the sequence number in the previously transmitted FEC packet.
   The timestamp is set in the following fashion. When the FEC packet is


J.Rosenberg,H.Schulzrinne                                     [Page 4]

Internet Draft                 RS Coding                November 3, 1998


   sent, the value of the media RTP timestamp is examined. This value is
   used as the timestamp of the FEC packet. This results in the TS value
   in FEC packets to be monotonically increasing, independent of the FEC
   scheme.

   The payload type is obtained through out of band signaling. The sig-
   naling protocol MUST establish a symbol length to be associated with
   the payload type value. End systems which cannot recognize a payload
   type must discard it. This provides backwards compatibility. The FEC
   mechanisms can then be used in a multicast group with mixed FEC-
   capable and FEC-uncapable receivers.

   Following the RTP header is the Reed Solomon header.

6.1 Reed Solomon Header

   The format of the Reed-Solomon FEC header is shown below.


      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |      SN Base                  |        length recovery        |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |E| PT Recovery |       N       |        K      |       i       |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                          TS Recovery                          |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+


   The length recovery field is used to determine the length of any
   recovered packets. It is computed via the protection operation
   applied to the 16 bit natural binary representation of the lengths
   (in bytes) of the media payload, CSRC list, extension and padding of
   media packets in the media block (in other words, the CSRC list,
   extension, and padding, if present, are "counted" as part of the pay-
   load). This allows for the FEC procedure to be applied even when the
   lengths of the media packets are not identical.

   The E bit indicates a header extension. Implementations conforming to
   this version of the specification MUST set this bit to zero.

   The PT recovery field is obtained via the protection operation
   applied to the payload type values of the media packets in the media
   block.

   The TS recovery field is computed via the protection operation


J.Rosenberg,H.Schulzrinne                                     [Page 5]

Internet Draft                 RS Coding                November 3, 1998


   applied to the timestamps of the media packets in the media block.
   This allows the timestamp to be completely recovered.

   The SN base is the sequence number of the first media packet in the
   media block protected by FEC. The K field is the number of media
   packets in the media block, minus 1. The N field is the total number
   of FEC plus data packets in the coding block, minus 1. The i field
   indicates that this packet is the i+1th FEC packet of the N-K FEC
   packets in the code. The symbol length of the code is indicated by
   means of the payload type field in the RTP header.

   The payload of the FEC packet is the protection operation applied to
   the CSRC List plus payloads plus extension plus padding of the media
   packets in the media block.

7 Protection Operation

   The protection operation involves taking a variety of fields from the
   various headers, adding the payloads, appending the whole thing
   together, padding zeroes, and then computing the FEC across the
   resulting binary block. The result is then placed into the FEC
   packet.

   For each media block consisting of K media packets, K binary arrays
   are generated (one for each packet) by appending the following fields
   from the header and payload together:

        o Padding Bit (1 bit)

        o Extension Bit (1 bit)

        o CC bits (3 bits)

        o Marker bit (1 bit)

        o Payload Type (7 bits)

        o Timestamp (32 bits)

        o Natural binary representation of the length of the CSRC List
          plus padding plus payload plus extension of the media packet
          (16 bits)

        o CSRC List (if CC is 1), else null (variable)

        o Header Extension (if X is 1), else null (variable)

        o the payload (variable)


J.Rosenberg,H.Schulzrinne                                     [Page 6]

Internet Draft                 RS Coding                November 3, 1998


        o Padding, if present (variable)

   If the lengths of the binary arrays are not equal, they are padded
   with zeroes to be the length of the longest binary array. If the
   resulting binary arrays have a length which is not a multiple of the
   symbol length, they are all padded further until they are a multiple
   of the symbol length.

   The Reed Solomon encoding operation is then applied to the K binary
   arrays, generating N-K FEC arrays. Each FEC array is used to generate
   a single FEC packet.

   The first bit in the FEC packet binary array is written into the Pad-
   ding Bit of the FEC packet. The second bit in the FEC packet binary
   array is written into the Extension bit of the FEC packet. The next
   three bits of the FEC packet binary array are written into the CC
   field of the FEC packet. The next bit of the FEC packet binary array
   is written into the marker bit of the FEC packet. The next 7 bits of
   the FEC packet binary array are written into the PT recovery field in
   the FEC packet header. The next 32 bits of the FEC packet binary
   array are written into the TS recovery field in the packet header.
   The next 16 bits are written into the Length Recovery field in the
   FEC packet header. The remaining bits are set to be the payload of
   the FEC packet.

8 Recovery Procedures

   The FEC packets allow end systems to recover from the loss of media
   packets. All of the header fields of the missing packets, including
   CSRC lists, extensions, padding bits, marker and payload type, are
   recoverable.  This section describes the procedure for performing
   this recovery.

   Recovery requires two distinct operations. The first determines when
   recovery should be attempted, based on the packets which have
   arrived. The second is to actually perform the reconstruction.

   The determination of when to recover is straightforward. For each
   coding block, once K of the N packets in the block have arrived,
   recovery can be attempted. Each FEC packet contains an FEC base, and
   the value of K an N. Using these, it is easy to determine when K of N
   packets have been received.

   One approach for an implementation of this logic is to use linked
   lists of packets. Each list is associated with two numbers: an SN
   base and a value of K. When an FEC packet arrives, if there is no
   list associated with its SN base field, a new list is created. Other-
   wise, the FEC packet is placed in the list associated with that SN


J.Rosenberg,H.Schulzrinne                                     [Page 7]

Internet Draft                 RS Coding                November 3, 1998


   base field. When a media packet arrives, its sequence number is
   checked against each list. If its SN is between the SN base and SN
   base plus K, the media packet is also placed in the list. Once a list
   has K packets in it, recovery can be peformed.

8.1 Reconstruction

   Reconstruction is possible when any K of the packets (media or FEC)
   from the coding block have arrived. Let T be the list of packets (FEC
   and media) which can be combined to recover some media packet xi. The
   procedure is as follows:

        1.   For each of the media packets in T, compute the binary
             array as described in the previous section.

        2.   For the FEC packets in T, compute the binary array in the
             same fashion, except always set the CSRC list, extension,
             and padding to null.

        3.   If the resulting K binary arrays are not of equal length,
             pad them with zeroes to be the length of the longest binary
             array.

        4.   If the lengths are not a multiple of the symbol length, pad
             them until they are a multiple.

        5.   Apply the Reed Solomon operation to the K binary arrays.
             This will result in N binary arrays, one of which is the
             recovery array corresponding to the packet to be recovered.

        6.   Create a new packet with the standard 12 byte header and no
             payload.

        7.   Set the version of the new packet to 2.

        8.   Set the Padding bit in the new packet to the first bit in
             the recovery array.

        9.   Set the Extension bit in the new packet to the second bit
             in the recovery array.

        10.  Set the CC field to the next three bits in the recovery
             array.

        11.  Set the marker bit in the new packet to the next bit in the
             recovery array.

        12.  Set the payload type in the new packet to the next 7 bits


J.Rosenberg,H.Schulzrinne                                     [Page 8]

Internet Draft                 RS Coding                November 3, 1998


             in the recovery array.

        13.  Set the SN field in the new packet to xi.

        14.  Set the TS field in the new packet to the next 32 bits in
             the recovery array.

        15.  Take the next 16 bits of the recovery array. Whatever the
             natural binary number this corresponds to, take that many
             bytes from the recovery array and append them to the new
             packet. This represents the CSRC list, extension, payload,
             and padding.

        16.  Set the SSRC of the new packet to the SSRC of the media
             stream its protecting

   This procedure will completely recover both the header and payload of
   an RTP packet.

9 Use with Redundant Encodings

   One can consider an FEC packet as a "redundant coding" of the media.
   Because of this, the payload format for encoding of redundant audio
   data [5] can be used to carry the FEC data along with the media. The
   procedure for this is simple. In some media packet, the payload type
   is set to the value for redundant encodings. The secondary coder is
   then set to be the FEC data. This means that the PTI of the secondary
   coder is set to the PTI value which indicates FEC. The block length
   of the secondary coder is set to the length of the FEC header and
   payload. The timestamp offset is set to the difference between the
   media timestamp and the timestamp from the FEC packet. The secondary
   coder payload includes the FEC header and FEC payload.

   This procedure only works if an FEC packet is sent after the last of
   the media packets in the media block has been sent. Otherwise, the
   timestamp offset would be negative, which is not allowed.

   Using the redundant encodings payload format also implies that the
   marker bit cannot be recovered.

   An advantage of this approach is a reduction in the overhead for
   sending FEC packets.

10 Conclusion

   This document has presented a new RTP payload format which allows for
   Reed Solomon based forward error correction of audio visual media. It
   is flexible, allowing nearly any Reed Solomon Code to be used. It is


J.Rosenberg,H.Schulzrinne                                     [Page 9]

Internet Draft                 RS Coding                November 3, 1998


   also backwards compatible with existing RTP implementations.
   Receivers which cannot understand FEC can discard the FEC packets,
   and still receive the media packets.

11 Security Considerations

   The use of FEC has implications on the usage and changing of keys for
   encryption. As the FEC packets do consist of a separate stream, there
   are a number of permutations on the usage of encryption. In particu-
   lar:

        o The FEC stream may be encrypted, while the media stream is
          not.

        o The media stream may be encrypted, while the FEC stream is
          not.

        o The media stream and FEC stream are both encrypted, but using
          different keys.

        o The media stream and FEC stream are both encrypted, but using
          the same key.

   The first three of these would require any application level signal-
   ing protocols to be aware of the usage of FEC, and to thus exchange
   keys for it and negotiate its usage on the media and FEC streams
   separately. In the final case, no such additional mechanisms are
   needed. Applications utilizing encryption SHOULD encrypt both
   streams, however. Encrypting just one may make certain known-
   plaintext attacks possible.

   However, the changing of keys becomes problematic. For example, if
   two packets a and b are sent, and FEC packet f(a,b) is sent, and the
   keys used for a and b are different, which key should be used to
   decode f(a,b)? In general, old keys will likely need to be cached, so
   that when the keys change for the media stream, the old key is kept,
   and used, until it is determined that the key has changed on the FEC
   packets as well.

   Another issue with the use of FEC is its impact on network conges-
   tion. Adding FEC in the face of increasing network losses is a bad
   idea, as it can lead to increased congestion and eventual congestion
   collapse if done on a widespread basis. As a result, implementors
   MUST NOT substantially increase the amount of FEC in use as network
   losses increase.

12 Full Copyright Statement


J.Rosenberg,H.Schulzrinne                                    [Page 10]

Internet Draft                 RS Coding                November 3, 1998


   Copyright (C) The Internet Society (1998). All Rights Reserved.

   This document and translations of it may be copied and furnished to
   others, and derivative works that comment on or otherwise explain it
   or assist in its implmentation may be prepared, copied, published and
   distributed, in whole or in part, without restriction of any kind,
   provided that the above copyright notice and this paragraph are
   included on all such copies and derivative works.

   However, this document itself may not be modified in any way, such as
   by removing the copyright notice or references to the Internet Soci-
   ety or other Internet organizations, except as needed for the purpose
   of developing Internet standards in which case the procedures for
   copyrights defined in the Internet Standards process must be fol-
   lowed, or as required to translate it into languages other than
   English.

   The limited permissions granted above are perpetual and will not be
   revoked by the Internet Society or its successors or assigns.

   This document and the information contained herein is provided on an
   "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
   TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
   BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
   HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MER-
   CHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE."

13 Author's Addresses


   Jonathan Rosenberg
   Lucent Technologies, Bell Laboratories
   101 Crawfords Corner Rd.
   Holmdel, NJ 07733
   Rm. 4C-526
   email: jdrosen@bell-labs.com

   Henning Schulzrinne
   Columbia University
   M/S 0401
   1214 Amsterdam Ave.
   New York, NY 10027-7003
   email: schulzrinne@cs.columbia.edu


14 Bibliography


J.Rosenberg,H.Schulzrinne                                    [Page 11]

Internet Draft                 RS Coding                November 3, 1998


   [1] J.-C. Bolot and A. Garcia, "The case for fec-based error control
   for packet audio in the internet," Multimedia Systems , 1997.

   [2] C. Perkins and C. Perkins, "Options for repair of streaming
   media," Request for Comments (Informational) 2354, Internet Engineer-
   ing Task Force, June 1998.

   [3] J. Rosenberg and H. Schulzrinne, "An RTP payload format for gen-
   eric forward error correction," Internet Draft, Internet Engineering
   Task Force, Aug.  1998.  Work in progress.

   [4] L. Rizzo, "Erasure codes for computer communication protocols,"
   technical report, Universita di Pisa, Pisa, Italy, Jan. 1997.

   [5] C. Perkins, I. Kouvelas, V. Hardman, M. Handley, and J. Bolot,
   "RTP payload for redundant audio data," Request for Comments (Pro-
   posed Standard) 2198, Internet Engineering Task Force, Sept. 1997.


J.Rosenberg,H.Schulzrinne                                    [Page 12]