Internet Engineering Task Force Johan Sjoberg, Ericsson Audio Video Transport WG Magnus Westerlund, Ericsson INTERNET-DRAFT Ari Lakaniemi, Nokia March 30, 2001 Petri Koskelainen, Nokia Expires: September 30, 2001 Bernhard Wimmer, Siemens Tim Fingscheidt, Siemens Qiaobing Xie, Motorola Sanjay Gupta, Motorola RTP payload format and file storage format for AMR and AMR-WB audio Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or cite them other than as "work in progress". The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/lid-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html This document is an individual submission to the IETF. Comments should be directed to the authors. Abstract This document specifies a real-time transport protocol (RTP) payload format to be used for AMR and AMR-WB speech encoded signals. The payload format is designed to be able to interoperate with existing AMR and AMR-WB transport formats. Furthermore, a file format for storage of AMR and AMR-WB speech data is specified. Two separate MIME type registrations, one for AMR and one for AMR-WB, describing both RTP payload format and storage format are included. Sjoberg et al. [Page 1] INTERNET-DRAFT RTP Payload Format for AMR and AMR-WB March 30, 2001 1. Introduction This payload description applies to the packetization of data from two different codecs, the Adaptive Multi-Rate (AMR) codec and the Adaptive Multi-Rate Wideband (AMR-WB) codec. It is important to remember that these are different codecs and they MUST always be handled as different payload types in RTP. 1.1. The Adaptive Multi-Rate speech codec The adaptive multi-rate (AMR) speech codec [1] was developed by the European Telecommunications Standards institute (ETSI). The AMR codec is standardized for GSM, and is also chosen by the Third Generation Partnership Project (3GPP) as the mandatory codec for third generation systems. The AMR codec will be widely used in cellular systems. The AMR codec is a multi-mode codec with 8 narrow band speech modes with bit rates between 4.75 and 12.2 kbps. The sampling frequency is 8000 Hz and processing is done on 20 ms frames, i.e. 160 samples per frame. The AMR modes are closely related to each other and use the same coding framework. Three of the AMR modes are already adopted standards of their own, the 6.7 kbps mode as PDC-EFR [10], the 7.4 kbps mode as IS-641 codec in TDMA [9], and the 12.2 kbps mode as GSM- EFR [8]. 1.2. The Adaptive Multi-Rate Wideband speech codec The Adaptive Multi-Rate Wideband (AMR-WB) speech codec [3] was originally developed by 3GPP to be used in GSM and 3G systems. The AMR-WB codec will be widely used in cellular systems. The AMR-WB codec is a multi-mode speech codec with 9 wideband speech coding modes with bit-rates between 6.6 and 23.85 kbps. The sampling frequency is 16000 Hz and processing is performed on 20 ms frames, i.e. 320 speech samples per frame. The AMR-WB modes are closely related to each other and employ the same coding framework. 1.3. Common Characteristics for AMR and AMR-WB The multi-mode feature is used to preserve high speech quality under a wide range of transmission conditions. In mobile radio systems (e.g. GSM) mode adaptation allows the system to adapt the balance between speech coding and error protection to enable best possible speech quality in prevailing transmission conditions. On the other hand, mode adaptation can be also utilized to adapt to the varying available transmission bandwidth. Codec implementations must support all specified speech coding modes, and mode switching can occur to Sjoberg et al. [Page 2] INTERNET-DRAFT RTP Payload Format for AMR and AMR-WB March 30, 2001 any mode at any time. The mode information must therefore be transmitted together with the speech encoded bits, to indicate the mode. To realize rate adaptation the decoder needs to signal the mode it prefers to receive to the encoder. Both codecs include voice activity detection (VAD) and generation of comfort noise (CN) parameters during silence periods. Hence, the codecs can reduce the number of transmitted bits and packets during silence periods to a minimum. The operation to send CN parameters at regular intervals during silence periods is usually called discontinuous transmission (DTX) or source controlled rate (SCR) operation. The frames containing CN parameters are called Silence Indicator (SID) frames. Due to the flexibility and robustness of these codecs, they are suitable also for other purposes than circuit switched cellular systems. Other suitable applications are real-time services over packet switched networks. The payload format should be designed for robustness against both bit errors and packet loss. The speech encoded bits have different perceptual sensitivity to bit errors and cellular systems exploit this by using unequal error protection and detection (UEP and UED). The UED/UEP mechanism focus the correction and detection of corrupted bits to the perceptually most sensitive bits. A speech frame is only declared damaged if there are bit errors in the most sensitive bits, i.e. the class A bits see [2] and [4]. It is acceptable to have some bit errors in the other bits, i.e. class B and C. Also a damaged frame is still useful for error concealment in the decoding, which uses some of the less sensitive bits. This improves the speech quality compared to discarding the data. Today there exist some link layers that do not discard packets with bit errors, e.g. SLIP and some wireless links. With the Internet traffic pattern shifting towards a more media-centric one, more link layers of such nature may emerge in the future. With transport layer support for partial checksums, for example those supported by UDP- Lite [13] (work in progress), bit error tolerant AMR and AMR-WB traffic could achieve better performance over these types of links. There are at least two basic approaches for carrying AMR and AMR-WB traffic over bit error tolerant networks: 1) Utilizing a partial checksum to cover headers and the most important speech bits of the payload. It is recommended that at least all class A bits are covered by the checksum. 2) Utilizing a partial checksum to only cover headers, but a frame CRC to cover the class A bits of each speech frame in the payload. Sjoberg et al. [Page 3] INTERNET-DRAFT RTP Payload Format for AMR and AMR-WB March 30, 2001 In either approach, at least part of the class B/C bits are left without error-check and thus bit error tolerance is achieved. It is still important that the network designer pay attention to the class B and C residual bit error rate. Though less sensitive to errors than class A bits, class B bits are not insignificant and undetected errors in these bits cause degradation in speech quality. An example of residual error rates considered acceptable for AMR in UMTS can be found in [21] and for AMR-WB in [22]. Approach 1 is a bit efficient, flexible and simple way, but comes with two disadvantages, namely, a) bit errors in protected speech bits will cause the payload to be discarded, and b) when transporting multiple frames in a payload there is the possibility that a single bit error in protected bits gets all the frames discarded. These disadvantages can be avoided if needed, with some overhead in the form of a frame-wise CRC (Approach 2). In problem a), the CRC makes it possible to detect bit errors in class A bits and use the frame for error concealment, which gives a small improvement in speech quality. Secondly (b), when transporting multiple frames in a payload the CRC's remove the possibility that a single bit error in a class A bit gets all the frames discarded. Avoiding that gives an improvement in speech quality when transporting multiple frames and subject to bit errors. The choice between the two approaches must be made based on the available bandwidth, and desired tolerance to bit errors. Neither solution is appropriate to all cases. The payload format supports several means to increase robustness against packet loss. The simple scheme of repetition of previously sent data is one possibility. Another possible scheme which is more bandwidth efficient is to use payload external FEC, e.g. RFC2733 [20], which generates extra packets containing repair data. The whole payload can also be sorted in sensitivity order to support external FEC schemes using UEP. There is work in progress on a generic version of such a scheme [19]. Several frames can be encapsulated into a single RTP packet to decrease protocol overhead. One of the drawbacks of such approach is that in case of packet loss this means loss of several consecutive speech frames, which usually causes clearly audible distortion in reconstructed speech. Interleaving of frames can improve the speech quality in such cases by distributing the consecutive losses into series of single frame losses. However, interleaving and bundling several frames per payload will also increase end-to-end delay and is therefore not applicable to all types of applications. However, streaming applications are likely to be able to exploit interleaving to improve speech quality in lossy transmission conditions. Sjoberg et al. [Page 4] INTERNET-DRAFT RTP Payload Format for AMR and AMR-WB March 30, 2001 2. Payload format The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC2119 [5]. The AMR and AMR-WB payload format supports transmission of multiple frames per payload, the use of fast codec mode adaptation, and robustness against packet loss and bit errors. The payload format consists of one payload header with an optional interleaving extension, a table of contents, optionally one CRC per payload frame and zero or more payload frames. The payload format is either bandwidth efficient or octet aligned, which mode of operation to use has to be signalled at session establishment. Only the octet aligned format has the possibility to use the robust sorting, interleaving and CRC to make it robust to packet loss and bit errors. In the octet aligned format the payload header, table of contents entries and the payload frames are individually octet aligned to make implementations efficient, but in the bandwidth efficient format only the full payload is octet aligned. If the option to transmit a robust sorted payload is enabled and employed, the full payload SHALL finally be ordered in descending bit error sensitivity order to be prepared for unequal error protection or unequal error detection schemes. The encoded bit streams are defined in sensitivity order in Annex B of [2] and [4], the original order as delivered from the speech encoder is defined in [1] and [3]. Octet alignment of a field or payload means that the last octet MUST be padded with zeroes at the end to fill the the octet. The AMR frame types, or modes, are defined in [2] and the corresponding description for AMR-WB is found in [4]. Frame type 14 (only available for AMR-WB), SPEECH_LOST, and 15, NO_DATA, are needed to indicate not transmitted frames or lost frames. NO_DATA could mean both no data produced by the speech encoder for this frame or no data transmitted in this payload, i.e. valid data for this frame could be sent in an earlier or following packets. For example, when multiple frames are sent in each payload and comfort noise starts. A frame type sequence in a payload with 8 speech frames using AMR mode 7 is interrupted by DTX operation in the fifth frame, this looks like: {7,7,7,7,8,15,15,8}. The AMR SCR/DTX is described in [6] and AMR-WB SCR/DTX in [7]. Robustness against packet loss can be accomplished by using the possibility to retransmit previously transmitted frames together with the current frame or frames. Another approach is to use interleaving to reduce the speech quality effect of packet losses. The speech quality in case of packet losses when transmitting several frames per Sjoberg et al. [Page 5] INTERNET-DRAFT RTP Payload Format for AMR and AMR-WB March 30, 2001 packet can be improved by using OPTIONAL frame interleaving. The interleaving improves perceived speech quality since it introduces single frame errors instead of several consecutive frame errors. Note that interleaving can be applied only if the receiver has signaled support for it in capability description. The AMR performance over error tolerant links can be improved by delivering also speech frames with bit errors. Unequal error detection is needed since bit errors SHOULD only be allowed in the least error sensitive bits. This payload format provides two alternative methods to implement unequal error detection: A. CRC calculation over the class A speech bits The optional CRC MAY be used to protect the class A speech bits. The number of class A bits is specified as informative for AMR in [2] and therefore copied into table 1 as normative for this payload format. The number of class A bits for AMR-WB are specified as normative in table 2 in [4] and these numbers MUST be used also for this payload format. Speech frames with errors in class A bits MUST be marked with SPEECH_BAD for corrupted speech frames (FT=0..7 for AMR and FT=0..8 for AMR-WB) or SID_BAD for corrupted SID frames (FT=8 for AMR and FT=9 for AMR-WB) and be sent to the speech decoder, see [6] and [7]. In this case the RTP header, payload header and table of contents should be covered by a transport layer checksum, e.g. UDP-lite [13]. Packets MUST be discarded if the transport layer checksum detects errors. B. Robust sorting of payload bits Robust behavior can also be accomplished by robust sorting of the payload. This enables the use of UED (e.g. UDP-lite) and UEP (e.g. ULP [19]). The UED and/or UEP is recommended to cover at least the RTP header, payload header, table of contents and class A bits. Support for unequal error detection is OPTIONAL. If either scheme is to be used, it MUST be signaled out of band (see section 7). Sjoberg et al. [Page 6] INTERNET-DRAFT RTP Payload Format for AMR and AMR-WB March 30, 2001 Class A total speech Index Mode bits bits ---------------------------------------- 0 AMR 4.75 42 95 1 AMR 5.15 49 103 2 AMR 5.9 55 118 3 AMR 6.7 58 134 4 AMR 7.4 61 148 5 AMR 7.95 75 159 6 AMR 10.2 65 204 7 AMR 12.2 81 244 8 AMR SID 39 39 Table 1. The number of class A bits for the AMR codec. A frame quality indicator is included for interoperability with the ATM payload format described in ITU-T I.366.2, the UMTS Iu interface [17] and other transport formats. The speech quality is increased if damaged frames are forwarded to the speech decoder error concealment unit and not dropped. In many communication scenarios the AMR encoded bits will be transmitted from one IP/UDP/RTP terminal to a terminal in a system with another transport format and/or vice versa. The transport format transcoding will be done in a gateway. A second likely scenario is that IP/UDP/RTP is used as transport between other systems, i.e. IP is originated and terminated in gateways on both sides of the IP transport. AMR or AMR-WB over I.366.{2,3} or +------+ +----------+ 3G Iu or | | IP/UDP/RTP/AMR | | -------------->| GW |----------------------->| TERMINAL | GSM Abis | | | | etc. +------+ +----------+ Figure 1: GW to VoIP terminal scenario AMR or AMR-WB AMR or AMR-WB over over I.366.{2,3} or +------+ +------+ I.366.{2,3} or 3G Iu or | | IP/UDP/RTP/AMR or | | 3G Iu or -------------->| GW |-------------------->| GW |---------------> GSM Abis | | IP/UDP/RTP/AMR-WB | | GSM Abis etc. +------+ +------+ etc. Figure 2. GW to GW scenario 2.1. The payload header Sjoberg et al. [Page 7] INTERNET-DRAFT RTP Payload Format for AMR and AMR-WB March 30, 2001 The length of the payload header is either 4 or 8 bits plus optionally an 8 bit interleaving header. The bits in the header are specified as follows: CMR (4 bits): Indicates Codec Mode Requested for the other communication direction. It is only allowed to request one of the speech modes of the used codec, frame type index 0..7 for AMR, see Table 1a in [2] or frame type index 0..8 for AMR-WB, see Table 1a in [4]. CMR value 15 indicates that no mode request is present, other values are for future use. P: Is a padding bit, always set to zero. 0 0 1 2 3 +-+-+-+-+ | CMR | +-+-+-+-+ Figure 3: Payload header for bandwidth efficient operation. 0 0 1 2 3 4 5 6 7 +-+-+-+-+-+-+-+-+ | CMR |P|P|P|P| +-+-+-+-+-+-+-+-+ Figure 4: Payload header for octet aligned operation. If the use of interleaving is signaled out of band at session set up, and if octet aligned operation is signaled interleaving is used and the payload header is extended with two 4 bit fields, ILL and ILP, used to describe the interleaving scheme. ILL (4 bits): OPTIONAL field that is present only if interleaving is signaled. The value of this field specifies the interleaving length used for frames in this payload. ILP (4 bits): OPTIONAL field that is present only if interleaving is signaled. The value of this field indicates the interleaving index for frames in this payload. The value of ILP MUST be smaller than or equal to the value of ILL. Erroneous value of ILP SHOULD cause the payload to be discarded. The value of the ILL field defines the length of an interleave group: ILL=L implies that frames in (L+1)-frame intervals are picked into the same interleaved payload, and the interleave group consists of L+1 payloads. The size of the interleaving group is the N*(L+1), if N is the number of frames per payload. The value of ILP=p in payloads belonging to the same group runs from 0 to L. The interleaving is meaningful only when number of frames per payload N is greater than Sjoberg et al. [Page 8] INTERNET-DRAFT RTP Payload Format for AMR and AMR-WB March 30, 2001 or equal to 2. Thus, when N frames are transmitted in each payload of a group, the interleave group consists of payloads with sequence numbers s...s+L, and frames encapsulated into these payloads are f...f+N*(L+1)-1. To put this in a form of an equation, let's assume that the first frame of an interleave group is n, the first payload of the group is s, number of frames per payload is N, ILL=L and ILP=p (p in range 0...L), the frames contained by the payload s+p are n + p + k*(L+1), where k runs from 0 to N-1. I.e. The first packet of an interleave group: ILL=L, ILP=0 Payload: s Frames: n, n+(L+1), n+2*(L+1), ..., n+(N-1)*(L+1) The second packet of an interleave group: ILL=L, ILP=1 Payload: s+1 Frames: n+1, n+1+(L+1), n+1+2*(L+1), ..., n+1+(N-1)*(L+1) ... The last packet of an interleave group: ILL=L, ILP=L Payload: s+L Frames: n+L, n+L+(L+1), n+L+2*(L+1), ..., n+L+(N-1)*(L+1) 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | CMR |P|P|P|P| ILL | ILP | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 5: Octet aligned operation payload header with interleaving extension. 2.2. The payload table of contents and CRCs The table of contents (ToC) consists of one entry for each speech frame in the payload. A table of contents entry includes several specified fields as follows: F (1 bit): Indicates if this frame is followed by further frames. F=1 further frames follow, F=0 last frame. FT (4 bits): Frame type indicator, indicating the AMR speech coding mode or comfort noise (SID) mode. The mapping of existing AMR modes to FT is given in Table 1a in [2] for AMR and in Table 1a in [4] for AMR-WB. If FT=14 (speech lost, available only in AMR-WB) or FT=15 (No transmission/no reception) no CRC or payload frame is present. Sjoberg et al. [Page 9] INTERNET-DRAFT RTP Payload Format for AMR and AMR-WB March 30, 2001 Q (1 bit): The payload quality bit indicates, if not set, that the payload is severely damaged and the receiver should set the RX_TYPE, see [6], to SPEECH_BAD or SID_BAD depending on the frame type (FT). P: Is a padding bit, always set to zero. 0 0 1 2 3 4 5 +-+-+-+-+-+-+ |F| FT |Q| +-+-+-+-+-+-+ Figure 6: Table of contents entry field for bandwidth efficient operation. 0 0 1 2 3 4 5 6 7 +-+-+-+-+-+-+-+-+ |F| FT |Q|P|P| +-+-+-+-+-+-+-+-+ Figure 7: Table of contents entry field for octet aligned operation. CRC (8 bits): OPTIONAL field, exists if the use of CRC is signaled at session set up. The 8 bit CRC is used for error detection. The algorithm to generate these 8 parity bits are defined in section 4.1.4 in [2]. 0 0 1 2 3 4 5 6 7 +-+-+-+-+-+-+-+-+ | CRC | +-+-+-+-+-+-+-+-+ Figure 8: CRC field The ToC and CRCs are arranged with all table of contents entries fields first followed by all CRC fields. The ToC starts with the frame data belonging to the oldest speech frame. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |F| FT |Q|P|P|F| FT |Q|P|P|F| FT |Q|P|P| CRC | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | CRC | CRC | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 9: The ToC and CRCs for a payload with three speech frames Sjoberg et al. [Page 10] INTERNET-DRAFT RTP Payload Format for AMR and AMR-WB March 30, 2001 2.3. Speech frame A speech frame represents one frame encoded with the mode according to the ToC field FT. The length of this field is implicitly defined by the AMR mode in the FT field. The bits SHALL be sorted according to Appendix B of [2] for AMR and Appendix B of [4] for AMR-WB. If octet aligned operation is used, the last octet of each speech frame MUST be padded with zeroes at the end if not all bits are used. 2.4. Compound payload The compound payload consists of one AMR payload header, the table of contents and one or more speech frames, see section 2.1, 2.2 and 2.3. These elements SHALL be put together to form a payload with either simple or robust sorting. If the bandwidth efficient operation is used only simple sorting MUST be used. Definitions for describing the compound AMR payload: b(m) - bit m of the compound AMR payload, octet aligned o(n,m) - bit m of octet n in the octet description of the compound AMR payload, bit 0 is MSB t(n,m) - bit m in the table of contents entry for speech frame n p(n,m) - bit m in the CRC for speech frame n f(n,m) - bit m in speech frame n F(n) - number of bits in speech frame n, defined by FT h(m) - bit m of payload header C(n) - number of CRC bits for speech frame n, 0 or 8 bits N - number of payload frames in the payload S - number of unused bits Payload frames f(n,m) are ordered in consecutive order, where frame n=1 is preceding frame n=2. Within one payload all frames between the oldest and most recent MUST be present, if not interleaving is used then the interleaving rules defined in section 2.1 applies. If speech data is missing for one or more frames in the sequence of frames in the payload, due to e.g. DTX, send the NO_DATA frame type for these frames. This does not mean that all frames must be sent, only that the sequence of frames in one payload MUST indicate missing frames. The compound AMR payload, b, is mapped into octets, o, where bit 0 is MSB. 2.4.1. Simple payload sorting If multiple new frames are encapsulated into the payload and robust payload sorting is not used, the payload is formed by concatenating the payload header, the ToC, optional CRC fields and the speech Sjoberg et al. [Page 11] INTERNET-DRAFT RTP Payload Format for AMR and AMR-WB March 30, 2001 frames in the payload. However, the bits inside a frame are ordered into sensitivity order as defined in [2] for AMR and [4] for AMR-WB. 2.4.1.1. Simple payload sorting for bandwidth efficient operation The simple payload sorting algorithm is defined in C-style as: /* payload header */ k=0; ; H=4; for (i = 0; i < H; i++){ b(k++) = h(i); } /* table of contents */ T=6; for (j = 0; j < N; j++){ for (i = 0; i < T; i++){ b(k++) = t(j,i); } } /* payload frames */ for (j = 0; j < N; j++){ for (i = 0; i < F(j); i++){ b(k++) = f(j,i); } } } /* padding */ S = (k%8 == 0) ? 0 : 8 - k%8; for (i = 0; i < S; i++){ b(k++) = 0; } /* map into octets */ for (i = 0; i < k; i++){ o(i/8,i%8)=b(i) } 2.4.1.2. Simple payload sorting for octet aligned operation In octet aligned operation is the simple payload sorting algorithm defined in C-style as: /* payload header */ k=0; H=8; if (interleaving){ H+=8; /* Interleaving extension */ } for (i = 0; i < H; i++){ b(k++) = h(i); } Sjoberg et al. [Page 12] INTERNET-DRAFT RTP Payload Format for AMR and AMR-WB March 30, 2001 /* table of contents */ T=8; for (j = 0; j < N; j++){ for (i = 0; i < T; i++){ b(k++) = t(j,i); } } /* CRCs, only if signaled */ if (crc) { for (j = 0; j < N; j++){ for (i = 0; i < C(j); i++){ b(k++) = p(j,i); } } } /* payload frames */ for (j = 0; j < N; j++){ for (i = 0; i < F(j); i++){ b(k++) = f(j,i); } /* padding of each speech frame */ S = (k%8 == 0) ? 0 : 8 - k%8; for (i = 0; i < S; i++){ b(k++) = 0; } } /* map into octets */ for (i = 0; i < k; i++){ o(i/8,i%8)=b(i) } 2.4.2. Robust payload sorting Robust payload sorting is only supported in octet aligned operation and must be signaled at session set up. A bit error in a more sensitive bit is subjectively more annoying than in a less sensitive bit. Therefore, to be able to protect only the most sensitive bits in a payload packet with a forward error detection or correction code, e.g. a checksum outside RTP or ULP [19], the bits inside a frame are ordered into sensitivity order. The protection SHOULD cover an appropriate number of octets from the beginning of the payload, covering at least the AMR payload header, ToC and class A bits (see [2]). If CRCs are used together with robust sorting only the payload header and the ToC should be covered by the transport checksum. Exactly how many octets need protection depends on the network and application. To maintain sensitivity ordering inside the AMR payload, when more than one speech frame is transmitted in one payload, reordering of the data is needed. Sjoberg et al. [Page 13] INTERNET-DRAFT RTP Payload Format for AMR and AMR-WB March 30, 2001 When robust sorting mode is used, the reordering to maintain the sensitivity ordered AMR payload SHALL be performed on octet level. The AMR payload header, ToC and CRCs SHALL still be placed unchanged in the beginning of the payload. Thereafter, the payload frames are sorted with one octet alternating from each payload frame. The robust payload sorting algorithm is defined in C-style as: /* payload header */ k=0; H=8; if (interleaving){ H += 8; /* interleaving extension */ } for (i = 0; i < H; i++){ b(k++) = h(i); } /* table of contents */ for (j = 0; j < N; j++){ for (i = 0; i < 8; i++){ b(k++) = t(j,i); } } /* CRCs */ if (crc){ for (j = 0; j < N; j++){ for (i = 0; i < C(j); i++){ b(k++) = p(j,i); } } } /* payload frames */ for (j = 0; j < N; j++){ P(j) = F(j)%8 == 0 ? 0 : 8 - F(j)%8; } max = (max(F(0),..,F(N-1))-1)/8 +1; for (i = 0; i < max; i++){ for (j = 0; j < N; j++){ for (l = 0; l < 8; l++){ if (i < F(j)+P(j)){ if (i < F(j)){ b(k++) = f(j,i); }else{ b(k++) = 0; } } } } } Sjoberg et al. [Page 14] INTERNET-DRAFT RTP Payload Format for AMR and AMR-WB March 30, 2001 /* map into octets */ for (i = 0; i < k; i++){ o(i/8,i%8)=b(i) } 2.5. Decoding security consideration If the payload length calculation, using the information from signaling plus the F and FT fields, does not indicate the same length as the size of the payload actually received, the payload should be dropped. Decoding a packet that has errors in length indicator bits could severely degrade the speech quality. 2.6. Implementation considerations Implementations SHOULD include both bandwidth efficient and octet aligned operation to give a high possibility of interoperability. The implementation of robust sorting, interleaving and CRCs are OPTIONAL. 3. RTP header usage The RTP header marker bit (M) is used to mark (M=1) the packages containing as their first frame the first speech frame after a comfort noise period in DTX operation. For all other packages the marker bit is set to 0 (M=0). The timestamp corresponds to the sampling instant of the first sample encoded for the first frame in the packet. A frame can be either encoded speech, comfort noise parameters, NO_DATA, or SPEECH_LOST (only for AMR-WB). The timestamp unit is in samples. The duration of one speech frame is 20 ms and the sampling frequency is 8 kHz, corresponding to 160 encoded speech samples per frame for AMR and 16 kHz corresponding to 320 samples per frame in AMR-WB. Thus, the timestamp is increased by 160 for AMR and 320 for AMR-WB for each consecutive frame. All frames in a packet MUST be successive 20 ms frames as delivered by the speech encoder exept if interleaving is employed, then frames encapsulated into a payload MUST be picked as defined in section 2.1. 4. Congestion Control The need of congestion control for data transported with RTP has to be considered. AMR and AMR-WB speech data have some elastic properties due to the different bandwidth demand for each mode. Another parameter that can reduce the bandwidth demand for AMR and AMR-WB is how many frames of speech data that are encapsulated in each payload. This will reduce the number of packets and the overhead Sjoberg et al. [Page 15] INTERNET-DRAFT RTP Payload Format for AMR and AMR-WB March 30, 2001 from IP/UDP/RTP headers. If using forward error correction (FEC) there is also the need to regulate the amount, so the FEC itself does not worsen the problem. Therefore, it is RECOMMENDED that applications using this payload implement congestion control. The actual mechanism for congestion control is not specified but should be suitable for real-time flows, e.g. "Equation-Based Congestion Control for Unicast Applications" [18]. 5. Security Considerations RTP packets using the payload format defined in this specification are subject to the security considerations discussed in the RTP specification [11]. This implies that confidentiality of the media streams is achieved by encryption. Because the payload format is arranged end-to-end, encryption MAY be performed after encapsulation so there is no conflict between the two operations. This payload type does not exhibit any significant non-uniformity in the receiver side computational complexity for packet processing to cause a potential denial-of-service threat. As this format transports encoded speech, the main security issues are decoding security (see section 2.5), confidentiality and authentication of the speech itself. The payload format itself does not have any support for security. These issues have to be solved by a payload external mechanism. 5.1. Confidentiality To achieve confidentiality of the encoded speech all speech data bits must be encrypted. There is less need to encrypt the payload header or the table of contents as they only carry information about the requested speech mode, frame type and frame quality. This information could be useful to some third party, e.g. quality monitoring. The type of encryption used can not only have impact on the confidentiality but also on error robustness. The error robustness against bit errors will be none, unless an encryption method without error-propagation is used, e.g. a stream cipher. This is only an issue when using UEP/D, when bit errors can be accepted in some part of the payload. 5.2. Authentication To authenticate the sender of the speech an external mechanism has to be added. It is RECOMMENDED that such a mechanism protects all the speech data bits. Note that the use of UED/UEP is difficult to combine with authentication. To prevent a man in the middle from tampering with the packetization of the speech data, some extra data SHOULD be protected. The data is: the payload header, ToC, CRCs, RTP timestamp, RTP sequence number, and the RTP marker bit. Tampering Sjoberg et al. [Page 16] INTERNET-DRAFT RTP Payload Format for AMR and AMR-WB March 30, 2001 could result in erroneous depacketization/decoding that could lower speech quality. Tampering with the AMR mode request field can result in that the sender must receive speech in a different quality than desired. 6. Examples 6.1. Bandwidth efficient examples 6.1.1. Single frame example The bandwidth efficient single frame per payload example is employing AMR, no valid Codec Mode Request CMR is sent (CMR=15), the payload was not damaged at IP origin (Q=1). The mode is AMR 7.4 kbps (FT=4). The speech encoded bits are put into f(0) to f(147) in descending sensitivity order according to [2]. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | CMR |F| FT |Q|f(0) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | f(147)|P|P| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 10: One frame per packet example. 6.1.2. Multi frame example The bandwidth efficient multiple frame per payload example is employing AMR-WB, a Codec Mode Request CMR for the AMR-WB 8.85 kbps mode is sent (CMR=1), the payloads were not damaged at IP origin (Q=1). The mode is AMR-WB 6.6 kbps (FT=0) for the first frame, f(0) to f(131), and AMR-WB 8.85 kbps (FT=1) for the second frame, g(0) to g(176). The speech encoded bits are put into f(0) to f(131) and g(0) to g(176) in descending sensitivity order according to [4]. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | CMR |F| FT |Q|F| FT |Q|f(0) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | Sjoberg et al. [Page 17] INTERNET-DRAFT RTP Payload Format for AMR and AMR-WB March 30, 2001 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | f(131)|g(0) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | g(176)|P|P|P| +-+-+-+-+-+-+-+-+ Figure 11: Two frame per packet example. 6.2. Octet aligned operation examples In this example octet aligned operation of the payload format is used. Two AMR frames with 7.95 kbps mode (FT=5) are sent in the payload. A mode request is sent, requesting the 10.2 kbps mode for the other link(CMR=6). CRC is used. Interleaving is used with depth ILL=1 and index ILP=0. The first frame is frame 1, f1(0..158), and the second frame in the payload is is frame 3 due to interleaving, f3(0..158). For each payload frame a CRC is calculated CRC1(0..7) for frame 1 and CRC3(0..7) for frame 3. Robust payload sorting is used. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | CMR |P|P|P|P| ILL | ILP |F| FT |Q|P|P|F| FT |Q|P|P| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | CRC1 | CRC3 | f1(0..7) | f3(0..7) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | f1(8..15) | f3(8..15) | f1(16..23) | f3(16..23) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ : ... : +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |f1(152..158) |P|f3(152..158) |P| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 12: Example with CRCs, interleaving and robust sorting. Sjoberg et al. [Page 18] INTERNET-DRAFT RTP Payload Format for AMR and AMR-WB March 30, 2001 7. MIME type registration This chapter defines the MIME types for the Adaptive Multi-Rate (AMR) and Adaptive Multi-Rate Wideband (AMR-WB) speech codecs, [1] and [3], respectively. To distinguish between the two codecs and emphasize that seamless switching is possible only within each of these two codecs the MIME types are kept separate although they are very similar. The data format and parameters are specified for both real- time transport and for storage type applications (e.g. e-mail attachment, multimedia messaging). The former is referred to as RTP mode and the latter as storage mode. Implementations according to [1] and [3] MUST support all eight coding modes for AMR and all nine coding modes for AMR-WB. The mode change within each codec can occur at any time during operation and therefore the mode information is transmitted in-band together with speech bits to allow mode change without any additional signaling. In addition to the speech codec, AMR and AMR-WB specifications also include Discontinuous Transmission / comfort noise (DTX/CN) functionality [14] and [15]. The DTX/CN switches the transmission off during silent parts of the speech and only CN parameter updates, SID frames, are sent at regular intervals. 7.1. RTP mode It is possible that the decoder may want to receive a certain speech mode or a subset of modes, due to link limitations in some cellular systems, e.g. the GSM radio link can only use a subset of at most four modes. A GSM subset can consist of any combination of the 8 AMR modes or 9 AMR-WB modes. Therefore, it is possible to request a specific set of speech modes in capability description and the encoder MUST abide by this request. If the request for mode set is not given any mode may be used or requested. The codec can in principle perform a mode change at any time between any two modes. To support interoperability with GSM through a gateway it is possible to set limitations for mode changes. The decoder has the possibility to define the minimum number of frames between mode changes and to limit the mode change to transition into neighboring modes only. It is also possible to limit the number of speech frames encapsulated into one RTP packet. This is an optional feature and if no parameter is given in the capability description, the transmitter MAY encapsulate any number of speech frames into one RTP packet. The payload CRC UED MUST only be used if the receiver has signaled support for this functionality in the capability description. Sjoberg et al. [Page 19] INTERNET-DRAFT RTP Payload Format for AMR and AMR-WB March 30, 2001 To support unequal error protection and/or detection the payload format supports robust payload sorting. The robust payload sorting is an OPTIONAL feature and MUST only be used if the receiver has signaled support for this functionality in the capability description. The speech quality in case of packet losses when transmitting several speech frames per packet can be improved by using the OPTIONAL frame level interleaving. The interleaving improves perceived speech quality since it introduces series of single frame errors instead of several consecutive frame errors. Interleaving MUST only be applied if the receiver has signaled support for it, and if used, the interleaving length MUST NOT exceed the limitation given in capability description. Note that the receiver can use the MIME parameters to limit increased buffering requirements caused by the interleaving. For example, interleaving=I defines the maximum size of an interleave group to I=N*(L+1) (see section 2.1 for details on interleaving). 7.2. Storage mode The storage mode is used for storing speech frames, e.g. as a file or e-mail attachment. The first octet of the file is the storage header and indicates with its first bit if the file contains AMR or AMR-WB speech. The rest of the header octet is reserved for future use. 0 1 2 3 4 5 6 7 +-+-+-+-+-+-+-+-+ |C|Reserved | +-+-+-+-+-+-+-+-+ Figure 13: Storage header, C=0 indicates AMR and C=1 AMR-WB. The speech frames are stored in consecutive order in octet aligned manner. This implies that the first octet after the last octet of frame n must be the first octet of frame n+1. The first octet of each stored speech frame consists of a 4-bit FT field (see definition in section 2.2)and a Q bit. The positions of the fields correspond to the positions of the corresponding fields of an octet aligned table of contents entry, see figure 7. Following this first octet comes the encoded speech frames bits (see section 2.3). The last octet of each frame is padded with zeroes, if needed, to achieve octet alignment. An example is given in figure 14. Sjoberg et al. [Page 20] INTERNET-DRAFT RTP Payload Format for AMR and AMR-WB March 30, 2001 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |P| FT |Q|P|P| | +-+-+-+-+-+-+-+-+ + | | + Speech bits for frame n + | | + +-+-+ | |P|P| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 14: An example of storage format with one AMR 5.9 kbit/s frames (118 speech bits). Note that bits marked with P, "padding" MUST be set to zero. Speech frames lost in transmission and non-received frames between SID updates during non-speech period MUST be stored as NO_DATA frames (frame type 15, see definition in [2] and [4]) or SPEECH_LOST (only available for AMR-WB) to keep synchronization with the original media. 7.3. AMR MIME Registration MIME-name for the AMR codec is allocated from IETF tree since AMR is expected to be widely used speech codec in VoIP applications. Some parts of this chapter will distinguish between RTP and storage modes. Media Type name: audio Media subtype name: AMR Required parameters: none Optional parameters for RTP mode: octet-align: If present, octet aligned operation SHALL be used. If not present, band width efficient operation is employed. mode-set: Requested AMR mode set. Restricts the active codec mode set to a subset of all modes. Possible values are comma separated list of modes: 0,...,7 (see Table 1a [2] an example is given in section 7.5). If not present, all speech modes are available. mode-change-period: Defines a number N which restricts the mode changes in such a way that mode changes are only allowed on multiples of N, initial state of the phase is arbitrary. If this parameter is not present, mode change can happen at any time. mode-change-neighbor: If present, mode changes SHALL only be made to neighboring modes in the active codec mode set. Neighboring modes are the ones closest in bit rate to the Sjoberg et al. [Page 21] INTERNET-DRAFT RTP Payload Format for AMR and AMR-WB March 30, 2001 current mode, both higher and lower rate included. If not present, change between any two modes in the active codec mode set is allowed. maxframes: Maximum number of speech frames in one RTP packet. The receiver may set this parameter in order to limit the buffering requirements or delay. crc: If present, CRCs SHALL be included in the payload, otherwise not. Requires also the octet-align parameter to be sent. robust-sorting: If present, the payload SHALL employ robust payload sorting. If not present simple payload sorting SHALL be used. Requires also the octet-align parameter to be sent. interleaving: Indicates that frame level interleaving SHALL be used and its value defines a maximum number of frames in the interleaving group (see section 2.1). If this parameter is not present, interleaving SHALL not be used. Requires also the octet-align parameter to be sent. Optional parameters for storage mode: none Encoding considerations for RTP mode: See chapter 2. Encoding considerations for storage mode: See section 7.2. Security considerations: see chapter 5 "Security". Public specification: please refer to chapter 8 "References". Additional information for storage mode: Magic number: none File extensions: amr, AMR Macintosh file type code: none Object identifier or OID: none Person & email address to contact for further information: johan.sjoberg@ericsson.com ari.lakaniemi@nokia.com Intended usage: COMMON. It is expected that many VoIP applications (as well as mobile applications) will use this type. Author/Change controller: johan.sjoberg@ericsson.com ari.lakaniemi@nokia.com 7.4. AMR-WB MIME Registration MIME-name for the AMR-WB codec is allocated from IETF tree since AMR- WB is expected to be widely used speech codec in VoIP applications. Sjoberg et al. [Page 22] INTERNET-DRAFT RTP Payload Format for AMR and AMR-WB March 30, 2001 Some parts of this chapter will distinguish between RTP and storage modes. Media Type name: audio Media subtype name: AMR-WB Required parameters: none Optional parameters for RTP mode: octet-align: If present, octet aligned operation SHALL be used. If not present, band width efficient operation is employed. mode-set: Requested AMR-WB mode set. Restricts the active codec mode set to a subset of all modes. Possible values are comma separated list of modes: 0,...,8 (see Table 1a [4]).If not present, all speech modes are available. mode-change-period: Defines a number N which restricts the mode changes in such a way that mode changes are only allowed on multiples of N, initial state of the phase is arbitrary. If this parameter is not present, mode change can happen at any time. mode-change-neighbor: If present, mode changes SHALL only be made to neighboring modes in the active codec mode set. Neighboring modes are the ones closest in bit rate to the current mode, both higher and lower rate included. If not present, change between any two modes in the active codec mode set is allowed. maxframes: Maximum number of speech frames in one RTP packet. The receiver may set this parameter in order to limit the buffering requirements or delay. crc: If present, CRCs SHALL be included in the payload, otherwise not. Requires also the octet-align parameter to be sent. robust-sorting: If present, the payload SHALL employ robust payload sorting. If not present simple payload sorting SHALL be used. Requires also the octet-align parameter to be sent. interleaving: Indicates that frame level interleaving SHALL be used and its value defines a maximum number of frames in the interleaving group (see section 2.1). If this parameter is not present, interleaving SHALL not be used. Requires also the octet-align parameter to be sent. Optional parameters for storage mode: none Encoding considerations for RTP mode: See chapter 2. Encoding considerations for storage mode: See section 7.2. Security considerations: see chapter 5 "Security". Sjoberg et al. [Page 23] INTERNET-DRAFT RTP Payload Format for AMR and AMR-WB March 30, 2001 Public specification: please refer to chapter 8 "References". Additional information for storage mode: Magic number: none File extensions: amr, AMR Macintosh file type code: none Object identifier or OID: none Person & email address to contact for further information: johan.sjoberg@ericsson.com ari.lakaniemi@nokia.com Intended usage: COMMON. It is expected that many VoIP applications (as well as mobile applications) will use this type. Author/Change controller: johan.sjoberg@ericsson.com ari.lakaniemi@nokia.com 7.5 Mapping to SDP Parameters Please note that this chapter applies to the RTP mode. Example of usage of AMR in SDP [16], possible GSM gateway scenario: m=audio 49120 RTP/AVP 97 a=rtpmap:97 AMR/8000 a=fmtp:97 mode-set=0,2,5,7; mode-change-period=2; mode-change- neighbor; maxframes=1 Example of usage of AMR-WB in SDP [16], possible VoIP scenario: m=audio 49120 RTP/AVP 98 a=rtpmap:98 AMR-WB/16000 a=fmtp:98 octet-align Example of usage of AMR-WB in SDP [16], possible streaming scenario: m=audio 49120 RTP/AVP 99 a=rtpmap:99 AMR-WB/16000 a=fmtp:99 octet-align; maxframes=3;interleaving=15 8. References [1] 3G TS 26.090, "Adaptive Multi-Rate (AMR) speech transcoding". [2] 3G TS 26.101, "AMR Speech Codec Frame Structure". [3] 3GPP TS 26.190 "AMR Wideband speech codec; Transcoding functions". [4] 3GPP TS 26.201 "AMR Wideband speech codec; Frame Structure". Sjoberg et al. [Page 24] INTERNET-DRAFT RTP Payload Format for AMR and AMR-WB March 30, 2001 [5] IETF RFC 2119, "Key words for use in RFCs to Indicate Requirement Levels". [6] 3G TS 26.093, "AMR Speech Codec; Source Controlled Rate operation". [7] 3GPP TS 26.193 "AMR Wideband Speech Codec; Source Controlled Rate operation". [8] GSM 06.60, "Enhanced Full Rate (EFR) speech transcoding". [9] TIA/EIA -136-Rev.A, part 410 - "TDMA Cellular/PCS - Radio Interface, Enhanced Full Rate Voice Codec (ACELP). Formerly IS- 641. TIA published standard, 1998". [10] ARIB, RCR STD-27H, "Personal Digital Cellular Telecommunication System RCR Standard". [11] IETF RFC1889, "RTP: A Transport Protocol for Real-Time Applications". [12] IETF draft-westberg-realtime-cellular-01.txt, "Realtime Traffic over Cellular Access Networks". [13] IETF draft-larzon-udplite-04.txt, "The UDP Lite Protocol". [14] GSM 06.92, "Comfort noise aspects for Adaptive Multi-Rate (AMR) speech traffic channels". [15] 3GPP TS 26.192 "AMR Wideband speech codec; Comfort Noise aspects". [16] M. Handley and V. Jacobson, "SDP: Session Description Protocol", RFC 2327, April 1998 [17] 3G TS 25.415 "UTRAN Iu Interface User Plane Protocols" [18] S. Floyd, M. Handley, J. Padhye, J. Widmer, "Equation-Based Congestion Control for Unicast Applications", ACM SIGCOMM 2000, Stockholm, Sweden [19] IETF draft-ietf-avt-ulp-00.txt, "An RTP Payload Format for Generic FEC with Uneven Level Protection ". [20] IETF RFC2733, "An RTP Payload Format for Generic Forward Error Correction". [21] 3G TS 26.102, "AMR speech codec interface to Iu and Uu". [22] 3GPP TS 26.202 "AMR Wideband speech codec; Interface to Iu and Uu". Sjoberg et al. [Page 25] INTERNET-DRAFT RTP Payload Format for AMR and AMR-WB March 30, 2001 9. Authors' addresses Johan Sjoberg Tel: +46 8 50878230 Ericsson Research EMail: Johan.Sjoberg@ericsson.com Ericsson Radio Systems AB Torshamnsgatan 23 SE-164 80 Stockholm, SWEDEN Magnus Westerlund Tel: +46 8 4048287 Ericsson Research EMail: Magnus.Westerlund@ericsson.com Ericsson Radio Systems AB Torshamnsgatan 23 SE-164 80 Stockholm, SWEDEN Ari Lakaniemi Tel: +358 40 5276440 Nokia Research Center EMail: ari.lakaniemi@nokia.com P.O.Box 407 FIN-00045 Nokia Group, FINLAND Petri Koskelainen Nokia Research Center Email: petri.koskelainen@nokia.com P.O.Box 100 FIN-33721 Tampere, FINLAND Tim Fingscheidt Tel: +49 89 722 57658 Siemens AG, ICP CD Fax: +49 89 722 46489 Grillparzerstrasse 10-18 EMail: Tim.Fingscheidt@mch.siemens.de D - 81675 Munich, GERMANY Bernhard Wimmer Tel: +49 89 722 23247 Siemens AG, ICP CD Fax: +49 89 722 46489 Grillparzerstrasse 10-18 EMail: Bernhard.Wimmer@mch.siemens.de D - 81675 Munich, GERMANY Qiaobing Xie Tel: +1-847-632-3028 Motorola, Inc. EMail: qxie1@email.mot.com 1501 W. Shure Drive, #2309 Arlington Heights, IL 60004, USA Sanjay Gupta Tel: +1-847-435-0306 Motorola, Inc. EMail: QA4496@email.mot.com 1501 W. Shure Drive, #3205 Arlington Heights, IL 60004, USA This Internet-Draft expires September 30, 2001. Sjoberg et al. [Page 26]