Internet Engineering Task Force B. Foster Internet Draft R. Kumar Document: F. Andreasen Category: Informational Cisco Systems Expires: September 1, 2002 March 1 2002 Voice-Band Data Media Format Status of this Document This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026 Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet- Drafts as reference material or to cite them other than as work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. 1. Abstract Voice-band data (fax and modem) traffic can often require different processing and as such, the ability to specify a different payload type when passing this type of traffic is important. This document defines a MIME type, audio/vbd for voiceband data media, and a specific "fmtp" parameter for specifying the underlying encoding. 2. Conventions used in this document The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC-2119. 3. Introduction There are a number of ways of passing modem and fax traffic over an IP network. One approach is to simply pass it in-band. Other approaches involve terminating the fax/modem at each end and relaying the data in some fashion. Either approach may be valid depending on Foster, et al Informational 1 Voice-Band Data Media Format March 2002 the processing capability of the gateway, characteristics of the network etc. This document is specifically concerned with the approach of passing modem and fax traffic in-band. Because voice-band data has distinctly different characteristics from voice, it is often important to be able to distinguish this difference by indicating an associated media format. This allows the receiver of the media to process the packets differently. 4. Rationale for distinct Voiceband Data payload types The rationale for distinguishing between a payload type associated with voice and a payload type associated with voiceband data is twofold: * At the receiver, voiceband data traffic is found to work best with fixed-size jitter buffers, while adaptive jitter buffers are optimal for voice. * Packet loss concealment algorithms are the receiver are suitable for voice, but not for voiceband data. For discrimination between voice and voiceband data and to allow different processing at a receiver, separate payload types must be used even if the underlying encoding is the same e.g. PCMU for voice and voiceband data. To this end, a new RTP audio encoding name, to be registered as the MIME type audio/vbd is defined. For a session, this encoding name could be dynamically mapped into one or more payload types; this is true for any encoding. Each payload type associated with the encoding "vbd" can have a separate format, specified through a 'fmtp' attribute, indicating a different underlying base encoding (e.g. PCMU, PCMA, G726-32, G726-40). This document proposes the use of dynamic payload types for voiceband data that are distinct from the payload types, static or dynamic, for voice even if the underlying encoding algorithms are the same. This is to enable different, voiceband data-specific receiver processing. For a given encoding algorithm, a receiver may include both in the media (m=) line in SDP. If it intends to support the encoding algorithm for voiceband data but not for voice, it should not include the applicable voice payload type in the 'm=' line. 5. Proposed representation in SDP The encoding name, "vbd", may be dynamically associated with one or more RTP payload types. Using the "fmtp" SDP attribute, each "vbd" payload type is associated with an underlying encoding. Thus, a=rtpmap: vbd/ a=fmtp: indicates a dynamic payload type to be associated with the codec "vbd". The fmtp attribute indicates the underlying audio encoding associated with the "vbd" codec. The audio encoding used by the "vbd" codec may be represented by either a Foster, et al Informational 2 Voice-Band Data Media Format March 2002 static or dynamic payload type. Note that it is possible to specify multiple "vbd" payload types, each with a different "fmtp" value and, therefore, a different audio encoding. An example media description in SDP might be: m=audio 3456 RTP/AVP 15 98 99 a=rtpmap:98 vbd/8000 a=fmtp:98 0 a=rtpmap:99 vbd/8000 a=fmtp:99 8 This specifies dynamic RTP payload types 98 and 99 as being "vbd" codecs. Further, it specifies that the vbd codec associated with payload type 98 uses an underlying PCMU codec format (indicated by the static payload type 0). It also specifies that payload type 99 has an underlying format of PCMA, (indicated by the static payload type 8). Note that the payload types 0 (PCMU) and 8 (PCMA) do not appear in the media line in this case. The only permitted voice encoding is G728 (payload type 15). The audio encoding underlying the voiceband data might also be represented by a dynamic payload type, as in the following segment: m=audio 3456 RTP/AVP 15 98 a=rtpmap:96 G726-40/8000 a=rtpmap:98 vbd/8000 a=fmtp:98 96 Again, the dynamic payload type of 96 does not appear in the media line in this case. However, it is used to bind G726-40 as the underlying encoding algorithm for the payload type of 98, used in voiceband data packets. When both voice and voiceband data payload types are distinctly earmarked for a session at session establishment, a transmitter may switch from a voice payload type (15 in the example above) to a voiceband data payload type (98 in the example above) when it detects an appropriate event such as an ANS or ANSAM as defined in V.25 [1] and V.8 [2] respectively. When the receiving gateway or endpoint sees a voiceband data payload type (98 in the example above), it recognizes this as a voiceband data codec (with G726-40 encoding) and adjusts the jitter buffer accordingly. The packet format defined in RFC 2198 can be used with a voiceband data codec for greater reliability by virtue of redundant transmission. A dynamic payload type is defined for the encoding name "red". The encapsulated voiceband data packets are, in this case, staggered in time (earlier and later packets combined in an RFC 2198 composite packet). In the following example media description: m=audio 3456 RTP/AVP 15 98 100 a=rtpmap:98 vbd/8000 a=fmtp:98 0 a=rtpmap:100 red/8000 Foster, et al Informational 3 Voice-Band Data Media Format March 2002 a=fmtp:100 98/98 a dynamic payload type of 100 is associated with RFC 2198 packets. A 'fmtp' line indicates that these RFC 2198 packets encapsulate two voiceband data payloads, each with payload type 98. The encapsulated packets are staggered in time (i.e. earlier and later packets combined in an RFC 2198 composite packet). A "vbd" payload type is negotiated like any other codec type. For symmetric connections that can be transitioned to a specific voiceband data payload type, both ends must declare support for that payload type. For backward compatibility, if this codec type ("vbd") is not bound to a connection, then suitable voice payload types may be used for voiceband data. 6. Other Characteristics of Voiceband Data Sessions This section is informational and is intended to elaborate on other differences between voice and voiceband data traffic. * Silence suppression can be used with voice, but not with voiceband data which requires a continuous carrier signal. * Since voiceband data has a much lower distortion tolerance, it requires an audio encoding algorithm in which DC removal filters are absent. Examples of suitable schemes are PCM (ITU G.711) and 32 kbps/40 kbps ADPCM (ITU G.726). By contrast, many more encoding algorithms are available for voice traffic. Note: this document does not intend to list all encoding algorithms suitable for voiceband data. 7. Proposed Registration of MIME media type audio/vbd MIME media type name: audio MIME subtype name: vbd Required parameters: rate: The RTP timestamp clock rate, which is equal to the sampling rate. The typical rate is 8000, but other rates may be specified. baseAlgorithm: The encoding scheme, such as PCMU, PCMA, G.726-32, G726-40 etc., used. No MIME parameters are inherited. Optional parameters: channels, ptime, maxptime (Refer to Ref. 7). Encoding considerations: This type is only defined for transfer via RTP. Security considerations: See Section 5 of Ref. 7. Interoperability considerations: none Foster, et al Informational 4 Voice-Band Data Media Format March 2002 Published specification: The RFC that will evolve out of this document. Applications which use this media type: Audio and video streaming and conferencing tools. Additional information: none Intended usage: Modulated facsimile and modem signals that benefit from special handling e.g. jitter buffer adjustment at a receiver. Additional information: 1. Magic number(s): N/A 2. File extension(s): N/A 3. Macintosh file type code: N/A Author/Change controller: Bill Foster, Rajesh Kumar and Flemming Andreasen Cisco Systems 170 W. Tasman Drive San Jose, CA 95134-1706 bfoster@cisco.com, rkumar@cisco.com, fandreas@cisco.com 8. References [1] ITU-T, V.25 specification. [2] ITU-T, V.8 Specification. [3] M. Handley, V. Jacobson, SDP: Session Description Protocol, RFC 2327. [4] H. Schulzrinne, RTP Profile for Audio and Video Conferences with Minimal Control, RFC 1890. [5] http://www.iana.org/assignments/rtp-parameters. [6] C. Perkins et al, RTP payload for redundant audio data, RFC 2198. [7] The RFC that will come out of draft-ietf-avt-rtp-mime-06.txt, Casner, S. and Hoschka, P. 9. Author's Addresses Flemming Andreasen Cisco Systems Foster, et al Informational 5 Voice-Band Data Media Format March 2002 499 Thornall Street, 8th Floor Edison, NJ 08837 Phone: +1 732 452 1667 Email: fandreas@cisco.com Bill Foster Cisco Systems Phone: +1 250 758-9418 Email: bfoster@cisco.com Rajesh Kumar Cisco Systems 170 West Tasman Dr San Jose, CA Phone: +1 408 527 0811 Email: rkumar@cisco.com 7. Full Copyright Statement Copyright (C) The Internet Society (2001). All Rights Reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English. The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assigns. This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Acknowledgement Funding for the RFC Editor function is currently provided by the Internet Society. Foster, et al Informational 6 Voice-Band Data Media Format March 2002 Foster, et al Informational 7