INTERNET-DRAFT Katsushi Kobayashi draft-kobayashi-dv-audio12-00.txt Communication Research Laboratory Akimichi Ogawa Keio University Stephen Casner Cisco Systems Carsten Bormann Universitaet Bremen TZI June 25, 1999 Expires December 1999 RTP Payload Format for nonlinear 12 bits Audio on DV Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet- Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. 1. Abstract This document specifies the packetization scheme for encapsulating the 12 bits nonlinear audio data streams used in "DV" video into a payload of the Real-Time Transport Protocol (RTP). 2. Introduction This document provides the information of 12 bits nonlinear audio used in the DV format and specifies the encapsulation into the Real- time Transport Protocol (RTP), version 2 [1,2]. Also, this document just specifies the differenticated part of 16 bit linear audio as L16 [3,4]. Reader is recommended to consult the L16 document with this one. Kobayashi, et al Expires December 1999 [Page 1] Internet Draft June 25, 1999 3. The need for the RTP encapsulation for 12 bits nonlinear DV audio. The HD Digital VCR Conference has published a digital video specification set entitled "Specification of Consumer-Use Digital VCRs using 6.3mm magnetic tape" [1]. The digital video format defined by that specification is commonly known as "DV" format. The original DV format treats whole of the data including audio and video as single bundled stream data. On the other hand, RTP recommends that different media data will transport different RTP streams, even if the both streams made by the same source. Therefore, RTP encapsulation format of DV stream also recommends audio and video streams transport different RTP streams with its corresponding RTP format. In the DV standard, audio data encodes PCM and three types of encoding format are defined, i.e. 16 bits linear 20 bits linear and 12 bits nonlinear.(20 bits linear has not been used yet.) The RTP encapsulation format for audio previously published supports 16 bits linear audio only [3,4]. The format of 12 bits nonlinear DV audio is congruent with 16 bits linear audio except the format of single sampled data element. An element of 12 bits nonlinear audio data can be obtained from the single sampled element of 16 bits linear one. It is not difficult to convert 12 bits nonlinear into 16 bits linear on the sender side and send it as L16 audio previously defined. However, the amount of the data size of 16 bits increases 33% compared with the 12 bits and it waste network bandwidth with meaningless data. 4. 12 bits nonlinear audio format in DV (DV12) The data of 12 bits nonlinear DV audio is derived from the single sampled data of the 16 bit linear audio format. The conversion detail between 16 and 12 bits is shown in the Table. Three levels of sampling frequency are defined in the DV specification, i.e. 32kHz, 44.1kHz and 48kHz. All the values are included by the samplig rates listed in L16 documents. And other parameters, encapsulation format and also MIME description are discussed in L16 document. When 12 bits size sampled data is packed into payload, the most significant bit MUST be encodes first. The sample code for packing 12 bits DV audio into RTP payload is shown in Appendix. 12 bits length of a sampled data does not accord to the 8 bits byte boundary of RTP payload. When odd number of samples in the payload, four LSBs data of the last byte is unused. 16 bits linear (X) 12 bits nonlinear (Y) ------------------------------------------------------------ 32,767 (7FFFh) Y = INT(X/64) + (600h) 2,047 (7FFh) 16,384 (4000h) 1,792 (700h) Kobayashi, et al Expires December 1999 [Page 2] Internet Draft June 25, 1999 ------------------------------------------------------------ 16,383 (3FFFh) Y = INT(X/32) + (500h) 1,791 (6FFh) 8,192 (2000h) 1,536 (600h) ------------------------------------------------------------ 8,191 (1FFFh) Y = INT(X/16) + (400h) 1,535 (5FFh) 4,096 (1000h) 1,280 (500h) ------------------------------------------------------------ 4,095 (0FFFh) Y = INT(X/8) + (300h) 1,279 (4FFh) 2,048 (0800h) 1,024 (400h) ------------------------------------------------------------ 2,047 (07FFh) Y = INT(X/4) + (200h) 1,023 (3FFh) 1,024 (0400h) 768 (300h) ------------------------------------------------------------ 1,023 (03FFh) Y = INT(X/2) + (100h) 767 (2FFh) 512 (0200h) 512 (200h) ------------------------------------------------------------ 511 (01FFh) Y = X 511 (1FFh) 0 (0000h) 0 (000h) ------------------------------------------------------------ -1 (FFFFh) Y = X -1 (FFFh) -512 (FE00h) -512 (E00h) ------------------------------------------------------------ -513 (FFFFh) Y = INT((X + 1)/2) - (101h) -513 (DFFh) -1,024 (FE00h) -768 (D00h) ------------------------------------------------------------ -1,025 (FBFFh) Y = INT((X + 1)/4) - (201h) -769 (CFFh) -2,048 (F800h) -1,024 (C00h) ------------------------------------------------------------ -2,049 (F7FFh) Y = INT((X + 1)/8) - (301h) -1,025 (BFFh) -4,096 (F000h) -1,280 (B00h) ------------------------------------------------------------ -4,097 (EFFFh) Y = INT((X + 1)/16) - (401h) -1,281 (AFFh) -8,192 (E000h) -1,536 (A00h) ------------------------------------------------------------ -8,193 (DFFFh) Y = INT((X + 1)/32) - (501h) -1,537 (9FFh) -16,384 (C000h) -1,792 (900h) ------------------------------------------------------------ -16,385 (BFFFh) Y = INT((X + 1)/64) - (601h) -1,793 (8FFh) -32,768 (8000h) -2,048 (800h) ------------------------------------------------------------ Table. Conversion between 16 bits to 12 bits [1] 6. Security Considerations RTP packets using the payload format defined in this specification are subject to the security considerations discussed in the RTP specification [2], and any appropriate RTP profile. This implies that confidentiality of the media streams is achieved by encryption. Kobayashi, et al Expires December 1999 [Page 3] Internet Draft June 25, 1999 Because the data compression used with this payload format is applied end-to-end, encryption may be performed after compression so there is no conflict between the two operations. A potential denial-of-service threat exists for data encodings using compression techniques that have non-uniform receiver-end computational load. The attacker can inject pathological datagrams into the stream which are complex to decode and cause the receiver to be overloaded. However, this encoding does not exhibit any significant non-uniformity. As with any IP-based protocol, in some circumstances a receiver may be overloaded simply by the receipt of too many packets, either desired or undesired. Network-layer authentication may be used to discard packets from undesired sources, but the processing cost of the authentication itself may be too high. In a multicast environment, pruning of specific sources may be implemented in future versions of IGMP [5] and in multicast routing protocols to allow a receiver to select which sources are allowed to reach it. 7. Full Copyright Statement Copyright (C) The Internet Society (1999). All Rights Reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Soci- ety or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be fol- lowed, or as required to translate it into languages other than English. The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assigns. This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MER- CHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE." Kobayashi, et al Expires December 1999 [Page 4] Internet Draft June 25, 1999 8. Authors' Addresses Katsushi Kobayashi Communication Research Laboratory 4-2-1 Nukii-kita machi, Koganei Tokyo 184-8795 JAPAN EMail: ikob@koganei.wide.ad.jp Akimichi Ogawa Keio University 5322 Endo, Fujisawa Kanagawa 252 JAPAN EMail: akimichi@sfc.wide.ad.jp Stephen L. Casner Cisco Systems, Inc. 170 West Tasman Drive San Jose, CA 95134-1706 United States EMail: casner@cisco.com Carsten Bormann Universitaet Bremen FB3 TZI Postfach 330440 D-28334 Bremen, GERMANY Phone: +49.421.218-7024 Fax: +49.421.218-7000 EMail: cabo@tzi.org 10. Bibliography [1] IEC 61834, Helical-scan digital video cassette recording system using 6,35 mm magnetic tape for consumer use (525-60, 625-50, 1125-60 and 1250-50 systems) [2] H. Schulzrinne, S. Casner, R. Frederick, and V. Jacobson. RTP: A transport protocol for real-time applications. IETF Audio/Video Transport Working Group, January 1996. RFC1889. [3] Schulzrinne, H., "RTP Profile for Audio and Video Conferences with Minimal Control", RFC 1890, January 1996. [4] Salsman, J., "The Audio/L16 MIME content type", RFC 2586, May 1999 [5] Deering, S., "Host Extensions for IP Multicasting", STD 5, Kobayashi, et al Expires December 1999 [Page 5] Internet Draft June 25, 1999 RFC 1112, August 1989. Appendix A. Sample code for packing and unpacking int pack12(short[] s, unsigned char[] b1, int n) { unsigned char *b = b1; while (n >= 2) { n -= 2; int s1 = *s++; int s2 = *s++; *b++ = s1 >> 4; *b++ = s1 << 4 + ((s2 >> 4) & 0xF); *b++ = s2; } if (n == 1) { int s1 = *s++; *b++ = s1 >> 4; *b++ = s1 << 4; } return b - b1; } int unpack12(unsigned char[] b, short[] s1, int n) { short *s = s1; while (n >= 3) { n -= 3; *s++ = b[0] << 4 + b[1] >> 4; *s++ = b[1] << 8 + b[2]; b += 3; } if (n == 2) { *s++ = b[0] << 4 + b[1] >> 4; } else if (n == 1) { error("alignment error"); } return s - s1; } Kobayashi, et al Expires December 1999 [Page 6]