AVT Working Group Internet Draft G. Hellstrom Omnitor AB Expires: February 2004 P. Jones Cisco Systems, Inc. August 2003 RTP Payload for Text Conversation Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC 2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Abstract This memo describes how to carry text conversation session contents in RTP packets. Text conversation session contents are specified in ITU-T Recommendation T.140 [1]. Text conversation is used alone or in connection to other conversational facilities such as video and voice, to form multimedia conversation services. This RTP payload description contains an optional possibility to include redundant text from already transmitted packets in order to reduce the risk of text loss caused by packet loss. The redundancy coding follows RFC 2198. Conventions used in this document Hellstrom Expires - February 2004 [Page 1] RTP Payload for Text Conversation August 2003 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119. 1. Introduction This document defines a payload type for carrying text conversation session contents in RTP packets. Text conversation session contents are specified in ITU-T Recommendation T.140 [1]. Text conversation is used alone or in connection to other conversational facilities such as video and voice, to form multimedia conversation services. Text in text conversation sessions is sent as soon as it is available, or with a small delay for buffering. The text is supposed to be entered by human users from a keyboard, handwriting recognition, voice recognition or any other input method. The rate of character entry is usually at a level of a few characters per second or less. Therefore, the expected number of characters to transmit is low. Only one or a few new characters are expected to be transmitted with each packet. T.140 specifies that text and other T.140 elements MUST be transmitted in ISO 10 646-1 code with UTF-8 transformation. That makes it easy to implement internationally useful applications, and to handle the text in modern information technology environments. The payload of an RTP packet following this specification consists of text encoded according to T.140 without any additional framing. A common case will be a single ISO 10646 character, UTF-8 encoded. T.140 requires the transport channel to provide characters without duplication and in original order. Text conversation users expect that text will be delivered with no or a low level of lost information. If lost information can be indicated, the willingness to accept loss is expected to be higher. Therefore a mechanism based on RTP is specified here. It gives text arrival in correct order, without duplications, and with detection and indication of losses. It also includes an optional possibility to repeat data for redundancy to lower the risk of loss. Since packet overhead is usually much larger than the T.140 contents, the increase in channel load by the redundancy scheme is minimal. 2. Usage of RTP When transport of T.140 text session data in RTP is desired, the payload as described in this specification SHOULD be used. A text conversation RTP packet as specified by this payload format consists of an RTP header as defined in RFC 3550 [2] followed Hellstrom Expires - February 2004 [Page 2] RTP Payload for Text Conversation August 2003 immediately by a block of T.140 data, defined here to be a "T140block". There is no additional header specific to this payload format. The T140block contains one or more T.140 code elements as specified in [1]. Most T.140 code elements are single ISO 10646 [5] characters, but some are multiple character sequences. Each character is UTF-8 encoded [6] into one or more octets. This implies that each block MUST contain an integral number of UTF-8 encoded characters regardless of the number of octets per character. It also implies that any composite character sequence (CCS) SHOULD be placed within one block. The T140blocks MAY be transmitted redundantly according to the payload format defined in RFC 2198 [3]. In that case, the RTP header is followed by one or more redundant data block headers, the same number of redundant data fields carrying T140blocks from previous packets, and finally the new (primary) T140block for this packet. Usually, each medium in a session utilizes a separate RTP stream. If synchronization of the text and other media packets is important, the streams MUST be associated when the sessions are established and the streams MUST share the reference clock (refer to the description of the timestamp field as it relates to synchronization in section 5.1 of RFC 3550). Association of RTP streams is dependent on the particular session application and is outside the scope of this document. When interleaving text and other media, such as when a PSTN gateway is relaying PSTN textphone protocols and audio over an IP network to another PSTN gateway, text packets and other media packets MAY be carried within the same RTP stream and distinguished by the Payload Type value. To ensure that text packet loss can be detected in such scenarios, the text packets MUST utilize a unique SSRC value and unique sequence number space. 2.1 RTP packet header Each RTP packet starts with a fixed RTP header. The following fields of the RTP fixed header are used for T.140 text streams: Payload Type (PT): The assignment of an RTP payload type is specific to the RTP profile under which this payload format is used. For profiles that use dynamic payload type number assignment, this payload format is identified by the name "T140" (see section 7). If redundancy is used per RFC 2198, the Payload Type MUST indicate that payload format ("RED"). Sequence number: The Sequence Number MUST be increased by one for each new transmitted packet. It is used for detection of packet Hellstrom Expires - February 2004 [Page 3] RTP Payload for Text Conversation August 2003 loss and packets out of order, and can be used in the process of retrieval of redundant text, reordering of text and marking missing text. Timestamp: The RTP Timestamp encodes the approximate instance of entry of the primary text in the packet. A clock frequency of 1000 Hz MUST be used. Sequential packets MUST NOT use the same timestamp. Since packets do not represent any constant duration, the timestamp cannot be used to directly infer packet losses. SSRC: The Synchronization Source value used to transmit a text stream must be unique from any other SSRC value used within an RTP session, even if the text media and other media are received from the same source (e.g., a PSTN gateway that is encoding audio and text packets from a single PSTN trunk). The reason is that each SSRC has its own sequence number space, which is important for proper detection of lost text packets. 2.2 Additional Headers There are no additional headers defined specific to this payload format. When redundant transmission of the data according to RFC 2198 is desired, the RTP header is followed by one or more redundant data block headers, one for each redundant data block to be included. Each of these headers provides the timestamp offset and length of the corresponding data block plus a payload type number indicating this payload format ("T140"). Redundant data older than the clock frequency divided by 16383 MUST not be transmitted. 2.3 T.140 Text Structure T.140 text is UTF-8 coded as specified in T.140 with no extra framing. When using the format with redundant data, the transmitter MAY select a number of T140block generations to retransmit in each packet. A higher number introduces better protection against loss of text but increases the data rate. Since packets are not generated at regular intervals, the timestamp is not sufficient to identify a packet in the presence of loss unless extra information is provided. Since sequence numbers are not provided in the redundant header, some additional rules must be followed to allow the redundant data corresponding to missing primary data to be merged properly into the stream of primary data T140blocks: - Each redundant data block MUST contain the same data as a T140block previously transmitted as primary data, and be Hellstrom Expires - February 2004 [Page 4] RTP Payload for Text Conversation August 2003 identified with a timestamp offset equating to the original timestamp for that T140block. - The redundant data MUST be placed in age order with most recent redundant T140block last in the redundancy area. - All T140blocks from the oldest desired generation up through the generation immediately preceding the new (primary) T140block MUST be included. These rules allow the sequence numbers for the redundant T140blocks to be inferred by counting backwards from the sequence number in the RTP header. The result will be that all the text in the payload will be contiguous and in order. 3. Recommended Procedure This section contains RECOMMENDED procedures for usage of the payload format. Based on the information in the received packets, the receiver can: - reorder text received out of order. - mark where text is missing because of packet loss. - compensate for lost packets by using redundant data. 3.1 Recommended Basic Procedure Packets are transmitted only when there is valid T.140 data to transmit. The sequence number is used for sequencing of T.140 data. T.140 specifies that T.140 data MAY be buffered before transmission for a short moment. A maximum buffering time of 500 ms is specified. In order to keep the maximum bit rate usage for text at a reasonable level, it is RECOMMENDED to buffer T.140 data for transmission in 300 ms intervals. This time is selected so that text users will still perceive a real time text flow. On reception, the RTP sequence number is compared with the sequence number of the last correctly received packet. If they are consecutive, the (only or primary) T140block is retrieved from the packet. 3.2 Recommended Procedure for Compensation for Lost Packets For reduction of data loss in case of packet loss, redundant data MAY be included in the packets following to the procedures in RFC 2198. If network conditions are not known, it is RECOMMENDED to use three redundant T140blocks in each packet. If there is a gap in the RTP sequence numbers, and redundant T140blocks are available in a subsequent packet, the sequence numbers for the redundant T140blocks should be inferred by counting backwards from the sequence number in Hellstrom Expires - February 2004 [Page 5] RTP Payload for Text Conversation August 2003 the RTP header for that packet. If there are redundant T140blocks with sequence numbers matching those that are missing, the redundant T140blocks may be substituted for the missing T140blocks. Both for the case when redundancy is used and not used, missing data SHOULD be marked by insertion of a missing text marker in the received stream for each missing T140block, as specified in ITU-T T.140. Addendum 1 [1]. 3.3 Recommended Procedure for Compensation for Packets Out of Order For protection against packets arriving out of order, the following procedure MAY be implemented in the receiver. If analysis of a received packet reveals a gap in the sequence and no redundant data is available to fill that gap, the received packet SHOULD be kept in a buffer to allow time for the missing packet(s) to arrive. It is RECOMMENDED that the waiting time be limited to 0.5 seconds. If a packet with a T140block belonging to the gap arrives before the waiting time expires, this T140block is inserted into the gap and then consecutive T140blocks from the leading edge of the gap may be consumed. Any T140block which does not arrive before the time limit expires should be treated as lost. 3.4 Transmission During "Silent Periods" when Redundancy is Used When using the redundancy transmission scheme, and there is redundant data, but no new T.140 data to transmit after the transmit buffering interval described in section 3.1 has passed, a packet MUST be transmitted containing a zero-length primary T140block and the properly positioned redundant data. Any zero-length T140blocks that are sent as primary data MUST be included as redundant T140blocks on subsequent packets just as normal text T140blocks would be so that sequence number inference for the redundant T140blocks will be correct, as explained in section 2.3. Redundancy for the last T140block SHOULD NOT be implemented by repeatedly transmitting the same packet (with the same sequence number) because this will cause the packet loss count, as reported in RTCP, to decrement. 4. SDP Attribute for Flow Control In some cases, it is necessary to limit the rate at which characters are transmitted. While the "b=" SDP attribute could be used to limit the rate of the RTP session, it may be that only the text stream in an interleaved audio/text session needs special handling. For example, when a PSTN gateway is interworking between an IP device Hellstrom Expires - February 2004 [Page 6] RTP Payload for Text Conversation August 2003 (not necessarily a textphone) and a PSTN textphone, it may be necessary to limit the character rate from the IP device in order to avoid throwing away characters at the PSTN gateway. At the same time, no explicit bit rate restriction is necessarily applied to the audio stream. To provide for flow control, the "gpmd" attribute [7] is used with the following syntax: a=gpmd: cps= The field is populated with the payload type that is used for text. The field contains an integer representing the maximum number of characters per second that may be received. Devices in receipt of this parameter MUST adhere to the request to apply flow control to the text communication by transmitting characters at a rate at or below the specified value. 5. Examples 5.1 RTP Packetization Examples This is an example of a T140 RTP packet without redundancy. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |V=2|P|X| CC=0 |M| T140 PT | sequence number | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | timestamp (1000Hz) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | synchronization source (SSRC) identifier | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + T.140 encoded data + | | + +---------------+ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ This is an example of an RTP packet with one redundant T140block. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |V=2|P|X| CC=0 |M| "RED" PT | sequence number of primary | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | timestamp of primary encoding "P" | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | synchronization source (SSRC) identifier | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Hellstrom Expires - February 2004 [Page 7] RTP Payload for Text Conversation August 2003 |1| T140 PT | timestamp offset of "R" | "R" block length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |0| T140 PT | | +-+-+-+-+-+-+-+-+ + | | + "R" T.140 encoded redundant data + | | + +---------------+ | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | "P" T.140 encoded primary data | + + + +---------------+ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 5.2 SDP Examples Below is an example of SDP describing RTP text transport on port 11000: m=text 11000 RTP/AVP 98 a=rtpmap:98 t140/1000 Below is an example of SDP similar to the above example, but also utilizing RFC 2198 to provide redundancy for the text packets: m=text 11000 RTP/AVP 98 100 a=rtpmap:98 t140/1000 a=rtpmap:100 red/1000 a=fmtp:100 98/98 Below is an example of SDP describing RTP text interleaved with G.711 audio packets within the same RTP session from port 7200 and at a maximum text rate of 6 characters per second: m=audio 7200 RTP/AVP 0 98 a=rtpmap:98 t140/1000 a=gpmd:98 cps=6 Below is an example using RFC 2198 to provide redundancy to just the text packets in an RTP session with interleaving text and G.711 at a text rate no faster than 6 characters per second: m=audio 7200 RTP/AVP 0 98 100 a=rtpmap:98 t140/1000 a=gpmd:98 cps=6 a=rtpmap:100 red/1000 a=fmtp:100 98/98 Hellstrom Expires - February 2004 [Page 8] RTP Payload for Text Conversation August 2003 6. Security Considerations Since the intention of the described payload format is to carry text in a text conversation, security measures in the form of encryption are of importance. The amount of data in a text conversation session is low and therefore any encryption method MAY be selected and applied to T.140 session contents or to the whole RTP packets. When redundant data is included, the same security considerations as for RFC 2198 apply. 7. MIME Media Type Registrations This document defines an RTP payload named "t140" and two associated MIME types, "text/t140" and "audio/t140". Additionally, the MIME type "text/RED" is defined to allow RFC 2198 to be used to carry redundant text payloads. 7.1 Registration of MIME Media Type text/t140 MIME media type name: text MIME subtype name: t140 Required parameters: rate The RTP timestamp clock rate, which is equal to the sampling rate. The only valid value is 1000. Optional parameters: None Encoding considerations: T.140 text can be transmitted with RTP as specified in RFC . Security considerations: None Interoperability considerations: None Published specification: ITU-T T.140 Recommendation. RFC . Applications which use this media type: Text communication terminals and text conferencing tools. Additional information: None Magic number(s): None File extension(s): None Macintosh File Type Code(s): None Person & email address to contact for further information: Hellstrom Expires - February 2004 [Page 9] RTP Payload for Text Conversation August 2003 Gunnar Hellstrom E-mail: gunnar.hellstrom@omnitor.se Intended usage: COMMON Author / Change controller: Gunnar Hellstrom | IETF avt WG gunnar.hellstrom@omnitor.se | c/o Steve Casner casner@cisco.com 7.2 Registration of MIME Media Type audio/t140 MIME media type name: audio MIME subtype name: t140 Required parameters: rate The RTP timestamp clock rate, which is equal to the sampling rate. The only valid value is 1000. Optional parameters: None Encoding considerations: T.140 text can be transmitted with RTP as specified in RFC . Security considerations: None Interoperability considerations: None Published specification: ITU-T T.140 Recommendation. RFC . Applications which use this media type: Text communication systems and text conferencing tools that transmit text associated with audio and within the same RTP session as the audio, such as PSTN gateways that transmit audio and text signals between two PSTN textphone users over an IP network. Additional information: None Magic number(s): None File extension(s): None Macintosh File Type Code(s): None Person & email address to contact for further information: Paul E. Jones E-mail: paulej@packetizer.com Intended usage: COMMON Hellstrom Expires - February 2004 [Page 10] RTP Payload for Text Conversation August 2003 Author / Change controller: Paul E. Jones | IETF avt WG paulej@packetizer.com | c/o Steve Casner casner@cisco.com 7.3 Registration of MIME Media Type text/RED MIME media type name: text MIME subtype name: RED Required parameters: pt: a comma-separated list of RTP payload types. Because comma is a special character, the list must be a quoted-string (enclosed in double quotes). For static payload types, each list element is simply the type number. For dynamic payload types, each list element is a mapping of the dynamic payload type number to an embedded MIME content-type specification for the payload format corresponding to the dynamic payload type. The format of the mapping is: dynamic-payload-type "=" content-type If the content-type string includes a comma, then the content-type string MUST be a quoted-string. If the content- type string does not include a comma, it MAY still be quoted. Since it is part of the list which must itself be a quoted- string, that means the quotation marks MUST be quoted with backslash quoting as specified in RFC 2045. If the content- type string itself contains a quoted-string, then the requirement for backslash quoting is recursively applied. To specify the text/RED payload format in SDP, the pt parameter is mapped to an a=fmtp attribute by eliminating the parameter name (pt) and changing the commas to slashes. For example, 'pt="101,102"' maps to 'a=fmtp:99 101/102'. Optional parameters: ptime, maxptime Encoding considerations: This type is only defined for transfer via RTP [2]. Security considerations: None Interoperability considerations: none Published specification: RFC 2198 Applications which use this media type: Text streaming and conferencing tools. Hellstrom Expires - February 2004 [Page 11] RTP Payload for Text Conversation August 2003 Additional information: none Person & email address to contact for further information: Paul E. Jones E-mail: paulej@packetizer.com Intended usage: COMMON Author / Change controller: Paul E. Jones | IETF avt WG paulej@packetizer.com | c/o Steve Casner casner@cisco.com 8. Authors' Addresses Gunnar Hellstrom Omnitor AB Alsnogatan 7, 4 tr SE-116 41 Stockholm Sweden Phone: +46 708 204 288 / +46 8 556 002 03 Fax: +46 8 556 002 06 E-mail: gunnar.hellstrom@omnitor.se Paul E. Jones Cisco Systems, Inc. 7025 Kit Creek Rd. Research Triangle Park, NC 27709 Phone: +1 919 392 6948 E-mail: paulej@packetizer.com 9. Acknowledgements The authors want to thank Stephen Casner and Colin Perkins for valuable support with reviews and advice on creation of this document, to Mickey Nasiri at Ericsson Mobile Communication for providing the development environment, and Michele Mizarro for verification of the usability of the payload format for its intended purpose. 10. References [1] ITU-T Recommendation T.140 (1998) - Text conversation protocol for multimedia application, with amendment 1, (2000). [2] Schulzrinne, H., Casner, S., Frederick, R. and V. Jacobson, "RTP: A Transport Protocol for Real-Time Applications", RFC 3550, July 2003. Hellstrom Expires - February 2004 [Page 12] RTP Payload for Text Conversation August 2003 [3] Perkins, C., Kouvelas, I., Hardman, V., Handley, M. and J. Bolot, "RTP Payload for Redundant Audio Data", RFC 2198, September 1997. [4] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [5] ISO/IEC 10646-1: (1993), Universal Multiple Octet Coded Character Set. [6] Yergeau, F., "UTF-8, a transformation format of ISO 10646", RFC 2279, January 1998. [7] Kumar, R., Andreasen, F., "SDP attribute for Qualifying Media Formats with Generic Parameters", draft-rajeshkumar-mmusic-gpmd- 03.txt, Work In Progress. Hellstrom Expires - February 2004 [Page 13]