Network Working Group G. Hellstrom Internet Draft Omnitor AB P. Jones Expires: October 2004 Cisco Systems, Inc. April 2004 RTP Payload for Text Conversation Status of this Memo This document is an Internet-Draft and is subject to all provisions of Section 10 of RFC2026. By submitting this Internet-Draft, we certify that any applicable patent or other IPR claims of which we are aware have been disclosed, and any of which we become aware will be disclosed, in accordance with RFC 3668 (BCP 79). By submitting this Internet-Draft, we accept the provisions of Section 3 of RFC 3667 (BCP 78). Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or cite them other than as "work in progress". The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html This document is a submission of the IETF AVT WG. Comments should be directed to the AVT WG mailing list, avt@ietf.org. Abstract This memo describes how to carry real time text conversation session contents in RTP packets. Text conversation session contents are specified in ITU-T Recommendation T.140. Two payload formats are described. One for transmitting text on a separate RTP session dedicated for the transmission of text, and Hellstrom & Jones Expires - October 2004 [Page 1] Internet-Draft RTP Payload for Text Conversation April 2004 one for transmitting audio and text data within one single RTP session. This RTP payload description recommends a method to include redundant text from already transmitted packets in order to reduce the risk of text loss caused by packet loss. Table of Contents 1. Introduction...................................................3 2. Conventions used in this document..............................4 3. Usage of RTP...................................................4 3.1 Payload Format for Transmission of text/t140 Data..........4 3.2 Payload Format for Transmission of audio/t140 Data.........4 3.3 The "T140block"............................................5 3.4 Synchronization of Text with Other Media...................5 3.5 Synchronization considerations for the audio/t140 format...6 3.6 RTP packet header..........................................6 4. Protection against loss of data................................7 4.1 Payload Format when using Redundancy.......................7 4.2 Using redundancy with the text/t140 format.................7 4.3 Using redundancy with the audio/t140 format................8 5. Recommended Procedure..........................................9 5.1 Recommended Basic Procedure................................9 5.2 Detection of Lost Text Packets............................10 5.3 Compensation for Packets Out of Order.....................10 5.4 Transmission During "Silent Periods" with Redundancy......11 6. Parameter for Character Transmission Rate.....................11 7. Examples......................................................12 7.1 RTP Packetization Examples for the text/t140 format.......12 7.2 RTP Packetization Examples for the audio/t140 format......14 7.3 SDP Examples..............................................16 8. Security Considerations.......................................17 8.1 Confidentiality...........................................17 8.2 Integrity.................................................17 8.3 Source authentication.....................................17 9. Congestion Considerations.....................................18 10. IANA considerations..........................................19 10.1 Registration of MIME Media Type text/t140................19 10.2 Registration of MIME Media Type audio/t140...............20 10.3 SDP mapping of MIME parameters...........................21 10.4 Offer/Answer Consideration...............................22 11. Authors' Addresses...........................................22 12. Acknowledgements.............................................22 13. Normative References.........................................23 14. Informative References.......................................23 15. Intellectual Property Statement..............................24 16. Copyright Statement..........................................24 Hellstrom & Jones Expires - October 2004 [Page 2] Internet-Draft RTP Payload for Text Conversation April 2004 [Notes to RFC Editor: 1. All references to RFC XXXX are to be replaced by references to the RFC number of this memo, when published. 2. All references to RFC YYYY [9] are to be replaced by references to the document that registers the text/red MIME type.] 1. Introduction This document defines two payload types for carrying text conversation session contents in RTP [2] packets. Text conversation session contents are specified in ITU-T Recommendation T.140 [1]. Text conversation is used alone or in connection to other conversational facilities such as video and voice, to form multimedia conversation services. Text in multimedia conversation sessions is sent character-by-character as soon as it is available, or with a small delay for buffering. The text is supposed to be entered by human users from a keyboard, handwriting recognition, voice recognition or any other input method. The rate of character entry is usually at a level of a few characters per second or less. In general, only one or a few new characters are expected to be transmitted with each packet. Small blocks of text may be prepared by the user and pasted into the user interface for transmission during the conversation, occasionally causing packets to carry more payload. T.140 specifies that text and other T.140 elements must be transmitted in ISO 10646-1[5] code with UTF-8 [6] transformation. That makes it easy to implement internationally useful applications and to handle the text in modern information technology environments. The payload of an RTP packet following this specification consists of text encoded according to T.140 without any additional framing. A common case will be a single ISO 10646 character, UTF-8 encoded. T.140 requires the transport channel to provide characters without duplication and in original order. Text conversation users expect that text will be delivered with no or a low level of lost information. If lost information can be indicated, the willingness to accept loss is expected to be higher. Therefore a mechanism based on RTP is specified here. It gives text arrival in correct order, without duplication, and with detection and indication of loss. It also includes an optional possibility to repeat data for redundancy to lower the risk of loss. Since packet Hellstrom & Jones Expires - October 2004 [Page 3] Internet-Draft RTP Payload for Text Conversation April 2004 overhead is usually much larger than the T.140 contents, the increase in bandwidth with the use of redundancy is minimal. By using RTP for text transmission in a multimedia conversation application, uniform handling of text and other media can be achieved in, as examples, conferencing systems, firewalls, and network translation devices. This, in turn, eases the design and increases the possibility for prompt and proper media delivery. This document obsoletes RFC 2793 [15]. The text clarifies ambiguities in RFC 2793, improves on the specific implementation requirements learned through development experience, gives explicit usage examples, and introduces a method of transporting text interleaved with voice within the same RTP session. 2. Conventions used in this document The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [4]. 3. Usage of RTP Two payload formats for real-time text transmission with RTP [2] are described in this memo, one for general text conversation use and another for use between PSTN gateways. 3.1 Payload Format for Transmission of text/t140 Data The text/t140 format is primarily used when text is transmitted on a separate RTP session dedicated for the transmission of text and not shared with other media, such as audio, DTMF etc. IP textphone devices, IP multimedia conversation devices and network elements involved in communication with such devices most commonly use this format. A text/t140 conversation RTP payload format consists of one and only one block of T.140 data, referred to as a "T140block" (see section 3.3). There are no additional headers specific to this payload format. The fields in the RTP header are set as defined in section 3.6. 3.2 Payload Format for Transmission of audio/t140 Data The primary purpose for the audio/t140 payload specification is to allow gateways that are interconnecting two PSTN networks to interleave, through a single RTP session, audio and text data received on the PSTN circuit. This is comparable to the way in which DTMF is extracted and transmitted within an RTP session [14]. Hellstrom & Jones Expires - October 2004 [Page 4] Internet-Draft RTP Payload for Text Conversation April 2004 Note that the audio/t140 format does not allow simultaneous audio and text transmission, because the expectation is that at each moment, only one payload type is selected for play-out. An audio/t140 conversation RTP payload format consists of a 16-bit "t140block counter" (with the most significant bit transmitted first) followed by one and only one "T140block" (see section 3.3). There are no additional headers specific to this payload format. The fields in the RTP header are set as defined in section 3.7. The T140block counter MUST be initialized to zero the first time that a packet containing a T140block is transmitted and MUST be incremented by 1 each time that a new block is transmitted. Once the counter reaches the value 0xFFFF, the counter is reset to 0 the next time the counter is incremented. This T140block counter is used to detect lost blocks and to avoid duplication of blocks. For the purposes of readability, the remainder of this document only refers to the T140block without making explicit reference to the T140block counter. Readers should understand that when using the audio/t140 format, the T140block counter MUST always precede the actual T140block, including redundant data transmissions. 3.3 The "T140block" T.140 text is UTF-8 coded as specified in T.140 with no extra framing. The T140block contains one or more T.140 code elements as specified in [1]. Most T.140 code elements are single ISO 10646 [5] characters, but some are multiple character sequences. Each character is UTF-8 encoded [6] into one or more octets. Each block MUST contain an integral number of UTF-8 encoded characters regardless of the number of octets per character. Any composite character sequence (CCS) SHOULD be placed within one block. 3.4 Synchronization of Text with Other Media Usually, each medium in a session utilizes a separate RTP stream. As such, if synchronization of the text and other media packets is important, the streams MUST be associated when the sessions are established and the streams MUST share the same reference clock (refer to the description of the timestamp field as it relates to synchronization in section 5.1 of RFC 3550). Association of RTP streams is dependent on the particular application and is outside the scope of this document. Hellstrom & Jones Expires - October 2004 [Page 5] Internet-Draft RTP Payload for Text Conversation April 2004 3.5 Synchronization considerations for the audio/t140 format. When audio/t140 is used, it is generally transmitted as interleaved packets between voice packets or other kinds of audio packets with the intention to create one common audio signal in the receiving equipment to be used for alternating between text and voice. The audio/t140 payload is then used to play out audio signals according to a PSTN textphone coding method (usually a modem). One should observe the RTP timestamps of the voice, text, or other audio packets in order to reproduce the stream correctly when playing out the audio. Note, also, that incoming text from a PSTN circuit might be at a higher bit-rate than can be played out on an egress PSTN circuit. As such, it is possible that, on the egress side, a gateway may not complete the play out of the text packets before it is time to play the next voice packet. Given that this application is primarily for the benefit of users of PSTN textphone devices, it is strongly RECOMMENDED that all received text packets be properly reproduced on the egress gateway before considering any other subsequent audio packets. If necessary, voice and other audio packets should be discarded in order to properly reproduce the text signals on the PSTN circuit, even if the text packets arrive late. The PSTN textphone users commonly use turn-taking indicators in the text stream, so it can be expected that as long as text is transmitted, it is valid text and should be given priority over voice. 3.6 RTP packet header Each RTP packet starts with a fixed RTP header. The following fields of the RTP fixed header are specified for T.140 text streams: Payload Type (PT): The assignment of an RTP payload type is specific to the RTP profile under which this payload format is used. For profiles that use dynamic payload type number assignment, this payload format can be identified by the MIME types "text/t140" and "audio/t140" (see section 10). If redundancy is used per RFC 2198, another payload type number needs to be provided for the redundancy format. MIME types for identifying RFC 2198 are available in RFC 3555 and RFC YYYY [9]. Sequence number: The definition of sequence numbers is available in RFC 3550 [2]. When transmitting text using the payload format for text/t140, it is used for detection of packet loss and packets out of order, and can be used in the process of retrieval of Hellstrom & Jones Expires - October 2004 [Page 6] Internet-Draft RTP Payload for Text Conversation April 2004 redundant text, reordering of text and marking missing text. Character loss is detected through the T140block counter when using the audio/t140 payload format. Timestamp: The RTP Timestamp encodes the approximate instance of entry of the primary text in the packet. A clock frequency of 1000 Hz MUST be used for text/t140. For audio/T140, the clock frequency MAY be set to any value, and SHOULD be set to the same value as for any audio packets in the same RTP stream in order to avoid RTP timestamp rate switching. The value SHOULD be set by out of band mechanisms. Sequential packets MUST NOT use the same timestamp. Since packets do not represent any constant duration, the timestamp cannot be used to directly infer packet loss. M-bit: The M-bit MUST be included, but has no defined meaning for t140 text streams and MUST be set to 0. 4. Protection against loss of data For reduction of data loss in case of packet loss, redundant data SHOULD be included in the packets following the procedures in RFC 2198 [3]. This method MUST be used, transmitting the original text and two redundant generations, if the application or the end-to-end network conditions do not call for other methods or other levels of redundancy to be used. As an alternative (or in addition) to redundancy, Forward Error Correction mechanisms MAY be used when transmitting text, as per RFC 2733 [8] or any other mechanism with the purpose of increasing the reliability of text transmission. There are also other mechanisms for increasing robustness of transmission that MAY be applied. 4.1 Payload Format when using Redundancy When redundancy according to RFC 2198 [3] is used, the RTP header is followed by one or more redundant data block headers, the same number of redundant data fields carrying T140blocks from previous packets, and finally the new (primary) T140block for this packet. The exact payload format is slightly different for the text/t140 format and for the audio/t140 format. 4.2 Using redundancy with the text/t140 format. When redundant transmission of the data according to RFC 2198 is desired, the RTP header is followed by one or more redundant data block headers, one for each redundant data block to be included. Each of these headers provides the timestamp offset and length of Hellstrom & Jones Expires - October 2004 [Page 7] Internet-Draft RTP Payload for Text Conversation April 2004 the corresponding data block plus a payload type number indicating this payload format ("text/t140"). Redundant data older than 16383 divided by the clock frequency MUST NOT be transmitted. When using the format with redundant data, the transmitter may select a number of T140block generations to retransmit in each packet. A higher number introduces better protection against loss of text but marginally increases the data rate. Since text is transmitted only when there is text to transmit, the timestamp is not sufficient to identify a packet in the case of loss. Extra information must be provided. Since sequence numbers are not provided in the redundant header, some additional rules must be followed to allow the redundant data corresponding to missing primary data to be merged properly into the stream of primary data T140blocks when using the text/t140 payload format. They are: - Each redundant data block MUST contain the same data as a T140block previously transmitted as primary data, and be identified with a timestamp offset equating to the original timestamp for that T140block. - The redundant data MUST be placed in age order with most recent redundant T140block last in the redundancy area. - All T140blocks from the oldest desired generation up through the generation immediately preceding the new (primary) T140block MUST be included. For the text/t140 payload format, these rules allow the sequence numbers for the redundant T140blocks to be inferred by counting backwards from the sequence number in the RTP header. The result will be that all the text in the payload will be contiguous and in order. If there is a gap in the RTP sequence numbers for text/t140, and redundant T140blocks are available in a subsequent packet, the sequence numbers for the redundant T140blocks should be inferred by counting backwards from the sequence number in the RTP header for that packet. If there are redundant T140blocks with sequence numbers matching those that are missing, the redundant T140blocks may be substituted for the missing T140blocks. 4.3 Using redundancy with the audio/t140 format When redundant transmission of the data according to RFC 2198 is used, the RTP header is followed by one or more redundant data block headers, one for each redundant data block to be included. Each of these headers provides the timestamp offset and length of Hellstrom & Jones Expires - October 2004 [Page 8] Internet-Draft RTP Payload for Text Conversation April 2004 the corresponding data block plus a payload type number indicating this payload format ("T140"). When using the format with redundant data, the transmitter may select a number of T140block generations to retransmit in each packet. A higher number introduces better protection against loss of text but marginally increases the data rate. The timestamp is not sufficient to identify a packet in the case of loss. Extra information must be provided. Since sequence numbers are not provided in the redundant header and since the sequence number space is shared by all audio payload types within an RTP session, a sequence number in the form of a T140block counter is added to the T140block for transmission. This allows the redundant data corresponding to missing primary data to be merged properly into the stream of primary data T140blocks when using the audio/t140 payload format. Each redundant data block MUST contain the same data as a T140block previously transmitted as primary data, and be identified with a T140block counter equating to the original T140block counter for that T140block. For the audio/t140 payload format, this rule allows the T140block counters for the redundant T140blocks to be retrieved. The T140block counters preceding the text in the T140block, enables the ordering by the receiver. If there is a gap in the T140block counter value of received audio/t140 packets, and if there are redundant T140blocks with T140block counters matching those that are missing, the redundant T140blocks may be substituted for the missing T140blocks. 5. Recommended Procedure This section contains RECOMMENDED procedures for usage of the payload format. Based on the information in the received packets, the receiver can: - reorder text received out of order. - mark where text is missing because of packet loss. - compensate for lost packets by using redundant data. 5.1 Recommended Basic Procedure Packets are transmitted only when there is valid T.140 data to transmit. The sequence number is used for sequencing of T.140 data. Hellstrom & Jones Expires - October 2004 [Page 9] Internet-Draft RTP Payload for Text Conversation April 2004 T.140 specifies that T.140 data MAY be buffered before transmission with a maximum buffering time of 500 ms. In order to keep the maximum bit rate usage for text at a reasonable level, it is RECOMMENDED to buffer T.140 data for transmission in 300 ms intervals. This time is selected so that text users will still perceive a real time text flow. On reception of text/t140 data, the RTP sequence number is compared with the sequence number of the last received packet. On reception of audio/t140 data, the T140block counter is compared with the T140block counter of the last received text packet. 5.2 Detection of Lost Text Packets Packet loss for text/t140 packets MAY be detected by observing gaps in the sequence numbers of RTP packets received by the receiver. With audio/t140, however, packets following a text packet might be audio packets of a format other than audio/text, so the same rule does not apply. Rather, receivers detect the loss of an audio/t140 packet by observing the value of the T140block counter in a subsequent audio/t140 packet. With both text/t140 and audio/t140, the loss of the last packet of a sequence of packets cannot be detected until the next text packet is received. Missing data SHOULD be marked by insertion of a missing text marker in the received stream for each missing T140block, as specified in ITU-T T.140 Addendum 1 [1]. 5.3 Compensation for Packets Out of Order For protection against packets arriving out of order, the following procedure MAY be implemented in the receiver. If analysis of a received packet reveals a gap in the sequence and no redundant data is available to fill that gap, the received packet SHOULD be kept in a buffer to allow time for the missing packet(s) to arrive. It is RECOMMENDED that the waiting time be limited to 0.5 seconds. If a packet with a T140block belonging to the gap arrives before the waiting time expires, this T140block is inserted into the gap and then consecutive T140blocks from the leading edge of the gap may be consumed. Any T140block which does not arrive before the time limit expires should be treated as lost and a missing text marker inserted ( see section 5.2 ). Hellstrom & Jones Expires - October 2004 [Page 10] Internet-Draft RTP Payload for Text Conversation April 2004 5.4 Transmission During "Silent Periods" with Redundancy When using the redundancy transmission scheme, and there is redundant data, but no new T.140 data to transmit after the transmit buffering interval described in section 5.1 has passed, a packet MUST be transmitted containing a zero-length primary T140block and the properly positioned redundant data. When using the audio/t140 payload format with an empty T140block, the T140block counter MUST also be absent (as there is no actual T140block). When using the text/t140 payload format, any zero-length T140blocks that are sent as primary data MUST be included as redundant T140blocks on subsequent packets just as normal text T140blocks would be so that sequence number inference for the redundant T140blocks will be correct, as explained in section 4.2. When using the audio/t140 payload format, empty T140blocks sent as primary data SHOULD NOT be included as redundant T140blocks, as it would simply be a waste of bandwidth to send them. 6. Parameter for Character Transmission Rate In some cases, it is necessary to limit the rate at which characters are transmitted. For example, when a PSTN gateway is interworking between an IP device and a PSTN textphone, it may be necessary to limit the character rate from the IP device in order to avoid throwing away characters in case of buffer overflow at the PSTN gateway. To control the character transmission rate, the MIME parameter "cps" in the "fmtp" attribute [7] is defined (see section 8 ). It is used in SDP with the following syntax: a=fmtp: cps= The field is populated with the payload type that is used for text. The field contains an integer representing the maximum number of characters that may be received per second. The value shall be used as a mean value over any 10 second interval. The default value is 30. Examples of use in SDP are found in section 7.3. In receipt of this parameter, devices MUST adhere to the request by transmitting characters at a rate at or below the specified value. Note that this parameter was not defined in RFC 2793 [15]. Therefore implementations of the text/t140 format may be in use that do not recognize and act according to this parameter. Receivers of text/t140 SHALL therefore be designed so that they can Hellstrom & Jones Expires - October 2004 [Page 11] Internet-Draft RTP Payload for Text Conversation April 2004 handle temporary reception of characters at a higher rate than this parameter specifies, so that malfunction because of buffer overflow is avoided for text conversation with human input. 7. Examples 7.1 RTP Packetization Examples for the text/t140 format. Below is an example of a text/t140 RTP packet without redundancy. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |V=2|P|X| CC=0 |M| T140 PT | sequence number | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | timestamp (1000Hz) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | synchronization source (SSRC) identifier | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + T.140 encoded data + | | + +---------------+ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Below is an example of a text/t140 RTP packet with one redundant T140block. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |V=2|P|X| CC=0 |M| "RED" PT | sequence number of primary | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | timestamp of primary encoding "P" | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | synchronization source (SSRC) identifier | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |1| T140 PT | timestamp offset of "R" | "R" block length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |0| T140 PT | | +-+-+-+-+-+-+-+-+ + | | + "R" T.140 encoded redundant data + | | + +---------------+ | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | "P" T.140 encoded primary data | + +---------------+ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Hellstrom & Jones Expires - October 2004 [Page 12] Internet-Draft RTP Payload for Text Conversation April 2004 Below is an example of an RTP packet with one redundant T140block using text/t140 payload format. The primary data block is empty, which is the case when transmitting a packet for the sole purpose of forcing the redundant data to be transmitted in the absence of any new data. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |V=2|P|X| CC=0 |M| "RED" PT | sequence number of primary | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | timestamp of primary encoding "P" | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | synchronization source (SSRC) identifier | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |1| T140 PT | timestamp offset of "R" | "R" block length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |0| T140 PT | | +-+-+-+-+-+-+-+-+ + | | + "R" T.140 encoded redundant data + | | + +---------------+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ As a follow-on to the previous example, the example below shows the next RTP packet in the sequence which does contain a real T140block when using the text/t140 payload format. Note that the empty block is present in the redundant transmissions of the text/t140 payload format. This example shows 2 levels of redundancy and one primary data block. The value of the "R2 block length" would be set to zero in order to represent the empty T140block. Hellstrom & Jones Expires - October 2004 [Page 13] Internet-Draft RTP Payload for Text Conversation April 2004 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |V=2|P|X| CC=0 |M| "RED" PT | sequence number of primary | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | timestamp of primary encoding "P" | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | synchronization source (SSRC) identifier | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |1| T140 PT | timestamp offset of "R1" | "R1" block length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |1| T140 PT | timestamp offset of "R2" | "R2" block length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |0| T140 PT | | +-+-+-+-+-+-+-+-+ + | | + "R1" T.140 encoded redundant data + | | + +---------------+ | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | "P" T.140 encoded primary data | + + + +---------------+ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 7.2 RTP Packetization Examples for the audio/t140 format Below is an example of an audio/t140 RTP packet without redundancy. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |V=2|P|X| CC=0 |M| T140 PT | sequence number | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | timestamp (8000Hz) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | synchronization source (SSRC) identifier | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | T140block counter | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + T.140 encoded data + | | + +---------------+ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Hellstrom & Jones Expires - October 2004 [Page 14] Internet-Draft RTP Payload for Text Conversation April 2004 Below is an example of an RTP packet with one redundant T140block using audio/t140 payload format. The primary data block is empty, which is the case when transmitting a packet for the sole purpose of forcing the redundant data to be transmitted in the absence of any new data. Note that since this is the audio/t140 payload format, the redundant block of T.140 data is immediately preceded with a T140block counter. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |V=2|P|X| CC=0 |M| "RED" PT | sequence number of primary | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | timestamp of primary encoding "P" | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | synchronization source (SSRC) identifier | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |1| T140 PT | timestamp offset of "R" | "R" block length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |0| T140 PT | "R" T140block counter | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | | + "R" T.140 encoded redundant data + | | + +---------------+ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ As a follow-on to the previous example, the example below shows the next RTP packet in the sequence which does contain a new real T140block when using the audio/t140 payload format. This example has 2 levels of redundancy and one primary data block. Since the previous primary block was empty, no redundant data is included for that block. This is because when using the audio/t140 payload format, any previously transmitted "empty" T140blocks are NOT included as redundant data in subsequent packets. Hellstrom & Jones Expires - October 2004 [Page 15] Internet-Draft RTP Payload for Text Conversation April 2004 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |V=2|P|X| CC=0 |M| "RED" PT | sequence number of primary | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | timestamp of primary encoding "P" | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | synchronization source (SSRC) identifier | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |1| T140 PT | timestamp offset of "R1" | "R1" block length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |1| T140 PT | "R1" T140block counter | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | | + "R1" T.140 encoded redundant data + | | + +---------------+ | | "P" T140block | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | counter | "P" T.140 encoded primary data | +-+-+-+-+-+-+-+-+ + | | + +---------------+ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 7.3 SDP Examples Below is an example of SDP describing RTP text transport on port 11000: m=text 11000 RTP/AVP 98 a=rtpmap:98 t140/1000 Below is an example of SDP similar to the above example, but also utilizing RFC 2198 to provide the recommended two levels of redundancy for the text packets: m=text 11000 RTP/AVP 98 100 a=rtpmap:98 t140/1000 a=rtpmap:100 red/1000 a=fmtp:100 98/98/98 Below is an example of SDP describing RTP text interleaved with G.711 audio packets within the same RTP session from port 7200 and at a maximum text rate of 6 characters per second: m=audio 7200 RTP/AVP 0 98 a=rtpmap:98 t140/8000 a=fmtp:98 cps=6 Hellstrom & Jones Expires - October 2004 [Page 16] Internet-Draft RTP Payload for Text Conversation April 2004 Below is an example using RFC 2198 to provide the recommended two levels of redundancy to the text packets in an RTP session with interleaving text and G.711 at a text rate no faster than 20 characters per second: m=audio 7200 RTP/AVP 0 98 100 a=rtpmap:98 t140/8000 a=fmtp:98 cps=20 a=rtpmap:100 red/8000 a=fmtp:100 98/98/98 Note - While these examples utilize the RTP/AVP profile, it is not intended to limit the scope of this memo to use with only that profile. Rather, any appropriate profile may be used in conjunction with this memo. 8. Security Considerations All of the security considerations from section 14 of RFC 3550 [2] apply. 8.1 Confidentiality Since the intention of the described payload format is to carry text in a text conversation, security measures in the form of encryption are of importance. The amount of data in a text conversation session is low and therefore any encryption method MAY be selected and applied to T.140 session contents or to the whole RTP packets. SRTP [13] provides a suitable method for ensuring confidentiality. 8.2 Integrity It may be desirable to protect the text contents of an RTP stream against manipulation. SRTP [13] provides methods for providing integrity that MAY be applied. 8.3 Source authentication Measures to make sure that the source of text is the intended one can be accomplished by a combination of methods. Text streams are usually used in a multimedia control environment. Security measures for authentication are available and SHOULD be applied in the registration and session establishment procedures, so that the identity of the sender of the text stream is reliably associated with the person or device setting up the session. Once Hellstrom & Jones Expires - October 2004 [Page 17] Internet-Draft RTP Payload for Text Conversation April 2004 established, SRTP [13] mechanisms MAY be applied to ascertain that the source is maintained the same during the session. 9. Congestion Considerations The congestion considerations from section 10 of RFC 3550 [2], section 6 of RFC 2198 [3] and the section about congestion in chapter 2 of RFC 3551 [11] apply with the following application specific considerations. Automated systems MUST NOT use this format to send large amounts of text at a rate significantly above that which a human user could enter. Even if the network load from users of text conversation is usually very low, for best-effort networks an application MUST monitor the packet loss rate and take appropriate actions to reduce its sending rate if this application sends at higher rate than what TCP would achieve over the same path. The reason is that this application, due to its recommended usage of two or more redundancy levels, is very robust against packet loss. At the same time, due to the low bit-rate of text conversations, if one considers the discussion in RFC 3714 [12], this application will experience very high packet loss rates before it needs to perform any reduction in the sending rate. If the application needs to reduce its sending rate, it SHOULD NOT reduce the number of redundancy levels below the default amount specified in section 4. Instead the following actions are RECOMMENDED in order of priority: - Increase the shortest time between transmissions described in section 5.1 from the recommended 300 ms to 500 ms that is the highest value allowable according to T.140. - Limit the maximum rate of characters transmitted. - Increase the shortest time between transmissions to a higher value, not higher than 5 seconds. This will cause unpleasant delays in transmission, beyond what is allowed according to T.140, but text will still be conveyed in the session with some usability. - Exclude participants from the session. Please note that if the reduction in bit-rate achieved through the above measures are not sufficient, the only remaining action is to terminate the session. Hellstrom & Jones Expires - October 2004 [Page 18] Internet-Draft RTP Payload for Text Conversation April 2004 As guidance, some load figures are provided here. -Experience tells that a common mean character transmission rate during a complete PSTN text conversation session in reality is around 2 characters per second. -A maximum performance of 20 characters per second is enough even for voice to text applications. -With the (unusually high) load of 20 characters per second, in a language that make use of three octets UTF-8 characters, no header compression, two redundant levels and 300 ms between transmissions, the maximum load of this application is 3300 bits/s. -When the restrictions mentioned above are applied, limiting transmission to 10 characters per second, using 5 s between transmissions, the maximum load of this application in a language that uses one octet per UTF-8 character is 300 bits/s. Note also, that this payload can be used in a congested situation as a last resort to maintain some contact when audio and video media need to be stopped. The availability of one low bit-rate stream for text in such adverse situations may be crucial for maintaining some communication in a critical situation. 10. IANA considerations This document defines an RTP payload named "t140" and two associated MIME types, "text/t140" and "audio/t140", to be registered by IANA. 10.1 Registration of MIME Media Type text/t140 MIME media type name: text MIME subtype name: t140 Required parameters: rate: The RTP timestamp clock rate, which is equal to the sampling rate. The only valid value is 1000. Optional parameters: cps: The maximum number of characters that may be received per second. The deafult value is 30. Encoding considerations: T.140 text can be transmitted with RTP as specified in RFC XXXX. Hellstrom & Jones Expires - October 2004 [Page 19] Internet-Draft RTP Payload for Text Conversation April 2004 Security considerations: See section 8 of RFC XXXX. Interoperability considerations: This format is the same as specified in RFC2793. For RFC2793 the "cps=" parameter was not defined. Therefore there may be implementations that do not consider this parameter. Receivers need to take that into account. Published specification: ITU-T T.140 Recommendation. RFC XXXX. Applications which use this media type: Text communication terminals and text conferencing tools. Additional information: This type is only defined for transfer via RTP. Magic number(s): None File extension(s): None Macintosh File Type Code(s): None Person & email address to contact for further information: Gunnar Hellstrom E-mail: gunnar.hellstrom@omnitor.se Intended usage: COMMON Author / Change controller: Gunnar Hellstrom | IETF avt WG gunnar.hellstrom@omnitor.se | 10.2 Registration of MIME Media Type audio/t140 MIME media type name: audio MIME subtype name: t140 Required parameters: rate: The RTP timestamp clock rate, which is equal to the sampling rate. This parameter SHOULD have the same value as for any audio codec packets interleaved in the same RTP stream. Optional parameters: cps: The maximum number of characters that may be received per second. The deafult value is 30. Encoding considerations: T.140 text can be transmitted with RTP as specified in RFC XXXX. Hellstrom & Jones Expires - October 2004 [Page 20] Internet-Draft RTP Payload for Text Conversation April 2004 Security considerations: See section 8 of RFC XXXX. Interoperability considerations: None Published specification: ITU-T T.140 Recommendation. RFC XXXX. Applications which use this media type: Text communication systems and text conferencing tools that transmit text associated with audio and within the same RTP session as the audio, such as PSTN gateways that transmit audio and text signals between two PSTN textphone users over an IP network. Additional information: This type is only defined for transfer via RTP. Magic number(s): None File extension(s): None Macintosh File Type Code(s): None Person & email address to contact for further information: Paul E. Jones E-mail: paulej@packetizer.com Intended usage: COMMON Author / Change controller: Paul E. Jones | IETF avt WG paulej@packetizer.com | 10.3 SDP mapping of MIME parameters The information carried in the MIME media type specification has a specific mapping to fields in the Session Description Protocol (SDP) [7], which is commonly used to describe RTP sessions. When SDP is used to specify sessions employing the text/t140 or audio/t140 format, the mapping is as follows: - The MIME type ("text") goes in SDP "m=" as the media name. - The MIME subtype (payload format name) goes in SDP "a=rtpmap" as the encoding name. The RTP clock rate in "a=rtpmap" MUST be 1000 for text/t140. For audio/T140, the clock rate MAY be set to any value, and SHOULD be set to the same value as for any audio packets in the same RTP stream. - The parameter "cps" goes in SDP "a=fmtp" attribute. Hellstrom & Jones Expires - October 2004 [Page 21] Internet-Draft RTP Payload for Text Conversation April 2004 - When the payload type is used with redundancy according to RFC 2198, the level of redundancy is shown by the number of elements in the slash-separated payload type list in the "fmtp" parameter of the redundancy declaration as defined in RFC YYYY [9] and RFC 2198 [3]. 10.4 Offer/Answer Consideration In order to achieve interoperability within the framework of the offer/answer model [10], the following consideration should be made: - The "cps" parameter is declarative. Both sides may provide a value, which is independent of the other side. 11. Authors' Addresses Gunnar Hellstrom Omnitor AB Renathvagen 2 SE-121 37 Johanneshov Sweden Phone: +46 708 204 288 / +46 8 556 002 03 Fax: +46 8 556 002 06 E-mail: gunnar.hellstrom@omnitor.se Paul E. Jones Cisco Systems, Inc. 7025 Kit Creek Rd. Research Triangle Park, NC 27709 USA Phone: +1 919 392 6948 E-mail: paulej@packetizer.com 12. Acknowledgements The authors want to thank Stephen Casner, Magnus Westerlund and Colin Perkins for valuable support with reviews and advice on creation of this document, to Mickey Nasiri at Ericsson Mobile Communication for providing the development environment, Michele Mizarro for verification of the usability of the payload format for its intended purpose, and Andreas Piirimets for editing support. Hellstrom & Jones Expires - October 2004 [Page 22] Internet-Draft RTP Payload for Text Conversation April 2004 13. Normative References [1] ITU-T Recommendation T.140 (1998) - Text conversation protocol for multimedia application, with amendment 1, (2000). [2] Schulzrinne, H., Casner, S., Frederick, R. and V. Jacobson, "RTP: A Transport Protocol for Real-Time Applications", RFC 3550, July 2003. [3] Perkins, C., Kouvelas, I., Hardman, V., Handley, M. and J. Bolot, "RTP Payload for Redundant Audio Data", RFC 2198, September 1997. [4] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [5] ISO/IEC 10646-1: (1993), Universal Multiple Octet Coded Character Set. [6] Yergeau, F., "UTF-8, a transformation format of ISO 10646", RFC 3629, December 2003. [7] Handley, M., Jacobson, V., "SDP: Session Description Protocol", RFC 2327, April 1998. [8] Rosenberg, J., Schulzrinne, H., "An RTP Payload Format for Generic Forward Error Correction", RFC 2733, December 1999. [9] Jones, P. , "Registration of the text/red MIME Sub-Type", draft-ietf-avt-text-red, RFC YYYY, 2004. [10] Rosenberg, J., Schulzrinne, H., "An Offer/Answer Model with the Session Description Protocol (SDP)", RFC 3264, June 2002. [11] Schultzrinne, J., Perkins, C., "RTP Profile for Audio and Video Conference with Minimal Control", RFC 3551, July 2003. 14. Informative References [12] Floyd, S., Kempf, J., IAB Concerns Regarding Congestion Control for Voice Traffic in the Internet, RFC 3714,March 2004 [13] Baugher, McGrew, Carrara, Naslund, Norrman, The Secure Real- Time Transport Protocol (SRTP), RFC 3711, March 2004. [14] Schulzrinne, H., Petrack, S., "RTP Payload for DTMF Digits, Telephony Tones and Telephony Signals", RFC 2833, May 2000. [15] Hellstrom, G., "RTP Payload for text conversation.", RFC2793, 2000 Hellstrom & Jones Expires - October 2004 [Page 23] Internet-Draft RTP Payload for Text Conversation April 2004 15. Intellectual Property Statement The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the IETF's procedures with respect to rights in IETF Documents can be found in RFC 3667 (BCP 78) and RFC 3668 (BCP 79). Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org. 16. Copyright Statement Copyright (C) The Internet Society (2004). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. Disclaimer of Validity This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Hellstrom & Jones Expires - October 2004 [Page 24]