AVT B. VerSteeg Internet-Draft A. Begen Intended status: Standards Track Cisco Systems Expires: May 7, 2009 T. VanCaenegem Alcatel-Lucent Bell November 3, 2008 Unicast-Based Rapid Synchronization with RTP Multicast Sessions draft-versteeg-avt-rapid-synchronization-for-rtp-01 Status of this Memo By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on May 7, 2009. Copyright Notice Copyright (C) The IETF Trust (2008). Abstract When a receiver joins a multicast session, it may need to acquire and parse certain key information before it can process any data sent in the multicast session. Depending on the join time, length of the key information repetition interval, size of the key information as well as the application and transport properties, the time lag before a receiver can usefully consume the multicast data, which we refer to VerSteeg, et al. Expires May 7, 2009 [Page 1] Internet-Draft Rapid Synchronization for RTP Flows November 2008 as the synchronization delay, varies and may be large. This is an undesirable phenomenon for receivers that frequently switch among different multicast sessions, such as video broadcasts. In this document, we describe a method using existing RTP and RTCP protocol machinery that reduces the synchronization delay. In this method, an auxiliary unicast RTP session carrying the key information to the receiver precedes/accompanies the multicast flow. This unicast flow may be transmitted at a faster than natural rate to further accelerate the synchronization. The motivating use case for this capability is multicast applications that carry real-time compressed audio and video. However, the proposed method can also be used in other types of multicast applications where the synchronization delay is long enough to be a problem. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 2. Requirements Notation . . . . . . . . . . . . . . . . . . . . 6 3. Acronyms . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 4. Elements of Delay in Multicast Streams . . . . . . . . . . . . 7 5. Elements of Delay in Video Systems . . . . . . . . . . . . . . 9 5.1. Overview of MPEG-2 Transport Streams . . . . . . . . . . . 9 5.2. Key Information Latency in Video Applications . . . . . . 11 5.2.1. PSI (PAT/CAT/PMT) Acquisition Delay . . . . . . . . . 11 5.2.2. Random Access Point Acquisition Delay . . . . . . . . 11 5.3. Buffering Delays in Video Applications . . . . . . . . . . 12 5.3.1. Network-Related Buffering Delays . . . . . . . . . . . 12 5.3.2. Application-Related Buffering Delays . . . . . . . . . 13 5.4. Breakdown of Typical Synchronization Delays in IPTV . . . 14 6. Rapid Multicast Synchronization . . . . . . . . . . . . . . . 14 6.1. Overview . . . . . . . . . . . . . . . . . . . . . . . . . 14 6.2. Message Flows and State Machines . . . . . . . . . . . . . 16 6.3. Shaping the Unicast Burst . . . . . . . . . . . . . . . . 20 6.4. Failure Cases . . . . . . . . . . . . . . . . . . . . . . 20 7. Encoding of the Signaling Protocol in RTCP . . . . . . . . . . 20 7.1. Transport-Layer Feedback Messages . . . . . . . . . . . . 21 7.1.1. RMS Request . . . . . . . . . . . . . . . . . . . . . 21 7.1.2. RMS Information . . . . . . . . . . . . . . . . . . . 22 7.1.3. RMS Termination . . . . . . . . . . . . . . . . . . . 25 7.2. Payload-Specific Feedback Messages . . . . . . . . . . . . 25 7.2.1. MPEG2-TS TSRAP . . . . . . . . . . . . . . . . . . . . 25 7.3. Multicast Join Report Block . . . . . . . . . . . . . . . 26 7.3.1. Report Block Format . . . . . . . . . . . . . . . . . 26 7.3.2. SDP Signaling . . . . . . . . . . . . . . . . . . . . 27 8. SDP Definitions and Examples . . . . . . . . . . . . . . . . . 28 8.1. Definitions . . . . . . . . . . . . . . . . . . . . . . . 28 8.2. Examples . . . . . . . . . . . . . . . . . . . . . . . . . 28 VerSteeg, et al. Expires May 7, 2009 [Page 2] Internet-Draft Rapid Synchronization for RTP Flows November 2008 9. NAT Considerations . . . . . . . . . . . . . . . . . . . . . . 31 10. Open Source RTP Receiver Implementation . . . . . . . . . . . 31 11. Open Issues . . . . . . . . . . . . . . . . . . . . . . . . . 31 12. Security Considerations . . . . . . . . . . . . . . . . . . . 31 13. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 31 13.1. Registration of SDP Attribute Values . . . . . . . . . . . 32 13.2. Registration of FMT Values . . . . . . . . . . . . . . . . 32 13.3. Registration of RTCP XR Block Types . . . . . . . . . . . 33 14. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 33 15. Change Log . . . . . . . . . . . . . . . . . . . . . . . . . . 33 15.1. draft-versteeg-avt-rapid-synchronization-for-rtp-01 . . . 33 16. References . . . . . . . . . . . . . . . . . . . . . . . . . . 34 16.1. Normative References . . . . . . . . . . . . . . . . . . . 34 16.2. Informative References . . . . . . . . . . . . . . . . . . 34 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 35 Intellectual Property and Copyright Statements . . . . . . . . . . 37 VerSteeg, et al. Expires May 7, 2009 [Page 3] Internet-Draft Rapid Synchronization for RTP Flows November 2008 1. Introduction Most multicast flows carry a stream of inter-related data. Certain information must first be acquired by the receivers to start processing any data sent in the multicast session. This document refers to this information as Key Information. The key information is conventionally sent periodically in the multicast session and usually consists of items such as a description of the schema for the rest of the data, references to which data to process for the receivers, encryption information including keys, as well as any other information required to process the data in the multicast flow. Real-time multicast applications require the receivers to buffer data. The receiver may have to buffer data to smooth out the network jitter, to allow loss-repair methods such as Forward Error Correction and retransmission to recover the missing packets, and to satisfy the data processing requirements of the application layer. When a receiver joins a multicast session, it has no control over what point in the flow is currently being transmitted. Sometimes the receiver may join the session right before the key information is sent in the session. In this case, the required waiting time is usually minimal. Similarly, the receiver may also join the session right after the key information has been transmitted. In this case the receiver has to wait for the key information to appear again in the stream before it can start processing any multicast data. In some other cases, the key information is not contiguous in the flow but dispersed over a large period, which forces the receiver to wait for all of the key information to arrive before starting to process the rest of the data. The net effect of waiting for the key information and waiting for various buffers to fill up is that the receivers may experience significantly large delays in data processing. In this document, we refer to the difference between the time a receiver joins the multicast session and the time the receiver acquires all the necessary key information as the Synchronization Delay. The synchronization delay may not be the same for different receivers; it usually varies depending on the join time, length of the key information repetition interval, size of the key information as well as the application and transport properties. The varying nature of the synchronization delay adversely affects the receivers that frequently switch among multicast sessions. In this specification, we address this problem for RTP-based multicast applications and describe a method that uses the fundamental tools offered by the existing RTP and RTCP protocols [RFC3550]. In this method, either the multicast source (or the distribution source in a VerSteeg, et al. Expires May 7, 2009 [Page 4] Internet-Draft Rapid Synchronization for RTP Flows November 2008 single-source multicast (SSM) session) retains key information for a period after transmission, or an intermediary network element joins the multicast session and continuously caches the key information as it is sent in the session and acts as a feedback target (See [I-D.ietf-avt-rtcpssm]) for the session. When a receiver wishes to join the same multicast session, instead of simply issuing an Internet Group Management Protocol (IGMP) [RFC3376] Join message, it sends a request to the feedback target address for the session asking for the key information. The feedback target starts a unicast retransmission RTP session and sends the key information to the receiver over that session. If there is spare bandwidth, the feedback target may also burst the key information at a faster than natural rate. As soon as the receiver acquires the key information, it can join the multicast group and start processing the multicast data. This method potentially reduces the synchronization delay. We refer to this method as Unicast-based Rapid Synchronization with RTP Multicast Sessions. A simplified network diagram showing the rapid synchronization method through an intermediary network element is depicted in Figure 1. +-----------------+ +--->| Intermediary | | ...| Network Element | | : |(Feedback Target)| | : +-----------------+ | v +--------+ +------+ +--------+ | RTP |---->|Router|........>|Joining | | Sender | | |-------->| RTP | +--------+ +------+ |Receiver| | +--------+ | | +--------+ +----------->|Existing| | RTP | |Receiver| +--------+ ---> Multicast RTP Flow ...> Unicast RTP Flow Figure 1: Rapid synchronization through an intermediary network element A primary design goal in this solution is to use the existing tools in the RTP protocol family. This improves the versatility of the existing implementations, and promotes faster deployment and better interoperability. To this effect, we use the unicast retransmission VerSteeg, et al. Expires May 7, 2009 [Page 5] Internet-Draft Rapid Synchronization for RTP Flows November 2008 support of RTP [RFC4588] and the capabilities of RTCP to handle the signaling needed to accomplish the synchronization. The packet(s) carrying the key information are sent by the feedback target in the auxiliary unicast session for rapid synchronization. These are constructed as retransmission packets that would have been sent in a unicast RTP session to recover the missing packets at a receiver that has never received any packet. In fact, there is a single RTP session used for both rapid synchronization and retransmission-based loss repair. The conventional RTCP feedback message that requests the retransmission of the missing packets [RFC4585] indicates their sequence numbers. However, upon joining a new session the receiver has never received a packet and thus, does not know the sequence numbers. Instead, the receiver sends a newly defined RTCP feedback message to request the key information needed to rapidly synchronize with the main multicast session. It is also worth noting that in order to issue the initial RTCP message to the feedback target, the SSRC of the session to be joined must be known prior to any packet reception, and hence, needs to be signaled out-of-band (or in-band). In a Session Description Protocol (SDP) description, the SSRC MUST be signaled through the 'ssrc' attribute [I-D.ietf-avt-rtcpssm]. In the rest of this specification, we have the following outline: In Section 4, we describe the delay components in generic multicast applications. In Section 5, we introduce the delay components that are specific to video systems. We provide the protocol details of the rapid multicast synchronization method in Section 6 and Section 7. Section 8 and Section 9 discuss the SDP signaling issues with examples and NAT-related issues, respectively. Finally, in Section 10 we provide a pointer to an open source RTP Receiver code that implements the functionalities introduced in this document. Note that Section 3 provides a list of the acronyms frequently used in this document. It should be noted that while this document primarily focuses on multicast applications that carry compressed audio and video, the core of the described method is payload-independent and can also be used in multicast applications that carry other types of data. 2. Requirements Notation The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119]. VerSteeg, et al. Expires May 7, 2009 [Page 6] Internet-Draft Rapid Synchronization for RTP Flows November 2008 3. Acronyms This document uses the following acronyms frequently: CAT: Conditional access table. DTS: Decoding timestamp. ECM: Entitlement control message. EMM: Entitlement management message. ES: Elementary stream. GoP: Group of pictures. IDR: Instantaneous decoding refresh. MPEG2-TS: MPEG2 transport stream. MPTS: Multi program transport stream. PAT: Program association table. PCR: Program clock reference. PMT: Program map table. PSI: Program specific information. PTS: Presentation timestamp. RAP: Random access point. SPTS: Single program transport stream. TSRAP: Transport stream random access point. 4. Elements of Delay in Multicast Streams In an any-source (ASM) or a single-source (SSM) multicast delivery system, there are three major elements that contribute to the overall synchronization delay when a receiver switches from one multicast session to another one. These are: VerSteeg, et al. Expires May 7, 2009 [Page 7] Internet-Draft Rapid Synchronization for RTP Flows November 2008 o Multicast switching delay o Key information latency o Buffering delays Multicast switching delay is the delay that is experienced to leave the current multicast session (if any) and join the new multicast session. In typical systems, the multicast join and leave operations are handled by a group management protocol. For example, the receivers and routers participating in a multicast session may use the Internet Group Management Protocol (IGMP) [RFC3376]. In [RFC3376], when a receiver wants to join a multicast session, it sends an IGMP Join message to its upstream router and the routing infrastructure sets up the multicast forwarding state to deliver the packets of the multicast session to the new receiver. Depending on the proximity of the upstream router, the current state of the multicast tree, the load on the system and the protocol implementation, the join times vary. Current systems provide join latencies usually less than 200 milliseconds (ms). If the receiver had been participating in another multicast session before joining the new session, it needs to send an IGMP Leave message to its upstream router to leave the session. In IGMP version 3 [RFC3376], the leave times are usually smaller than the join times, however, it is possible that the Leave and Join messages may get lost, in which case the multicast switching delay inevitably increases. Key information latency is the time it takes the receiver to acquire the key information. It is highly dependent on the proximity of the actual time the receiver joined the session to the next time the key information will be sent to the receivers in the session, whether the key information is sent contiguously or not, and the size of the key information. For some multicast flows, there is a little or no interdependency in the data, in which case the key information latency will be nil or negligible. For other multicast flows, there is a high degree of interdependency. One example of interest is the multicast flows that carry compressed audio/video. For these flows, the key information latency may become quite large and be a major contributor to the overall delay. We describe the interdependency associated with audio/video flows in detail in Section 5. The buffering component of the overall synchronization delay is driven by the way the application layer processes the payload. In many multicast applications, an unreliable transport protocol such as UDP [RFC0768] is often used to transmit the data packets, and the reliability, if needed, is usually addressed through other means such as Forward Error Correction and retransmission [I-D.ietf-rmt-pi-norm-revised]. These loss-repair methods require VerSteeg, et al. Expires May 7, 2009 [Page 8] Internet-Draft Rapid Synchronization for RTP Flows November 2008 buffering at the receiver side to function properly. In many applications, it is also often necessary to de-jitter the incoming data packets before feeding them to the application. The de- jittering process also increases the buffering delays. Besides these network-related buffering delays, there are also specific buffering needs that are required by the individual applications. For example, MPEG decoders require a significant amount of content to be available in the decoder buffers prior to starting to decode the content. We describe these buffering requirements for audio/video applications in detail in Section 5. 5. Elements of Delay in Video Systems For typical multicast-based video delivery systems, the multicast switching delay (time required to leave the previous multicast session and join the new session) is not the primary contributor to the overall synchronization delay. The multicast flows are typically already present at the edge or deep in the network, the propagation delays for join operations are modest, and the multicast routers can typically process the Join and Leave messages quickly. Even if the edge multicast router is not currently a member of the requested multicast session, the multicast routing control messages propagate through the network rapidly and trees are built without experiencing large delays. Even in cases where a number of tree branches need to be built to the edge multicast router, this cost is frequently amortized over a large number of receivers such that only the first receiver joining the group experiences the increased delay. Further, this delay can be eliminated at the cost of extra bandwidth in the network core by having the edge routers do static joins for the set of sessions they expect receivers to be interested in. These techniques usually provide a well-bounded multicast switching delay. Once the join operation completes and a receiver starts receiving media content for the first time in a multicast session, it often experiences a considerable amount of key information latency and buffering delays. In the following subsections, we discuss the details of these delay elements, using MPEG2 Transport Streams as the motivating use case. 5.1. Overview of MPEG-2 Transport Streams MPEG2 Transport Stream (MPEG2-TS) [MPEG2TS] is an encapsulation method and transport that multiplexes digital video and audio content, together with ancillary metadata, and produces a synchronized multiplexed stream that is tailored for transport over packet or cell-oriented networks. MPEG2-TS is ubiquitous in broadcast applications over both terrestrial and satellite networks. VerSteeg, et al. Expires May 7, 2009 [Page 9] Internet-Draft Rapid Synchronization for RTP Flows November 2008 Both Advanced Television Systems Committee (ATSC) in North America and Digital Video Broadcasting (DVB) in Europe use MPEG2-TS in their standards. MPEG2-TS has been standardized by both ISO and ITU [MPEG2TS]. While MPEG2-TS was originally limited to carry MPEG-2 encoded content, the specification was later extended to cover MPEG- 4/AVC audio/video encoding standards as well. MPEG2-TS is a container format that describes the schema of the audio and video content and the in-band control information. Prior to multiplexing, an audio and a video encoder output audio and video Elementary Streams (ES), respectively. The ES streams are then packetized to form the Packetized Elementary Streams (PES). The resulting elements are called PES packets. A transport stream (TS) encapsulates several PES streams and other data, and carries them in TS packets. The RTP payload format for carrying TS packets in an RTP stream is specified in [RFC2250]. In addition to the audio and video ES streams, there are ES streams that carry control data. Program Specific Information (PSI) consists of metadata carried in the transport stream. PSI includes Program Association Table (PAT), Conditional Access Table (CAT) and Program Map Table (PMT). A PAT has information about all the programs carried in the transport stream. It lists the 13-bit Program IDs (PID) for all the PMTs, associating them with the individual programs. Each of the ES streams of a particular program in the transport stream also has the same PID values. This way, a decoder at the receiving side can extract the desired TS packets from the transport stream by checking their PID values. If the transport stream is not a Multi-Program Transport Stream (MPTS), but rather it is a Single-Program Transport Stream (SPTS), all the ES streams in the transport stream correspond to a single program. CAT defines the type of the scrambling used (either at the PES or TS level), and identifies all the PID values of the TS packets that contain the Entitlement Management Messages (EMM). In addition to containing the PID values of each ES stream associated with a particular program, the PMT table also includes private data associated with the program such as the PID value of the packet containing the Entitlement Control Messages (ECM). The data contained in the EMM and ECM messages are vital in descrambling encrypted content. Note that PSI is carried in clear and is never scrambled so that a receiver which just started receiving the transport stream can process the PSI. The PAT, CAT and PMT tables must be parsed by the decoder in order to find the ES streams, private data as well as the encryption information for a given program. MPEG2-TS produces output that is synchronized to a common clock VerSteeg, et al. Expires May 7, 2009 [Page 10] Internet-Draft Rapid Synchronization for RTP Flows November 2008 across all the ESs in the multiplex. To assist the audio and video decoders, programs periodically provide a Program Clock Reference (PCR) value in the transport stream. PCR values are embedded in the TS adaptation field headers and are inserted by the encoder at least every 100 ms. A PCR timestamp represents the value of the encoder's system clock when it was sampled. The system clock is driven by a local 27 MHz oscillator. PCR is extremely important in native MPEG-2 transport to keep the decoders synchronized. For example, the periodically sent Decoding Timestamps (DTS) and Presentation Timestamps (PTS) are specified relative to the PCR value and the decoders use the PCR value as the basis for a master clock during decoding and playout. 5.2. Key Information Latency in Video Applications We classify the key information latency into two categories. 5.2.1. PSI (PAT/CAT/PMT) Acquisition Delay As we discussed in Section 5.1, the video (and the audio as well) in an MPEG2-TS is self describing, and the receiver must parse certain control information in the PAT, CAT and PMT tables (i.e., PSI) contained in the transport stream in order to know how to parse the rest of the stream (i.e., to find the audio and video elementary streams, private data and the encryption information for a given program). Many video services employ content encryption and the encryption keys must be parsed as well for decrypting the content. In order to enable various system elements to process video effectively, certain portions of the stream are left unencrypted. The PAT/PMT tables are always in the clear. The structure of the ECMs is also in the clear, although the ECM content which contains keying material is encrypted. The PSI information is repeated periodically in the transport stream, thus, when a receiver joins the multicast session, it needs to wait until the next time PSI is sent in the transport stream. 5.2.2. Random Access Point Acquisition Delay Conventional MPEG2 video encoders encode the video content in Groups of Pictures (GoP). Each GoP is encoded independently from other GoPs and starts with an intra-coded frame (I-frame) that does not have any reference to other frames in the same GoP, i.e., an I-frame contains the representation of an entire picture and can be decoded independently. Thus, the start of an I-frame is said to be a Random Access Point (RAP). VerSteeg, et al. Expires May 7, 2009 [Page 11] Internet-Draft Rapid Synchronization for RTP Flows November 2008 On the other hand, due to the temporal compression, rest of the frames in the same GoP may have references to the I-frame or to other frames in the same GoP. Due to this interdependency among the frames, one generally has to receive certain elements of the GoP prior to decoding or rendering any part of the GoP. For example, the decoder can decode a frame that is dependent on two other frames only after these two frames are decoded. Usually, GoP durations are between 500 ms and one second. However, more advanced codecs may use longer GoPs to gain from the encoding efficiency. When a receiver joins the multicast session, it needs to wait until the next RAP shows up in the multicast stream before it can start decoding. Since the frame that is currently being multicast does not depend on the join time, the average time a receiver waits for RAP, i.e., the average RAP acquisition delay, is approximately equal to half of the GoP duration. Hence, for longer GoPs, the RAP acquisition delay is proportionally longer. Advanced Video Coding (AVC) (also called MPEG4 part 10) compression is very similar to MPEG2 compression. It has a few more compression tools available, including Hierarchical GoPs. In a hierarchical GoP, the dependent frames of a GoP may reference the key frame at the start of this GoP or the key frame at the start of the next GoP. This additional dependency causes a longer RAP acquisition delay, as the decoder must receive two I-frames (spread between two logical GoPs) before decoding can commence. In an Open GOP, a frame in one GoP may refer to a frame in a previous GoP. AVC also has the ability to insert Instantaneous Decoding Refresh (IDR) frames. Frames that follow an IDR frame cannot reference frames that precede an IDR frame. IDR frames are useful for editing AVC streams, but are typically do not appear often enough in streaming video to be useful in a stream synchronization context. Note that in order for an intermediary network element such as a retransmission server to find the random access points in the video stream (e.g., I-frames), the necessary structural information must be in the clear if the intermediary is not in possession of the necessary keys. 5.3. Buffering Delays in Video Applications We classify the buffering delays into two categories. 5.3.1. Network-Related Buffering Delays In general, multicast-based video applications use an unreliable underlying transport protocol such as UDP [RFC0768] to distribute the content to a large number of receivers. This is largely due to the VerSteeg, et al. Expires May 7, 2009 [Page 12] Internet-Draft Rapid Synchronization for RTP Flows November 2008 fact that these applications are one way in nature and providing closed-loop reliability does not scale well when the number of receivers is large or the acceptable playout delay is small, or both. Rather, if there is a need for reliability, the applications may employ one or more loss-repair methods to recover the packets missing at each receiver (The Reliable Multicast Transport Working Group has several standardized solutions for this problem. Refer to [I-D.ietf-rmt-pi-norm-revised] for details). For example, Forward Error Correction (FEC) may be used proactively and/or on-demand to provide reliable transmission to a potentially very large multicast group in a scalable manner [I-D.ietf-fecframe-framework]. Similarly, retransmissions may be used in RTP-based multicast sessions where the retransmissions can be handled by local repair servers rather than the source itself [I-D.ietf-avt-rtcpssm]. However, regardless of the type of the loss-repair method(s) adopted by an application, loss- recovery operations always require additional buffering at the receiver side. The amount of buffering increases with the FEC block size when FEC is used, and with the round-trip time between the receiver and the local repair server when retransmission is used. Audio and video decoders demand almost jitter-free content. If any jitter is introduced during the transmission in the network or due to the loss-repair methods, the jitter has to be smoothed out before the content is fed to the decoder. This is called de-jittering and it usually adds up to the buffering delay. 5.3.2. Application-Related Buffering Delays The application buffering requirements for MPEG-encoded content are quite rigorous, particularly for the MPEG-based video applications. Video compression devices apply more bits to represent certain scenes than they do for other scenes. A very complex scene (individual picture) requires considerably more information than a simple scene. Furthermore, pictures that are entirely intra-coded, e.g., I-frames, consume more bits compared to pictures that make use of predictive coding. Each scene is shown by the decoder at a certain fixed frame rate (e.g., 24 fps or 30 fps). Since some scenes are comprised more bits than other scenes, the output rate of the decoder buffer is usually variable. However, the network flow is typically Constant Bit Rate (CBR) or Capped Variable Bit Rate (Capped VBR). The net effect is that the input rate to the decoder buffer is close to constant, but the output rate is highly variable. The video encoders keep track of the decoder buffer size, and use this information to regulate the temporal compression. This forces the decoder buffer to "breathe." In order to avoid underflow, the decoder buffer must fill up to a certain level prior to starting to decode and play the content. The decoder buffer size required to VerSteeg, et al. Expires May 7, 2009 [Page 13] Internet-Draft Rapid Synchronization for RTP Flows November 2008 avoid underflow is dependent on the encoder, and the encoder signals the decoder buffering requirements in-band. Typical decoder buffer requirements for MPEG2 content range from a few hundreds of milliseconds to a few seconds. However, AVC/MPEG4 part 10 encoders usually tend to use more temporal compression, and thus require a larger buffer at the decoder side. This consequently increases the buffering delays. 5.4. Breakdown of Typical Synchronization Delays in IPTV Editor's note: This section may provide typical ranges for each of the delay components based on the observations in the field. This section is for educational and illustrative purposes. 6. Rapid Multicast Synchronization The video systems we consider as a use case in this document encapsulate multicast video flows in an IP/UDP/RTP/MPEG2-TS/ format. Typical codecs supported by MPEG2-TS include MPEG2 and AVC/ MPEG4 part 10, although other codecs are also in use. For more details on RTP encapsulation of TS packets, refer to [RFC2250]. Note that many legacy video deployments today do not use RTP encapsulation. However, in order to use the rapid synchronization method specified in this document, RTP encapsulation is clearly required. We start this section with an overview of the rapid (multicast) synchronization method. 6.1. Overview RTP [RFC3550], together with [RFC4588] and [RFC4585] provide the mechanisms needed to encapsulate a data flow and use RTCP feedback to implement a robust loss-recovery mechanism for that flow. When packets are lost, a receiver may use RTCP NACK messages to request retransmission of those packets using RTP retransmission packets. In order to achieve rapid synchronization, the receiver can use this retransmission mechanism to request all of the data from the most recent RAP. We therefore employ a new RTP/RTCP-based mechanism that boils down to "I do not have synchronization with the stream. Send me a repair burst that will cover all of the required data from a recent RAP to the point the stream has reached in the multicast session." Since the receiver does not actually know the starting RTP sequence number of the RAP data, it cannot use the conventional RTCP NACK message. Instead, we define a new RTCP transport-layer feedback VerSteeg, et al. Expires May 7, 2009 [Page 14] Internet-Draft Rapid Synchronization for RTP Flows November 2008 message. Note that the rapid synchronization flow and the normal retransmission flow share the same RTP session. In order to reduce the synchronization delay associated with processing a new RTP encapsulated multicast flow, either the media source may retain the stream state, and/or a Retransmission Server (RS) may cache that flow near the egress edge and provide an accelerated unicast burst to the requesting receiver. Which element performs these functions depends on the desired scale of the system. The protocol machinery is agnostic to the difference, as both use the RTCP feedback target address as the way to identify the element performing these functions. Where a Retransmission Server (RS) is employed, it semi-permanently joins the multicast session and receives the RTP steams it wishes to cache so that it can perform the retransmission functions. For the rapid synchronization support, RS parses the incoming flows, looking for the key information. The key information may be segregated by RS into two components - the key information that occurs in sequence with the RTP data and that which does not occur in RTP sequence order. When acquiring an RTP session, the RTP Receiver (RR) sends an RTCP message to the feedback target requesting an accelerated burst of data from the multicast session via unicast including any key information that occurs in sequence with the RTP data plus any optional key information that does not occur in RTP sequence order. In the case of IP/UDP/RTP/MPEG2-TS encapsulated video streams, the key information that may not occur in the RTP sequence order may consist of the information required to demultiplex and decode the MPEG2-TS. We refer to this information as the Transport Stream Random Access Point (TSRAP). The TSRAP information includes the PAT, PMT, PCR and optionally the ECMs. The TSRAP information is sent from the feedback target to RR in a newly defined RTCP payload-specific feedback message. The sequential key information, on the other hand, (i.e., the PES header, I-frame and subsequent data) is always sent in the unicast RTP flow, using the RTP retransmission payload format. This has several advantages, including easy correlation between the packets in the RTP session carrying the unicast burst and those of the multicast session, and the ability of the receiver to utilize the same algorithms for reception and processing the key information as it does for dealing with RTP retransmissions during the normal session operation. VerSteeg, et al. Expires May 7, 2009 [Page 15] Internet-Draft Rapid Synchronization for RTP Flows November 2008 The RTP data is sent from the feedback target to the receiver starting with the sequential key information via unicast as described above. The MPEG data sent in the unicast burst starts with an Elementary Stream Random Access Point (ESRAP). This includes a PES header containing a PTS, followed by a sequence header, followed by an I-frame. The unicast burst continues at a higher than natural rate until the unicast burst catches up with the real-time multicast flow. 6.2. Message Flows and State Machines The flow diagram for unicast-based rapid synchronization is sketched in Figure 2. In this figure, we have an RTP Sender and an RTP Receiver (RR). The rapid synchronization support is provided in this scenario by a Retransmission Server (RS), although the message flows are identical to the case where the rapid synchronization is performed by a feedback target co-located with the media source. Note that [I-D.ietf-avt-rtcpssm] permits the feedback target to be a retransmission server, since it is a logical function to which RRs send their unicast feedback. VerSteeg, et al. Expires May 7, 2009 [Page 16] Internet-Draft Rapid Synchronization for RTP Flows November 2008 +--------+ | RTP | | Sender | +--------+ | v +--------------+ | Router | | |<----- (4) IGMP Join ------+ +--------------+ | | | | +-------+ +--- (5) New Multicast Flow ---+ | | | | v v | +--------------+ +----------+ | | <. (1) Rapid Synch Request . | | | | & Additional Requirements | | |Retransmission| .. (2) Burst Description ..> | RTP | | Server | .... (3) Unicast Burst ....> | Receiver | | (RS) | & Payload-specific Info | (RR) | | | <..... (6) Termination ..... | | | | & Rapid Synch Report(s) | | +--------------+ +----------+ ---> Multicast Flows and IGMP Messages ...> Unicast Flows Figure 2: Flow diagram for unicast-based rapid synchronization The following steps describe rapid synchronization in detail: 1. RR sends a rapid synchronization request for the new multicast RTP session to the feedback target address of that session. This request is sent in an RTCP transport-layer feedback message [RFC4585], which is defined in Section 7.1.1. The RTCP message contains the SSRC of RR and the SSRC of the media source. Note that since no RTP packets have been received yet for this session, the SSRC must be obtained out-of-band or in-band. For sessions described using SDP [RFC4566], the SSRC MUST be signaled using the 'ssrc' attribute of the media description (The 'cname' source attribute MUST also be present [I-D.ietf-mmusic-sdp-source-attributes]). In the RTCP message, RR MAY also specify additional requirements. This information is used by RS to prepare a rapid synchronization burst that conforms with RR's requirements. VerSteeg, et al. Expires May 7, 2009 [Page 17] Internet-Draft Rapid Synchronization for RTP Flows November 2008 2. RS receives the rapid synchronization request, and decides whether to accept it or not. If RS accepts the request, it sends an RTCP message to RR that comprises fields that describe the burst RS will generate and send, possibly including the indication when RR should switch to the multicast stream. This new RTCP transport-layer feedback message is defined in Section 7.1.2. For redundancy purposes, this message MAY be sent multiple times towards RR. Note that RS learns the IP address and port information for RR from the rapid synchronization request it received. (This description glosses over the NAT details. Refer to Section 9 for a discussion of NAT-related issues.) If RS cannot provide a unicast rapid synchronization, RS rejects the request and informs RR immediately via the RTCP feedback message. If RR receives a message indicating that its rapid synchronization request has been denied, it abandons the rapid synchronization attempt and SHOULD immediately join the RTP multicast session by sending an IGMP Join message [RFC3376] to its upstream multicast router for the new multicast session. 3. If RS accepts the rapid synchronization request, it transmits (in addition to the RTCP feedback message describing the burst) the unicast RTP burst data and any additional payload-specific RTCP message(s) that may contain information that is not provided in the unicast RTP burst. Note that these RTCP messages are usually sent in a compound RTCP packet (See Section 6.1 of [RFC4585]), which is either sent prior to the data burst or during the data burst. 4. At the appropriate moment (as indicated or computed from the burst parameters specified in the burst description), RR sends an IGMP Join message [RFC3376] to its upstream multicast router for the new multicast session. 5. RR starts receiving the multicast RTP flow. 6. RS may know when it needs to stop the unicast burst based on the burst parameters. Or, RR may explicitly let RS know the sequence number of the first RTP packet it received from the multicast session via a new RTCP transport-layer feedback message (defined in Section 7.1.3). With this information, RS can decide when to terminate the unicast burst. Against the possibility of a packet loss, this message MAY be sent multiple times towards RS as long as RR follows the RTCP timer rules of [RFC4585]. Note that one or more of the data packets may have been dropped by the network during bursting. To recover them, RR MAY VerSteeg, et al. Expires May 7, 2009 [Page 18] Internet-Draft Rapid Synchronization for RTP Flows November 2008 explicitly request a retransmission for those packets after the burst completes. It is possible that RR may decide to switch to a new multicast session while an earlier rapid synchronization request is still pending or active. In that case, RR SHOULD cancel the pending/active rapid synchronization operation before it sends a new request for the new multicast session. It is also possible that the rapid synchronization request sent by RR may get lost. Or, RS may have received the request and accepted to service the request, however, the payload-specific RTCP message(s) sent by RS may be dropped in the network. In such cases RR either would not receive any RTP burst data, or it would receive the RTP burst data but it would not be able to benefit from it without the payload-specific information. Thus, when RR infers that rapid synchronization has failed, it SHOULD first cancel the pending/active rapid synchronization operation. Then, if staying on the same session, RR SHOULD join the multicast session; if switching to a new session, RR MAY send a new rapid synchronization request for the new session. See Section 6.4 for more details. To cancel the pending request and stop any existing rapid synchronization operation, RR SHOULD send an RTCP BYE message for the unicast retransmission session. Upon receiving an RTCP BYE message, RS MUST terminate the rapid synchronization operation, and cease transmitting any further packets of the associated unicast burst. Section 6.1 of [RFC3550] mandates the RTCP BYE message always to be sent with a sender or receiver report in a compound RTCP packet (If no data has been received, an empty receiver report MUST be included). With the information contained in the receiver report, RS can also figure out how many duplicate RTP packets have been delivered to RR (Note that this will be an upper-bound estimate as one or more packets might have been lost during the burst transmission). The impact of duplicate packets and measures that can be taken to minimize the impact of receiving duplicate packets will be addressed in Section 6.3. Note that sending an RTCP BYE message for the unicast session would not prevent RR from asking for a retransmission from RS at a later time since the next time RR sends a retransmission request to RS, the logical retransmission session will be re-instantiated. Thus, if RR is already seeing an overlap in the data received from the unicast and multicast sessions, RR MAY also send an RTCP BYE message to make RS stop the unicast burst. Also note that if RR decides to switch to a new multicast session after it already joined a multicast session following a rapid VerSteeg, et al. Expires May 7, 2009 [Page 19] Internet-Draft Rapid Synchronization for RTP Flows November 2008 synchronization request, RR MUST also send an RTCP BYE message for the session associated with the current multicast source stream. Whether RR completes the rapid synchronization successfully or not, it is a good practice to gather detailed information about RR's rapid synchronization experience. For this purpose, in Section 7.3, we define a new RTCP extended report (XR) block type, which we refer to as Multicast Join Report. This report is designed to be payload- independent, thus, it can be used by any multicast application that supports rapid synchronization. 6.3. Shaping the Unicast Burst Editor's note: This section may discuss sizing of the buffers, output buffer overload protection, output bandwidth management, adjustment of burst rate and duration. TBC. 6.4. Failure Cases Editor's note: This section may discuss what happens if the request for rapid synchronization gets lost, or rapid synchronization fails, or when there are no available resources, etc. TBC. 7. Encoding of the Signaling Protocol in RTCP This section defines the formats of the RTCP transport-layer feedback messages that are exchanged between the Retransmission Server (RS) and RTP Receiver (RR) during rapid synchronization. These messages are payload-independent and SHOULD be used by all RTP-based multicast applications that support rapid synchronization regardless of the payload they carry. This section also defines the format of the RTCP payload-specific feedback message that is used for rapid synchronization by the RTP applications carrying MPEG2-TS. Other RTCP payload-specific feedback messages MAY similarly be defined in other documents for different payload types. Finally, a new RTCP extended report block type to be used with the rapid synchronization method is also defined in this section. The common packet format for the RTCP feedback messages is defined in Section 6.1 of [RFC4585]. Each feedback message has a fixed-length VerSteeg, et al. Expires May 7, 2009 [Page 20] Internet-Draft Rapid Synchronization for RTP Flows November 2008 field for version, padding, feedback message type (FMT), payload type (PT), length, SSRC of packet sender, SSRC of media source as well as a variable-length field for feedback control information (FCI). In the transport-layer feedback messages, the PT field is set to RTPFB (205), whereas in the payload-specific feedback messages, the PT field is set to PSFB (206). 7.1. Transport-Layer Feedback Messages In this section, we define three transport-layer feedback messages for rapid multicast synchronization (RMS). 7.1.1. RMS Request The RMS Request message is identified by PT=RTPFB and FMT=2. The FCI field MUST contain only one RMS Request. The RMS Request is used by RR to request rapid synchronization for a new multicast RTP session. The FCI field has the structure depicted in Figure 3. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Min RMS Buffer Fill Requirement | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Max RMS Buffer Fill Requirement | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Max Receive Bitrate | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 3: FCI field syntax for the RMS Request message Min RMS Buffer Fill Requirement (32 bits): The minimum amount of data (in ms) required by RR after the burst completes. A zero value means it is not specified. If specified, the amount of backfill that will be provided by the unicast burst SHOULD NOT be smaller than this value since it will not be able to build up the desired level of buffer at RR and may cause buffer underruns. VerSteeg, et al. Expires May 7, 2009 [Page 21] Internet-Draft Rapid Synchronization for RTP Flows November 2008 Max RMS Buffer Fill Requirement (32 bits): The maximum amount of data (in ms) that can be received by RR after the burst completes. A zero value means it is not specified. If specified, the amount of backfill that will be provided by the unicast burst SHOULD NOT be larger than this value since it may cause buffer overflows at RR. Max Receive Bitrate (32 bits): The maximum bitrate (in bits per second) that RR can receive. A zero value means it is not specified. If specified, the unicast burst bitrate SHOULD NOT be larger than this value since it may cause congestion and packet loss. The length of the feedback message MUST be set to 5. The semantics of this feedback message is independent of the payload type. 7.1.2. RMS Information The RMS Information message is identified by PT=RTPFB and FMT=3. The FCI field MUST contain only one RMS Information. The RMS Information is used to describe the unicast burst that will be sent for rapid synchronization. It also includes other useful information for RR. There are currently two proposals for RMS Information: 7.1.2.1. Encoding Option A The FCI field has the structure depicted in Figure 4. VerSteeg, et al. Expires May 7, 2009 [Page 22] Internet-Draft Rapid Synchronization for RTP Flows November 2008 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | RTP Seqnum of the First Burst Packet | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Earliest IGMP Join Time | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Join-Time Rapid Synchronization Fill | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Rapid Synchronization Duration | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Rapid Synchronization Fill | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 4: FCI field syntax for the RMS Information message RTP Seqnum of the First Burst Packet (32 bits): The extended RTP sequence number of the first packet that will be sent as part of the rapid synchronization. This allows RR to know if one or more packets are dropped at the beginning of rapid synchronization. 32- bit extended RTP sequence number is constructed by putting the 16- bit RTP sequence number in the lower two bytes and octet 0's in the higher two bytes. Earliest IGMP Join Time (32 bits): Time difference between the arrival of the first burst packet and the earliest time instant when RR could join the new multicast session (in RTP clock ticks). Join-Time Rapid Synchronization Fill (32 bits): The amount of backfill (in ms) that RR can expect in its buffer as a result of the rapid synchronization at the time of the earliest IGMP Join. Rapid Synchronization Duration (32 bits): Time difference between the timestamps of the first and last RTP packets in the unicast burst (in RTP clock ticks). Rapid Synchronization Fill (32 bits): The amount of backfill (in ms) that RR can expect in its buffer as the result of the rapid synchronization. If RS sets the second and fourth fields to zero, it means RS has rejected the request. In that case, RR SHOULD join the multicast session immediately. The length of the feedback message MUST be set to 7. The semantics of this feedback message is independent of the payload type. VerSteeg, et al. Expires May 7, 2009 [Page 23] Internet-Draft Rapid Synchronization for RTP Flows November 2008 7.1.2.2. Encoding Option B The FCI field has the structure depicted in Figure 5. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Join-Now | Response | First RTP Seqnum of the Burst | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Earliest IGMP Join Time | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Rapid Synchronization End Time | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 5: FCI field syntax for the RMS Information message Join-Now (8 bits): A value of 1 indicates RR to join the multicast session immediately (In this case, Earliest IGMP Join Time and Rapid Synchronization End Time fields MUST be set to zero). A value of 0 indicates that the earliest time instant when RR could join the new multicast session is given in the Earliest IGMP Join Time field. Other values MUST NOT be used. Response (8 bits): A value of 0 indicates that rapid synchronization request has been rejected. A value of 1 indicates that the rapid synchronization request has been accepted and RR MUST NOT send an explicit notification to RS when multicast join has been completed. A value of 2 indicates that the rapid synchronization request has been accepted and RR MUST send an explicit notification (See Section 7.1.3) to RS when multicast join has been completed. Other values MUST NOT be used. First RTP Seqnum of the Burst (16 bits): The RTP sequence number of the first packet that will be sent as part of the rapid synchronization. This allows RR to know if one or more packets are dropped at the beginning of rapid synchronization. Earliest IGMP Join Time (32 bits): Time difference between the arrival of the first burst packet and the earliest time instant when RR could join the new multicast session (in RTP clock ticks). Rapid Synchronization End Time (32 bits) The length of the feedback message MUST be set to 5. The semantics of this feedback message is independent of the payload type. Editor's note: The RMS Information message MAY be sent multiple VerSteeg, et al. Expires May 7, 2009 [Page 24] Internet-Draft Rapid Synchronization for RTP Flows November 2008 times at the start of, or prior to, or during the RTP unicast burst. 7.1.3. RMS Termination The RMS Termination message is identified by PT=RTPFB and FMT=4. The FCI field MUST contain only one RMS Termination. The RMS Termination is used by RR to let RS know the sequence number of the first RTP packet received from the multicast session. With this information, RS can decide when to terminate the unicast burst. The FCI field has the structure depicted in Figure 6. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | RTP Seqnum of the First Received Multicast Packet | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 6: FCI field syntax for the RMS Termination message RTP Seqnum of the First Received Multicast Packet (32 bits): The extended RTP sequence number of the first packet received from the multicast session. 32-bit extended RTP sequence number is constructed by putting the 16-bit RTP sequence number in the lower two bytes and octet 0's in the higher two bytes. The length of the feedback message MUST be set to 3. The semantics of this feedback message is independent of the payload type. 7.2. Payload-Specific Feedback Messages We define the format of the RTCP payload-specific feedback message for the RTP applications carrying MPEG2-TS. 7.2.1. MPEG2-TS TSRAP The MPEG2-TS TSRAP message is identified by PT=PSFB and FMT=4. The FCI field MUST contain only one MPEG2-TS TSRAP. The MPEG2-TS TSRAP is used for carrying the key information that consists of the information required to demultiplex and decode the MPEG2-TS. It usually includes the PAT, PMT, PCR and optionally the ECMs. VerSteeg, et al. Expires May 7, 2009 [Page 25] Internet-Draft Rapid Synchronization for RTP Flows November 2008 The FCI field has the structure depicted in Figure 7. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ : TBC : +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 7: FCI field syntax for the MPEG2-TS TSRAP message Editor's note: The format of this message is TBD. The length of the feedback message MUST be set to TBD. 7.3. Multicast Join Report Block In this section, we define a new report block type within the framework of RTP Control Protocol (RTCP) Extended Reports (XR) [RFC3611]. This report is used to carry useful information about the first RTP packet received from a multicast session. This information includes the sequence number of the RTP packet and the time it took to receive it after the IGMP Join message was issued. In addition to the useful diagnostics information, this report may also serve as an indication that RR has successfully completed the multicast join. 7.3.1. Report Block Format The report format is shown in Figure 8. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | BT | rsvd. | Status| Block Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | SSRC of the Multicast Session | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | RTP Seqnum of the First Received Multicast Packet | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | IGMP Join Time | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 8: Format for the Multicast Join Report Block VerSteeg, et al. Expires May 7, 2009 [Page 26] Internet-Draft Rapid Synchronization for RTP Flows November 2008 BT (8 bits): Block type that identifies the block format. Multicast Join Report Block is identified by the constant TBD. rsvd. (4 bits): This field is reserved for future definition. In the absence of such definition, the bits in this field MUST be set to zero and MUST be ignored by the receiver. Status (4 bits): TBD. Block Length (16 bits): The length of this report block, including the header, in 32-bit words minus one. It MUST be set to 3. SSRC of the Multicast Session (32 bits): The SSRC of the multicast RTP session that RR has joined. RTP Seqnum of the First Received Multicast Packet (32 bits): The extended RTP sequence number of the first packet received from the multicast session. 32-bit extended RTP sequence number is constructed by putting the 16-bit RTP sequence number in the lower two bytes and octet 0's in the higher two bytes. IGMP Join Time (32 bits): Time difference (in ms) between the instant IGMP Join message has been sent and the instant the first RTP packet was received from the multicast session. The semantics of this report block is independent of the payload type. Editor's note: More fields can be defined in this XR report to give more details about the rapid synchronization experience of RRs. They are TBD. 7.3.2. SDP Signaling A new parameter is defined for the Multicast Join Report Block to be used with Session Description Protocol (SDP) [RFC4566]. It has the following syntax within the 'rtcp-xr' attribute: VerSteeg, et al. Expires May 7, 2009 [Page 27] Internet-Draft Rapid Synchronization for RTP Flows November 2008 rtcp-xr-attrib = "a=rtcp-xr:" [xr-format *(SP xr-format)] CRLF xr-format = "multicast-join" CRLF = %d13.10 Figure 9 Refer to Section 5.1 of [RFC3611] for a detailed description and the full syntax of the 'rtcp-xr' attribute. 8. SDP Definitions and Examples 8.1. Definitions The syntax of the 'rtcp-fb' attribute has been defined in [RFC4585]. Here we add the following syntax to the 'rtcp-fb' attribute (the feedback type and optional parameters are all case sensitive): (In the following ABNF [RFC5234], fmt, SP and CRLF are used as defined in [RFC4566].) rtcp-fb-syntax = "a=rtcp-fb:" rtcp-fb-pt SP rtcp-fb-val CRLF rtcp-fb-pt = "*" ; wildcard: applies to all formats / fmt ; as defined in SDP spec rtcp-fb-val = "nack" SP "ssli" The following parameter is defined in this document for use with 'nack': o 'ssli' stands for Stream Synchronization Loss Indication and indicates the use of RMS Request feedback as defined in Section 7.1.1. 8.2. Examples This section provides an SDP [RFC4566] example for enabling rapid synchronization with multicast RTP sessions. The following example uses the SDP grouping semantics [RFC3388], the RTP/AVPF profile [RFC4585], the RTP retransmissions [RFC4588], the RTCP extensions for SSM sessions with unicast feedback [I-D.ietf-avt-rtcpssm] and the source-specific media attributes [I-D.ietf-mmusic-sdp-source-attributes]. In the example below, we have two primary source streams and two VerSteeg, et al. Expires May 7, 2009 [Page 28] Internet-Draft Rapid Synchronization for RTP Flows November 2008 unicast retransmission streams for each of these source streams. The source streams are multicast from a distribution source (with a source IP address of 8.166.1.1) in different multicast groups. For each source stream, a feedback target address (9.30.30.1) is also specified with the 'rtcp' attribute. The receiver(s) can report missing packets on the source stream to the feedback target and request retransmissions. The parameter 'rtx-time' specifies the time in milliseconds (measured from the time a packet was first sent) that the sender (in this case the feedback target) keeps an RTP packet in its buffers available for retransmission. For the first source stream, only the conventional retransmission support is enabled. For the second source stream, both the conventional retransmission and rapid synchronization support are enabled. This is achieved by the "a=rtcp-fb:98 nack ssli" line. When a receiver requires rapid synchronization for a new multicast session it wants to join, it sends an RMS-R message to the feedback target. This feedback message has to have the SSRC of the primary source session for which rapid synchronization is requested for. However, since this receiver has not received any RTP packets from this primary source session yet, the receiver MUST learn the SSRC value from the 'ssrc' attribute of the media description [I-D.ietf-avt-rtcpssm]. In addition to the SSRC value, the 'cname' source attribute MUST also be present in the SDP description [I-D.ietf-mmusic-sdp-source-attributes]. Note that listing the SSRC values for the primary source sessions in the SDP file does not create a problem in SSM sessions when an SSRC collision occurs. This is because in SSM sessions, a receiver that observed an SSRC collision with a media source MUST change its own SSRC [I-D.ietf-avt-rtcpssm] by following the rules defined in [RFC3550]. A feedback target that receives an RMS-R feedback message becomes aware that the prediction chain at the receiver side has been broken or does not exist any more. If the necessary conditions are satisfied (as outlined in Section 7 of [RFC4585]) and available resources exist, the feedback target MAY react to the RMS-R message by sending the payload-specific feedback message(s) and starting the unicast burst. VerSteeg, et al. Expires May 7, 2009 [Page 29] Internet-Draft Rapid Synchronization for RTP Flows November 2008 v=0 o=ali 1122334455 1122334466 IN IP4 rtp.rocks.com s=Rapid Synchronization Examples t=0 0 a=group:FID 1 2 a=group:FID 3 4 a=rtcp-unicast:rsi m=video 40000 RTP/AVPF 96 i=Primary Source Stream #1 c=IN IP4 224.1.1.1/255 a=source-filter: incl IN IP4 224.1.1.1 8.166.1.1 a=recvonly a=rtpmap:96 MP2T/90000 a=rtcp:40001 IN IP4 9.30.30.1 a=rtcp-fb:96 nack a=mid:1 m=video 40002 RTP/AVPF 97 i=Unicast Retransmission Stream #1 (Ret. Support Only) c=IN IP4 9.30.30.1 a=recvonly a=rtpmap:97 rtx/90000 a=rtcp:40003 a=fmtp:97 apt=96 a=fmtp:97 rtx-time=3000 a=mid:2 m=video 41000 RTP/AVPF 98 i=Primary Source Stream #2 c=IN IP4 224.1.1.2/255 a=source-filter: incl IN IP4 224.1.1.2 8.166.1.1 a=recvonly a=rtpmap:98 MP2T/90000 a=rtcp:41001 IN IP4 9.30.30.1 a=rtcp-fb:98 nack a=rtcp-fb:98 nack ssli a=ssrc:123321 cname:iptv-ch32@rms.example.com a=rtcp-xr:multicast-join a=mid:3 m=video 41002 RTP/AVPF 99 i=Unicast Retransmission Stream #2 (Ret. and Rapid Synch. Support) c=IN IP4 9.30.30.1 a=recvonly a=rtpmap:99 rtx/90000 a=rtcp:41003 a=fmtp:99 apt=98 a=fmtp:99 rtx-time=5000 a=mid:4 VerSteeg, et al. Expires May 7, 2009 [Page 30] Internet-Draft Rapid Synchronization for RTP Flows November 2008 9. NAT Considerations TBC. 10. Open Source RTP Receiver Implementation An open source RTP Receiver code that implements the functionalities introduced in this document is available at the following URL: http://www.cisco.com/en/US/docs/video/cds/cda/vqe/2_0/user/guide/ ch1_over.html The code is also available at: ftp://ftpeng.cisco.com/ftp/vqec/ Note that this code is under development and may be based on an earlier version of this document. As we make progress in the draft, the source code will also be updated to reflect the changes. 11. Open Issues o Finalizing the RTCP payload-independent message formats. o Format of the MPEG2-TS TSRAP message. o Capability of extending the feedback message formats in the future. o Determining other useful information to report in the Multicast Join XR Report. 12. Security Considerations TBC. 13. IANA Considerations This document register new SDP values and RTCP packets. The following contact information shall be used for all registrations in this document: VerSteeg, et al. Expires May 7, 2009 [Page 31] Internet-Draft Rapid Synchronization for RTP Flows November 2008 Ali Begen abegen@cisco.com 170 West Tasman Drive San Jose, CA 95134 USA 13.1. Registration of SDP Attribute Values This document registers a new value for the 'nack' attribute to be used with the 'rtcp-fb' attribute in SDP. For more information about 'rtcp-fb', refer to [RFC4585]. Value name: ssli Long name: Stream Synchronization Loss Indication Usable with: nack Reference: This document 13.2. Registration of FMT Values Within the RTPFB range, the following three format (FMT) values are registered: Name: RMS-R Long name: Rapid Multicast Synchronization Request Value: 2 Reference: This document Name: RMS-I Long name: Rapid Multicast Synchronization Information Value: 3 Reference: This document Name: RMS-T Long name: Rapid Multicast Synchronization Termination Value: 4 Reference: This document Within the PSFB range, the following format (FMT) value is registered: VerSteeg, et al. Expires May 7, 2009 [Page 32] Internet-Draft Rapid Synchronization for RTP Flows November 2008 Name: MPEG2-TS TSRAP Long name: MPEG2-TS Transport Stream Random Access Point Value: 4 Reference: This document 13.3. Registration of RTCP XR Block Types New block types for RTCP XR are subject to IANA registration. For general guidelines on IANA considerations for RTCP XR, refer to [RFC3611]. This document (provisionally) assigns the block type value TBD in the RTCP XR Block Type Registry to "Multicast Join Report Block." This document also registers the SDP [RFC4566] parameter 'multicast-join' for the 'rtcp-xr' attribute in the RTCP XR SDP Parameters Registry. 14. Acknowledgments TBC. 15. Change Log 15.1. draft-versteeg-avt-rapid-synchronization-for-rtp-01 The following are the major changes compared to version 00: o The core of the rapid synchronization method is now payload- independent. But, the draft still defines payload-specific messages that are required for enabling rapid synch for the RTP flows carrying MPEG2-TS. o RTCP APP packets have been removed, new RTCP transport-layer and payload-specific feedback messages have been defined. o The step for leaving the current multicast session has been removed from Section 6.2. o A new RTCP XR (Multicast Join) report has been defined. o IANA Considerations section have been updated. o Editorial changes to clarify several points. 16. References VerSteeg, et al. Expires May 7, 2009 [Page 33] Internet-Draft Rapid Synchronization for RTP Flows November 2008 16.1. Normative References [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. Jacobson, "RTP: A Transport Protocol for Real-Time Applications", STD 64, RFC 3550, July 2003. [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session Description Protocol", RFC 4566, July 2006. [RFC3388] Camarillo, G., Eriksson, G., Holler, J., and H. Schulzrinne, "Grouping of Media Lines in the Session Description Protocol (SDP)", RFC 3388, December 2002. [RFC4585] Ott, J., Wenger, S., Sato, N., Burmeister, C., and J. Rey, "Extended RTP Profile for Real-time Transport Control Protocol (RTCP)-Based Feedback (RTP/AVPF)", RFC 4585, July 2006. [RFC4588] Rey, J., Leon, D., Miyazaki, A., Varsa, V., and R. Hakenberg, "RTP Retransmission Payload Format", RFC 4588, July 2006. [I-D.ietf-avt-rtcpssm] Ott, J., "RTCP Extensions for Single-Source Multicast Sessions with Unicast Feedback", draft-ietf-avt-rtcpssm-17 (work in progress), January 2008. [I-D.ietf-mmusic-sdp-source-attributes] Lennox, J., Ott, J., and T. Schierl, "Source-Specific Media Attributes in the Session Description Protocol (SDP)", draft-ietf-mmusic-sdp-source-attributes-02 (work in progress), October 2008. [RFC5234] Crocker, D. and P. Overell, "Augmented BNF for Syntax Specifications: ABNF", STD 68, RFC 5234, January 2008. 16.2. Informative References [MPEG2TS] ITU-T H.222.0, "Generic Coding of Moving Pictures and Associated Audio Information: Systems", May 2006. [RFC0768] Postel, J., "User Datagram Protocol", STD 6, RFC 768, August 1980. [I-D.ietf-rmt-pi-norm-revised] VerSteeg, et al. Expires May 7, 2009 [Page 34] Internet-Draft Rapid Synchronization for RTP Flows November 2008 Adamson, B., Bormann, C., London, U., and J. Macker, "NACK-Oriented Reliable Multicast Protocol", draft-ietf-rmt-pi-norm-revised-07 (work in progress), October 2008. [I-D.ietf-fecframe-framework] Watson, M., "Forward Error Correction (FEC) Framework", draft-ietf-fecframe-framework-03 (work in progress), October 2008. [RFC3376] Cain, B., Deering, S., Kouvelas, I., Fenner, B., and A. Thyagarajan, "Internet Group Management Protocol, Version 3", RFC 3376, October 2002. [RFC2250] Hoffman, D., Fernando, G., Goyal, V., and M. Civanlar, "RTP Payload Format for MPEG1/MPEG2 Video", RFC 2250, January 1998. [RFC3611] Friedman, T., Caceres, R., and A. Clark, "RTP Control Protocol Extended Reports (RTCP XR)", RFC 3611, November 2003. Authors' Addresses Bill VerSteeg Cisco Systems 5030 Sugarloaf Parkway Lawrenceville, GA 30044 USA Email: billvs@cisco.com Ali Begen Cisco Systems 170 West Tasman Drive San Jose, CA 95134 USA Email: abegen@cisco.com VerSteeg, et al. Expires May 7, 2009 [Page 35] Internet-Draft Rapid Synchronization for RTP Flows November 2008 Tom VanCaenegem Alcatel-Lucent Bell Copernicuslaan 50 Antwerpen, 2018 Belgium Email: Tom.Van_Caenegem@alcatel-lucent.be VerSteeg, et al. Expires May 7, 2009 [Page 36] Internet-Draft Rapid Synchronization for RTP Flows November 2008 Full Copyright Statement Copyright (C) The IETF Trust (2008). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Intellectual Property The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org. Acknowledgment Funding for the RFC Editor function is provided by the IETF Administrative Support Activity (IASA). VerSteeg, et al. Expires May 7, 2009 [Page 37]