Internet Engineering Task Force Ron Frederick Internet Draft Jay Geagan Document: draft-periyannan-rtsp-caching-01.txt Mike Kellner March 10, 2000 Alagu Periyannan Expires: September 10, 2000 Entera, Inc. Caching Support in Standards-based RTSP/RTP Servers Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Abstract This document presents the issues facing streaming media caching. It proposes a set of mechanisms to enable streaming media caching between standards-based RTSP/RTP servers and proxies. Streaming media caching refers to the process through which streaming content is dynamically replicated closer to users so as to provide a better viewing experience. A list of RTSP enhancements and open issues are presented. This document is intended to be a starting point for discussion between various parties interested in standardizing the mechanism used by RTSP/RTP servers to enable streaming media caching. Frederick, Geagan, Kellner, Periyannan [1] draft-periyannan-rtsp-caching-01.txt March 10, 2000 1. Introduction This document presents the issues facing streaming media caching. It also proposes a set of mechanisms to enable streaming media caching in standards-based streaming servers that use the RTSP [1] and RTP [2,3] protocols. Streaming media caching refers to the process through which streaming content is dynamically replicated closer to users so as to provide a better viewing experience. The replication of the content is done with the co-operation of the origin RTSP/RTP streaming server and a streaming media caching RTSP/RTP proxy that is deployed close to the users. A streaming caching proxy communicates with the origin server to pull the media data into local storage. The communication between the proxy and the origin server is achieved via a few enhancements to the RTSP protocol. Once the media data is in local storage the proxy serves subsequent clients directly via RTSP/RTP with minimal communication with the origin server. The next section presents the issues faced when solving the streaming media caching problem. The section that follows introduces multiple approaches to streaming media caching. The subsequent sections present a streaming caching proposal and discuss the open issues. 2. Streaming Caching Issues A streaming caching architecture should try to solve the following issues, - Transfer Loss A method for caching proxies to create a loss-less copy of the media from the origin server. This is a challenge since the media is normally carried over UDP. - Transformation Loss A method for caching proxies to re-create lost information present in the original media. This is an issue since the media is transformed into RTP packets that may not carry all of the information required to originate the media stream. - Cache Coherency A method for caching proxies to know when the media that is replicated locally is stale, i.e. when newer media is available at the origin server. Frederick, Geagan, Kellner, Periyannan [2] draft-periyannan-rtsp-caching-01.txt March 10, 2000 - Access Accounting A robust mechanism for the origin servers to know about hit counts and hit durations from the caching proxies. - Authorization A method for origin servers to authorize caching proxies to serve content from cache. The authorization can be on a per-viewing basis or on a periodic (say hourly) basis. - Copy Protection Sufficient guards against rogue proxies from making unauthorized copies of media content from origin servers. Ideally, the cached content must be in a form that is only useable by proxies that participate in the access accounting and authorization mechanisms. (This is not much of an issue in web caching since web pages are not perceived as being as valuable as media content.) When solving the above issues it must be noted that the administrative/ownership domains of the origin servers, proxies and clients will not be the same in many cases. Content providers or hosting companies will usually own the origin servers; Internet service providers will usually own the caching proxies and individual users will own the streaming clients. It is also worth noting that not all issues mentioned above need to be solved to enable streaming media caching. Some content providers may not care about authorization issues. Others may not care about copy protection issues. Many of the above issues need not be solved in cases where the administrative domain of the origin servers and the proxies are the same. 3. Streaming Caching Approaches The transfer of media data between the origin server and proxy can happen using various approaches. A few of the possible approaches are presented below. Each has its advantages and disadvantages. - File Transfer The original media file is transferred from the origin server to the proxy. Proxies have complete information about the media with this approach. However, proxies need to be able to parse various file formats supported by origin servers. Solving the copy protection issue is harder in this approach since the original media file is transferred to the proxy. Frederick, Geagan, Kellner, Periyannan [3] draft-periyannan-rtsp-caching-01.txt March 10, 2000 - Packet Recording RTP packets are recorded as they pass between a client and the origin server. With this approach there is information loss when transforming the media file to packets. There could be additional loss through packet loss since RTP is normally carried over UDP. - Packet Transfer The original media file is not transferred from the origin server to the proxy. Instead the RTP packets generated from the original media are transferred over a loss-less protocol. In addition to the RTP packets some "meta" information present in the media file but not present in the RTP packets is also transferred. Thus, Proxies have complete information about the media with this approach. This document proposes a scheme based on the Packet Transfer approach. The scheme provides a solution to many of the issues presented in section 2. 4. Proposed Mechanisms for Streaming Caching 4.1 Description The proxy contacts the origin server using RTSP and requests that the media be streamed via in-band RTP over TCP. The RTP packets are sent via TCP in a loss-less fashion, i.e. without skipping packets. A new "metachannel" is used to send additional information missing within RTP packets, such as transmission time (not presentation time), key frame flag, etc. Although some of this information may be garnered from within an RTP packet, it will be easier for a proxy to process if it is available via the metachannel in an RTP payload- format independent manner. Copy protection fears are alleviated to some extent since the original media file is not transferred to the proxy. A rogue proxy or client can already record packets given the current RTSP and RTP specifications. The metachannel only adds a few additional pieces of information used for serving the media to clients. The packets sent to the proxy are not in a form that can be easily manipulated or re- authored. Once the media is in the proxy, it can serve the stream from local cache after a few brief RTSP transactions with the origin server. These transactions provide a mechanism for cache coherency, access accounting and authorization. Frederick, Geagan, Kellner, Periyannan [4] draft-periyannan-rtsp-caching-01.txt March 10, 2000 4.2 Caching Procedure The following steps describe the interaction between client, proxy and server, a. Client's DESCRIBE Transaction - Client sends a DESCRIBE for an RTSP URL. - Proxy passes on the DESCRIBE to the origin server. - Origin server replies with the SDP description [4]. - Proxy sends the DESCRIBE response to the client. b. Packet Transfer - Proxy compares the Last-Modified header with a previously cached copy (if any) to decide whether to re-fetch the media. - Proxy sends a SETUP on each RTSP stream with a special field in the Transport header requesting a metachannel (described below.) - Proxy sends a PLAY on each RTSP stream and starts transferring the RTP packets and metachannel data into local cache. c. Client's SETUP Transaction - Client sends a SETUP for each RTSP stream. - Proxy sends a SETUP to the origin server with a special transport type of "fromcache". d. Stream Serving from Proxy - Proxy acts like a server and interacts with the client to stream the media. - While the packet transfer operation is proceeding the proxy may decide to serve media to the client in classic non-caching pass- through mode. e. Client's TEARDOWN Transaction - Client sends a TEARDOWN - Proxy sends a TEARDOWN to the origin server 4.3 Metachannel Format The metachannel is used to send additional information that is missing within RTP packets. This information is used in conjunction with the RTP packet to serve clients from the proxy. The RTP/RTCP packets and metachannel packets are interleaved in-band within the RTSP TCP connection. A metachannel packet precedes a set of RTP packets and carries meta information for those RTP packets. Frederick, Geagan, Kellner, Periyannan [5] draft-periyannan-rtsp-caching-01.txt March 10, 2000 The following example SETUP message is used to request RTP over TCP with a metachannel (and RTP over UDP as the alternate transport.) SETUP rtsp://foo.com/bar.mov/trackID=3 Transport: rtp/avp/tcp;interleaved=0-1;x-metachannel=2, rtp/avp;unicast;client_port=7000-7001 The RTP and RTCP packets are sent in-band on RTSP channel ID 0 and 1. The "x-metachannel=2" specifies that the metachannel is sent in- band on RTSP channel ID 2. A metachannel packet contains an aggregation of stackable subpackets similar to what is done in RTCP. Like RTCP, these stackable subpackets contain a fixed header with a type and length followed by a variable amount of data depending on the type. Also like RTCP, subpackets always end on a 32-bit boundary. The information in a metachannel packet pertains to all RTP packets that are sent following the metachannel packet. The packets on the metachannel must be interleaved appropriately on channel 2 between associated RTP packets on channel 0, i.e. correct ordering and placement of packets on channel 0 and 2 is required. The fixed header of each metachannel subpacket looks as follows: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | type | length | data... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ The fields are: - type: 8 bits One of the metachannel types described below, indicating what information is contained in this subpacket. - length: 8 bits The length of this subpacket in 32-bit words minus one, just like the length in RTCP. The offset of one makes zero a valid length and avoids a possible infinite loop in scanning a metachannel packet, while counting 32-bit words avoids a validity check for a multiple of 4. Frederick, Geagan, Kellner, Periyannan [6] draft-periyannan-rtsp-caching-01.txt March 10, 2000 The following types are currently defined for metachannel subpackets: Type 1: Sample/Frame Info (8 bytes) 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | type=1 | length=1 | MBZ |R|C|B|P|I|T| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | transmission time offset | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ This information describes the sample/frame contained in the RTP packets that follow. The fields here are as follows: required (R): 1 bit This frame is a "required" frame. It is required to decode all frames after this frame (until the next Req frame.) This flag is useful at proxies when implementing trick modes. copy (C): 1 bit This frame is an exact copy of a previously sent frame. This flag is useful at proxies to reduce storage. B-frame (B): 1 bit This is a B-frame, i.e. an inter-coded frame with forward and reverse frame-differencing. This information could also be obtained by parsing the RTP data. However, it would require RTP payload-specific knowledge at the proxy. P-frame (P): 1 bit This is a P-frame, i.e. an inter-coded frame with reverse frame- differencing only. I-frame (I): 1 bit This is an I-frame, i.e. a key frame or intra-coded frame. transmission time valid (T): 1 bit This indicates that the transmission time offset field following this bit is valid. If this bit is zero, the transmission time offset value should be ignored and caching servers should use their own pacing/smoothing algorithms (if any) for scheduling the transmission time of packets. transmission time offset: 32 bits This is a signed 32-bit number holding the transmission time of each following RTP packet relative to its RTP timestamp. The timescale of this time is the same as the timescale of the RTP timestamp. It can be used by caching servers to match the origin server's transmission schedule of the packets without having to analyze the content. Frederick, Geagan, Kellner, Periyannan [7] draft-periyannan-rtsp-caching-01.txt March 10, 2000 Type 2: Stream Info 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | type=2 | length=0 | MBZ |A|xpt| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ This information describes the attributes of the stream. It is sent at the beginning of the stream prior to sending any RTP packets or other metachannel packets. The fields are as follows: send-all (A): 1 bit This bit specifies that a caching server MUST send all the packets in the stream for a receiver to be able to decode it correctly. It should not perform any kind of "thinning" operation on the stream, even if that means it is unable to meet real-time delivery requirements. transport preference (xpt): 2 bits These bits specify the preferred transport for this stream. Possible values are: 00: No preference. Use whichever transport clients prefer most. 01: Prefer real-time. Prefer a transport that allows for smooth real-time playback at the possible expense of reliability. For instance, given the option of RTP over TCP and RTP over UDP, choose UDP. 10: Prefer reliable. Prefer a transport that provides reliability at the possible expense of real-time delivery. For instance, given the same choices as above, choose TCP. 11: Require a reliable transport. If the client only offers unreliable transport options, refuse to serve the stream. Bits marked MBZ are reserved for future use. They MUST be set to zero by the sender and ignored by the receiver. Frederick, Geagan, Kellner, Periyannan [8] draft-periyannan-rtsp-caching-01.txt March 10, 2000 5. RTSP Enhancements Required The following changes need to be made to RTSP servers, - Last-Modified header usage in DESCRIBE This header is required for cache coherency checking and is already part of the RTSP specification. The reason it is mentioned in this section is that it is not a required header in an RTSP-compliant implementation. - SETUP method with a special transport type of "fromcache" This is required for access logging and authentication. - New metachannel This is required to get around "transformation loss". 6. Open Issues - Copy protection Copy protection fears are alleviated to some extent but not totally solved. One could argue that this scheme provides a "hook" in origin servers to create perfect copies of media content. However, The media sent to the proxy is not in a form that can be easily manipulated or re-authored. - Enforcement of Access accounting and authorization Access accounting and authorization with the origin server is not enforced, i.e. proxies can serve content to clients without communicating with the origin servers (except the first time.) 7. Security Considerations See open issues. 8. Intellectual Property Notice The IETF and its members are hereby notified that Entera Corporation claims certain intellectual property rights, including patent rights, in regard to some or all of the specifications contained in this document. Should this specification be adopted by the IETF, Entera will agree to grant licenses to these rights to interested parties on reasonable and nondiscriminatory terms. For more information contact Entera Corporation. Frederick, Geagan, Kellner, Periyannan [9] draft-periyannan-rtsp-caching-01.txt March 10, 2000 9. References [1] Schulzrinne, H., Rao, A., Lanphier, R., "Real Time Streaming Protocol (RTSP)," RFC 2326, April 1998. [2] Schulzrinne, H., Casner, S., Frederick, R., and V. Jacobson, "RTP: A transport protocol for real-time applications," RFC 1889, January 1996. [3] Schulzrinne, H., "RTP Profile for Audio and Video Conferences with Minimal Control," RFC 1890, January 1996. [4] Handley, M., Jacobson, V., "SDP: Session Description Protocol," RFC 2327, April 1998. 10. Authors' Addresses Ron Frederick Entera, Inc. Email: ronf@entera.com Jay Geagan Entera, Inc. Email: j@entera.com Mike Kellner Entera, Inc. Email: m@entera.com Alagu Periyannan Entera, Inc. Email: alagu@entera.com Entera, Inc. 40971 Encyclopedia Circle Fremont CA 94538 Phone: +1 510 770 5200 URL: http://www.entera.com Frederick, Geagan, Kellner, Periyannan [10]