INTERNET DRAFT                                             12 July 2000


Expires: 12 January 2001

                                                            Sean Sheedy
                                                      nCUBE Corporation


                             RTSP Extensions:
            Additional Transports and Performance Enhancements


                   draft-sheedy-mmusic-rtsp-ext-01.txt


                           Status of this memo

          This document is an Internet-Draft and is in full
          conformance with all provisions of Section 10 of
          RFC2026.

          Internet-Drafts are working documents of the
          Internet Engineering Task Force (IETF), its areas,
          and its working groups.  Note that other groups may
          also distribute working documents as Internet-
          Drafts.

          Internet-Drafts are draft documents valid for a
          maximum of six months and may be updated, replaced,
          or obsoleted by other documents at any time.  It is
          inappropriate to use Internet-Drafts as reference
          material or to cite them other than as "work in
          progress."

          The list of current Internet-Drafts can be accessed
          at http://www.ietf.org/ietf/1id-abstracts.txt

          The list of Internet-Draft Shadow Directories can
          be accessed at http://www.ietf.org/shadow.html.


                                 Abstract

          This document proposes enhancements to the RTSP
          protocol for broadcast quality non-IP based video-
          on-demand applications.  Additional transports for
          non-IP delivery of media streams are proposed,
          along with control extensions to reduce latency.
          These proposals are based on nCUBE Corporation's
          and Oracle Corporation's experience with their
          existing media servers.


Sheedy                                                         [Page 1]
INTERNET DRAFT                                             12 July 2000


1  Introduction

   nCUBE Corporation has developed a media server using the RTSP
   standard [1] for its video-on-demand (VOD) platform, the MediaCUBE
   4.  The platform is designed for large-scale deployments of
   broadcast quality interactive video.  It is being used currently in
   several commercial deployments worldwide, with many more deployments
   scheduled in the near future.

   nCUBE's experience to date with the RTSP protocol has been positive.
   The basic protocol is flexible enough to work in a large-scale,
   high-bandwidth environment.  The HTTP-like syntax has proven easy
   for client developers to implement.  The flexibility provided by the
   syntax and the facilities for extensions have proven invaluable in
   deploying RTSP in an environment somewhat different from that for
   which it was originally designed.

   A typical broadcast quality environment differs from the Internet
   environment in several ways:

      - Transports and lower transports

        Although IP protocols are often used for control connections,
        broadcast quality video-on-demand installations often do not
        use IP protocols (such as UDP and RTP) for the actual delivery
        of a presentation (which is usually an aggregation of media
        streams).  Typically, MPEG-2 transport streams are carried on a
        lower transport that natively supports MPEG-2, such as AAL5
        (over ATM) or QAM.

      - Multiplexing RTSP clients

        Supporting non-RTSP clients (e.g., many currently available set
        top boxes) requires a bridging server that speaks the client's
        native protocol and, in turn, acts as an RTSP client of the
        media server.  Such bridging servers typically make transport
        address and bandwidth assignments for the clients, and often
        need to coordinate these decisions with external hardware
        devices such as QAM modulators and up converters.

      - Limited capability clients

        Video-on-demand clients in the home (such as specialized set
        top boxes) are under tremendous price pressures.  Consequently,
        their capabilities are often much more limited than even low-
        end general-purpose computers.  Memory is typically very
        limited (on the order of 8 megabytes), and the media streams
        are discarded immediately once they have been decoded.  Many
        hardware decoders are sensitive to timing jitter and
        discontinuities in video or audio elementary streams.


Sheedy                                                         [Page 2]
INTERNET DRAFT                                             12 July 2000


      - Latency

        Low latency, particularly for stream control requests such as
        pause or fast forward, is critical to the satisfaction of many
        home users of video-on-demand services.  End users of a home
        VOD service are a cross section of cable television customers,
        and are often not computer savvy.  Their standard of comparison
        for the responsiveness of a media server is their VCR or DVD
        player, not a computer web browser accessing the Internet.

   nCUBE Corporation has added some extensions to the RTSP standard,
   which address the requirements of this different environment [2].
   These enhancements were developed in collaboration with Oracle
   Corporation, who has also incorporated similar features in their
   RTSP server [3].  The remainder of this document proposes a set of
   enhancements to the RTSP standard based on both of these
   implementations.

   This document does not address extensions to the SDP standard [4]
   that may also be required.

2  Transports and Lower Transports

   Other media delivery mechanisms besides RTP are used in many
   commercial video-on-demand deployments.  To support this, new
   transports and lower transports in addition to the current standard
   RTP and UDP are needed.

   The names and address syntax for these transports and lower
   transports are intended to match those in proposed extensions to SDP
   syntax [5].

2.1  Transports

   MPEG-2 is used extensively for high bandwidth video [6].  An
   enhancement to the "transport-protocol" field in the "Transport"
   header to support this is:

        MP2T   MPEG-2 Transport

   The new syntax of "transport-protocol" would then be:

       transport-protocol = "RTP" | "MP2T"

2.2  Lower Transports

   Similarly, TCP or UDP are not always used as lower transports.
   Enhancements to the "lower-transport" field are:

        AAL5   ATM Adaptation Layer 5


Sheedy                                                         [Page 3]
INTERNET DRAFT                                             12 July 2000


        ASI    DVB Asynchronous Serial Interface

        QAM    Quadrature Amplitude Modulation

   The new syntax of "lower-transport" would then be:

       lower-transport = "TCP" | "UDP" | "AAL5" | "ASI" | "QAM"

   (Note that end user RTSP clients typically don't request a DVB-ASI
   lower transport.  This is primarily used by bridging servers that
   are also controlling external hardware such as QAM modulators.)

2.3  Destinations

   To handle the additional lower transport types, the syntax of the
   "destination" transport parameter needs to be enhanced.  In
   particular, many lower transports described in section 2.2 use
   addresses that are not globally unique, but are unique only within a
   particular physical channel.  The destination "address" field should
   contain an optional identifier string at the beginning to allow
   sufficiently intelligent clients (such as bridging servers) to
   disambiguate between physical channels.

   The formal syntax for the expanded "destination" addresses is:

       address      = [ id-string ":" ] type-address

       type-address = host
                    | atm-address
                    | qam-address

       id-string    = 1*( ALPHA | DIGIT | "_" )

   (Note that QAM and DVB-ASI addressing are identical, and both are
   covered by the "qam-address" rule.)

   The AVP profile is used with all of these lower transports.  The
   behavior specified by the AVP profile is defined by existing
   standards, and depends upon the lower transport type:

        AAL5   The ITU H.222.1 standard for MPEG delivery over
               ATM [7]

        ASI    The Digital Video Broadcasting - Cable standard
        QAM    [8]

2.3.1  ATM Address

   The destination address for ATM may be either an ATM permanent
   virtual circuit address or an ATM switched virtual circuit address:


Sheedy                                                         [Page 4]
INTERNET DRAFT                                             12 July 2000


       atm-address = atm-pvc-address
                   | atm-svc-address


2.3.1.1 ATM PVC Address

   The destination address for an ATM permanent virtual circuit is the
   VPI and VCI of the client, and the port number on the server of the
   physical trunk that connects to the client.  Numbers may be
   specified in either decimal or hexadecimal (preceded by "0x"):

       atm-pvc-address = atm-port "/" vpi "/" vci

       atm-port        = 1*5(DIGIT)
                       | "0x" 1*4(HEX)
       vpi             = 1*5(DIGIT)
                       | "0x" 1*4(HEX)
       vci             = 1*5(DIGIT)
                       | "0x" 1*4(HEX)

   For example, to use ATM permanent virtual circuits a client may
   specify a Transport header like the following:

       Transport: MP2T/AVP/AAL5;unicast;destination=0/0/40

2.3.1.2 ATM SVC Address

   The destination address for an ATM switched virtual circuit is the
   20-byte network service access point (NSAP) address, specified as 40
   hex characters without a "0x" prefix.  Optionally, dots may be
   included after 16-bit fields, with the first dot following an 8-bit
   field:

       atm-svc-address = 20*20(HEX)
                       | 2*2(HEX) 9*9( "." 4*4(HEX) ) "." 2*2(HEX)

   For example, to use ATM switched virtual circuits a client may
   specify a Transport header like the following:

       Transport: MP2T/AVP/AAL5;unicast;
                  destination=47000580ffe1000000f21a360b00204821490f01

2.3.2  QAM and DVB-ASI Addresses

   The destination address for QAM or DVB-ASI is a server-specific
   channel number (note that this is not the RF channel number) and
   MPEG-2 program number specified in decimal:

       qam-address    = channel-number "." program-number

       channel-number = 1*3(DIGIT)
       program-number = 1*5(DIGIT)


Sheedy                                                         [Page 5]
INTERNET DRAFT                                             12 July 2000


   For example, to use QAM a client may specify a Transport header like
   the following:

       Transport: MP2T/AVP/QAM;unicast;destination=cim00:0.75

   For example, to use DVB-ASI a client may specify a Transport header
   like the following:

       Transport: MP2T/AVP/ASI;unicast;destination=dac00:0.75

2.4  Client Identification

   For most video-on-demand environments, clients cannot be allowed to
   specify a transport destination address.  In non-IP delivery
   environments, they typically do not have sufficient knowledge of the
   network topology to properly specify an address.  In all
   environments, allowing clients to choose an address presents
   security problems.  Further, in many non-IP delivery environments
   (such as cable systems using QAM and DOCSIS), valid transport
   addresses cannot be derived from the IP address of the client.

   To resolve these problems, the client must be able to identify
   itself to the media server.  This can be accomplished by adding a
   new Transport parameter, "client".  The argument to "client" is a
   deployment-specific string that uniquely identifies a client.
   Identity information included in the string may be, for example, a
   smart card ID for a set top box and the optical node to which the
   set top box is connected.

   The formal syntax for the "client" parameter is:

       client    = "client" "=" client-id
       client-id = token

3  Reuse of Transports

   In some environments, such as commercial ATM or QAM deployments,
   transport properties do not change from presentation to
   presentation, and setting up sessions is expensive, in terms of both
   server loading and client perceived latency.  The Web browsing model
   of creating a transport for each presentation works well in many
   Internet delivery environments.  In a non-IP delivery environment
   with dedicated media delivery bandwidth, however, using a single
   transport for several sequential presentations provides a better end
   user experience.

   Allowing a single transport to handle multiple sequential
   presentations requires extensions in the following areas:

      - Presentation URI's

      - Transport parameters


Sheedy                                                         [Page 6]
INTERNET DRAFT                                             12 July 2000


3.1  Presentation URI Enhancements

   Reusing an existing transport for different presentations requires a
   mechanism to change the presentation URI, and URI wildcarding to
   allow clients to control playout in cases when the currently active
   presentation on the server is not known precisely.

3.1.1  Changing Presentation URI

   To play a different presentation on an existing transport, the
   client may specify a different presentation URI on a PLAY method
   request than was used in the initial SETUP request.  If a PLAY is
   requested with a different presentation URI than that most recently
   used in the session, the presentation specified by the new URI will
   be played over the existing session's transport.

   For a PLAY request with a new presentation URI to succeed,
   sufficient bandwidth must already be available in the existing
   transport.  This can be reserved with an extension transport
   parameter on the initial SETUP of the session ("bandwidth",
   described in section 3.2), or can be allocated with a new SETUP
   request.

   Changing a presentation URI is only allowed if the server supports
   aggregate control of both the current presentation and the new
   presentation.  If this is not the case, the server must respond with
   error 459, "Aggregate Operation Not Allowed".

   If the new URI requested by a client is not a presentation URI, the
   server must respond with error 460, "Only Aggregate Operation
   Allowed".

3.1.2  Wildcard Presentation URI

   If a client uses queued PLAY requests with different URI's, it may
   not be able to determine which presentation is active at any
   particular time.  To handle this case, an asterisk (*) for the URI
   matches whatever presentation URI, if any, is currently active.
   Such a wild card asterisk is legal only for the following methods:

       PLAY
       PAUSE
       TEARDOWN
       GET_PARAMETER
       SET_PARAMETER

   Wildcard URI's may not be used with SETUP requests.  If a client
   uses a wildcard URI in a SETUP request, the server must respond with
   an error 404, "Not Found".


Sheedy                                                         [Page 7]
INTERNET DRAFT                                             12 July 2000


3.1.2.1 URI Response Header

   In a response to a request containing a wildcard URI, the server
   must include a URI header its response to indicate the actual
   presentation URI affected by the request.  The syntax of the URI
   response header is:

       URI-Response = "URI" ":" absolute_URI

3.2  bandwidth Transport Parameter

   A client may use the "bandwidth" Transport parameter to reserve
   bandwidth for a transport.  Its argument is a decimal number
   specifying the bandwidth to reserve in bits per second.  If no
   bandwidth parameter is given, it implies that the media server will
   use the bit rate of the presentation specified in the SETUP
   request's URI for the bandwidth of the transport.

   The formal syntax for the "bandwidth" parameter is:

       bandwidth = "bandwidth" "=" 1*DIGIT

4  PLAY Queue Enhancements

   Requiring all new PLAY requests to be queued when another PLAY
   request is active makes low-latency implementation of fast forward
   and rewind difficult; it requires multiple requests to the media
   server to stop the current PLAY and start the new one.  Further, it
   makes seamless transitions between normal and scaled play
   impossible, since the current PLAY must be stopped, resulting in a
   gap in the media delivery, before the new PLAY can be started.

   Similarly, requiring all PAUSE requests to flush the queue of PLAY
   requests is awkward.  This forces a client to remember and reissue
   all previously queued PLAY requests when it restarts a stream after
   a PAUSE.

   These problems can be resolved by allowing clients to specify the
   type of queuing behavior they desire on each request.  The proposed
   mechanism uses a new header "Queue-Control" with two options for
   specifying queueing behavior.  The syntax is:

       Queue-Control   = "Queue-Control" ":" queue-directive
                         *(";" queue-directive)
       queue-directive = "play-now"
                       | "no-flush"

4.1  play-now Directive

   A client may use the play-now directive with either a SETUP or PLAY
   method.


Sheedy                                                         [Page 8]
INTERNET DRAFT                                             12 July 2000


4.1.1  play-now with PLAY

   When used in a PLAY request, the play-now directive indicates that
   the PLAY operation should be performed immediately rather than
   queuing it.  Using play-now in a PLAY request causes any queued PLAY
   requests to be discarded unless the no-flush directive is also
   included.

4.1.2  play-now with SETUP

   When added to a SETUP request, the play-now directive indicates that
   the client wants streaming to begin immediately (i.e., possibly even
   before the SETUP response is sent to the client). This allows the
   client to avoid waiting for the response from SETUP and then issuing
   a PLAY command, but has some practical limitations.

   The play-now directive with SETUP is not useful in those
   environments where the client requires information contained in the
   SETUP response before it can start decoding the media stream.  For
   example, if a set top box needs the SETUP response to know which
   channel to tune to, it will typically need to issue a separate PLAY
   command after it has tuned to the proper channel.

   If the play-now directive is included in a SETUP request, Range,
   Scale and Speed headers may also be included.

4.2  no-flush Directive

   A client may use the no-flush directive with either a PAUSE or PLAY
   method.  When added to either request, it prevents queued PLAY
   requests from being discarded.

5  Server State Changes

   Most clients need to track the state of the media server while the
   server is streaming.  The most critical state change to clients
   occurs when the media server encounters the end of a presentation
   (or the beginning when rewinding), and stops streaming.  There are
   currently no standard mechanisms for detecting this in the RTSP
   specification.  Problems clients encounter in the current
   architecture include:

      - Polling for the current media server state wastes network
        bandwidth, and introduces unacceptable latencies in detecting
        state transitions.

      - In non-IP delivery environments, the transport typically
        remains allocated even if no media is being delivered.  This
        means that a client cannot watch for the server to close the
        transport to signal the end of media delivery.


Sheedy                                                         [Page 9]
INTERNET DRAFT                                             12 July 2000


      - Watching for the incoming media to stop is unreliable.  Short
        timeouts can trigger a false end of media detection if the
        media flow is temporarily delayed.  Long timeouts introduce
        unacceptable latencies.  Clients are unable to distinguish
        between a normal end of media and an error condition that
        resulted in the media delivery stopping.

   These problems can be remedied by a client callback mechanism.  The
   proposed mechanism uses the ANNOUNCE method sent from the server to
   the client, along with a new header which contains the details of
   media server state transitions.

5.1  ANNOUNCE Callbacks

   If desired by the client, an ANNOUNCE request can be sent
   asynchronously from the server to the client to notify it of any
   changes in a session state.  ANNOUNCE requests are only sent to a
   client if the client used the May-Notify header in its SETUP request
   for the session (section 5.2).  The nature and time of the event
   causing the stream state change are contained in the Notice header
   (section 5.3).

   An ANNOUNCE request will only be sent if the session is currently
   associated with an open persistent connection to the client.  If the
   session is not associated with a connection to the client, the state
   change notification will be returned in the next GET_PARAMETER
   response for the session.

5.2  May-Notify Header

   The May-Notify header may be included in a SETUP request.

   If a client includes the May-Notify header in a SETUP request, the
   server will notify the client asynchronously of any stream state
   changes by sending it an ANNOUNCE request (section 5.1).   If this
   header is not included, state changes are returned to the client as
   part of a GET_PARAMETER response.  In both cases, the state change
   is reported with a Notice header (section 5.3).

5.3  Notice Header

   The Notice header contains media server state change information for
   a session, such as errors encountered during play or reaching the
   end of the stream.  It may only originate from a media server, and
   is not recognized in client requests.  The Notice header is sent
   from the server to a client via either an ANNOUNCE request or a
   GET_PARAMETER response (section 5.1).

   The formal syntax for the Notice header is:

       Notice       = "Notice" ":" notify *("," notify)


Sheedy                                                        [Page 10]
INTERNET DRAFT                                             12 July 2000


       notify       = event-code SP """ event-phrase """ SP
                      "event-date" "=" utc-time

       event-code   = 4DIGIT

       event-phrase = *<TEXT, excluding CR, LF, ">

   Event codes and phrases which may be returned by the server are:

        Code   Message

        1103   Stream Stalled

        1104   Stream Resumed

        2101   End-of-Stream Reached

        2103   Transition

        2104   Start-of-Stream Reached

        2306   Continuous Feed Terminated

        4401   Error Reading Media Data

        5201   Server Resources Unavailable

        5401   Stream Failure

        5402   Session Terminated by Server

        5403   Server Shutting Down

        5501   Internal Server Error

6  Miscellaneous

6.1  Reason Header

   A client may wish to inform the server why it has chosen to tear
   down a session.  This is often useful in diagnosing server or
   network problems.  This is accomplished with the Reason header.  The
   Reason header is only valid in TEARDOWN requests.

   How much of the Reason header message is saved by the media server,
   or whether the message is saved at all, is up to the discretion of
   the media server.  Implementers of media servers should place limits
   on the message length and message frequency to prevent the Reason
   header from being used in denial-of-service attacks.

   The formal syntax of the Reason header is:


Sheedy                                                        [Page 11]
INTERNET DRAFT                                             12 July 2000


       Reason        = "Reason" ":" reason-phrase
       reason-phrase = *<TEXT, excluding CR, LF, ">

6.2  Looping Ranges

   Continuous looping play of a presentation is a frequent requirement
   in commercial environments.  This is typically used for movie
   trailers, etc.

   To support this, the Range header can be enhanced to allow clients
   to ask the media server to continuously loop a presentation.  The
   formal syntax of the extended Range header is:

       Range        = "Range" ":" 1\#ranges-specifier *(range-option)
       range-option = ";" "time" "=" utc-time
                    | ";" "loop" [ "=" loop-count ]
       loop-count   = 1*DIGIT

   Adding the "loop" option to a Range header causes the specified
   range within the media to loop for "loop-count" iterations, or
   forever if no "loop-count" is specified.

   A PAUSE request or another PLAY request for the session will stop
   the looping.  A PAUSE request will terminate the loop immediately.
   A queued PLAY request (without the play-now directive, section 4.1)
   will terminate the loop at the end of the current iteration.  A PLAY
   request with the play-now directive will terminate the loop
   immediately.

6.3  Additional Status Codes

   The following two standard status codes should be added:

        Code   Message

        463    Destination Required

        464    Unable to Visual Scan

   Code 463 indicates that the media server was unable to select an
   appropriate transport destination address for the client, and that
   the client must supply one explicitly.  It may only be returned in
   SETUP responses.

   Code 464 may only be returned in response to a PLAY request, which
   includes a scale other than 1.  It indicates that the server is
   unable to stream the media at a rate other than normal speed
   forward.  This may be a temporary condition caused, for example, by
   unusually heavy loading on the media server.  It may also be a
   permanent condition due, for example, to media encoding limitations
   or media server policy.


Sheedy                                                        [Page 12]
INTERNET DRAFT                                             12 July 2000


6.4  Stream Parameters

   Standard parameters need to be defined for the GET_PARAMETER method
   to be generally useful.  Proposed standard parameters are:

        state      The current server protocol state.  Possible
                   returned values are:

                        playing
                        ready

        position   The current stream position. The position is
                   the number of seconds from the beginning of
                   the media, in npt format.

        scale      The current stream scale.

Appendix A: Author's Address

   Sean Sheedy
   nCUBE Corporation
   1825 NW 167th Place
   Beaverton, OR  97006
   USA

   E-mail: seans@ncube.com

References

1. Schulzrinne, H., Rao, A. and R. Lanphier, "Real Time Streaming
   Protocol (RTSP)", RFC 2326, April 1998.

2. nCUBE Corporation, "nCUBE RTSP Implementation and Extensions",
   January 2000.

3. Oracle Corporation, "Custom Video Client Developer's Guide, Release
   3.2", September 1999.

4. Handley, M., and V. Jacobson, "SDP: Session Description Protocol",
   RFC 2327, April 1998.

5. Kumar, R. and M. Mostafa, "Conventions for the use of the Session
   Description Protocol (SDP) for ATM Bearer Connections", draft-
   rajeshkumar-mmusic-sdp-atm-02.txt, July 2000.

6. International Telecommunication Union, "Generic Coding of Moving
   Pictures and Associated Audio Information: Systems", H.222.0, July
   1995.

7. International Telecommunication Union, "Multimedia Multiplex and
   Synchronization for Audiovisual Communication in ATM Environments",
   H.222.1, March 1996.


Sheedy                                                        [Page 13]
INTERNET DRAFT                                             12 July 2000


8. European Telecommunications Standards Institute, "Digital Video
   Broadcasting: Framing Structure, Channel Coding and Modulation For
   Cable Systems", EN 300 429, October 1997.


Sheedy                                                        [Page 14]