Internet DRAFT - draft-lazzaro-mmusic-stage-studio-requirements

draft-lazzaro-mmusic-stage-studio-requirements









INTERNET-DRAFT                                                J. Lazzaro
July 7, 2005                                                J. Wawrzynek
Expires: January 7, 2006                                     UC Berkeley


        Requirements for a Stage and Studio Multimedia Framework

        <draft-lazzaro-mmusic-stage-studio-requirements-00.txt>


Status of this Memo

By submitting this Internet-Draft, each author represents that any
applicable patent or other IPR claims of which he or she is aware have
been or will be disclosed, and any of which he or she becomes aware
will be disclosed, in accordance with Section 6 of BCP 79.

Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups.  Note that other
groups may also distribute working documents as Internet-Drafts.

Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time.  It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."

The list of current Internet-Drafts can be accessed at
http://www.ietf.org/1id-abstracts.txt.

The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.

This Internet-Draft will expire on January 6, 2006.

Copyright Notice

Copyright (C) The Internet Society (2005).  All Rights Reserved.














Lazzaro/Wawrzynek                                               [Page 1]

INTERNET-DRAFT                                               7 July 2005


                                Abstract

     Is the IETF multimedia stack appropriate for use in the digital
     audio equipment found in recording studios and concert halls?
     To help answer this question, this memo lists the requirements
     for a session management framework for stage and studio
     devices.




                           Table of Contents


1. Introduction  . . . . . . . . . . . . . . . . . . . . . . . . . .   3
2. Discovery . . . . . . . . . . . . . . . . . . . . . . . . . . . .   3
3. Bidirectional Heretogeneous Media Flows . . . . . . . . . . . . .   4
4. Fine-Grained Media Selection and Control  . . . . . . . . . . . .   4
5. Presentation and Capture Timing . . . . . . . . . . . . . . . . .   5
6. Sample Accurate Signal Processing . . . . . . . . . . . . . . . .   5
7. Session Chaining  . . . . . . . . . . . . . . . . . . . . . . . .   6
8. Multicast Support . . . . . . . . . . . . . . . . . . . . . . . .   7
9. Discussion  . . . . . . . . . . . . . . . . . . . . . . . . . . .   7
10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . .   7
11. Security Considerations  . . . . . . . . . . . . . . . . . . . .   7
12. IANA Considerations  . . . . . . . . . . . . . . . . . . . . . .   7
13. References . . . . . . . . . . . . . . . . . . . . . . . . . . .   8
     13.1 Normative References . . . . . . . . . . . . . . . . . . .   8
     13.2 Informative References . . . . . . . . . . . . . . . . . .   8
14. Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . .   8
15. Intellectual Property Rights Statement . . . . . . . . . . . . .   9
16. Full Copyright Statement . . . . . . . . . . . . . . . . . . . .   9
17. Change Log . . . . . . . . . . . . . . . . . . . . . . . . . . .  10


















Lazzaro/Wawrzynek                                               [Page 2]

INTERNET-DRAFT                                               7 July 2005


1.  Introduction

Digital technology has made a deep impact on how contemporary music is
performed and how all audio content is produced.  Microprocessors are
ubiquitous on stage and in the recording studio: a few in personal
computers, but most in embedded systems.

However, Internet technologies have not yet truly hit the stage and
studio world.  Digital media flows between computers largely occur via
USB, Firewire, and specialized digital transports (S/PIDF and AES
synchronous protocols, and customized versions of Ethernet).

Why hasn't the IETF content-streaming protocol suite (RTP [1] and its
payload formats, SDP [2], and RTSP [4]) found a home in this world?
This memo is an attempt to start a discussion on this topic.  We list
the requirements for a framework for stage and studio applications.

To highlight the challenges of the requirements, an ASSESSMENT heading
in each section discusses how a strawman IETF architecture would handle
the requirement.  [3] describes the strawman architecture in detail.

In the strawman architecture, stage and studio devices are Real Time
Streaming Protocol (RTSP) servers, and operating systems access devices
via RTSP clients.  RTSP is in widespread use as a session manager for
audio and video content-streaming on the Internet [4].


2.  Discovery

Today, adding a new digital device to a recording studio is (usually)
easy.  Most devices use USB, and are USB class-compliant for audio and
MIDI.  A user connects the new device to a computer using a USB cable,
and the operating system detects its presence.  Applications display
audio and MIDI sources and sinks for all attached devices upon user
request.

This level of automatic discovery is a requirement for a stage and
studio Internet framework.  Users should be able to add a device to a
wired or wireless LAN, and shortly thereafter audio and MIDI inputs and
outputs from the device should accessible by applications.  Users should
not have to do anything (install drivers, manually set up network
sessions, etc) to make this happen.

ASSESSMENT

Automatic discovery at the application level should not be a challenge
for our strawman architecture.  Dnsext protocols [5] [6] may be used to
advertise services over link-local multicast.  The advertisement will



Lazzaro/Wawrzynek                                               [Page 3]

INTERNET-DRAFT                                               7 July 2005


point to the network address and port for the RTSP server.  The
framework should specify a set of normative RTSP URLs that clients may
access with the DESCRIBE method to discover what the device does and how
to access it.  Depending on the design, the body returned by DESCRIBE
may use the Session Description Protocol, or may use some other new or
existing protocol.



3.  Bidirectional Heretogeneous Media Flows

Many stage and studio devices support input and output of several types
of media.  For example, a breakout box might send 8 channels of audio
input onto the network (originating from 8 analog audio input jacks on
the box), receive 8 channels of audio output from the network (which it
would send to 8 analog audio output jacks on the box), along with
several pairs of MIDI input and output jacks.

Support for bidirectional heretogeneous media flows is a requirement for
a stage and studio Internet framework.  Thus, the framework concerns
sessions (bundles of "wires") not an individual "wire" (as would a
framework that provided a "virtual patchbay" service).

ASSESSMENT

Heretogeneous flows are simple in our strawman architecture, as session
descriptions support the synchronized transport of audio, MIDI, and
other media types.

Bidirectional flows are a challenge for our strawman architecture,
because RTSP's SDP session descriptions are recvonly by convention, and
the semantics of control URLs deeply reflect this assumption.


4. Fine-Grained Media Selection and Configuration

Stage and studio devices that use USB or Firewire permit fine-grained
control and selection of media.  For example, an 8-input/8-output USB
breakout box sends descriptive names for each channel for presentation
to the user.  Users may dynamically select which I/O channels should
send or receive media flows (and sample rates and bit-depths) via
applications or operating-system utilities.

Fine-grained media selection and control is a requirement for a stage
and studio Internet framework.






Lazzaro/Wawrzynek                                               [Page 4]

INTERNET-DRAFT                                               7 July 2005


ASSESSMENT

Fine-grained media selection and control are challenges for our strawman
architecture.  RTSP's SDP conventions do provide some tools: clients can
choose to use a subset of offered control URLs, and clients sending an
audio stream can signal the use of one of a small set of sample rate and
bit-depth combinations via the payload type.  However, these tools do
not cover all situations, and the use of these tools for devices with a
large number of inputs and outputs is unwieldy.


5.  Presentation and Capture Timing

Stage and studio users expect to control the presentation time of audio
outputs and the capture time of audio inputs.

More specifically, users expect all audio inputs and outputs from a
breakout box to use the same sample clock.  To support the use of
multiple breakout boxes, users expect a breakout box to accept a remote
sample clock input, and to generate an output sample clock for other
devices to use.  Users expect the input and output latencies of breakout
boxes to be deterministic (within the limits of clock jitter), and
knowable (via signalling, a user manual, or measurement).

Presentation and capture timing control is a requirement for a stage and
studio Internet framework.

ASSESSMENT

These issues are a challenge for our strawman protocol.  Traditionally,
RTP considers sender and receiver behaviors in these respects to be
outside its domain.  RTSP and SDP do not offer tools for signalling this
sort of timing information during session setup.


6.  Sample Accurate Signal Processing

In a breakout box, the input and output flows are independent.  However,
not all stage and studio devices fit the breakout box model.  In some
stage and studio devices, the output flows are produced in response to
input flows.  We refer to these devices as "signal processors".

For example, a reverberation unit is a signal processor: it accepts a
"dry" audio input sample stream and generates a "wet" audio output
sample stream from its input.  As a second example, a music synthesizer
is a signal processor: it accepts a MIDI input stream, and generates an
audio output sample stream from its MIDI input.




Lazzaro/Wawrzynek                                               [Page 5]

INTERNET-DRAFT                                               7 July 2005


With few exceptions, current implementations of signal processor
hardware devices are not "sample-accurate".  In other words, for a
reverb unit, there is no way to know at the transport level that a
particular output audio sample corresponds to a particular input audio
sample.

In current practice, audio engineers work around the lack of sample
accurate transport by estimating a nominal latency for the device.
However, transport-level sample-accurate signal processing is desired by
"change agents" in the stage and studio community, as it would bring
exact repeatability to the studio workflow.

Sample-accurate signal processing is a requirement for a stage and
studio Internet framework.

ASSESSMENT

Signal processors in general, and sample-accurate signal processing in
specific, are challenges for our strawman architecture.  SDP does not
have the semantics for expressing that an output flows depends on an
input flow.  An assessment of RTP's present capability for sample-
accurate signal processing is controversial: some would say that the NTP
timestamps in RTCP packets are sufficient, others would argue for the
need to label particular RTP timestamps with a timecode value (via RTP
or RTCP).


7.  Session Chaining

Stage and studio devices are often connected in serial and parallel
configurations.  For audio devices that support S/PDIF and similar
protocols, devices may be interconnected manually using cables, or
electronically using digital patchbays.  For devices that support USB or
Firewire, a personal computer program usually simulates a patchbay.

In an Internet framework, series and parallel interconnections could be
expressed within the description of a single session.  This
functionality (which we call "session chaining") is a requirement for a
stage and studio Internet framework.

ASSESSMENT

Simple forms of session chaining would not be difficult to add to the
strawman architecture.  Simple session chaining is within the expressive
power of SDP, as the connection information is specified on a per-RTP-
session basis.





Lazzaro/Wawrzynek                                               [Page 6]

INTERNET-DRAFT                                               7 July 2005


8.  Multicast Support

Stage and studio devices that network using shared wires sometimes use
the shared physical fabric to do multicasting.  For example, powered
speakers may have a feed-through port for its audio input, so that a set
of speakers may be driven by a single daisy-chained wire.

The potential use of multicast for media flows is a requirement for a
stage and studio framework.

ASSESSMENT

Multicast is supported by our strawman architecture, as multicast is
within the expressive power of SDP.


9.  Discussion

As an early draft of an individual submission, the requirements listed
above reflect the views of the authors.  One purpose to be served by
making this I-D a working-group item is to elicit feedback from the
stage and studio community, so that the document evolves to represent a
community consensus.

At that point, serious evaluation can begin on the appropriateness of
using the IETF protocol stack (in its original form or an augmented
form) to meet the requirements, and the interest in the working group to
support that work.


10.  Acknowledgements

This work is a spin-off of the RTP MIDI work in AVT.  We thank the RTP
MIDI community for insights into the problem domain.


11.  Security Considerations

Security requirements for stage and studio protocols will be added to
later versions of this I-D.


12.  IANA Considerations

None.


13.  References



Lazzaro/Wawrzynek                                               [Page 7]

INTERNET-DRAFT                                               7 July 2005


13.1 Normative References

None.

13.2 Informative References

[1] Schulzrinne, H., Casner, S., Frederick, R., and V. Jacobson.
"RTP: A transport protocol for real-time applications", RFC 3550, July
2003.

[2] Handley, M., Jacobson, V., and C. Perkins.  "SDP: Session
Description Protocol", draft-ietf-mmusic-sdp-new-22.txt.

[3] Lazzaro, J. and J. Wawrzynek, "An RTP payload format for MIDI",
expired version draft-ietf-avt-rtp-midi-format-08.txt, linked at
http://ietfreport.isoc.org/all-ids/draft-ietf-avt-rtp-midi-format-08.txt
The strawman architecture appears in Appendix C.6.2.  Note that is
section (and the strawman architecture) has been removed from the
current version of the RTP MIDI draft.

[4] Schulzrinne, H., Rao, A., and R. Lanphier. "Real Time Streaming
Protocol (RTSP)", RFC 2326, April 1998.

[5] Cheshire, S. and M. Krochmal.  "DNS-based Service Discovery",
draft-cheshire-dnsext-dns-sd-03.txt, June 2005.

[6] Cheshire, S., and M. Krochmal, "Multicast DNS",
draft-cheshire-dnsext-multicastdns-05.txt, June 2005.


14.  Authors' Addresses

John Lazzaro (corresponding author)
UC Berkeley
CS Division
315 Soda Hall
Berkeley CA 94720-1776
Email: lazzaro@cs.berkeley.edu

John Wawrzynek
UC Berkeley
CS Division
631 Soda Hall
Berkeley CA 94720-1776
Email: johnw@cs.berkeley.edu






Lazzaro/Wawrzynek                                               [Page 8]

INTERNET-DRAFT                                               7 July 2005


15.  Intellectual Property Rights Statement

The IETF takes no position regarding the validity or scope of any
Intellectual Property Rights or other rights that might be claimed to
pertain to the implementation or use of the technology described in this
document or the extent to which any license under such rights might or
might not be available; nor does it represent that it has made any
independent effort to identify any such rights.  Information on the
procedures with respect to rights in RFC documents can be found in BCP
78 and BCP 79.

Copies of IPR disclosures made to the IETF Secretariat and any
assurances of licenses to be made available, or the result of an attempt
made to obtain a general license or permission for the use of such
proprietary rights by implementers or users of this specification can be
obtained from the IETF on-line IPR repository at
http://www.ietf.org/ipr.

The IETF invites any interested party to bring to its attention any
copyrights, patents or patent applications, or other proprietary rights
that may cover technology that may be required to implement this
standard.  Please address the information to the IETF at ietf-
ipr@ietf.org.


16.  Full Copyright Statement

Copyright (C) The Internet Society (2005).  This document is subject to
the rights, licenses and restrictions contained in BCP 78, and except as
set forth therein, the authors retain all their rights.

This document and the information contained herein are provided
on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE
REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND
THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES,
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT
THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR
ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A
PARTICULAR PURPOSE.

Acknowledgement

Funding for the RFC Editor function is currently provided by the
Internet Society.







Lazzaro/Wawrzynek                                               [Page 9]

INTERNET-DRAFT                                               7 July 2005


17.  Change Log

[Note to RFC Editors: this Appendix, and its Table of Contents listing,
should be removed from the final version of the memo]

Initial release.













































Lazzaro/Wawrzynek                                              [Page 10]