Internet DRAFT - draft-wenger-clue-definitions


CLUE WG                                                        S. Wenger
Internet-Draft                                               Vidyo, Inc.
Intended status: Informational                          October 31, 2011
Expires: May 3, 2012

                            CLUE Definitions


   This document collects terminology and definitions to be used
   consistently among documents produced in the CLUE working group.

Status of this Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on May 3, 2012.

Copyright Notice

   Copyright (c) 2011 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   ( in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.

Wenger                     Expires May 3, 2012                  [Page 1]
Internet-Draft              CLUE Definitions                October 2011

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . . . 3
   2.  Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . 3
   3.  Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . 3
   4.  Acknowledgements  . . . . . . . . . . . . . . . . . . . . . . . 7
   5.  IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 7
   6.  Security Considerations . . . . . . . . . . . . . . . . . . . . 7
   7.  Informative References  . . . . . . . . . . . . . . . . . . . . 7
   Appendix A.  Draft History  . . . . . . . . . . . . . . . . . . . . 8
   Author's Address  . . . . . . . . . . . . . . . . . . . . . . . . . 9

Wenger                     Expires May 3, 2012                  [Page 2]
Internet-Draft              CLUE Definitions                October 2011

1.  Introduction

   This document collects terminology and definitions to be used
   consistently among documents produced in the CLUE working group.  The
   draft is intended to be a living document, to be updated as the
   working group discussions progress.  It is not intended for
   publication as an RFC; instead, it is expected that definitions and
   terminology defined herein are going to be replicated in other CLUE

   All defined terms herein are captitalized.  It is recommended that
   authors of other CLUE document follow this conventions when using
   definitions from this document.

   It should be understood that the definitions herein are not to be
   interpreted as antedating design decisions to be made later.  It's
   entierly reasonable that design choices conflicting with definitions
   in this draft may become necessary.  In such cases, CLUE document
   authors can either request the definition be updated, or can simply
   avoid the use of a definition of this document.  However, an author
   should never use a capitzalied definition as set forth herein with a
   different meaning in a different CLUE document.

2.  Terminology

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   document are to be interpreted as described in RFC 2119 [RFC2119].

3.  Definitions

        Audio Capture: Media Capture for audio.  Denoted as ACn.

        Audio Mixing: refers to the accumulation of scaled audio signals
        to produce a single audio stream.  See RTP Topologies,

        Camera-Left and Right: For media captures, camera-left and
        camera-right are from the point of view of a person observing
        the rendered media.  They are the opposite of stage-left and

        Capture Device: A device that converts audio and video input
        into an electrical signal, in most cases to be fed into a media
        encoder.  Cameras, microphones, and keyboards are examples for
        capture devices.

Wenger                     Expires May 3, 2012                  [Page 3]
Internet-Draft              CLUE Definitions                October 2011

        Capture Scene: the scene that is captured by a collection of
        Capture Devices.  A Capture Scene may be represented by more
        than one type of Media.  A Capture Scene may include more than
        one Media Capture of the same type.  An example of a Capture
        Scene is the video image of a group of people seated next to
        each other, along with the sound of their voices, which could be
        represented by some number of VCs and ACs.  A middle box may
        also express Capture Scenes that it constructs from Media
        streams it receives.

        Capture Set: includes Media Captures that all represent some
        aspect of the same Capture Scene.  The items (rows) in a Capture
        Set represent different alternatives for representing the same
        Capture Scene.

        Conference: used as defined in [RFC4353], A Framework for
        Conferencing within the Session Initiation Protocol (SIP).

        Encoding Group: Encoding group: A set of encoding parameters
        representing a device's complete encoding capabilities or a
        subdivision of them.  Media stream providers formed of multiple
        physical units, in each of which resides some encoding
        capability, would typically advertise themselves to the remote
        media stream consumer as being formed multiple encoding groups.
        Within each encoding group, multiple potential actual encodings
        are possible, with the sum of those encodings' characteristics
        constrained to being less than or equal to the group-wide

        Endpoint: The logical point of final termination through
        receiving, decoding and rendering, and/or initiation through
        capturing, encoding, and sending of media streams.  An endpoint
        consists of one or more physical devices which source and sink
        media streams, and exactly one [RFC4353] Participant (which, in
        turn, includes exactly one SIP User Agent).  In contrast to an
        endpoint, an MCU may also send and receive media streams, but it
        is not the initiator nor the final terminator in the sense that
        Media is Captured or Rendered.  Endpoints can be anything from
        multiscreen/multicamera rooms to handheld devices.

        Endpoint Characteristics: include placement of Capture and
        Rendering Devices, capture/render angle, resolution of cameras
        and screens, spatial location and mixing parameters of
        microphones.  Endpoint characteristics are not specific to
        individual media streams sent by the endpoint.

Wenger                     Expires May 3, 2012                  [Page 4]
Internet-Draft              CLUE Definitions                October 2011

        Front: the portion of the room closest to the cameras.  In going
        towards back you move away from the cameras.

        Individual Encode: A variable with a set of attributes that
        describes the maximum values of a single audio or video capture
        encoding.  The attributes include: maximum bandwidth- and for
        video maximum macroblocks, maximum width, maximum height,
        maximum frame rate.  [Edt. These are based on H.264.]

        Layout: How rendered media streams are spatially arranged with
        respect to each other on a single screen/mono audio telepresence
        endpoint, and how rendered media streams are arranged with
        respect to each other on a multiple screen/loudspeaker
        telepresence endpoint.  Note that audio as well as video is
        encompassed by the term layout--in other words, included is the
        placement of audio streams on loudspeakers as well as video
        streams on video screens.  Edit. note: this is on the RENDERER

        Left: to be interpreted in the context of the description where
        the word occurs."  Edt. note: there has been a lot of mailing
        list discussions about "left" and "right", and we seem not to be
        able to arrive at a common understanding.

        Local: Sender and/or receiver physically co-located ("local",
        "in the same room") in the context of the discussion.

        MCU: Multipoint Control Unit (MCU) - a device that connects two
        or more endpoints together into one single multimedia conference
        [RFC5117].  An MCU includes an [RFC4353] Mixer.  Edt. Note:
        RFC4353 is tardy in requireing that media from the mixer be sent
        to EACH participant.  I think we have practical use cases where
        this is not the case.  But the bug (if it is one) is in 4353 and
        not herein.

        Media: Any data that, after suitable encoding, can be conveyed
        over RTP, including audio, video or timed text.  Edt. note: does
        Media include far end camera control (which can be conveyed over

        Media Capture: a source of Media, such as from one or more
        Capture Devices.  A Media Capture (MC) may be the source of one
        or more Media streams.  A Media Capture may also be constructed
        from other Media streams.  A middle box can express Media
        Captures that it constructs from Media streams it receives.

Wenger                     Expires May 3, 2012                  [Page 5]
Internet-Draft              CLUE Definitions                October 2011

        Media Consumer: an Endpoint or middle box that receives Media

        Media Provider: an Endpoint or middle box that sends Media

        Model: a set of assumptions a telepresence system of a given
        vendor adheres to and expects the remote telepresence system(s)
        also to adhere to.

        Participant: to be interpreted as defined in RFC 4353

        Remote: Sender and/or receiver on the other side of the
        communication channel (depending on context); not Local.  A
        remote can be an Endpoint or an MCU.

        Render: the process of generating a representation from a media,
        such as displayed motion video or sound emitted from

        Rendering Device: A device that converts an electrical signal -
        in most cases stemming from a media decoder - to audible and
        visual signals.  Screens and loudspeakers are examples for
        rendering devices.

        Right: to be interpreted in the context of the description where
        the word occurs."/>

        Simultaneous Transmission Set: a set of media captures that can
        be transmitted simultaneously from a Media Provider.

        Source selection policies: rules for determining which media
        source(s) to play or show.

        Spatial Relation: The arrangement in space of two objects, in
        contrast to relation in time or other relationships.  See also
        Left and Right.

        RTP stream as in [RFC3550].

        Stream Characteristics: include media stream attributes commonly
        used in non-CLUE SIP/SDP environments (such as: media codec, bit
        rate, resolution, profile/level etc.) as well as CLUE specific
        attributes (which could include for example and depending on the
        solution found: the I-D or spatial location of a capture device
        a stream originates from).

Wenger                     Expires May 3, 2012                  [Page 6]
Internet-Draft              CLUE Definitions                October 2011

        Telepresence: an environment that gives non co-located users or
        user groups a feeling of (co-located) presence - the feeling
        that a Local user is in the same room with other Local users and
        the Remote parties.  The inclusion of Remote parties is achieved
        through multimedia communication including at least audio and
        video signals of high fidelity.

        "Telepresence Extensions": The protocol extensions beyond SIP
        adn its extensions defined as of July 2011, specified by RFC
        XXXX.  (XXX to be replaced by the RFC editor with any documents
        developed by the CLUE WG that contain normative content, such as
        architecture, protocol specification, and whatnot.)

        Video Capture: Media Capture for video.  Denoted as VCn.

        Video composite: A single image that is formed from combining
        visual elements from separate sources.

4.  Acknowledgements

   Most of this stuff is copied from the CLUE requirements and framework
   draft.  Go there to look at dignitaries.

5.  IANA Considerations


6.  Security Considerations


7.  Informative References

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119, March 1997.

   [RFC5117]  Westerlund, M. and S. Wenger, "RTP Topologies", RFC 5117,
              January 2008.

   [RFC3550]  Schulzrinne, H., Casner, S., Frederick, R., and V.
              Jacobson, "RTP: A Transport Protocol for Real-Time
              Applications", STD 64, RFC 3550, July 2003.

   [RFC4353]  Rosenberg, J., "A Framework for Conferencing with the

Wenger                     Expires May 3, 2012                  [Page 7]
Internet-Draft              CLUE Definitions                October 2011

              Session Initiation Protocol (SIP)", RFC 4353,
              February 2006.

              Wikipedia, "Blocking (stage), available from http://
              May 2011, <

Appendix A.  Draft History

   01: removed "stage direction" in "left" "right".  Replaced with "to
   be interpreted in context".

   clarified "local" to mean "in the same room" per J. Polk email

   Consistently refer to "loudspeaker" per Christer's email 6/13/11

   Added "keyboard" to listed caputure devices per Christer's email

   Rendering device: changed "optical" to "visual".  Hope Christer finds
   this OK.  Per Christer's email 6/13/11

   Note: the other changes proposed by Christer in his 6/13 email appear
   to change more than just words but semantics.  Not included here as
   such semantic changes are best handled in the requirement and other

   Edt. Note: there have been lengthily discussions about the term
   "Participant".  One camp wanted the word to refer to human users of
   telepresence equipment so to use intuitive language, the other (as
   this document) suggests to use the word in the RFC

   Participant defined as per RFC 4353.  I think this is what we arrived
   at on the mailing list, right?

   Added "Telepresence Extensions" per long "solutions" thread.

   00: Initial version; mostly copy-past from requirements-02.

   Removed "session".  Editors are advised to be specific about SIP,
   RTP, or whatever else sessions.

   Made clear that Layout refers to rendering, not capturing.

Wenger                     Expires May 3, 2012                  [Page 8]
Internet-Draft              CLUE Definitions                October 2011

   added Left, Right, in stage direction, w/ Wikipedia reference

Author's Address

   Stephan Wenger
   Vidyo, Inc.
   433 Hackensack Ave., 7th Floor
   Hackensack, NJ  07601


Wenger                     Expires May 3, 2012                  [Page 9]