CLUE Working Group R. Presta Internet-Draft S P. Romano Intended status: Informational University of Napoli Expires: September 22, 2014 March 21, 2014 An XML Schema for the CLUE data model draft-ietf-clue-data-model-schema-04 Abstract This document provides an XML schema file for the definition of CLUE data model types. Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on September 22, 2014. Copyright Notice Copyright (c) 2014 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents Presta & Romano Expires September 22, 2014 [Page 1] Internet-Draft draft-ietf-clue-data-model-schema-04 March 2014 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 3. XML Schema . . . . . . . . . . . . . . . . . . . . . . . . . . 6 4. . . . . . . . . . . . . . . . . . . . . . . . 17 5. . . . . . . . . . . . . . . . . . . . . . . . 17 6. . . . . . . . . . . . . . . . . . . . . . . . 17 7. . . . . . . . . . . . . . . . . . . . . . . 17 8. . . . . . . . . . . . . . . . . . . . . 17 9. . . . . . . . . . . . . . . . . . . . . . . 17 10. . . . . . . . . . . . . . . . . . . . . . . . . 17 10.1. . . . . . . . . . . . . . . . . . . . . . 20 10.2. . . . . . . . . . . . . . . . . . . . 20 10.3. . . . . . . . . . . . . . . . . . . . . . 20 10.4. . . . . . . . . . . . . . . . . . . 20 10.4.1. . . . . . . . . . . . . . . . . . . . 21 10.4.2. . . . . . . . . . . . . . . . . . . . . 22 10.5. . . . . . . . . . . . . . . . . . 23 10.6. . . . . . . . . . . . . . . . . . . . . . . . . 23 10.7. . . . . . . . . . . . . . . . . . . . 24 10.8. . . . . . . . . . . . . . . . . . . . . . . . 24 10.9. . . . . . . . . . . . . . . . . . . . . . . . 24 10.10. . . . . . . . . . . . . . . . . . . . . . . . . 24 10.11. . . . . . . . . . . . . . . . . . . . . . . 25 10.12. . . . . . . . . . . . . . . . . . . . . . . 25 10.13. . . . . . . . . . . . . . . . . . . . . . . 25 10.14. . . . . . . . . . . . . . . . . . . . . . . . 26 10.15. . . . . . . . . . . . . . . . . . . . . . . . . . 26 10.16. . . . . . . . . . . . . . . . . . . . . . . . 26 10.17. . . . . . . . . . . . . . . . . . . 26 10.18. . . . . . . . . . . . . . . . . . . . . . . . 26 10.19. . . . . . . . . . . . . . . . . . . . . . . . . . 26 10.20. . . . . . . . . . . . . . . . . . . . . . 27 10.21. . . . . . . . . . . . . . . . . . . . . 27 10.21.1. . . . . . . . . . . . . . . . . . 27 10.22. captureID attribute . . . . . . . . . . . . . . . . . . . 27 11. Audio captures . . . . . . . . . . . . . . . . . . . . . . . . 27 11.1. . . . . . . . . . . . . . . . . . . 28 12. Video captures . . . . . . . . . . . . . . . . . . . . . . . . 28 12.1. . . . . . . . . . . . . . . . . . . . . . 29 13. Text captures . . . . . . . . . . . . . . . . . . . . . . . . 29 14. . . . . . . . . . . . . . . . . . . . . . . . . 30 14.1. . . . . . . . . . . . . . . . . . . . . . 30 14.2. sceneID attribute . . . . . . . . . . . . . . . . . . . . 31 14.3. scale attribute . . . . . . . . . . . . . . . . . . . . . 31 15. . . . . . . . . . . . . . . . . . . . . . . . . . 31 15.1. . . . . . . . . . . . . . . . . . . . . 32 15.2. sceneEntryID attribute . . . . . . . . . . . . . . . . . 32 15.3. mediaType attribute . . . . . . . . . . . . . . . . . . . 33 Presta & Romano Expires September 22, 2014 [Page 2] Internet-Draft draft-ietf-clue-data-model-schema-04 March 2014 16. . . . . . . . . . . . . . . . . . . . . . . . 33 16.1. . . . . . . . . . . . . . . . . . . . 33 16.2. . . . . . . . . . . . . . . . . . . . . 33 16.3. encodingGroupID attribute . . . . . . . . . . . . . . . . 34 17. . . . . . . . . . . . . . . . . . . . . . . 34 17.1. . . . . . . . . . . . . . . . . . . . . . 34 17.2. . . . . . . . . . . . . . . . . . . . . 34 18. . . . . . . . . . . . . . . . . . . . . . 35 19. . . . . . . . . . . . . . . . . . . . . . . . . 35 19.1. . . . . . . . . . . . . . . . . . . . . 36 19.1.1. participantID attribute . . . . . . . . . . . . . . . 37 19.1.2. . . . . . . . . . . . . . . . . . . . . . . . 37 19.1.3. . . . . . . . . . . . . . . . . . . 37 20. . . . . . . . . . . . . . . . . . . . . . . 37 20.1. . . . . . . . . . . . . . . . . . . . . . . . 38 20.2. . . . . . . . . . . . . . . . . . . . . . . 38 20.3. . . . . . . . . . . . . . . . . . . . 38 21. . . . . . . . . . . . . . . . . . . . . . . . . . . 38 22. Sample XML file . . . . . . . . . . . . . . . . . . . . . . . 39 23. MCC example . . . . . . . . . . . . . . . . . . . . . . . . . 45 24. Diff with draft-ietf-clue-data-model-schema-02 version . . . 53 25. Diff with draft-ietf-clue-data-model-schema-03 version . . . . 53 26. Informative References . . . . . . . . . . . . . . . . . . . . 53 Presta & Romano Expires September 22, 2014 [Page 3] Internet-Draft draft-ietf-clue-data-model-schema-04 March 2014 1. Introduction This document provides an XML schema file for the definition of CLUE data model types. The schema is based on information contained in [I-D.ietf-clue-framework]. It encodes information and constraints defined in the aforementioned document in order to provide a formal representation of the concepts therein presented. The schema definition is intended to be modified according to changes applied to the above mentioned CLUE document. The document actually represents a proposal aiming at the definition of a coherent structure for all the information associated with the description of a telepresence scenario. 2. Terminology This document refers to the same terminology used in [I-D.ietf-clue-framework]. We briefly recall herein some of the main terms exploited in the document. Audio Capture: Media Capture for audio. Denoted as ACn in the example cases in this document. Camera-Left and Right: For Media Captures, camera-left and cameraright are from the point of view of a person observing the rendered media. They are the opposite of Stage-Left and Stage- Right. Capture: Same as Media Capture. Capture Device: A device that converts audio and video input into an electrical signal, in most cases to be fed into a media encoder. Capture Encoding: A specific encoding of a Media Capture, to be sent by a Media Provider to a Media Consumer via RTP. Capture Scene: An abstraction grouping semantically-coupled Media Captures available at the Media Provider's side, representing a precise portion of the local scene that can be transmitted remotely. Capture Scene MAY correspond to a part of the telepresence room or MAY focus only on the presentation media. A Capture Scene is characterized by a set of attributes and by a set of Capture Scene Entries. Presta & Romano Expires September 22, 2014 [Page 4] Internet-Draft draft-ietf-clue-data-model-schema-04 March 2014 Capture Scene Entry: A list of Media Captures of the same media type that constitute a possible representation of a Capture Scene. Media Capture belonging to the same Capture Scene Entry can be sent simultaneously by the Media Provider. CLUE Participant: An entity able to use the CLUE protocol within a telepresence session. It can be an Endpoint or a MCU able to use the CLUE protocol. Consumer: Same as Media Consumer. Encoding or Individual Encoding: The representation of an encoding technology. In the CLUE datamodel, for each encoding it is provided a set of parameters representing the encoding constraints, like for example the maximum bandwidth of the Media Provider the encoding can consume. s Encoding Group: The representation of a group of encodings. For each group, it is provided a set of parameters representing the constraints to be applied to the group as a whole. An example is the maximum bandwidth that can be consumed when using the contained encodings together simultaneously. Endpoint The logical point of final termination through receiving, decoding and rendering, and/or initiation through capturing, encoding, and sending of media streams. An endpoint consists of one or more physical devices which source and sink media streams, and exactly one SIP Conferencing Framework Participant (which, in turn, includes exactly one SIP User Agent). Endpoints can be anything from multiscreen/multicamera room controllers to handheld devices. MCU: Multipoint Control Unit (MCU) - a device that connects two or more endpoints together into one single multimedia conference. An MCU may include a Mixer. Media: Any data that, after suitable encoding, can be conveyed over RTP, including audio, video or timed text. Media Capture: A "Media Capture", or simply "Capture", is a source of Media of a single type (i.e., audio or video or text). Media Stream: The term "Media Stream", or simply "Stream", is used as a synonymous of Capture Encoding. Presta & Romano Expires September 22, 2014 [Page 5] Internet-Draft draft-ietf-clue-data-model-schema-04 March 2014 Media Provider: A CLUE participant (i.e., an Endpoint or a MCU) able to send Media Streams. Media Consumer: A CLUE participant (i.e., an Endpoint or a MCU) able to receive Media Streams. Scene: Same as Capture Scene. Scene Entry: Same as Capture Scene Entry. Stream: Same of Media Stream. Multiple Content Capture: A Capture that can contain different Media Captures of the same media type. It is denoted as MCC in this document. In the Stream resulting from the MCC, the Stream coming from the encoding of the composing Media Captures can appear simultaneously, if the MCC is the result of a mixing operation, or can appear alternatively over the time, according to a certain switching policy. Plane of Interest: The spatial plane containing the most relevant subject matter. Provider: Same as Media Provider. Render: Simultaneous Transmission Set: a set of Media Captures that can be transmitted simultaneously from a Media Provider. Single Media Capture: A Capture representing the Media coming from a single-source Capture Device. Spatial Information: Data about the spatial position of a Capture Device that generate a Single Media Capture within the context of a Capture Scene representing a phisical portion of a Telepresence Room. Stream Characteristics: The union of the features used to describe a Stream in the CLUE environment and in the SIP-SDP environment Video Capture: A Media Capture for video. 3. XML Schema This section contains the proposed CLUE data model schema definition. The element and attribute definitions are formal representation of Presta & Romano Expires September 22, 2014 [Page 6] Internet-Draft draft-ietf-clue-data-model-schema-04 March 2014 the concepts needed to describe the capabilities of a Media Provider and the streams that are requested by a Media Consumer given the Provider's offer. The main groups of information are: : the list of media captures available (Section 4) : the list of encodings groups (Section 5) : the list of capture scenes (Section 6) : the list of simultaneous transmission sets (Section 7) : the list of global capture entries sets (Section 8) : meta data about the participants represented in the telepresence session (Section 19). [to be discussed] : the list of instantiated capture encodings (Section 9) All of the above refers to concepts that have been introduced in [I-D.ietf-clue-framework] and further detailed in the following of this document. Presta & Romano Expires September 22, 2014 [Page 7] Internet-Draft draft-ietf-clue-data-model-schema-04 March 2014 Presta & Romano Expires September 22, 2014 [Page 8] Internet-Draft draft-ietf-clue-data-model-schema-04 March 2014 Presta & Romano Expires September 22, 2014 [Page 9] Internet-Draft draft-ietf-clue-data-model-schema-04 March 2014 Presta & Romano Expires September 22, 2014 [Page 10] Internet-Draft draft-ietf-clue-data-model-schema-04 March 2014 Presta & Romano Expires September 22, 2014 [Page 11] Internet-Draft draft-ietf-clue-data-model-schema-04 March 2014 Presta & Romano Expires September 22, 2014 [Page 12] Internet-Draft draft-ietf-clue-data-model-schema-04 March 2014 Presta & Romano Expires September 22, 2014 [Page 13] Internet-Draft draft-ietf-clue-data-model-schema-04 March 2014 Presta & Romano Expires September 22, 2014 [Page 14] Internet-Draft draft-ietf-clue-data-model-schema-04 March 2014 Presta & Romano Expires September 22, 2014 [Page 15] Internet-Draft draft-ietf-clue-data-model-schema-04 March 2014 Presta & Romano Expires September 22, 2014 [Page 16] Internet-Draft draft-ietf-clue-data-model-schema-04 March 2014 Following sections describe the XML schema in more detail. 4. represents the list of one ore more media captures available on the media provider's side. Each media capture is represented by a element (Section 10). 5. represents the list of the encoding groups organized on the media provider's side. Each encoding group is represented by a element (Section 16). 6. represents the list of the capture scenes organized on the media provider's side. Each capture scene is represented by a element. (Section 14). 7. contains the simultaneous sets indicated by the media provider. Each simultaneous set is represented by a element. (Section 17). 8. contains a set of alternative representations of all the scenes that are offered by a Media Provider to a Media Consumer. Each alternative is named "global capture entry" and it is represented by a element. (Section 18). 9. is a list of capture encodings. It can represent the list of the desired capture encodings indicated by the media consumer or the list of instantiated captures on the provider's side. Each capture encoding is represented by a element. (Section 20). 10. According to the CLUE framework, a media capture is the fundamental representation of a media flow that is available on the provider's side. Media captures are characterized (i) by a set of features that are independent from the specific type of medium, and (ii) by a set of features that are media-specific. The features that are common to Presta & Romano Expires September 22, 2014 [Page 17] Internet-Draft draft-ietf-clue-data-model-schema-04 March 2014 all media types appear within the media capture type, that has been designed as an abstract complex type. Media-specific captures, such as video captures, audio captures and others, are specialization of that abstract media capture type, as in a typical generalization- specialization hierarchy. The following is the XML Schema definition of the media capture type: Presta & Romano Expires September 22, 2014 [Page 18] Internet-Draft draft-ietf-clue-data-model-schema-04 March 2014 Presta & Romano Expires September 22, 2014 [Page 19] Internet-Draft draft-ietf-clue-data-model-schema-04 March 2014 10.1. is a mandatory field specifying the media type of the capture ("audio", "video", "text",...). 10.2. is a mandatory field containing the identifier of the capture scene the media capture belongs to. Indeed, each media capture must be associated with one and only one capture scene. When a media capture is spatially definable, some spatial information is provided along with it in the form of point coordinates (see Section 10.4). Such coordinates refers to the space of coordinates defined for the capture scene containing the capture. 10.3. is a mandatory field containing the identifier of the encoding group the media capture is associated with. 10.4. Media captures are divided into two categories: (i) non spatially definable captures and (ii) spatially definable captures. Captures are spatially definible when it is possible to provide at least the coordinates of the device position within the telepresence room of origin. Such coordinates are expressed according to the coordinate space of the capture scene the media captures belongs to. Non spatially definable captures cannot be characterized within the physical space of the telepresence room of origin. Capture of this kind are for example those related to registrations, text captures, DVDs, registered presentation, or external streams, that are played in the telepresence room and transmitted to remote sites. Another example is represented by switched captures: their content, in fact, comes from different devices over the time. Spatially definable captures represent a part of the telepresence room. The captured part of the telepresence room is described by means of the element. Non spatially definable captures do not show in their XML description such element: they are instead characterized by having the tag set to "true" (see Section 10.5). The definition of the spatial information type is the following: Presta & Romano Expires September 22, 2014 [Page 20] Internet-Draft draft-ietf-clue-data-model-schema-04 March 2014 The contains the coordinates of the capture device that is taking the capture, as well as, optionally, the pointing direction (see Section 10.4.1). It is a mandatory field when the media capture is spatially definable, independently from the media type. The is an optional field containing four points defining the captured area covered by the capture (see Section 10.4.2). 10.4.1. The element is used to represent the position and the line of capture of a capture device. The XML Schema definition of the element type is the following: Presta & Romano Expires September 22, 2014 [Page 21] Internet-Draft draft-ietf-clue-data-model-schema-04 March 2014 The point type contains three spatial coordinates (x,y,z) representing a point in the space associated with a certain capture scene. The capture point type extends the point type, i.e., it is represented by three coordinates identifying the position of the capture device, but can add further information. Such further information is conveyed by the , which is another point-type element representing the "point on line of capture", that gives the pointing direction of the capture device. The coordinates of the point on line of capture MUST NOT be identical to the capture point coordinates. If the point on line of capture is not specified, no assumptions are made about the pointing direction of the capturing device. 10.4.2. is an optional element that can be contained within the spatial information associated with a media capture. It represents the spatial area captured by the media capture. The XML representation of that area is provided through a set of four point-type element, , , , and , as it can be seen from the following definition: Presta & Romano Expires September 22, 2014 [Page 22] Internet-Draft draft-ietf-clue-data-model-schema-04 March 2014 , , , and should be co- planar. By comparing the capture area of different media captures within the same capture scene, a consumer can better determine the spatial relationships between them and render them correctly. 10.5. When media captures are non spatially definable, they are marked with the boolean element set to "true" and no is provided. Indeed, and are mutually exclusive tag, according to the section within the XML Schema definition of the media capture type. 10.6. A media capture can be (i) an individual media capture or (ii) a multiple content capture (MMCC). A multiple content capture is made by different captures that can be arranged spatially (by a composition operation), or temporally (by a switching operation), or that can result from the orchestration of both the techniques. If a media capture is a MCC, then it MUST show in its XML data model representation the element. It is a mandatory element composed by a list of media capture identifiers ("captureIDREF") and capture scene entry identifiers ("sceneEntryIDREF"), where the last ones are used as shortcuts to refer to multiple capture identifiers. Presta & Romano Expires September 22, 2014 [Page 23] Internet-Draft draft-ietf-clue-data-model-schema-04 March 2014 10.7. is an optional element for multiple content captures that contains a numeric identifier. Multiple content captures marked with the same identifier in the contain at each time captures coming from the same room endpoint. 10.8. is an optional boolean element that can be used only for multiple content captures. It indicates wheter or not a multiple content capture is a mix (audio) or a composition (video) of streams. If set to "true", it means that the capture can result at a certain time from more than one capture. This attribute is useful for a media consumer for example to avoid nesting a composed video capture into another composed capture or rendering. 10.9. is an optional boolean element that can be used only for multiple content captures. It indicates wheter or not a multiple content capture switches over the time. If set to "true", it means that the content of the MCC (in terms of the actual composing captures) can change over the time. 10.10. is an optional element that can be used only for multiple content captures. It indicates the criteria applied to build the multiple content capture using the media captures referenced in . Such element can assume a list of pre-defined values ([todo]). Presta & Romano Expires September 22, 2014 [Page 24] Internet-Draft draft-ietf-clue-data-model-schema-04 March 2014 10.11. is an optional element that can be used only for multiple content captures. It indicates the maximum number of media captures that can be represented in the multiple content capture at a time. 10.12. is a boolean element that MUST be used for single- content captures. Its value is fixed and set to "true". Such element indicates the capture that is being described is not a multiple content capture. Indeed, and the aforementioned tags related to MCC attributes (from Section 10.6 to Section 10.11) are mutually exclusive, according to the section within the XML Schema definition of the media capture type. 10.13. is used to provide optionally human-readable textual information about a media capture. The same element is exploited to describe, besides media captures, capture scenes and capture scene entries, as it is included in their XML representation. A media capture can be described by using multiple elements, each one providing information in a different language. The element definition is the following: As it can be seen, is a string element with an attribute ("lang") indicating the language used in the textual description. Presta & Romano Expires September 22, 2014 [Page 25] Internet-Draft draft-ietf-clue-data-model-schema-04 March 2014 10.14. is an optional unsigned integer field indicating the importance of a media capture according to the media provider's perspective. It can be used on the receiver's side to automatically identify the most relevant contribution from the media provider. The higher the importance, the lower the contained value. When media captures are marked with a "0" priority value, it means that they are "not subject to priority". 10.15. is an optional element containing the language used in the capture, if any. 10.16. is an optional element indicating whether or not the capture device originating the capture may move during the telepresence session. That optional element can assume one of the three following values: (i) static, (ii) dynamic or (iii) highly dynamic, defined as in [I-D.ietf-clue-framework]. 10.17. The optional contains an unsigned integer indicating the maximum number of capture encodings that can be simultaneously active for the media capture. If absent, this parameter defaults to 1. The minimum value for this attribute is 1. The number of simultaneous capture encodings is also limited by the restrictions of the encoding group the media capture refers to my means of the element. 10.18. The optional element contains the value of the ID attribute of the media capture it refers to. The media capture marked with a element can be for example the translation of a main media capture in a different language. 10.19. The element is an optional tag describing what is represented in the spatial area covered by a media capture. The current possible values are: "table", "lectern", "individual", and "audience", as listed in the enumerative view type in the following. Presta & Romano Expires September 22, 2014 [Page 26] Internet-Draft draft-ietf-clue-data-model-schema-04 March 2014 10.20. The element is an optional tag used for media captures conveing information about presentations within the telepresence session. The current possible values are "slides" and "images", as listed in the enumerative presentation type in the following. 10.21. This optional element is used to indicate which telepresence session participants are represented within the media captures. For each participant, a element is provided. [to be discussed] 10.21.1. contains the identifier of the represented participant. Metadata about the represented participant can be retrieved by accessing the list (Section 19). 10.22. captureID attribute The "captureID" attribute is a mandatory field containing the identifier of the media capture. 11. Audio captures Audio captures inherit all the features of a generic media capture and present further audio-specific characteristics. The XML Schema definition of the audio capture type is reported below: Audio-specific information about the audio capture is contained in Presta & Romano Expires September 22, 2014 [Page 27] Internet-Draft draft-ietf-clue-data-model-schema-04 March 2014 (Section 11.1). 11.1. The optional element is a field with enumerated values ("mono" and "stereo") which describes the method of encoding used for audio. A value of "mono" means the audio capture has one channel. A value of "stereo" means the audio capture has two audio channels, left and right. A single stereo capture is different from two mono captures that have a left-right spatial relationship. A stereo capture maps to a single RTP stream, while each mono audio capture maps to a separate RTP stream. The XML Schema definition of the element type is provided below: 12. Video captures Video captures, similarly to audio captures, extend the information of a generic media capture with video-specific features, such as (Section 12.1). The XML Schema representation of the video capture type is provided in the following: Presta & Romano Expires September 22, 2014 [Page 28] Internet-Draft draft-ietf-clue-data-model-schema-04 March 2014 12.1. The element is a boolean element indicating that there is text embedded in the video capture. The language used in such embedded textual description is reported in "lang" attribute. The XML Schema definition of the element is: 13. Text captures Also text captures can be described by extending the generic media capture information, similarly to audio captures and video captures. The XML Schema representation of the text capture type is currently lacking text-specific information, as it can be seen by looking at the definition below: Presta & Romano Expires September 22, 2014 [Page 29] Internet-Draft draft-ietf-clue-data-model-schema-04 March 2014 14. A media provider organizes the available capture in capture scenes in order to help the receiver both in the rendering and in the selection of the group of captures. Capture scenes are made of capture scene entries, that are set of media captures of the same media type. Each capture scene entry represents an alternative to represent completely a capture scene for a fixed media type. The XML Schema representation of a element is the following: The element can contain zero or more textual elements, defined as in Section 10.13. Besides , there the element (Section 14.1), which is the list of the capture scene entries. 14.1. The element is a mandatory field of a capture scene containing the list of scene entries. Each scene entry is represented by a element (Section 15). Presta & Romano Expires September 22, 2014 [Page 30] Internet-Draft draft-ietf-clue-data-model-schema-04 March 2014 14.2. sceneID attribute The sceneID attribute is a mandatory attribute containing the identifier of the capture scene. 14.3. scale attribute The scale attribute is a mandatory attribute that specifies the scale of the coordinates provided in the spatial information of the media capture belonging to the considered capture scene. The scale attribute can assume three different values: "millimeters" - the scale is in millimeters. Systems which know their physical dimensions (for example professionally installed telepresence room systems) should always provide those real-world measurements. "unknown" - the scale is not necessarily millimeters, but the scale is the same for every media capture in the capture scene. Systems which don't know specific physical dimensions but still know relative distances should select "unknown" in the scale attribute of the capture scene to be described. "noscale" - there is no a common physical scale among the media captures of the capture scene. That means the scale could be different for each media capture. 15. A element represents a capture scene entry, which contains a set of media capture of the same media type describing a capture scene. Presta & Romano Expires September 22, 2014 [Page 31] Internet-Draft draft-ietf-clue-data-model-schema-04 March 2014 A element is characterized as follows. One or more optional elements provide human-readable information about what the scene entry contains. is defined as already seen in Section 10.13. The remaining child elements are described in the following subsections. 15.1. The is the list of the identifiers of the media captures included in the scene entry. It is an element of the captureIDListType type, which is defined as a sequence of each one containing the identifier of a media capture listed within the element: 15.2. sceneEntryID attribute The sceneEntryID attribute is a mandatory attribute containing the identifier of the capture scene entry represented by the element. Presta & Romano Expires September 22, 2014 [Page 32] Internet-Draft draft-ietf-clue-data-model-schema-04 March 2014 15.3. mediaType attribute The mediaType attribute contains the media type of the media captures included in the scene entry. 16. The element represents an encoding group, which is made by a set of one or more individual encodings and some parameters that apply to the group as a whole. Encoding groups contain references to individual encodings that can be applied to captures of the same media type. In other words, they can group audio encodings or, alternatively, video encodings. The definition of the element is the following: In the following, the contained elements are further described. 16.1. is an optional field containing the maximum bitrate that can be shared by the individual encodings included in the encoding group. 16.2. is the list of the individual encoding grouped together in the encoding group. Each individual encoding is represented through its identifier contained within an element. Presta & Romano Expires September 22, 2014 [Page 33] Internet-Draft draft-ietf-clue-data-model-schema-04 March 2014 16.3. encodingGroupID attribute The encodingGroupID attribute contains the identifier of the encoding group. 17. represents a simultaneous transmission set, i.e. a list of captures of the same media type that can be transmitted at the same time by a media provider. There are different simultaneous transmission sets for each media type. Besides the identifiers of the captures ( elements), also the identifiers of capture scene entries can be exploited, as shortcuts ( elements). 17.1. contains the identifier of the media capture that belongs to the simultanous set. 17.2. contains the identifier of the scene entry containing a group of capture that are able to be sent simultaneously with the other capture of the simultaneous set. Presta & Romano Expires September 22, 2014 [Page 34] Internet-Draft draft-ietf-clue-data-model-schema-04 March 2014 18. represents a set of captures of the same media time representing a summary of the complete Media Provider's offer. The media type of a global capture entry is reported in the "mediaType" attribute. Similarly to the simultanous set case, the content of a global capture entry is expressed by leveraging both media capture identifiers and scene entry identifiers. 19. Information about the participants that are represented in the media captures is conveyed via the element. As it can be seen from the XML Schema depicted below, for each participant, a element is provided. [to be discussed] Presta & Romano Expires September 22, 2014 [Page 35] Internet-Draft draft-ietf-clue-data-model-schema-04 March 2014 19.1. includes all the metadata related to a person represented within one or more media captures. Such element currently provides at least the vcard of the subject (via the element, see Section 19.1.2) and optionally his conference role(s) (via one or more elements, see Section 19.1.3). Furthermore, it has a mandatory "participantID" attribute (Section 19.1.1). Presta & Romano Expires September 22, 2014 [Page 36] Internet-Draft draft-ietf-clue-data-model-schema-04 March 2014 19.1.1. participantID attribute The participantID attribute carries the identifier of a represented participant. Such identifier can be used to refer to the participant, as in the element in media captures representation (Section 10.21). 19.1.2. The element is the XML representation of all the fields composing a vcard as specified in the Xcard RFC [RFC6351]. The vcardType is imported by the Xcard XML Schema provided by [I-D.ietf-ecrit-additional-data]. As such schema specifies, the element within is mandatory. 19.1.3. The value of the element determines the role of the represented participant within the telepresence session organization. It can be one of the following terms, that are defined in the framework document: "presenter", "timekeeper","attendee", "minute taker", "translator", "chairman", "vice-chairman". A participant can have more than one conference role. In that case, more than one element will appear in his description. 20. A is given from the association of a media capture and an individual encoding, to form a capture stream as defined in [I-D.ietf-clue-framework]. A possible solution to model such entity is provided in the following. Presta & Romano Expires September 22, 2014 [Page 37] Internet-Draft draft-ietf-clue-data-model-schema-04 March 2014 20.1. is the mandatory element containing the identifier of the media capture that has been encoded to form the capture encoding. 20.2. is the mandatory element containing the identifier of the applied individual encoding. 20.3. is an optional element to be used in case of configuration of MCCs. It contains the list of capture identifiers and capture scene entry identifiers the Media Consumer wants within the MCC. That element is structured as the element used to describe the content of a MCC, i.e. it contains The total number of the media captures listed in the must be lower than or equal to the value carried within the attribute of the MCC. 21. The element has been left within the XML Schema for representing a drafty version of the body of an ADVERTISEMENT message (see the example section). Presta & Romano Expires September 22, 2014 [Page 38] Internet-Draft draft-ietf-clue-data-model-schema-04 March 2014 22. Sample XML file The following XML document represents a schema compliant example of a CLUE telepresence scenario. Taking inspiration from the example described in Sec. XX of the framework draft ([I-D.ietf-clue-framework]), it is provided the XML representation of an endpoint-style Media Provider's offer. There are three cameras, where the central one is also able of capturing a zoomed-out view of the overall telepresence room. Besides the three video captures coming from such cameras, the MP makes available a further multi-content capture about the loudest segment of the room, obtained by switching the video source across the three cameras. For the sake of simplicity, only one audio capture is advertised for the audio of the whole room. The three cameras are placed in front of three participants (Alice, Bob and Ciccio), whose vcard and conference roles details are also provided. Media captures are arranged into four capture scene entries: 1. (VC0, VC1, VC2) - left, center and right camera video captures 2. (VC3) - video capture associated with loudest room segment 3. (VC4) - video capture zoomed out view of all people in the room 4. (AC0) - main audio There are two encoding groups: (i) EG0, for video encodings, and (ii) EG1, for audio encodings. As to the simultaneous sets, only VC1 and VC4 cannot be transmitted simultaneously since they are captured by the same device. i.e. the central camera (VC4 is a zoomed-out view while VC1 is a focused view of the front participant). The simultaneous sets would then be the following: SS1 made by VC3 and all the captures in the first capture scene entry (VC0,VC1,VC2); Presta & Romano Expires September 22, 2014 [Page 39] Internet-Draft draft-ietf-clue-data-model-schema-04 March 2014 SS2 made by VC3, VC0, VC2, VC4 audio CS1 EG1 0.5 1.0 0.5 0.5 0.0 0.5 true main audio from the room 1 it static room alice bob ciccio 1 video CS1 EG0 Presta & Romano Expires September 22, 2014 [Page 40] Internet-Draft draft-ietf-clue-data-model-schema-04 March 2014 0.5 1.0 0.5 0.5 0.0 0.5 true left camera video capture 1 it static individual ciccio 2 video CS1 EG0 0.5 1.0 0.5 0.5 0.0 0.5 true central camera video capture 1 it static individual alice 2 Presta & Romano Expires September 22, 2014 [Page 41] Internet-Draft draft-ietf-clue-data-model-schema-04 March 2014 video CS1 EG0 0.5 1.0 0.5 0.5 0.0 0.5 true right camera video capture 1 it static individual bob 2 video CS1 EG0 true SE1 false true Soundlevel:0 1 loudest room segment 1 it static individual 1 Presta & Romano Expires September 22, 2014 [Page 42] Internet-Draft draft-ietf-clue-data-model-schema-04 March 2014 video CS1 EG0 0.5 1.0 0.5 0.5 0.0 0.5 true zoomed out view of all people in the room 1 it static room alice bob ciccio 1 600000 ENC1 ENC2 ENC3 300000 ENC4 ENC5 Presta & Romano Expires September 22, 2014 [Page 43] Internet-Draft draft-ietf-clue-data-model-schema-04 March 2014 VC0 VC1 VC2 VC3 VC4 VC4 VC3 SE1 VC0 VC2 VC4 VC3 Bob Presta & Romano Expires September 22, 2014 [Page 44] Internet-Draft draft-ietf-clue-data-model-schema-04 March 2014 minute taker Alice presenter Ciccio chairman timekeeper 23. MCC example Enhancing the scenario presented in the previous example, the media providers is able to advertise a composed capture VC7 made by a big picture representing the current speaker (VC3) and two picture-in- picture boxes representing the previous speakers (the previous one -VC5- and the oldest one -VC6). The provider does not want to instantiate and send VC5 and VC6, so it does not associate any encoding group with them. Their XML representations are provided for enabling the description of VC7. A possible description for that scenario could be the following: audio Presta & Romano Expires September 22, 2014 [Page 45] Internet-Draft draft-ietf-clue-data-model-schema-04 March 2014 CS1 EG1 0.5 1.0 0.5 0.5 0.0 0.5 true main audio from the room 1 it static room alice bob ciccio 1 video CS1 EG0 0.5 1.0 0.5 0.5 0.0 0.5 true left camera video capture 1 it Presta & Romano Expires September 22, 2014 [Page 46] Internet-Draft draft-ietf-clue-data-model-schema-04 March 2014 static individual ciccio 2 video CS1 EG0 0.5 1.0 0.5 0.5 0.0 0.5 true central camera video capture 1 it static individual alice 2 video CS1 EG0 0.5 1.0 0.5 0.5 0.0 Presta & Romano Expires September 22, 2014 [Page 47] Internet-Draft draft-ietf-clue-data-model-schema-04 March 2014 0.5 true right camera video capture 1 it static individual bob 2 video CS1 EG0 true SE1 false true Soundlevel:0 1 loudest room segment 1 it static individual 1 video CS1 EG0 0.5 1.0 0.5 0.5 0.0 Presta & Romano Expires September 22, 2014 [Page 48] Internet-Draft draft-ietf-clue-data-model-schema-04 March 2014 0.5 true zoomed out view of all people in the room 1 it static room alice bob ciccio 1 video CS1 true SE1 false true Soundlevel:1 1 penultimate loudest room segment 1 it static individual 1 video CS1 true SE1 Presta & Romano Expires September 22, 2014 [Page 49] Internet-Draft draft-ietf-clue-data-model-schema-04 March 2014 false true Soundlevel:2 1 last but two loudest room segment 1 it static individual 1 video CS1 true VC3 VC5 VC6 true true 1 big picture of the current speaker + pips about previous speakers 1 it static individual 1 600000 ENC1 ENC2 ENC3 300000 Presta & Romano Expires September 22, 2014 [Page 50] Internet-Draft draft-ietf-clue-data-model-schema-04 March 2014 ENC4 ENC5 participants' individual videos VC0 VC1 VC2 loudest segment of the room VC3 loudest segment of the room + pips VC7 room audio AC0 room video VC4 Presta & Romano Expires September 22, 2014 [Page 51] Internet-Draft draft-ietf-clue-data-model-schema-04 March 2014 VC7 SE1 VC0 VC2 VC4 VC7 Bob minute taker Alice presenter Ciccio chairman timekeeper Presta & Romano Expires September 22, 2014 [Page 52] Internet-Draft draft-ietf-clue-data-model-schema-04 March 2014 24. Diff with draft-ietf-clue-data-model-schema-02 version captureParameters and encodingParameters have been removed from the captureEncodingType data model example has been updated and validated according to the new schema. Further description of the represented scenario have been provided. A multiple content capture example has been added. Obsolete comments and references have been removed. 25. Diff with draft-ietf-clue-data-model-schema-03 version encodings section has been removed global capture entries have been introduced capture scene entry identifiers are used as shortcuts in listing the content of MCC (similarly to simultaneous set and global capture entries) Examples have been updated. A new example with global capture entries has been added. has been made optional. has been renamed into Obsolete comments have been removed. participants information has been added. 26. Informative References [I-D.ietf-clue-framework] Duckworth, M., Pepperell, A., and S. Wenger, "Framework for Telepresence Multi-Streams", draft-ietf-clue-framework-14 (work in progress), February 2014. [I-D.ietf-ecrit-additional-data] Rosen, B., Tschofenig, H., Marshall, R., Randy, R., and J. Winterbottom, "Additional Data related to an Emergency Call", draft-ietf-ecrit-additional-data-21 (work in progress), March 2014. Presta & Romano Expires September 22, 2014 [Page 53] Internet-Draft draft-ietf-clue-data-model-schema-04 March 2014 [RFC4796] Hautakorpi, J. and G. Camarillo, "The Session Description Protocol (SDP) Content Attribute", RFC 4796, February 2007. [RFC6351] Perreault, S., "xCard: vCard XML Representation", RFC 6351, August 2011. Authors' Addresses Roberta Presta University of Napoli Via Claudio 21 Napoli 80125 Italy EMail: roberta.presta@unina.it Simon Pietro Romano University of Napoli Via Claudio 21 Napoli 80125 Italy EMail: spromano@unina.it Presta & Romano Expires September 22, 2014 [Page 54]