CLUE Working Group R. Presta Internet-Draft S P. Romano Intended status: Standards Track University of Napoli Expires: October 19, 2015 April 17, 2015 An XML Schema for the CLUE data model draft-ietf-clue-data-model-schema-09 Abstract This document provides an XML schema file for the definition of CLUE data model types. Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on October 19, 2015. Copyright Notice Copyright (c) 2015 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents Presta & Romano Expires October 19, 2015 [Page 1] Internet-Draft draft-ietf-clue-data-model-schema-09 April 2015 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 3. XML Schema . . . . . . . . . . . . . . . . . . . . . . . . . . 7 4. . . . . . . . . . . . . . . . . . . . . . . . 17 5. . . . . . . . . . . . . . . . . . . . . . . . 18 6. . . . . . . . . . . . . . . . . . . . . . . . 18 7. . . . . . . . . . . . . . . . . . . . . . . 18 8. . . . . . . . . . . . . . . . . . . . . . . . . 18 9. . . . . . . . . . . . . . . . . . . . . . . 18 10. . . . . . . . . . . . . . . . . . . . . . . . . 18 10.1. captureID attribute . . . . . . . . . . . . . . . . . . . 19 10.2. mediaType attribute . . . . . . . . . . . . . . . . . . . 20 10.3. . . . . . . . . . . . . . . . . . . . 20 10.4. . . . . . . . . . . . . . . . . . . . . . 20 10.5. . . . . . . . . . . . . . . . . . . 20 10.5.1. . . . . . . . . . . . . . . . . . . . 21 10.5.2. . . . . . . . . . . . . . . . . . . . . 22 10.6. . . . . . . . . . . . . . . . . . 23 10.7. . . . . . . . . . . . . . . . . . . . . . . . . 23 10.8. . . . . . . . . . . . . . . . . . . . 24 10.9. . . . . . . . . . . . . . . . . . . . 24 10.10. . . . . . . . . . . . . . . . . . . . . . . . . 24 10.11. . . . . . . . . . . . . . . . . . . . . . . 25 10.12. . . . . . . . . . . . . . . . . . . . . . . 25 10.13. . . . . . . . . . . . . . . . . . . . . . . 25 10.14. . . . . . . . . . . . . . . . . . . . . . . . 26 10.15. . . . . . . . . . . . . . . . . . . . . . . . . . 26 10.16. . . . . . . . . . . . . . . . . . . . . . . . 26 10.17. . . . . . . . . . . . . . . . . . . . . . . . 26 10.18. . . . . . . . . . . . . . . . . . . . . . . . . . 27 10.19. . . . . . . . . . . . . . . . . . . . . . 27 10.20. . . . . . . . . . . . . . . . . . . . . . 27 10.21. . . . . . . . . . . . . . . . . . . . . 27 10.21.1. . . . . . . . . . . . . . . . . . . . . 27 11. Audio captures . . . . . . . . . . . . . . . . . . . . . . . . 28 11.1. . . . . . . . . . . . . . . . . . . 28 12. Video captures . . . . . . . . . . . . . . . . . . . . . . . . 29 13. Text captures . . . . . . . . . . . . . . . . . . . . . . . . 29 14. Other capture types . . . . . . . . . . . . . . . . . . . . . 30 15. . . . . . . . . . . . . . . . . . . . . . . . . 30 15.1. . . . . . . . . . . . . . . . . . . . 31 15.2. . . . . . . . . . . . . . . . . . . . . . . 31 15.3. sceneID attribute . . . . . . . . . . . . . . . . . . . . 32 15.4. scale attribute . . . . . . . . . . . . . . . . . . . . . 32 16. . . . . . . . . . . . . . . . . . . . . . . . . . 32 16.1. . . . . . . . . . . . . . . . . . . . . 33 16.2. sceneViewID attribute . . . . . . . . . . . . . . . . . . 33 17. . . . . . . . . . . . . . . . . . . . . . . . 33 Presta & Romano Expires October 19, 2015 [Page 2] Internet-Draft draft-ietf-clue-data-model-schema-09 April 2015 17.1. . . . . . . . . . . . . . . . . . . . 34 17.2. . . . . . . . . . . . . . . . . . . . . 34 17.3. encodingGroupID attribute . . . . . . . . . . . . . . . . 34 18. . . . . . . . . . . . . . . . . . . . . . . 35 18.1. setID attribute . . . . . . . . . . . . . . . . . . . . . 35 18.2. mediaType attribute . . . . . . . . . . . . . . . . . . . 35 18.3. . . . . . . . . . . . . . . . . . . . 36 18.4. . . . . . . . . . . . . . . . . . . . . 36 18.5. . . . . . . . . . . . . . . . . . . . 36 19. . . . . . . . . . . . . . . . . . . . . . . . . . 36 20. . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 20.1. . . . . . . . . . . . . . . . . . . . . . . . . 37 20.1.1. personID attribute . . . . . . . . . . . . . . . . . 38 20.1.2. . . . . . . . . . . . . . . . . . . . . 38 20.1.3. . . . . . . . . . . . . . . . . . . . . 38 21. . . . . . . . . . . . . . . . . . . . . . . 38 21.1. . . . . . . . . . . . . . . . . . . . . . . . 39 21.2. . . . . . . . . . . . . . . . . . . . . . . 39 21.3. . . . . . . . . . . . . . . . . . . . 39 22. . . . . . . . . . . . . . . . . . . . . . . . . . . 39 23. XML Schema extensibility . . . . . . . . . . . . . . . . . . . 40 23.1. Example of extension . . . . . . . . . . . . . . . . . . 40 24. Security considerations . . . . . . . . . . . . . . . . . . . 42 25. IANA considerations . . . . . . . . . . . . . . . . . . . . . 43 25.1. XML namespace registration . . . . . . . . . . . . . . . 43 25.2. XML Schema registration . . . . . . . . . . . . . . . . . 44 25.3. MIME Media Type Registration for 'application/clue_info+xml' . . . . . . . . . . . . . . . 44 26. Sample XML file . . . . . . . . . . . . . . . . . . . . . . . 45 27. MCC example . . . . . . . . . . . . . . . . . . . . . . . . . 51 28. Diff with draft-ietf-clue-data-model-schema-08 version . . . . 58 29. Diff with draft-ietf-clue-data-model-schema-07 version . . . . 58 30. Diff with draft-ietf-clue-data-model-schema-06 version . . . . 58 31. Diff with draft-ietf-clue-data-model-schema-04 version . . . . 59 32. Diff with draft-ietf-clue-data-model-schema-03 version . . . . 60 33. Diff with draft-ietf-clue-data-model-schema-02 version . . . 60 34. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 60 35. Informative References . . . . . . . . . . . . . . . . . . . . 60 Presta & Romano Expires October 19, 2015 [Page 3] Internet-Draft draft-ietf-clue-data-model-schema-09 April 2015 1. Introduction This document provides an XML schema file for the definition of CLUE data model types. The schema is based on information contained in [I-D.ietf-clue-framework]. It encodes information and constraints defined in the aforementioned document in order to provide a formal representation of the concepts therein presented. The document aims at the definition of a coherent structure for information associated with the description of a telepresence scenario. Such information is used within the CLUE protocol messages ([I-D.ietf-clue-protocol]) enabling the dialogue between a Media Provider and a Media Consumer. CLUE protocol messages, indeed, are XML messages allowing (i) a Media Provider to advertise its telepresence capabilities in terms of media captures, capture scenes, and other features envisioned in the CLUE framework, according to the format herein defined and (ii) a Media Consumer to request the desired telepresence options in the form of capture encodings, represented as described in this document. 2. Terminology This document refers to the same terminology used in [I-D.ietf-clue-framework], except for the "CLUE Participant" definition (which is still under discussion). We briefly recall herein some of the main terms used in the document. Audio Capture: Media Capture for audio. Denoted as ACn, n being an unsigned integer number, in the example cases in this document. Camera-Left and Right: For Media Captures, Camera-Left and Camera- Right are from the point of view of a person observing the rendered media. They are the opposite of Stage-Left and Stage- Right. Capture: Same as Media Capture. Capture Device: A device that converts audio and video input into an electrical signal, in most cases to be fed into a media encoder. Capture Encoding: A specific encoding of a Media Capture, to be sent by a Media Provider to a Media Consumer via RTP. Presta & Romano Expires October 19, 2015 [Page 4] Internet-Draft draft-ietf-clue-data-model-schema-09 April 2015 Capture Scene: An abstraction grouping semantically-coupled Media Captures available at the Media Provider's side, representing a precise portion of the local scene that can be transmitted remotely. Capture Scene MAY correspond to a part of the telepresence room or MAY focus only on the presentation media. A Capture Scene is characterized by a set of attributes and by a set of Capture Scene Views. Capture Scene View: A list of Media Captures of the same media type that constitute a possible representation of a Capture Scene. Media Captures belonging to the same Capture Scene View can be sent simultaneously by the Media Provider. CLUE Participant: This term is not imported from the framework terminology. A CLUE Participant identifies a generic entity (either an Endpoint or a MCU) making use of the CLUE protocol. Consumer: Same as Media Consumer. Encoding or Individual Encoding: The representation of an encoding technology. In the CLUE datamodel, for each encoding it is provided a set of parameters representing the encoding constraints, like for example the maximum bandwidth of the Media Provider the encoding can consume. Encoding Group: The representation of a group of encodings. For each group, it is provided a set of parameters representing the constraints to be applied to the group as a whole. An example is the maximum bandwidth that can be consumed when using the contained encodings altogether simultaneously. Endpoint The logical point of final termination through receiving, decoding and rendering, and/or initiation through capturing, encoding, and sending of media streams. An endpoint consists of one or more physical devices which source and sink media streams, and exactly one SIP Conferencing Framework Participant (which, in turn, includes exactly one SIP User Agent). Endpoints can be anything from multiscreen/multicamera room controllers to handheld devices. MCU: Multipoint Control Unit (MCU) - a device that connects two or more endpoints together into one single multimedia conference. An MCU may include a Mixer. Media: Any data that, after suitable encoding, can be conveyed over RTP, including audio, video or timed text. Presta & Romano Expires October 19, 2015 [Page 5] Internet-Draft draft-ietf-clue-data-model-schema-09 April 2015 Media Capture: A "Media Capture", or simply "Capture", is a source of Media of a single type (i.e., audio or video or text). Media Stream: The term "Media Stream", or simply "Stream", is used as a synonymous of Capture Encoding. Media Provider: A CLUE participant (i.e., an Endpoint or a MCU) able to send Media Streams. Media Consumer: A CLUE participant (i.e., an Endpoint or a MCU) able to receive Media Streams. Scene: Same as Capture Scene. Scene View: Same as Capture Scene View. Stream: Same as Media Stream. Multiple Content Capture: A Capture that can contain different Media Captures of the same media type. It is denoted as MCC in this document. In the Stream resulting from the MCC, the Stream coming from the encoding of the composing Media Captures can appear simultaneously, if the MCC is the result of a mixing operation, or can appear alternatively over the time, according to a certain switching policy. Plane of Interest: The spatial plane containing the most relevant subject matter. Provider: Same as Media Provider. Render: The process of reproducing the received Streams like, for instance, displaying of the remote video on the Media Consumer's screens, or playing of the remote audio through loudspeakers. Simultaneous Transmission Set: a set of Media Captures of the same media type that can be transmitted simultaneously from a Media Provider. Single Media Capture: A Capture representing the Media coming from a single-source Capture Device. Spatial Information: Data about the spatial position of a Capture Device that generate a Single Media Capture within the context of a Capture Scene representing a phisical portion of a Telepresence Room. Presta & Romano Expires October 19, 2015 [Page 6] Internet-Draft draft-ietf-clue-data-model-schema-09 April 2015 Stream Characteristics: The union of the features used to describe a Stream in the CLUE environment and in the SIP-SDP environment. Video Capture: A Media Capture for video. 3. XML Schema This section contains the CLUE data model schema definition. The element and attribute definitions are formal representations of the concepts needed to describe the capabilities of a Media Provider and the streams that are requested by a Media Consumer given the Media Provider's ADVERTISEMENT ([I-D.ietf-clue-protocol]). The main groups of information are: : the list of media captures available (Section 4) : the list of encoding groups (Section 5) : the list of capture scenes (Section 6) : the list of simultaneous transmission sets (Section 7) : the list of global views sets (Section 8) : meta data about the participants represented in the telepresence session (Section 20). : the list of instantiated capture encodings (Section 9) All of the above refers to concepts that have been introduced in [I-D.ietf-clue-framework] and further detailed in the following of this document. Presta & Romano Expires October 19, 2015 [Page 7] Internet-Draft draft-ietf-clue-data-model-schema-09 April 2015 Presta & Romano Expires October 19, 2015 [Page 8] Internet-Draft draft-ietf-clue-data-model-schema-09 April 2015 Presta & Romano Expires October 19, 2015 [Page 9] Internet-Draft draft-ietf-clue-data-model-schema-09 April 2015 Presta & Romano Expires October 19, 2015 [Page 10] Internet-Draft draft-ietf-clue-data-model-schema-09 April 2015 Presta & Romano Expires October 19, 2015 [Page 11] Internet-Draft draft-ietf-clue-data-model-schema-09 April 2015 Presta & Romano Expires October 19, 2015 [Page 12] Internet-Draft draft-ietf-clue-data-model-schema-09 April 2015 Presta & Romano Expires October 19, 2015 [Page 13] Internet-Draft draft-ietf-clue-data-model-schema-09 April 2015 Presta & Romano Expires October 19, 2015 [Page 14] Internet-Draft draft-ietf-clue-data-model-schema-09 April 2015 Presta & Romano Expires October 19, 2015 [Page 15] Internet-Draft draft-ietf-clue-data-model-schema-09 April 2015 Presta & Romano Expires October 19, 2015 [Page 16] Internet-Draft draft-ietf-clue-data-model-schema-09 April 2015 Following sections describe the XML schema in more detail. 4. represents the list of one ore more media captures available at the Media Provider's side. Each media capture is represented by a element (Section 10). Presta & Romano Expires October 19, 2015 [Page 17] Internet-Draft draft-ietf-clue-data-model-schema-09 April 2015 5. represents the list of the encoding groups organized on the Media Provider's side. Each encoding group is represented by an element (Section 17). 6. represents the list of the capture scenes organized on the Media Provider's side. Each capture scene is represented by a element. (Section 15). 7. contains the simultaneous sets indicated by the Media Provider. Each simultaneous set is represented by a element. (Section 18). 8. contains a set of alternative representations of all the scenes that are offered by a Media Provider to a Media Consumer. Each alternative is named "global view" and it is represented by a element. (Section 19). 9. is a list of capture encodings. It can represent the list of the desired capture encodings indicated by the Media Consumer or the list of instantiated captures on the provider's side. Each capture encoding is represented by a element. (Section 21). 10. According to the CLUE framework, a media capture is the fundamental representation of a media flow that is available on the provider's side. Media captures are characterized (i) by a set of features that are independent from the specific type of medium, and (ii) by a set of features that are media-specific. The features that are common to all media types appear within the media capture type, that has been designed as an abstract complex type. Media-specific captures, such as video captures, audio captures and others, are specializations of that abstract media capture type, as in a typical generalization- specialization hierarchy. The following is the XML Schema definition of the media capture type: Presta & Romano Expires October 19, 2015 [Page 18] Internet-Draft draft-ietf-clue-data-model-schema-09 April 2015 10.1. captureID attribute The "captureID" attribute is a mandatory field containing the identifier of the media capture. Presta & Romano Expires October 19, 2015 [Page 19] Internet-Draft draft-ietf-clue-data-model-schema-09 April 2015 10.2. mediaType attribute The "mediaType" attribute is a mandatory attribute specifying the media type of the capture ("audio", "video", "text",...). 10.3. is a mandatory field containing the identifier of the capture scene the media capture is defined in. Indeed, each media capture must be defined within one and only one capture scene. When a media capture is spatially definable, some spatial information is provided along with it in the form of point coordinates (see Section 10.5). Such coordinates refer to the space of coordinates defined for the capture scene containing the capture. 10.4. is an optional field containing the identifier of the encoding group the media capture is associated with. Media captures that are not associated with any encoding group can not be instantiated as media streams. 10.5. Media captures are divided into two categories: (i) non spatially definable captures and (ii) spatially definable captures. Captures are spatially definable when at least (i) it is possible to provide the coordinates of the device position within the telepresence room of origin (capture point) together with its capturing direction specified by a second point (point on line of capture), or (ii) it is possible to provide the represented area within the telepresence room, by listing the coordinates of the four co-planar points identifying the plane of interest (area of capture). The coordinates of the abovementioned points must be expressed according to the coordinate space of the capture scene the media captures belongs to. Non spatially definable captures cannot be characterized within the physical space of the telepresence room of origin. Captures of this kind are for example those related to recordings, text captures, DVDs, registered presentations, or external streams that are played in the telepresence room and transmitted to remote sites. Spatially definable captures represent a part of the telepresence room. The captured part of the telepresence room is described by means of the element. By comparing the element of different media captures within the Presta & Romano Expires October 19, 2015 [Page 20] Internet-Draft draft-ietf-clue-data-model-schema-09 April 2015 same capture scene, a consumer can better determine the spatial relationships between them and render them correctly. Non spatially definable captures do not embed such element in their XML description: they are instead characterized by having the tag set to "true" (see Section 10.6). The definition of the spatial information type is the following: The contains the coordinates of the capture device that is taking the capture, as well as, optionally, the pointing direction (see Section 10.5.1). The is an optional field containing four points defining the captured area covered by the capture (see Section 10.5.2). 10.5.1. The element is used to represent the position and optionally the line of capture of a capture device. MUST be included in spatially definable audio captures, while it is optional for spatially definable video captures. The XML Schema definition of the element type is the following: Presta & Romano Expires October 19, 2015 [Page 21] Internet-Draft draft-ietf-clue-data-model-schema-09 April 2015 The point type contains three spatial coordinates (x,y,z) representing a point in the space associated with a certain capture scene. The capture point type extends the point type, i.e., it is represented by three coordinates identifying the position of the capture device, but can add further information. Such further information is conveyed by the , which is another point-type element representing the "point on line of capture", that gives the pointing direction of the capture device. The coordinates of the point on line of capture MUST NOT be identical to the capture point coordinates. For a spatially definable video capture, if the point on line of capture is provided, it MUST belong to the region between the point of capture and the capture area. For a spatially definable audio capture, if the point on line of capture is not provided, the sensitivity pattern should be considered omnidirectional. 10.5.2. is an optional element that can be contained within the spatial information associated with a media capture. It represents the spatial area captured by the media capture. MUST be included in the spatial information of spatially definable video Presta & Romano Expires October 19, 2015 [Page 22] Internet-Draft draft-ietf-clue-data-model-schema-09 April 2015 captures, while it MUST NOT be associated with audio captures. The XML representation of that area is provided through a set of four point-type elements, , , , and , as it can be seen from the following definition: , , , and MUST be co- planar. 10.6. When media captures are non spatially definable, they are marked with the boolean element set to "true" and no is provided. Indeed, and are mutually exclusive tags, according to the section within the XML Schema definition of the media capture type. 10.7. A media capture can be (i) an individual media capture or (ii) a multiple content capture (MCC). A multiple content capture is made by different captures that can be arranged spatially (by a composition operation), or temporally (by a switching operation), or that can result from the orchestration of both the techniques. If a media capture is a MCC, then it can show in its XML data model representation the element. It is composed by a list of media capture identifiers ("captureIDREF") and capture scene view identifiers ("sceneViewIDREF"), where the last ones are used as shortcuts to refer to multiple capture identifiers. The referenced captures are used to create the MCC according to a certain strategy. If the element does not appear in a MCC, or it has no child elements, then the MCC is assumed to be made of multiple sources but no information regarding those sources is provided. Presta & Romano Expires October 19, 2015 [Page 23] Internet-Draft draft-ietf-clue-data-model-schema-09 April 2015 10.8. is an optional element for multiple content captures that contains a numeric identifier. Multiple content captures marked with the same identifier in the contain at all times captures coming from the same source. It is the MP that determines what the source for the captures is. In this way, the MP can choose how to group together single captures for the purpose of keeping them synchronized according to the SynchronisationID attribute. 10.9. is an optional boolean element for multiple content captures. It indicates whether or not the Provider allows the Consumer to choose a specific subset of the captures referenced by the MCC. If this attribute is true, and the MCC references other captures, then the Consumer MAY specify in a CONFIGURE message a specific subset of those captures to be included in the MCC, and the Provider MUST then include only that subset. If this attribute is false, or the MCC does not reference other captures, then the Consumer MUST NOT select a subset. If is not shown in the XML description of the MCC, its value is to be considered "false". 10.10. is an optional element that can be used only for multiple content captures. It indicates the criteria applied to build the multiple content capture using the media captures referenced in . Presta & Romano Expires October 19, 2015 [Page 24] Internet-Draft draft-ietf-clue-data-model-schema-09 April 2015 10.11. is an optional element that can be used only for multiple content captures. It provides information about the number of media captures that can be represented in the multiple content capture at a time. If is not provided, all the media captures listed in the element can appear at a time in the capture encoding. The type definition is provided below. When the "exactNumber" attribute is set to "1", it means the element carries the exact number of the media captures appearing at a time. Otherwise, the number of the represented media captures MUST be considered "<=" the value. 10.12. is a boolean element that MUST be used for single- content captures. Its value is fixed and set to "true". Such element indicates the capture that is being described is not a multiple content capture. Indeed, and the aforementioned tags related to MCC attributes (from Section 10.7 to Section 10.11) are mutually exclusive, according to the section within the XML Schema definition of the media capture type. 10.13. is used to provide optionally human-readable textual information about a media capture. The same element is exploited to describe, besides media captures, capture scenes and capture scene views, as it is included in their XML representation. A media capture can be described by using multiple elements, each providing information in a different language. The element definition is the following: Presta & Romano Expires October 19, 2015 [Page 25] Internet-Draft draft-ietf-clue-data-model-schema-09 April 2015 As it can be seen, is a string element with an attribute ("lang") indicating the language used in the textual description. 10.14. is an optional unsigned integer field indicating the importance of a media capture according to the Media Provider's perspective. It can be used on the receiver's side to automatically identify the most relevant contribution from the Media Provider. The higher the importance, the lower the contained value. When media captures are marked with a "0" priority value, it means that they are "not subject to priority". 10.15. is an optional element containing the language used in the capture, if any. 10.16. is an optional element indicating whether or not the capture device originating the capture may move during the telepresence session. That optional element can assume one of the three following values: (i) static, (ii) dynamic or (iii) highly dynamic. 10.17. The optional element contains the value of the ID attribute of the media capture it refers to. The media capture marked with a element can be for example the translation of a main media capture in a different language. Presta & Romano Expires October 19, 2015 [Page 26] Internet-Draft draft-ietf-clue-data-model-schema-09 April 2015 10.18. The element is an optional tag describing what is represented in the spatial area covered by a media capture. The current possible values are: "table", "lectern", "individual", and "audience", as listed in the enumerative view type in the following. 10.19. The element is an optional tag used for media captures conveying information about presentations within the telepresence session. The current possible values are "slides" and "images", as listed in the enumerative presentation type in the following. 10.20. The element is a boolean element indicating that there is text embedded in the media capture (e.g., in a video capture). The language used in such embedded textual description is reported in "lang" attribute. The XML Schema definition of the element is: 10.21. This optional element is used to indicate which telepresence session participants are represented within the media captures. For each participant, a element is provided. 10.21.1. contains the identifier of the represented person. Metadata about the represented participant can be retrieved by accessing the list (Section 20). Presta & Romano Expires October 19, 2015 [Page 27] Internet-Draft draft-ietf-clue-data-model-schema-09 April 2015 11. Audio captures Audio captures inherit all the features of a generic media capture and present further audio-specific characteristics. The XML Schema definition of the audio capture type is reported below: An example of audio-specific information that can be included is represented by the element. (Section 11.1). 11.1. The element is an optional field describing the characteristics of the nominal sensitivity pattern of the microphone capturing the audio signal. The XML Schema definition is provided below: Presta & Romano Expires October 19, 2015 [Page 28] Internet-Draft draft-ietf-clue-data-model-schema-09 April 2015 12. Video captures Video captures, similarly to audio captures, extend the information of a generic media capture with video-specific features. The XML Schema representation of the video capture type is provided in the following: 13. Text captures Also text captures can be described by extending the generic media capture information, similarly to audio captures and video captures. The XML Schema representation of the text capture type is currently lacking text-specific information, as it can be seen by looking at the definition below: Presta & Romano Expires October 19, 2015 [Page 29] Internet-Draft draft-ietf-clue-data-model-schema-09 April 2015 14. Other capture types Other media capture types can be described by using the CLUE data model. They can be represented by exploiting the "otherCaptureType" type. This media capture type is conceived to be filled in with elements defined within extensions of the current schema, i.e., with elements defined in other XML schemas (see Section 23 for an example). The otherCaptureType inherits all the features envisioned for the abstract mediaCaptureType. The XML Schema representation of the otherCaptureType is the following: 15. A Media Provider organizes the available captures in capture scenes in order to help the receiver both in the rendering and in the selection of the group of captures. Capture scenes are made of media captures and capture scene views, that are sets of media captures of the same media type. Each capture scene view is an alternative to represent completely a capture scene for a fixed media type. The XML Schema representation of a element is the following: Presta & Romano Expires October 19, 2015 [Page 30] Internet-Draft draft-ietf-clue-data-model-schema-09 April 2015 Each capture scene is identified by a "sceneID" attribute. The element can contain zero or more textual elements, defined as in Section 10.13. Besides , there is the optional element (Section 15.1), which contains structured information about the scene in the vcard format, and the optional element (Section 15.2), which is the list of the capture scene views. When no is provided, the capture scene is assumed to be made of all the media captures which contain the value of its sceneID attribute in their mandatory captureSceneIDREF attribute. 15.1. The element contains optional information about the capture scene according to the vcard format. 15.2. The element is a mandatory field of a capture scene containing the list of scene views. Each scene view is represented by a element (Section 16). Presta & Romano Expires October 19, 2015 [Page 31] Internet-Draft draft-ietf-clue-data-model-schema-09 April 2015 15.3. sceneID attribute The sceneID attribute is a mandatory attribute containing the identifier of the capture scene. 15.4. scale attribute The scale attribute is a mandatory attribute that specifies the scale of the coordinates provided in the spatial information of the media capture belonging to the considered capture scene. The scale attribute can assume three different values: "mm" - the scale is in millimeters. Systems which know their physical dimensions (for example professionally installed telepresence room systems) should always provide such real-world measurements. "unknown" - the scale is not necessarily millimeters, but the scale is the same for every media capture in the capture scene. Systems which are not aware of specific physical dimensions yet still know relative distances should select "unknown" in the scale attribute of the capture scene to be described. "noscale" - there is no common physical scale among the media captures of the capture scene. That means the scale could be different for each media capture. 16. A element represents a capture scene view, which contains a set of media captures of the same media type describing a capture scene. A element is characterized as follows. Presta & Romano Expires October 19, 2015 [Page 32] Internet-Draft draft-ietf-clue-data-model-schema-09 April 2015 One or more optional elements provide human-readable information about what the scene view contains. is defined as already seen in Section 10.13. The remaining child elements are described in the following subsections. 16.1. The is the list of the identifiers of the media captures included in the scene view. It is an element of the captureIDListType type, which is defined as a sequence of ,each containing the identifier of a media capture listed within the element: 16.2. sceneViewID attribute The sceneViewID attribute is a mandatory attribute containing the identifier of the capture scene view represented by the element. 17. The element represents an encoding group, which is made by a set of one or more individual encodings and some parameters that apply to the group as a whole. Encoding groups contain references to individual encodings that can be applied to media Presta & Romano Expires October 19, 2015 [Page 33] Internet-Draft draft-ietf-clue-data-model-schema-09 April 2015 captures. The definition of the element is the following: In the following, the contained elements are further described. 17.1. is an optional field containing the maximum bitrate expressed in bit per second that can be shared by the individual encodings included in the encoding group. 17.2. is the list of the individual encodings grouped together in the encoding group. Each individual encoding is represented through its identifier contained within an element. 17.3. encodingGroupID attribute The encodingGroupID attribute contains the identifier of the encoding group. Presta & Romano Expires October 19, 2015 [Page 34] Internet-Draft draft-ietf-clue-data-model-schema-09 April 2015 18. represents a simultaneous transmission set, i.e., a list of captures of the same media type that can be transmitted at the same time by a Media Provider. There are different simultaneous transmission sets for each media type. Besides the identifiers of the captures ( elements), also the identifiers of capture scene views and of capture scene can be exploited, as shortcuts ( and elements). 18.1. setID attribute The "setID" attribute is a mandatory field containing the identifier of the simultaneous set. When only capture scene identifiers are listed within a simultaneous set, the media type attribute MUST be used in order to determine which media captures can be simultaneously sent together. 18.2. mediaType attribute The "mediaType" attribute is an optional attribute containing the media type of the captures referenced by the simultaneous set. When only capture scene identifiers are listed within a simultaneous set, the media type attribute MUST appear in the XML description in Presta & Romano Expires October 19, 2015 [Page 35] Internet-Draft draft-ietf-clue-data-model-schema-09 April 2015 order to determine which media captures can be simultaneously sent together. 18.3. contains the identifier of the media capture that belongs to the simultanous set. 18.4. contains the identifier of the scene view containing a group of captures that are able to be sent simultaneously with the other captures of the simultaneous set. 18.5. contains the identifier of the capture scene where all the included captures of a certain media type are able to be sent together with the other captures of the simultaneous set. 19. is a set of captures of the same media type representing a summary of the complete Media Provider's offer. The content of a global view is expressed by leveraging only scene view identifiers, put within elements. Each global view is identified by a unique identifier within the "globalViewID" attribute. 20. Information about the participants that are represented in the media captures is conveyed via the element. As it can be seen from the XML Schema depicted below, for each participant, a Presta & Romano Expires October 19, 2015 [Page 36] Internet-Draft draft-ietf-clue-data-model-schema-09 April 2015 element is provided. 20.1. includes all the metadata related to a person represented within one or more media captures. Such element provides the vcard of the subject (via the element, see Section 20.1.2) and his conference role(s) (via one or more elements, see Presta & Romano Expires October 19, 2015 [Page 37] Internet-Draft draft-ietf-clue-data-model-schema-09 April 2015 Section 20.1.3). Furthermore, it has a mandatory "personID" attribute (Section 20.1.1). 20.1.1. personID attribute The "personID" attribute carries the identifier of a represented person. Such identifier can be used to refer to the participant, as in the element in the media captures representation (Section 10.21). 20.1.2. The element is the XML representation of all the fields composing a vcard as specified in the Xcard RFC [RFC6351]. The vcardType is imported by the Xcard XML Schema provided by [I-D.ietf-ecrit-additional-data]. As such schema specifies, the element within is mandatory. 20.1.3. The value of the element determines the role of the represented participant within the telepresence session organization. It can be one of the following terms, that are defined in the framework document: "presenter", "timekeeper","attendee", "minute taker", "translator", "chairman", "vice-chairman". A participant can play more than one conference roles. In that case, more than one element will appear in his description. 21. A is given from the association of a media capture with an individual encoding, to form a capture stream as defined in [I-D.ietf-clue-framework]. The model of such an entity is provided in the following. Presta & Romano Expires October 19, 2015 [Page 38] Internet-Draft draft-ietf-clue-data-model-schema-09 April 2015 21.1. is the mandatory element containing the identifier of the media capture that has been encoded to form the capture encoding. 21.2. is the mandatory element containing the identifier of the applied individual encoding. 21.3. is an optional element to be used in case of configuration of MCCs. It contains the list of capture identifiers and capture scene view identifiers the Media Consumer wants within the MCC. That element is structured as the element used to describe the content of a MCC, i.e., it contains The total number of media captures listed in the must be lower than or equal to the value carried within the attribute of the MCC. 22. The element has been left within the XML Schema for representing a drafty version of the body of an ADVERTISEMENT message (see the example section). Presta & Romano Expires October 19, 2015 [Page 39] Internet-Draft draft-ietf-clue-data-model-schema-09 April 2015 23. XML Schema extensibility The telepresence data model defined in this document is meant to be extensible. Extensions are accomplished by defining elements or attributes qualified by namespaces other than "urn:ietf:params:xml:ns:clue-info" and "urn:ietf:params:xml:ns:vcard-4.0" for use wherever the schema allows such extensions (i.e., where the XML Schema definition specifies "anyAttribute" or "anyElement"). Elements or attributes from unknown namespaces MUST be ignored. 23.1. Example of extension When extending the CLUE data model, a new schema with a new namespace associated with it needs to be specified. In the following, an example of extension is provided. The extension defines a new audio capture attribute ("newAudioFeature") and an attribute for characterizing the captures belonging to an "otherCaptureType" defined by the user. An XML document compliant with the extension is also included. The XML file results validated against the current CLUE data model schema. Presta & Romano Expires October 19, 2015 [Page 40] Internet-Draft draft-ietf-clue-data-model-schema-09 April 2015 Presta & Romano Expires October 19, 2015 [Page 41] Internet-Draft draft-ietf-clue-data-model-schema-09 April 2015 CS1 true true EG1 newAudioFeatureValue CS1 true EG1 OtherValue 300000 ENC4 ENC5 24. Security considerations This document defines an XML Schema data model for telepresence scenarios. The modeled information is identified in the CLUE framework as the needed one in order to enable a full-optional media stream negotiation and rendering. Indeed, the XML elements herein Presta & Romano Expires October 19, 2015 [Page 42] Internet-Draft draft-ietf-clue-data-model-schema-09 April 2015 defined are used within CLUE protocol messages to describe both the media streams representing the MP's telepresence offer and the desired selection requested by the MC. Security concerns described in [I-D.ietf-clue-framework], Section 15, apply to this document. Data model information carried within CLUE messages SHOULD be accessed only by authenticated endpoints. Indeed, some information published by the MP might reveal sensitive data about who and what is represented in the transmitted streams. The vCards included in the elements (Section 20.1) mandatorily contains the identity of the represented person. Optionally vCards can also carry the person's contact addresses, together with his/her photo and other personal data. Similar privacy-critical information can be conveyed by means of elements (Section 15.1) describing the capture scenes. The elements also can specify details that should be protected about the content of media captures (Section 10.13), capture scenes (Section 15), scene views (Section 16). Integrity attacks to the data model information incapsulated in CLUE messages can invalidate the success of the telepresence session's setup by misleading the MC's and MP's interpretation of the offered and desired media streams. The assurance of the authenticated access and of the integrity of the data model information is up to the involved transport mechanisms, namely the CLUE protocol [I-D.ietf-clue-protocol] and the CLUE data channel [I-D.ietf-clue-datachannel]. 25. IANA considerations This document registers a new XML namespace, a new XML schema and the MIME type for the schema. 25.1. XML namespace registration URI: urn:ietf:params:xml:ns:clue-info Registrant Contact: IETF CLUE Working Group , Roberta Presta XML: Presta & Romano Expires October 19, 2015 [Page 43] Internet-Draft draft-ietf-clue-data-model-schema-09 April 2015 BEGIN CLUE Data Model Namespace

Namespace for CLUE Data Model

urn:ietf:params:xml:ns:clue-info

See RFC XXXX.

END 25.2. XML Schema registration This section registers an XML schema per the guidelines in [RFC3688]. URI: urn:ietf:params:xml:schema:clue-info Registrant Contact: CLUE working group (clue@ietf.org), Roberta Presta (roberta.presta@unina.it). Schema: The XML for this schema can be found as the entirety of Section 3 of this document. 25.3. MIME Media Type Registration for 'application/clue_info+xml' This section registers the " "application/clue_info+xml"" MIME type. To: ietf-types@iana.org Subject: Registration of MIME media type application/clue+xml MIME media type name: application MIME subtype name: clue_info+xml Presta & Romano Expires October 19, 2015 [Page 44] Internet-Draft draft-ietf-clue-data-model-schema-09 April 2015 Required parameters: (none) Optional parameters: charset Same as the charset parameter of "application/xml" as specified in [RFC3023], Section 3.2. Encoding considerations: Same as the encoding considerations of "application/xml" as specified in [RFC3023], Section 3.2. Security considerations: This content type is designed to carry data related to telepresence information. Some of the data could be considered private. This media type does not provide any protection and thus other mechanisms such as those described in Section Security are required to protect the data. This media type does not contain executable content. Interoperability considerations: None. Published specification: RFC XXXX [[NOTE TO IANA/RFC-EDITOR: Please replace XXXX with the RFC number for this specification.]] Applications that use this media type: None. Additional Information: Magic Number(s): (none), File extension(s): .clue, Macintosh File Type Code(s): TEXT. Person & email address to contact for further information: Roberta Presta (roberta.presta@unina.it). Intended usage: LIMITED USE Author/Change controller: The IETF Other information: This media type is a specialization of application/xml [RFC3023], and many of the considerations described there also apply to application/clue_info+xml. 26. Sample XML file The following XML document represents a schema compliant example of a CLUE telepresence scenario. Taking inspiration from the examples described in the framework draft ([I-D.ietf-clue-framework]), it is provided the XML representation of an endpoint-style Media Provider's offer. There are three cameras, where the central one is also able of capturing a zoomed-out view of the overall telepresence room. Presta & Romano Expires October 19, 2015 [Page 45] Internet-Draft draft-ietf-clue-data-model-schema-09 April 2015 Besides the three video captures coming from such cameras, the MP makes available a further multi-content capture about the loudest segment of the room, obtained by switching the video source across the three cameras. For the sake of simplicity, only one audio capture is advertised for the audio of the whole room. The three cameras are placed in front of three participants (Alice, Bob and Ciccio), whose vcard and conference roles details are also provided. Media captures are arranged into four capture scene views: 1. (VC0, VC1, VC2) - left, center and right camera video captures 2. (VC3) - video capture associated with loudest room segment 3. (VC4) - video capture zoomed out view of all people in the room 4. (AC0) - main audio There are two encoding groups: (i) EG0, for video encodings, and (ii) EG1, for audio encodings. As to the simultaneous sets, only VC1 and VC4 cannot be transmitted simultaneously since they are captured by the same device, i.e., the central camera (VC4 is a zoomed-out view while VC1 is a focused view of the front participant). The simultaneous sets would then be the following: SS1 made by VC3 and all the captures in the first capture scene view (VC0,VC1,VC2); SS2 made by VC3, VC0, VC2, VC4 CS1 EG1 0.5 Presta & Romano Expires October 19, 2015 [Page 46] Internet-Draft draft-ietf-clue-data-model-schema-09 April 2015 1.0 0.5 0.5 0.0 0.5 true main audio from the room 1 it static room alice bob ciccio CS1 EG0 0.5 1.0 0.5 0.5 0.0 0.5 true left camera video capture 1 it static individual ciccio CS1 EG0 0.5 1.0 0.5 0.5 0.0 0.5 true central camera video capture 1 it static individual alice CS1 EG0 0.5 1.0 0.5 0.5 0.0 0.5 true right camera video capture 1 it static individual Presta & Romano Expires October 19, 2015 [Page 48] Internet-Draft draft-ietf-clue-data-model-schema-09 April 2015 bob CS1 EG0 true Soundlevel:0 loudest room segment 1 it static individual CS1 EG0 0.5 1.0 0.5 0.5 0.0 0.5 true zoomed out view of all people in the room 1 it static room alice bob ciccio 600000 Presta & Romano Expires October 19, 2015 [Page 49] Internet-Draft draft-ietf-clue-data-model-schema-09 April 2015 ENC1 ENC2 ENC3 300000 ENC4 ENC5 VC0 VC1 VC2 VC3 VC4 VC4 VC3 SE1 Presta & Romano Expires October 19, 2015 [Page 50] Internet-Draft draft-ietf-clue-data-model-schema-09 April 2015 VC0 VC2 VC4 VC3 Bob minute taker Alice presenter Ciccio chairman timekeeper 27. MCC example Enhancing the scenario presented in the previous example, the Media Provider is able to advertise a composed capture VC7 made by a big picture representing the current speaker (VC3) and two picture-in- picture boxes representing the previous speakers (the previous one -VC5- and the oldest one -VC6). The provider does not want to instantiate and send VC5 and VC6, so it does not associate any encoding group with them. Their XML representations are provided for enabling the description of VC7. Presta & Romano Expires October 19, 2015 [Page 51] Internet-Draft draft-ietf-clue-data-model-schema-09 April 2015 A possible description for that scenario could be the following: CS1 EG1 0.5 1.0 0.5 0.5 0.0 0.5 true main audio from the room 1 it static room alice bob ciccio CS1 EG0 0.5 1.0 0.5 0.5 0.0 Presta & Romano Expires October 19, 2015 [Page 52] Internet-Draft draft-ietf-clue-data-model-schema-09 April 2015 0.5 true left camera video capture 1 it static individual ciccio CS1 EG0 0.5 1.0 0.5 0.5 0.0 0.5 true central camera video capture 1 it static individual alice CS1 EG0 0.5 1.0 Presta & Romano Expires October 19, 2015 [Page 53] Internet-Draft draft-ietf-clue-data-model-schema-09 April 2015 0.5 0.5 0.0 0.5 true right camera video capture 1 it static individual bob CS1 EG0 true SE1 Soundlevel:0 loudest room segment 1 it static individual CS1 EG0 0.5 1.0 0.5 0.5 0.0 0.5 Presta & Romano Expires October 19, 2015 [Page 54] Internet-Draft draft-ietf-clue-data-model-schema-09 April 2015 true zoomed out view of all people in the room 1 it static room alice bob ciccio CS1 true SE1 Soundlevel:1 penultimate loudest room segment 1 it static individual CS1 true SE1 Soundlevel:2 last but two loudest room segment 1 it static individual CS1 true VC3 VC5 Presta & Romano Expires October 19, 2015 [Page 55] Internet-Draft draft-ietf-clue-data-model-schema-09 April 2015 VC6 big picture of the current speaker + pips about previous speakers 1 it static individual 600000 ENC1 ENC2 ENC3 300000 ENC4 ENC5 participants' individual videos VC0 VC1 VC2 loudest segment of the room VC3 loudest segment of the Presta & Romano Expires October 19, 2015 [Page 56] Internet-Draft draft-ietf-clue-data-model-schema-09 April 2015 room + pips VC7 room audio AC0 room video VC4 VC7 SE1 VC0 VC2 VC4 VC7 Bob minute taker Alice presenter Presta & Romano Expires October 19, 2015 [Page 57] Internet-Draft draft-ietf-clue-data-model-schema-09 April 2015 Ciccio chairman timekeeper 28. Diff with draft-ietf-clue-data-model-schema-08 version o Typos correction 29. Diff with draft-ietf-clue-data-model-schema-07 version o IANA Considerations: text added o maxCaptureEncodings removed o personTypeType values aligned with CLUE framework o allowSubsetChoice added for multiple content captures o embeddedText moved from videoCaptureType definition to mediaCaptureType definition o typos removed from section Terminology 30. Diff with draft-ietf-clue-data-model-schema-06 version o Capture Scene Entry/Entries renamed as Capture Scene View/Views in the text, / renamed as / in the XML schema. o Global Scene Entry/Entries renamed as Global View/Views in the text, / renamed as / o Security section added. o Extensibility: a new type is introduced to describe other types of media capture (otherCaptureType), text and example added. Presta & Romano Expires October 19, 2015 [Page 58] Internet-Draft draft-ietf-clue-data-model-schema-09 April 2015 o Spatial information section updated: capture point optional, text now is coherent with the framework one. o Audio capture description: added, removed, disallowed. o Simultaneous set definition: added to refer to capture scene identifiers as shortcuts and an optional mediaType attribute which is mandatory to use when only capture scene identifiers are listed. o Encoding groups: removed the constraint of the same media type. o Updated text about media captures without (optional in the XML schema). o "mediaType" attribute removed from homogeneous groups of capture (scene views and globlal views) o "mediaType" attribute removed from the global view textual description. o "millimeters" scale value changed in "mm" 31. Diff with draft-ietf-clue-data-model-schema-04 version globalCaptureEntries/Entry renamed as globalSceneEntries/Entry; sceneInformation added; Only capture scene entry identifiers listed within global scene entries (media capture identifiers removed); renamed as in the >clueInfo< template renamed as to synch with the framework terminology renamed as to synch with the framework terminology renamed as in the media capture type definition to remove ambiguity Examples have been updated with the new definitions of and of . Presta & Romano Expires October 19, 2015 [Page 59] Internet-Draft draft-ietf-clue-data-model-schema-09 April 2015 32. Diff with draft-ietf-clue-data-model-schema-03 version encodings section has been removed global capture entries have been introduced capture scene entry identifiers are used as shortcuts in listing the content of MCC (similarly to simultaneous set and global capture entries) Examples have been updated. A new example with global capture entries has been added. has been made optional. has been renamed into Obsolete comments have been removed. participants information has been added. 33. Diff with draft-ietf-clue-data-model-schema-02 version captureParameters and encodingParameters have been removed from the captureEncodingType data model example has been updated and validated according to the new schema. Further description of the represented scenario has been provided. A multiple content capture example has been added. Obsolete comments and references have been removed. 34. Acknowledgments The authors thank all the CLUErs for their precious feedbacks and support. 35. Informative References [I-D.ietf-clue-datachannel] Holmberg, C., "CLUE Protocol data channel", draft-ietf-clue-datachannel-09 (work in progress), March 2015. [I-D.ietf-clue-framework] Duckworth, M., Pepperell, A., and S. Wenger, "Framework for Presta & Romano Expires October 19, 2015 [Page 60] Internet-Draft draft-ietf-clue-data-model-schema-09 April 2015 Telepresence Multi-Streams", draft-ietf-clue-framework-22 (work in progress), April 2015. [I-D.ietf-clue-protocol] Presta, R. and S. Romano, "CLUE protocol", draft-ietf-clue-protocol-03 (work in progress), February 2015. [I-D.ietf-ecrit-additional-data] Randy, R., Rosen, B., Tschofenig, H., Marshall, R., and J. Winterbottom, "Additional Data Related to an Emergency Call", draft-ietf-ecrit-additional-data-29 (work in progress), March 2015. [RFC3023] Murata, M., St. Laurent, S., and D. Kohn, "XML Media Types", RFC 3023, January 2001. [RFC3688] Mealling, M., "The IETF XML Registry", BCP 81, RFC 3688, January 2004. [RFC4796] Hautakorpi, J. and G. Camarillo, "The Session Description Protocol (SDP) Content Attribute", RFC 4796, February 2007. [RFC6351] Perreault, S., "xCard: vCard XML Representation", RFC 6351, August 2011. Authors' Addresses Roberta Presta University of Napoli Via Claudio 21 Napoli 80125 Italy EMail: roberta.presta@unina.it Presta & Romano Expires October 19, 2015 [Page 61] Internet-Draft draft-ietf-clue-data-model-schema-09 April 2015 Simon Pietro Romano University of Napoli Via Claudio 21 Napoli 80125 Italy EMail: spromano@unina.it Presta & Romano Expires October 19, 2015 [Page 62]