INTERNET-DRAFT J. Ott/D. Kutscher/C. Bormann Expires: December 1999 Universitaet Bremen June 1999 Capability description for group cooperation draft-ott-mmusic-cap-00.txt Status of this memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet- Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Abstract This document presents a notation for describing potential and specific configurations of end systems in multiparty collaboration sessions. The objective is to define a configuration description framework that can be used to define end system capabilities, to calculate a set of appropriate common capabilities based on the descriptions of all (end) systems and to express a selected media description for use in session descriptions. One application for this framework would be multiparty multimedia conferencing, an application area where multiple tools have to be configured on conference startup (and/or during the conference) concerning media encoding types and other parameters. Other applications are IP Telephony and media gateway control. This document is intended for discussion in the Multiparty Multimedia Session Control (MMUSIC) working group of the Internet Engineering Task Force. Comments are solicited and should be addressed to the working group's mailing list at confctrl@isi.edu and/or the authors. Ott/Kutscher/Bormann [Page 1] INTERNET-DRAFTCapability description for group cooperation June 1999 1. Introduction 1.1. Background 1.1.1. Motivation Multiparty multimedia conferencing is one application that requires the dynamic interchange of end system capabilities and the negotiation of a parameter set that is appropriate for all sending and receiving end systems in a conference. Currently the parameter negotiation is either done by out of band means or, for loosely coupled conferences, parameters are simply fixed by the initiator of a conference. In the latter scenario no negotiation is required because only those participants with media tools that support the predefined settings can join a media session and/or a conference. This approach is applicable for conferences that are announced some time ahead of the actual start date of the conference. Potential participants can check the availability of media tools in advance and tools like session directories can configure tools on startup. This procedure however fails to work for conferences initiated spontaneously like Internet phone calls or ad-hoc multiparty conferences. Fixed settings for parameters like media types, their encoding etc. can easiliy inhibit the initiation of conferences, for example in situations where a caller insists on a fixed audio encoding that is not available at the callee's end system. To allow for spontaneous conferences, the process of defining a conference's parameter set must therefore be performed either at conference start (for closed conferences) or maybe (potentially) even repeatedly every time a new participant joins an active conference. The latter approach may not be appropriate for every type of conference: For conferences with TV-broadcast or lecture characteristics (one main active source) it is usually not desired to re-negotiate parameters every time a new participant with an exotic configuration joins because it may exclude the main source from media sessions. But conferences with equal ``rights'' for participants that are open for new participants do need dynamic capability negotiation, for example a telephone call that is extented to a 3-parties conference at some time during the session. 1.1.2. Current practices in the IETF community Capability and session descriptions play different roles in applications of IETF conferencing standards and are currently almost always specified as SDP (Session Description Protocol) [11] session descriptions. In session announcements with SAP (Session Announcement Protocol) [12] they are used to define media encodings and parameters for conferences and thus at least reflect the system capabilities of Ott/Kutscher/Bormann [Page 2] INTERNET-DRAFTCapability description for group cooperation June 1999 the participants or the active source. Within the context of SIP (Session Initiation Protocol) capability descriptions can be expressed in different session description languages, one of them SDP. For example, in a SIP-INVITE message for a unicast session, the session description enumerates the media types and formats that the caller is willing to use and thus expresses the capabilities of the caller's end system. The SDP content is however not only used to express a caller's preferences but is also used to configure communication channels in a somewhat crude way. For example, if a callee does not want to send or receive data on a offered stream he has to set the port number of that stream to zero in its media description that he sends as a reply to the caller. The use of SDP as a capability description and negotiation mechanism has lead to a whole set of conventions and requirements that have to be considered by implementations because SDP itself is not powerful enough for this purpose. This is clearly not a defect of SDP which has never been designed to be a complete capability description and negotiation mechanism. SDP has been developed in the context of SAP to describe simple static media sets. The misuse of SDP reveals a lack of a powerful, yet simple way to perform capability description and negotiation in a conference setup or reconfiguration phase in the current IETF conferencing model. 1.2. Purpose The configuration negotiation framework consists of three components: o A language that allows expressing capability descriptions, potential configurations, unambiguously; o an algorithm that compares different capability descriptions and produces an appropriate ``collapsed'' subset that can be used as a common set of potential configurations; and o a concrete capability name and value range specification for specific applications. This documents specifies ways to express potential and concrete configurations as well as rules to combine, constrain, and collaps these configurations. How a particular component's potential configurations are gained, what relationship exists to system capabilities, and similar meta-discussions are beyond the scope of this dcoument. It is also not the purpose of this document to specify a complete framework including mandatory protocols for capability exchange. Names and value ranges for different applications should be defined in a follow-up document and registered with the IANA. Besides modeling and rules, this document specifies a syntax for Ott/Kutscher/Bormann [Page 3] INTERNET-DRAFTCapability description for group cooperation June 1999 expressing configurations and describes a basic and a concise representation format as well as an XML-based notation. A number of appendices provide mappings to other specification formats (in particular SDP and H.245) as far as possible and also give an overview of semantic definitions for configurations for audio codecs. 1.3. Relation to other Developments A few other generic or application specific models have been developed that deal with capability description and/or capability negotiation. RFC 2295 (Transparent Content Negotiation in HTTP) [3] proposes a negotiation mechanism layered on top of HTTP that allows for automatically selecting the ``best'' version of documents that are accessible by a single URI. A server can describe the properties of each variant of a document associated with ``quality degration factors''. The content negotiation process will either allow the client to select the appropriate version according a variant list provided by the server or the server itself may choose a document version relying on Accept-headers that are included in the client's request. The Resource Description Framework (RDF) [4] provides a specification model for properties of Web resources and aims at automating processing Web resources with respect to resource discovery, cataloging, resource selection and other applications. CC/PP [5] is an on-going development that is creating a framework for describing user preferences and device capabilities that uses RDF to express those descriptions. In the CC/PP model a user agent can provide capability profiles that enable servers and proxies to customize content accordingly. The IETF Content Negotiation (conneg) working group is developing a collection of media features for display, print and fax [6], a registration procedure for feature tags (the names of capability properties) [7] as well as description and negotiation models [8] [9] for media features and capabilities. One of conneg's goals is to develop a ``tag independent negotiation'' process that can work without knowing the meaning of feature tags. Whereas TCN, RDF and CC/PP focus on describing/negotiating capabilities for client/server scenarios such as the WWW, where a server provides content with certain properties and a client has certain preferences/capabilities, the conneg approach is more general. The conneg framework provides the abstraction of ``feature sets'' that are media feature collections. Feature sets can either be interpreted as a set of variants that a server can provide as data formats or as a set of capabilities of a receiver. Content negotiation in this model would be to find a non-empty feature set that is compatible with both the sender's and the receiver's original Ott/Kutscher/Bormann [Page 4] INTERNET-DRAFTCapability description for group cooperation June 1999 feature set. H.245, the multimedia control protocol employed across all newer H.32x Recommendations for tightly-coupled multimedia conferencing (particularly included H.323) provides the concept of capability specification and exchange between terminals and uses the same description mechanisms to define particular instantiations of media streams in a conference. For capability description purposes, H.245 provides means to express all the capabilities supported by a system (``AlternativeCapabilitySets'') as well as to describe permitted combinations of these capability sets to be instantiated at the same time. Capability exchange is defined on a peer-to-peer basis, common (``collapsed'') capabilities are calculated by some central entity that controls the mode of operation in a multipoint conference. This calculation requires the central entity to understand the (semantics of) the individual endpoints' capability descriptions. T.124 specifies a framework for exchanging and collapsing capabilities. This framework specifies a core set of rules (minimum, maximum, logical AND) and capability types as well as a naming scheme, but leaves definition of specific semantics to the application protocols. This concept makes the framework extensible and enables entities to calculate a common set of supported capabilities without having to understand their semantics. Also, T.124 distinguishes between capability descriptions and particular instantiations for application sessions. In addition to these collapsing capabilities, T.124 supports the notion of non- collapsing capabilities to which the collapsing process is not applied. Capability negotiation for groups of senders and receivers as presented in this document can be viewed as a specialization of the general conneg approach that focuses on simplicity for capability descriptions. Some expressional power of the conneg framework is abandoned in favor of simplicity. 1.4. Terminology for requirement specifications In this document, the key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" are to be interpreted as described in RFC 2119 [1] and indicate requirement levels for compliant implementations. 2. Requirements and Concepts 2.1. System Model Any (computer) system has a number of rather fixed hardware as well as software resources. These resources ultimately define the limitations on what can be captured, displayed, rendered, replayed, etc. with this particular machine. We term features enabled and Ott/Kutscher/Bormann [Page 5] INTERNET-DRAFTCapability description for group cooperation June 1999 restricted by these resources "system capabilities". Example: System capabilities may include the limitation of the screen resolution for true color by the graphics board; available audio hardware or software may offer only certain media encodings (e.g. G.711 and G.723.1 but not GSM); and CPU processing power and quality of implementation may constrain the possible video encoding algorithms. In multiparty multimedia conferences, participants employ different ``components'' in conducting the conference. Example: In lecture multicast conferences one component might be the voice transmission for the lecturer, another the transmission of video pictures showing the lecturer and the third the transmission of presentation material that are different components in a conference. Depending on system capabilities, user preferences and other technical and political constraints, different configurations can be chosen to accomplish the ``deployment'' of these components. Each component can be characterized at least by (a) its intended use (i.e. the function it shall provide) and (b) a one or more possible ways to realize this function. Each way of realizing a particular function is referred to as a "configuration". Example: A conference component's intended use may be to make transparencies of a presentation visible to the audience on the Mbone. This can be achieved either by a video camera capturing the image and transmitting a video stream via some video tool or by loading an copy of the slides into a distributed eletronic whiteboard. For each of these cases, additional parameters may exist, leading to additional configurations (see below). Two configurations are considered different regardless whether they employ entirely different mechanisms and protocols (as in the previous example) or they choose the same and differ only in a single parameter. Example: In case of video transmission, a JPEG-based still image protocol may be used, H.261 encoded CIF images could be sent as could H.261 encoded QCIF images. All three cases constitute different configurations. Of course there are many more detailed protocol parameters. Each component's configurations are limited by the system capabilities. In addition, the intended use of a component may constrain the possible configurations further to a subset suitable for the particular component's purpose. Example: In a system for highly interactive audio communication the component responsible for audio may decide not to use the Ott/Kutscher/Bormann [Page 6] INTERNET-DRAFTCapability description for group cooperation June 1999 available G.723.1 audio codec to avoid the additional latency but only use G.711. This would be reflected in this component only showing configurations based upon G.711. Still, multiple configurations are possible, e.g. depending on the use of A-law or u-Law, packetization and redundancy parameters, etc. We distinguish two types of configurations: o potential configurations (a set of any number of configurations per component) indicating a system's functional capabilities as constrained by the intended use of the various components; o actual configurations (exactly one per instance of a component) reflecting the mode of operation of this component's particular instantiation. Example: The potential configuration of the aforementioned video component may indicate support for JPEG, H.261/CIF, and H.261/QCIF. A particular instantiation for a video conference may use the actual configuration of H.261/CIF for exchanging video streams. A configuration consists of any number of properties and is uniquely identified by a tag. Potential configurations can be grouped into alternatives each of which indicates a possible mode of operation of a component. In a conference, each involved peer contributes to the formation of a component's configuration -- by specifying its its own features and limitations during the capability exchange process. Based upon all systems' input, a set of common capbilities -- potential configurations -- is calculated through the collapsing process. The collapsing process may be influenced by additional constraints that may be expressed on the possible combinations of alternatives -- between multiple instances of the same component as well as across (instances of) different components. Also, user preferences may be taken into account -- during the collapsing process as well as when deciding on which potential configuration is to be instantiated as the actual configuration for a component. 2.2. Definition of terms From the system model described above, the following core terms can be extracted: o conference component An element of a multiparty multimedia conference that can appear Ott/Kutscher/Bormann [Page 7] INTERNET-DRAFTCapability description for group cooperation June 1999 as a media stream and has a set of potential configurations. o configuration A set of named attributes, expressing constraints to a system's capabilities. o capability Resources or system features that influence the selection of useful configurations for components. o alternative When comparing different potential configurations, one potential configuration is an alternative to other configurations. o property A property is a label-value pair. The capability description language specified in this document is called CAP. 2.3. Description language The objective of a capability description language is to allow the definition of supported media types, encodings and features of an end system. The language must be unambiguous, easily parsable and allow for concise definitions to minimize the transport overhead for a capability negotiation phase during a conference. It should also be extensible and not fixed to certain features, because new encodings must be supported without changes to the language definition. To ensure the unambiguousness it is however required to have a common understanding on the meaning of identifiers and values. E.g. if two end systems used different names for the audio encoding ``GSM'' a capability negotiation would not lead to the desired result. The need for well-known identifiers and the need for extensibility require to seperate the definition of identifiers and values from the definition of the description language itself. Identifiers and values should therefore be standardized and registered. 2.4. Collapsing Algorithm The objective of the collapsing algorithm is to take capability description sets from each end system in order to find a set of media-types, encodings and features that are supported by all end system, or, if this is not possible, to find a subset that would exclude as few systems as possible. Ott/Kutscher/Bormann [Page 8] INTERNET-DRAFTCapability description for group cooperation June 1999 The procedure described above would be the default algorithm. In certain scenarios where some end systems are priveleged it must be possible to ensure that the result of the collapsing process does not exclude those privileged systems. It must therefore be possible to parameterize the process with the policy to be applied. 3. Specification of the Decription Language Two, semantically equivalent, notations are introduced. The first notation is simple but leads to verbose capability descriptions and the second notation is more complex but allows for concise descriptions.[1] This specification also defines how to translate descriptions using the concise notation to the other, simpler, format. Please note that all tags and values are just examples and not a subject of this specification. 3.1. Basic Description Language In the basic description language a end system's capability description is a set of alternatives. An alternative is a set of constraints for certain parameters. A constraint can be understood as a restriction because it limits the capability alternative according to the constraint's meaning. A constraint is constituted of three components: +---------+-------------------------------------+ |tag | name of the constraint | |operator | defines the type of the constraint | |value | a value for the constraint operator | +---------+-------------------------------------+ Table 1: Components of a Constraint A constraint that limits the capability of an end system to a maximum transfer rate of 64 kbit/s (say in a description of audio receiver capabilities) would be written as follows: bps <= 64000; with bps as the tag, <= as the operator and 64000 as the value of this constraint (plus a semicolon as a end-of-statement-symbol). A complete alternative (a set of constraints) would be written as: _________________________ [1] A third, XML-based notation is included in appendix A. Ott/Kutscher/Bormann [Page 9] INTERNET-DRAFTCapability description for group cooperation June 1999 media = audio; mode = receive | send; channels = 1; encoding = g711; compression = mulaw; sampling_rate = 8000 | 11025 | 16000; This example exhibits another way of expressing constraints using the = operator. The = operator can be used to define a set of supported values in a single constraint. The value of the = operator's value is actually a list of names seperated by ``|''. In the definition of the media constraint it is shown how a single name is used as a value for the = operator, which has the meaning that (for the respective alternative ) only the media-type audio is supported. Another operator that is not shown in the example is the operator >= that can be used to express minimum constraints. Table 2 provides an overview of the operators: +---+---------------------------+ |<= | maximum | |>= | minimum | |= | selection of fixed values | +---+---------------------------+ Table 2: Operators for Capability Constraints The reason why the sampling_rate constraint is expressed with a = and not with a <= operator is that defining the rate capability as a maximum constraint with a value of 16000 would allow any value less than 16000 as a valid parameter which would not match the application specific semantics in this case.[2] The example above contains one alternative of a capability description. It could be used as a complete description expressing that the end system does not support more than this specific alternative. Most end system however support more variants of audio parameters, requiring the definition of more alternatives. E.g. supporting ``GSM'' as a second encoding would lead to the following capability description: tag: audio/g711 media = audio; mode = receive | send; channels = 1; encoding = g711; compression = mulaw; sampling_rate = 8000 | 11025 | 16000; _________________________ [2] Most codecs do not support arbitrary sampling rates. Ott/Kutscher/Bormann [Page 10] INTERNET-DRAFTCapability description for group cooperation June 1999 tag: audio/gsm media = audio; mode = receive | send; channels = 1; encoding = gsm; compression = half | full | enhanced_full; This description expresses that the end system supports one media type ``audio'' and two audio encodings ``g711'' and ``gsm'', each with certain other constraints. This way of defining capabilities is very redundant as many constraints are the same for both alternatives. It is important to know all the constraints of an alternative for a later negotiation phase (see below) but for writing and transferring capability descriptions another notation that expresses common constraints and allows for more concise definition is useful. The = operator is actually already used to aggregate several constraints into one: A hypothetic even more primitive notation could translate each alternative containing a = constraint into a set of alternatives each containing a ``equality constraint'' for one value of the = value list. E.g. for the GSM alternative there would be 3 alternatives for each compression type (each variant again would require an alternative for receive and for send mode in this example). This has not been done in this example in order to avoid the obvious verbosity. Every alternative containing a = constraint with n values can however unrolled to n different alternatives if this granularity is required. Each alternative also contains a tag that allows to reference it later in simultaneous capability specifications. Due to the possibility to aggregate alternatives with = constraints several specific codec parameters for a media codec can be subsumed under one common tag like in the example above. This allows to handle common cases, where this is desired, efficiently. Again, if more granularity is needed for specific applications, = constraints can be unrolled. The ABNF[2] specification for the basic description language is as follows: Ott/Kutscher/Bormann [Page 11] INTERNET-DRAFTCapability description for group cooperation June 1999 +-------------------------------------------------------------------+ |caps = alternative *(CRLF CRLF alternative) | |alternative = tag-definition CRLF *constraint | |constraint = *WSP (min-constraint / max-constraint / | | oneof-constraint) *WSP [CRLF *WSP] | |tag-definition = *WSP "tag:" *WSP identifier *WSP ";" | |min-constraint = label *WSP ">=" *WSP numval *WSP ";" | |max-constraint = label *WSP "<=" *WSP numval *WSP ";" | |oneof-constraint = label *WSP "=" *WSP [oneof-list] *WSP | | ";" | |oneof-list = val / (oneof-list *WSP "|" *WSP val) | |label = identifier | |val = identifier | |numval = 1*DIGIT | |identifier = ALPHA, *(ALPHA / DIGIT) | +-------------------------------------------------------------------+ Note that the specification does currently not provide ``non- collapsing'' attributes, i.e. attributes that are not considered in collapsing rules, except for tags. Another syntactic element for those attribute will be added in the future. 3.2. Concise Description Language 3.2.1. Syntax The goal of the concise description language is to express the same capability description more concisely by grouping shared constraints of alternatives. The concise language provides the same constraint operators but introduces the concept of alternative groups. The above, verbose example can be expressed like this: media: audio { mode = receive | send; channels = 1; encoding: g711 { compression = mulaw; sampling_rate = 8000 | 11025 | 16000; } || encoding: gsm { compression = half | full | enhanced_full; }; }; An alternative group contains those constraints (and subgroups) that are specific to an alternative and cannot be expressed in the common part. A group is enclosed by curly brackets and follows a group-tag (like ``encoding: g711'' in the example). A group-tag is semantically a ``='' constraint (with one value) but is used in the concise notation to introduce a new subgroup of constraints. Ott/Kutscher/Bormann [Page 12] INTERNET-DRAFTCapability description for group cooperation June 1999 The example above contains three groups: The top-level group ``media: audio'' and two second-level groups ``encoding: g711'' and ``encoding: gsm''. Groups on the same hierarchy level (siblings) are connected by ``||''. Groups can be nested to arbitrary levels and there is no limit for the number of siblings in a hierarchy. The next example shows how the ``encoding: g711'' group can be split-up into 2 subgroups: media: audio { mode = receive | send; channels = 1; encoding: g711 { compression: mulaw { sampling_rate = 8000 | 11025 | 16000; } || compression: alaw { sampling_rate = 8000 | 11025 | 32000; }; } || encoding: gsm { compression = half | full | enhanced_full; }; }; Note that there are no explicit tags allowed for the concise notation. Instead group tags serve as implicit tags components that can be composed to unique tags for each expressed alternative. A alternative can be uniquely specified by joining the group tags of all enclosing groups. The specification example above would thus define three alternatives: audio/g711/mulaw, audio/g711/alaw and audio/gsm. Tag concatenation uses "/" (slash) as a delimiting character. The ABNF[2] specification for the concise description language is as follows (as an extension to the ABNF of the basic language, see section 3.1): +------------------------------------------------------------+ |caps = 1*(group LWSP *("||" LWSP group) *WSP | | ";") | |group = group-tag *WSP "{" LWSP *constraint | | [caps] LWSP "}" | |group-tag = name ":" *WSP tag | +------------------------------------------------------------+ 3.2.2. Translation to Basic Notation Transforming a capability description from concise to basic notation MUST be done by applying the following algorithm, starting at the outermost hierarchy level and transforming subgroups recursively: Ott/Kutscher/Bormann [Page 13] INTERNET-DRAFTCapability description for group cooperation June 1999 transform group-tag to = constraint; push group-tag to tag stack; adopt all other constraints within the group; for each group in this level { add adopted constraints and transformed group-tag to every alternative obtained from transforming the subgroups recursively resulting in a set of alternatives; if (is innermost group) { construct tag by concatenating all group-tags from tag stack and add it to alternative; } pop tag stack; } Two innermost subgroups at the same hierachy level are thus converted to two alternatives. An Example: A: B { C <= 1; D: E { F <= 2; G:H { I <= 3 ; } || J: K { L <=4; } } || M: L { N <= 5; } } would be transformed into the following set of alternatives: tag: B/E/H A = B; C <= 1; D = E; F <= 2; G = H; I <= 3; tag: B/E/K A = B; C <= 1; D = E; F <= 2; J = K; L <= 4; Ott/Kutscher/Bormann [Page 14] INTERNET-DRAFTCapability description for group cooperation June 1999 tag: B/L A = B; C <= 1; M = L; N <= 5; 3.2.3. Translation from Basic to Concise Format The mapping from basic to concise representation is not unique by itself: In principle, for all alternatives constraints with common values can be factored out. Depending on the constraints that are chosen for outer groups the results will differ. Nevertheless it would be possible to define an algorithm that will guarantee uniqueness, for example by defining certain tags as implicit outer- level tags (e.g. ``media'') and by demanding that those constraints with the largest number of equal values in many alternatives will appear in the outermost groups. Conflicts could be avoided by imposing a lexicographic ordering on the tags. Only ``='' constraints with one parameter can be chosen for group tags. 4. Specification of constraints for simulataneous capabilities For some applications it is not sufficient to be able to express the capability to support a list of media types and codec parameters. Instead constraints of how many instances of codecs of different types can be active at a given time must also be specified as an input parameter for a negotiation/selection process. For example a gateway may be able to handle either 5 GSM streams or, alternatively, 5 G.711 streams at the same time but not both GSM and G.711 at the same time. The specification presented here enables the definitions of such constraints by the tagging mechanism. Alternative capability can be refered to in rules expressing those simultaneous constraints using their tags. The specification of such a definition language is however not subject of this draft and will have to be defined elsewhere. 5. Specification of the Collapsing Process The collapsing process generates a set of alternatives, according to the collapsing policy and the set of alternatives that are used as the input to this process. 5.1. Finding compatible alternatives Ott/Kutscher/Bormann [Page 15] INTERNET-DRAFTCapability description for group cooperation June 1999 The general collapsing process tries to find a set of alternatives that are supported by every end system. This must be accomplished by comparing each alternative of an end system's alternative set with each alternative of every other alternative set. The process of collapsing two alternatives works as follows: find intersection of constraints of the two alternatives by keeping all constraint with same names and operators; for all constraints in the intersection { find according constraint (same name and operator) in second set; if(operator==''<='') { calculate minimum of both constraint values and add maximum constraint with that value to result set; } if(operator==''>='') { calculate maximum of both constraint values and add minimum constraint with that value to result set; } if(operator==''='') { Build intersection of tags in both constraints; add = constraint with a value of the intersection to the result set; } } Tags are ignored in the collapsing process. If the result set of alternatives contains = constraints with empty value lists the collapsing of these two alternatives has failed and the resulting set must be discarded. 5.2. Other policies Other collapsing policies will have to be defined. 6. Composed Configurations For certain configurations it is required to compose configurations by combining or referencing other configurations. Sample application could be redundant and FEC encodings. A full specification how this can be accomplished will have to be defined. The general outline would be to use the structuring and referencing mechanisms (tagged alternative) to express the required constraints for the respective encodings. 7. Security Considerations Security considerations will also have to be defined. Ott/Kutscher/Bormann [Page 16] INTERNET-DRAFTCapability description for group cooperation June 1999 8. Authors' Addresses Joerg Ott Universitaet Bremen, TZI, MZH 5180 Bibliothekstr. 1 D-28359 Bremen Germany voice +49 421 201-7028 fax +49 421 218-7000 Dirk Kutscher Universitaet Bremen, TZI, MZH 5160 Bibliothekstr. 1 D-28359 Bremen Germany voice +49 421 218-7595 fax +49 421 218-7000 Carsten Bormann Universitaet Bremen, TZI, MZH 5180 Bibliothekstr. 1 D-28359 Bremen Germany voice +49 421 218-7024 fax +49 421 218-7000 9. References [1] S. Bradner, ``Key words for use in RFCs to Indicate Requirement Levels'' RFC 2119, March 1997 [2] D. Crocker, P. Overell, ``Augmented BNF for Syntax Specifications: ABNF'', RFC 2234, November 1997 [3] K. Holtman, A. Mutz, ``Transparent Content Negotiation in HTTP'', RFC 2295, March 1998 [4] O. Lassila, R. Swick, ``Resource Description Framework (RDF) Model und Syntax Specification'', W3C Proposed Recommendation, January 1999, work in progress, http://www.w3.org/TR/1999 [5] F. Reynolds, J. Hjelm, S. Dawkins, S. Singhal, ``Composite Capability/Preference Profiles (CC/PP): A user side framework for content negotiation'', W3C Note 30, November 1998, work in progress, http://www.w3.org/TR/1998/NOTE-CPP-19981130 [6] L. Massinter, K. Holtman, A. Mutz, D. Wing, ``Media Features for Ott/Kutscher/Bormann [Page 17] INTERNET-DRAFTCapability description for group cooperation June 1999 Display, Print, and Fax'', Internet Draft draft-ietf-conneg- media-features-05.txt, January 1998, Work in Progress [7] K. Holtman, A. Mutz, T. Hardie, ``Media Feature Tag Registration Procedure'', Internet Draft draft-ietf-conneg-feature- reg-03.txt, July 1998, Work in Progress [8] G. Klyne, ``A syntax for describing media feature sets'', Internet Draft draft-ietf-conneg-feature-syntax-04.txt, December 1998, Work in Progress [9] G. Klyne, ``An algebra for describing media feature sets'', Internet Draft draft-ietf-conneg-feature-algebra-03.txt, August 1998, Work in Progress [10] G. Klyne, ``W3C Composite Capability/Preference Profiles'', Internet-Draft draft-ietf-conneg-W3C-ccpp-01.txt, December 1998, Work in progress [11] M. Handley, ``SDP: Session Description Protocol'', RFC 2327, April 1998 [12] M.Handley, C. Perkins, E. Whelan, ``Session Announcement Protocol'', Internet-Draft draft-ietf-mmusic-sap-v2-01.txt, June 1999, Work in progress Ott/Kutscher/Bormann [Page 18] INTERNET-DRAFTCapability description for group cooperation June 1999 Appendix A: XML-DTD for the description language A XML-DTD for XML documents representing concise CAP descriptions: The example explained above represented in XML: Ott/Kutscher/Bormann [Page 19] INTERNET-DRAFTCapability description for group cooperation June 1999 receive send 1 8000 11025 8000 11025 half full enhanced_full Note that = constraints with one alternative are represented as property elements for brevity while = constraints with multiple alternatives are represented as one.of elements with a property element (without name attribute) for each value. Ott/Kutscher/Bormann [Page 20] INTERNET-DRAFTCapability description for group cooperation June 1999 Appendix B: Mapping from/to SDP Note that this appendix is still prelimenary as it does not yet cover all the features provided by the capability description language presented in this document. SDP allows for describing all parameters required for establishing a conference. The media parameters that can be interpreted as a caller's capabilities are only a subset of the session desription. Other information like origin (``o='' field) or communication parameters are not related to a system's capability description (although they need to be expressable in a session description language, as well). An example SDP description: v=0 o=mhandley 2890844526 2890842807 IN IP4 126.16.64.4 s=SDP Seminar i=A Seminar on the session description protocol u=http://www.cs.ucl.ac.uk/staff/M.Handley/sdp.03.ps e=mjh@isi.edu (Mark Handley) c=IN IP4 224.2.17.12/127 t=2873397496 2873404696 a=recvonly m=audio 49170 RTP/AVP 0 m=video 51372 RTP/AVP 31 m=video 51374 RTP/AVP 98 a=rtpmap:98 X-H.263+ m=application 32416 udp wb a=orient:portrait Only the ``m='' and the respective ``a='' fields contain relevant information for a mapping to our capability description language. The first element of a ``m='' field is the media type that can be mapped to the tag of a top-level ``group-tag'' in the concise description language. The second element of a ``m='' field, the transport port, is a communication parameter and can therefore be neglected for now. The third and fourth (and subsequent) elements define a transport protocol (that can be regarded as some kind of capability) and media formats (encodings). The ``m='' field may be followed by an ``a='' field that can contain arbitrary constraints on the media description, notably the rtpmap attribute, that maps a dynamic RTP payload type number to a media format (and additional encoding parameters, depending on the concrete encoding). Further encoding specific parameters are specified using a ``a=fmtp'' attribute. All parameters of a ``a=fmtp'' attribute will be mapped to respective constraints in our description language. The concrete mapping is yet to be defined for some common uses of ``a=fmtp''. For the sake of generality we must translate the implicit encoding paramters expressed in static RTP payload numbers to explicit descriptions and extract the relevant information from the ``a='' fields for dynamic payload types. Ott/Kutscher/Bormann [Page 21] INTERNET-DRAFTCapability description for group cooperation June 1999 The example above could therefore be translated as: media: audio { mode = receive | send encoding: g711 { transport = RTP compression = mulaw sampling_rate = 8000 channels = 1 } } media: video { mode = receive | send encoding: h261 { transport = RTP } || encoding: h263+ { transport = RTP } } media: application { type: wb { transport = UDP orientation = portrait } } The constraints inside the ``g711'' group have to be adopted from the payload types definition in RFC 1890. The ``transport'' constraint could also be factored-out to the outer groups ``audio'' and ``video'' -- this is not relevant to the semantics of the description. Note that the empty group for ``h261'' and ``h263+'' can also be abbreviated as a ``= constraint'' if no specific constraints exist for those encodings. The mapping process can thus defined as follows: 1) Each ``m='' format specification is mapped to a ``group'' nested in a ``group'' for the respective media. The tag for that group is inferred either from the static payload type or in case of dynamic payload types looked up from a corresponding ``a=rtpmap'' field. The corresponding registered payload type name leads to an encoding name (by a yet to be defined name map). A mapping for unregistered payload type names has to be defined, as well. 2) The transport of a ``m='' field becomes a ``='' constraint in the ``group'' for the encoding 3) For registered payload type names the additional parameters as defined in RFC 1890 such as sampling rate and number of channels are each translated into corresponding ``='' constraints of the encoding group. Ott/Kutscher/Bormann [Page 22] INTERNET-DRAFTCapability description for group cooperation June 1999 4) Translation of ``a=fmtp'' has to be defined... 5) All other ``a='' fields relating to a ``m='' and representing a single attribute-value mapping (like orient:portrait) are translated into single ``='' constraints with one value. Future versions of this specification will also define how integrate other SDP configuration parameters into CAP using non-collapsing parameters (see section xx) that are yet to be defined. Translating a description written in the concise decscription language (back) to SDP again would rely on a well-defined mapping of encoding names: 1) CAP names for video or audio that cannot be translated into registered payload type names will be translated as dynamic payload types with a corresponding ``a=rtpmap'' field. 2) CAP groups with encoding names that can be mapped are either translated into ``m='' fields with static payload types if the encoding parameters (sampling rate and number of channels) conform to the specification of a static payload type or, if one of these parameters differ are translated to ``m='' fields with a dynamic payload type that will be defined in a subsequent ``a=rtpmap'' field. 3) For other media types the encoding groups will be translated to ``m=application''fields with the encoding name as the fourth element. 4) The transport constraint of the CAP description is will be reflected in the ``m='' field, as well. 5) Other = constraints with one value will be translated to ``a='' fields. Ott/Kutscher/Bormann [Page 23] INTERNET-DRAFTCapability description for group cooperation June 1999 Appendix C: Integration into SDP Instead of translating a CAPs specification into SDP media descriptions it can be more efficient to directly add it to a SDP description and thus retain the original specification. This can be done by using dynamic payload types: m=audio 98 a=rtpmap:98 X-CAP a={ channels = 1; encoding: g711 { compression = mulaw);sampling_rate = 8000 | 11025 | 16000; } || encoding: gsm { compression = half | full | enhanced_full; }; } Ott/Kutscher/Bormann [Page 24] INTERNET-DRAFTCapability description for group cooperation June 1999 Appendix D: Mapping to H.245 TBD. Ott/Kutscher/Bormann [Page 25]